[
https://issues.apache.org/jira/browse/PIG-4627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rohini Palaniswamy updated PIG-4627:
------------------------------------
Description: Self join does not produce right results in case of null
after PIG-4495 which writes multiple inputs into same tez input. Need the
https://issues.apache.org/jira/secure/attachment/12628162/PIG-3761-1.patch fix
of PIG-3761 to handle that by comparing indexes in the raw comparators. (was:
These are issues with using slow comparators or bugs in comparators.
Tez is using PigTupleSortComparator and mapreduce is using
PigTupleWritableComparator on the mapside for comparing tuples.
PigTupleSortComparator is very inefficient and makes it really slow for group
by.
Self join does not produce right results in case of null after PIG-4495 which
writes multiple inputs into same tez input. Need the
https://issues.apache.org/jira/secure/attachment/12628162/PIG-3761-1.patch fix
of PIG-3761 to handle that by comparing indexes in the raw comparators.)
Summary: [Pig on Tez] Self join does not handle null values correctly
(was: [Pig on Tez] Group by on multiple keys is slow and Self join does not
handle null values correctly)
> [Pig on Tez] Self join does not handle null values correctly
> ------------------------------------------------------------
>
> Key: PIG-4627
> URL: https://issues.apache.org/jira/browse/PIG-4627
> Project: Pig
> Issue Type: Bug
> Reporter: Rohini Palaniswamy
> Assignee: Rohini Palaniswamy
> Fix For: 0.16.0, 0.15.1
>
> Attachments: PIG-4627-1.patch
>
>
> Self join does not produce right results in case of null after PIG-4495
> which writes multiple inputs into same tez input. Need the
> https://issues.apache.org/jira/secure/attachment/12628162/PIG-3761-1.patch
> fix of PIG-3761 to handle that by comparing indexes in the raw comparators.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)