[
https://issues.apache.org/jira/browse/PIG-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12751834#action_12751834
]
Daniel Dai commented on PIG-578:
--------------------------------
One minor comment: there are tabs in LogToPhyTranslationVisitor.java and
QueryParser.jjt. We shall change it to space. Other part looks good to me.
> join ... outer, ... outer semantics are a no-ops, should produce
> corresponding null values
> ------------------------------------------------------------------------------------------
>
> Key: PIG-578
> URL: https://issues.apache.org/jira/browse/PIG-578
> Project: Pig
> Issue Type: Improvement
> Components: impl
> Affects Versions: 0.2.0
> Reporter: David Ciemiewicz
> Assignee: Pradeep Kamath
> Fix For: 0.4.0
>
> Attachments: PIG-578-2.patch, PIG-578.patch
>
>
> Currently using the "OUTER" modifier in the JOIN statement is a no-op. The
> resuls of JOIN are always an INNER join. Now that the Pig types branch
> supports null values proper, the semantics of JOIN ... OUTER, ... OUTER
> should be corrected to do proper outer joins and populating the corresponding
> empty values with nulls.
> Here's the example:
> A = load 'a.txt' using PigStorage() as ( comment, value );
> B = load 'b.txt' using PigStorage() as ( comment, value );
> --
> -- OUTER clause is ignored in JOIN statement and does not populat tuple with
> -- null values as it should. Otherwise OUTER is a meaningless no-op modifier.
> --
> ABOuterJoin = join A by ( comment ) outer, B by ( comment ) outer;
> describe ABOuterJoin;
> dump ABOuterJoin;
> The file a contains:
> a-only 1
> ab-both 2
> The file b contains:
> ab-both 2
> b-only 3
> When you execute the script today, the dump results are:
> (ab-both,2,ab-both,2)
> The expected dump results should be:
> (a-only,1,,)
> (ab-both,2,ab-both,2)
> (,,b-only,3)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.