[
https://issues.apache.org/jira/browse/PIG-4057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Daniel Dai updated PIG-4057:
----------------------------
Attachment: PIG-4057-3.patch
Address [~cheolsoo]'s review comments.
> Group All followed by CROSS with default parallelism produces wrong results
> ---------------------------------------------------------------------------
>
> Key: PIG-4057
> URL: https://issues.apache.org/jira/browse/PIG-4057
> Project: Pig
> Issue Type: Bug
> Reporter: Rohini Palaniswamy
> Assignee: Daniel Dai
> Fix For: 0.14.0
>
> Attachments: PIG-4057-1.patch, PIG-4057-2.patch, PIG-4057-3.patch
>
>
> SET default_parallel 199;
> ......
> by_size = ...
> uniq_vals = .....
> grpd = group uniq_vals all;
> all_vals = FOREACH grpd GENERATE uniq_vals;
> cross_result = CROSS by_size, all_vals;
> store cross_result into '/tmp/roh/cross/out/recipient_asns';
> Job1: grpd, all_vals, cross_result (The plan does GFCross function here for
> all_vals assuming cross parallelism to be 1 taking that of the current job
> even
> though it should consider default parallelism 199 of Job 2. Parallelism of
> Job1
> is 1 because of group all)
> Job2: cross_result (Actual CROSS of by_size and all_vals)
--
This message was sent by Atlassian JIRA
(v6.2#6252)