[
https://issues.apache.org/jira/browse/PIG-3347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13692379#comment-13692379
]
Daniel Dai commented on PIG-3347:
---------------------------------
That's the incorrect PushUpFilter. Can be solved by disable PushUpFilter rule:
pig -t PushUpFilter -x local xxx.pig
Look at the logical plan:
{code}
c: (Name: LOStore Schema:
group#28:bytearray,a_distinct#29:bag{#30:tuple(#31:bytearray)})
|
|---b: (Name: LOForEach Schema:
group#28:bytearray,a_distinct#29:bag{#30:tuple(#31:bytearray)})
| |
| (Name: LOGenerate[false,false] Schema:
group#28:bytearray,a_distinct#29:bag{#30:tuple(#31:bytearray)})ColumnPrune:InputUids=[29,
28]ColumnPrune:OutputUids=[29, 28]
| | |
| | group:(Name: Project Type: bytearray Uid: 28 Input: 0 Column: (*))
| | |
| | a_distinct:(Name: Project Type: bag Uid: 29 Input: 1 Column: (*))
| |
| |---(Name: LOInnerLoad[0] Schema: group#28:bytearray)
| |
| |---a_distinct: (Name: LODistinct Schema: #31:bytearray)
| |
| |---1-7: (Name: LOForEach Schema: #31:bytearray)
| | |
| | (Name: LOGenerate[false] Schema: #31:bytearray)
| | | |
| | | (Name: Project Type: bytearray Uid: 31 Input: 0 Column:
(*))
| | |
| | |---(Name: LOInnerLoad[0] Schema: #31:bytearray)
| |
| |---a: (Name: LOInnerLoad[1] Schema: null)
|
|---c: (Name: LOFilter Schema: group#28:bytearray,a#29:bag{#36:tuple()})
| |
| (Name: Equal Type: boolean Uid: 35)
| |
| |---(Name: UserFunc(org.apache.pig.builtin.BagSize) Type: long Uid:
32)
| | |
| | |---a:(Name: Project Type: bag Uid: 29 Input: 0 Column: 1)
| |
| |---(Name: Cast Type: long Uid: 33)
| |
| |---(Name: Constant Type: int Uid: 33)
|
|---a_group: (Name: LOCogroup Schema:
group#28:bytearray,a#29:bag{#36:tuple()})
| |
| (Name: Project Type: bytearray Uid: 28 Input: 0 Column: 0)
|
|---a: (Name: LOLoad Schema: null)RequiredFields:null
{code}
Filter is pushed in front of foreach, which is wrong.
> Store invocation in local mode brings sire effect
> -------------------------------------------------
>
> Key: PIG-3347
> URL: https://issues.apache.org/jira/browse/PIG-3347
> Project: Pig
> Issue Type: Bug
> Components: grunt
> Affects Versions: 0.11
> Environment: local mode
> Reporter: Sergey
>
> The problem is that intermediate 'store' invocation "changes" the final store
> output. Looks like it brings some kind of side effect. We did use 'local'
> mode to run script
> here is the input data:
> 1
> 1
> Here is the script:
> {code}
> a = load 'test';
> a_group = group a by $0;
> b = foreach a_group {
> a_distinct = distinct a.$0;
> generate group, a_distinct;
> }
> --store b into 'b';
> c = filter b by SIZE(a_distinct) == 1;
> store c into 'out';
> {code}
> We expect output to be:
> 1 1
> The output is empty file.
> Uncomment {code}--store b into 'b';{code} line and see the diffrence.
> Yuo would get expected output.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira