[
https://issues.apache.org/jira/browse/PIG-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Thejas M Nair updated PIG-2316:
-------------------------------
Attachment: pig-2316-trunk-v2.txt
{code}
Applying pig-2316-trunk-v1.txt triggers another bug. For the following filter
clause, note that filter plan in MR plan is incomplete.
B = FILTER A BY ((col1==1) OR (col1 != 2));
Filter in MR plan -
B: Store(fakefile:org.apache.pig.builtin.PigStorage) - scope-11
|
|---B: Filter[bag] - scope-7
| |
| Not Equal To[boolean] - scope-10
| |
| |---Project[int][0] - scope-8
| |
| |---Constant(2) - scope-9
|
|---A: New For Each(false,false)[bag] - scope-6
{code}
pig-2316-trunk-v2.txt has the fix for this issue.
> Incorrect results for FILTER *** BY ( *** OR ***) with
> FilterLogicExpressionSimplifier optimizer turned on
> ----------------------------------------------------------------------------------------------------------
>
> Key: PIG-2316
> URL: https://issues.apache.org/jira/browse/PIG-2316
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.8.0, 0.8.1, 0.9.0, 0.9.1
> Reporter: Huanyu Zhao
> Priority: Critical
> Fix For: 0.8.1, 0.9.2
>
> Attachments: pig-2316-trunk-v1.txt, pig-2316-trunk-v2.txt
>
>
> An example for this bug:
> cat weird.txt
> 1,a
> 2,b
> 3,c
> When running pig with the following statements:
> A = LOAD 'weird.txt' using PigStorage(',') AS (col1:int,col2);
> B = FILTER A BY ((col1==1) OR (col1 != 1));
> DUMP B;
> I expect to get the result of all three rows back, but I receive only two
> rows.
> (2,b)
> (3,c)
> When we start pig with optimizer turning off.
> pig -optimizer_off All
> With optimizer turning off, we get the expected results and I get three rows
> for the same statements.
> (1,a)
> (2,b)
> (3,c)
> --------------------------------------------------------
> This bug was test on:
> pig-0.9.1,
> pig-0.9.0,
> pig-0.8.1,
> pig-0.8.0
> All produced same incorrect results.
> --------------------------------------------------------
> When looked at the logical plan for this example, we found
> FilterlogicExpressionSimplifier optimizer produced incorrect logical plan. So
> we guess the bug is caused by FilterlogicExpressionSimplifier optimizer.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira