Incorrect results when there is a dump in between statements. 
--------------------------------------------------------------

                 Key: PIG-153
                 URL: https://issues.apache.org/jira/browse/PIG-153
             Project: Pig
          Issue Type: Bug
         Environment: Pig + Hadoop
            Reporter: Amir Youssefi


Following scenario is with Pig + Hadoop. 

A similar run with Local Pig showed correct results.


Here is test file data/test/test2.txt: 

a1      1       5700
b1      2       2001
c2      2       

I run the following script step by step:

grunt> a = load 'data/test/test2.txt';
grunt> dump a;
2008-03-18 06:41:55,163 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.POMapreduce - ----- MapReduce Job 
---     --
2008-03-18 06:41:55,163 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Input: 
[/user/amiry/dat     a/test/test2.txt:org.apache.pig.builtin.PigStorage()]
2008-03-18 06:41:55,163 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Map: [[*]]
2008-03-18 06:41:55,163 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Group: null
2008-03-18 06:41:55,163 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Combine: null
2008-03-18 06:41:55,163 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Reduce: null
2008-03-18 06:41:55,163 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Output: 
/tmp/temp135967     7959/tmp-246846292:org.apache.pig.builtin.BinStorage
2008-03-18 06:41:55,163 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Split: null
2008-03-18 06:41:55,163 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Map parallelism: -1
2008-03-18 06:41:55,163 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Reduce parallelism: 
-1
2008-03-18 06:41:57,472 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher - 
Pig      progress = 0%
2008-03-18 06:41:58,477 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher - 
Pig      progress = 50%
2008-03-18 06:42:04,495 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher - 
Pig      progress = 100%
(a1, 1, 5700)
(b1, 2, 2001)
(c2, 2, )
grunt> b = filter a by $0 eq 'a1';
grunt> dump b;
2008-03-18 06:42:23,881 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.POMapreduce - ----- MapReduce Job 
---     --
2008-03-18 06:42:23,881 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Input: 
[/tmp/temp135967     7959/tmp-246846292:org.apache.pig.builtin.BinStorage]
2008-03-18 06:42:23,881 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Map: [[*]->[FILTER 
BY (     [PROJECT $0] eq ['a1'])]]
2008-03-18 06:42:23,882 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Group: null
2008-03-18 06:42:23,882 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Combine: null
2008-03-18 06:42:23,882 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Reduce: null
2008-03-18 06:42:23,882 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Output: 
/tmp/temp135967     7959/tmp1851797397:org.apache.pig.builtin.BinStorage
2008-03-18 06:42:23,882 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Split: null
2008-03-18 06:42:23,882 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Map parallelism: -1
2008-03-18 06:42:23,882 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Reduce parallelism: 
-1
2008-03-18 06:42:25,938 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher - 
Pig      progress = 0%
2008-03-18 06:42:28,946 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher - 
Pig      progress = 50%
2008-03-18 06:42:34,963 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher - 
Pig      progress = 100%
(a1, 1, 5700)
grunt> c = filter a by $0 eq 'b1';
grunt> dump c;
2008-03-18 06:42:59,884 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.POMapreduce - ----- MapReduce Job 
---     --
2008-03-18 06:42:59,884 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Input: 
[/tmp/temp135967     7959/tmp1851797397:org.apache.pig.builtin.BinStorage]
2008-03-18 06:42:59,885 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Map: [[*]->[FILTER 
BY (     [PROJECT $0] eq ['b1'])]]
2008-03-18 06:42:59,885 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Group: null
2008-03-18 06:42:59,885 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Combine: null
2008-03-18 06:42:59,885 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Reduce: null
2008-03-18 06:42:59,885 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Output: 
/tmp/temp135967     7959/tmp-1157182212:org.apache.pig.builtin.BinStorage
2008-03-18 06:42:59,885 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Split: null
2008-03-18 06:42:59,885 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Map parallelism: -1
2008-03-18 06:42:59,885 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.POMapreduce - Reduce parallelism: 
-1
2008-03-18 06:43:01,964 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher - 
Pig      progress = 0%
2008-03-18 06:43:04,974 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher - 
Pig      progress = 50%
2008-03-18 06:43:06,980 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher - 
Pig      progress = 100%
grunt>


Meaning c is empty.




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to