Hi Jamin, >> Out of bound access. Trying to access non-existent column: 8. Schema activityID:chararray,reqHost:chararray,rspPylByt:long pylByt:long,reqTime:double,reqDur:double,rspTime:double,rspDur:double has 8 column(s).
Did you try to disable ColumnMapKeyPrune optimization? You can do it by adding "-t ColumnMapKeyPrune" to the command line. Also, there have been a few bug fixes regarding ColumnMapKeyPrune since 0.12 release, so please try to branch-0.12<https://github.com/apache/pig/tree/branch-0.12> in Pig repo. Thanks, Cheolsoo On Sun, Mar 23, 2014 at 10:18 PM, XIAMING CHEN <[email protected]> wrote: > I found that PIG gets confused about the schema after a complicated but > correct nested FOREACH operation. > > My script is attached with no modification and it gives error messages > below: > > Picked up _JAVA_OPTIONS: -Xmx1G > 2014-03-24 13:05:18,662 [main] INFO org.apache.pig.Main - Apache Pig > version 0.12.0 (r1529718) compiled Oct 07 2013, 12:20:14 > 2014-03-24 13:05:18,663 [main] INFO org.apache.pig.Main - Logging error > messages to: > /mnt/tera/workspace/OmnilabMisc/sjtuwifi/activities/pig_1395637518659.log > 2014-03-24 13:05:18,897 [main] INFO org.apache.pig.impl.util.Utils - > Default bootup file /home/chenxm/.pigbootup not found > 2014-03-24 13:05:18,990 [main] INFO > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - > Connecting to hadoop file system at: file:/// > activities: {group: chararray,brief: {(activityID: chararray,reqHost: > chararray,rspPylByt: long,pylByt: long,reqTime: double,reqDur: > double,rspTime: double,rspDur: double)}} > 2014-03-24 13:05:19,766 [main] WARN org.apache.pig.PigServer - > Encountered Warning IMPLICIT_CAST_TO_DOUBLE 5 time(s). > features: {activityID: chararray,service: chararray,volume: long,size: > long,ADur: double,MWTime: double,MEdur: double,VMR: double,CI: double,PABw: > double} > 2014-03-24 13:05:19,904 [main] WARN org.apache.pig.PigServer - > Encountered Warning IMPLICIT_CAST_TO_DOUBLE 11 time(s). > 2014-03-24 13:05:19,904 [main] WARN org.apache.pig.PigServer - > Encountered Warning IMPLICIT_CAST_TO_LONG 2 time(s). > filtered: {activityID: chararray,service: chararray,volume: long,size: > long,ADur: double,MWTime: double,MEdur: double,VMR: double,CI: double,PABw: > double} > 2014-03-24 13:05:20,049 [main] ERROR org.apache.pig.tools.grunt.Grunt - > ERROR 1000: > <file > /home/chenxm/tera/workspace/OmnilabMisc/sjtuwifi/activities/features_perf.pig, > line 47, column 142> Out of bound access. Trying to access non-existent > column: 8. Schema > activityID:chararray,reqHost:chararray,rspPylByt:long,pylByt:long,reqTime:double,reqDur:double,rspTime:double,rspDur:double > has 8 column(s). > Details at logfile: ************/pig_1395637518659.log > [Finished in 1.7s with exit code 6] > > In the output, schema of 'filtered' projection is correct but in the > following FOREACH [line 47], PIG treats 'filtered' with another schema the > same to 'brief' [line 16]. > I do not know why PIG is confused about this. Is this a bug or my usage in > an incorrect way? > > Best, > > Jamin > [email protected]
