It looks like a bug of Pig. I try the following script: a = load 'data/a.txt' as (b:bag{t:tuple(f1:int,f2:int)}); result = foreach a generate FLATTEN(b) as c; describe result;
the output is result: {c: int,f2: int} The c is considered one field of tuple other than tuple On Tue, Jul 20, 2010 at 6:00 AM, Dmitriy Lyubimov <dlie...@gmail.com> wrote: > Hi, > > I would greatly appreciate somebody's help with the following pig error > during MR > > all mappers fail with the following stack trace > > java.lang.ClassCastException: java.lang.Integer cannot be cast to > org.apache.pig.data.Tuple > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:389) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POIsNull.getNext(POIsNull.java:152) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.PONot.getNext(PONot.java:71) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POAnd.getNext(POAnd.java:67) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.getNext(POFilter.java:148) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:272) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLimit.getNext(POLimit.java:85) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:272) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:255) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:232) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:227) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:52) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) > at org.apache.hadoop.mapred.Child.main(Child.java:170) > > > > > > the pig script fragment causing this is as follows : > IMP_F2 = foreach IMP_F1 generate ... , FLATTEN(contentRatings) as > contentRating; > IMP_F3 = filter IMP_F2 by contentRating is not null and > contentRating.vendorId==1 > > if i remove IMP_F3 line then the job goes thru but adding IMP_F3 > filtering causes this. > describe IMP_F2 produces > > IMP_F2: {... ,contentRating: (vendorId: int, ... ), ... } > > > i also tried casts like 'filter by ... > (int)(contentRating.vendorId)==1 which did not change anything. > > Any ideas for workaround are appreciated. > > Thanks in advance. > -Dmitriy > -- Best Regards Jeff Zhang