I'm fairly new to Pig and am having a problem with a pig script that works fine 
in local mode, but fails in Hadoop mode. I'm using Cloudera CDH2, which 
includes Pig 0.5.0 and Hadoop 0.20.1. 

The line my script fails on is:
flattened = FOREACH joined GENERATE flatten($0) AS (session, feature, action), 
$3 AS count;

When I dump joined, each tuple looks like the following:

(fWq2XYmhvZWdAZ2FpbGJvcmRlbi5pbmZvLFE2L1RET0k4bUp1UHJRPT01yZ4kx,{(fWq2XYmhvZWdAZ2FpbGJvcmRlbi5pbmZvLFE2L1RET0k4bUp1UHJRPT01yZ4kx,mail_delete,btn)},,)

And here is what I get from describe on joined:

joined: {crossed::group: (dist_sessions::session: 
bytearray,dist_actions::feature: chararray,dist_actions::action: 
chararray),crossed::crossed: {dist_sessions::session: 
bytearray,dist_actions::feature: chararray,dist_actions::action: 
chararray},counted::id: (session: bytearray,feature: chararray,action: 
chararray),counted::count: long}

The exception I get is:

Backend error message
---------------------
java.lang.ClassCastException: org.apache.pig.data.DataByteArray cannot be cast 
to org.apache.pig.data.Tuple
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:309)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:254)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:204)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:240)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:418)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:386)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:366)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:238)
        at 
org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:463)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411)
        at org.apache.hadoop.mapred.Child.main(Child.java:170)

Any help would be appreciated.

Thanks,
Jon

Reply via email to