Hi, I have some code that looks like this:
top_hits = foreach regrouped { result = TOP(1, 6, projected_joined_albums); -- field 6 = score generate flatten(result); }; I'm not too keen on the TOP syntax because it's opaque and you need the comment there to explain what's going on. I've seen the same thing achieved like so, in a more transparent way, and in fact I've used this in other cases myself: top_hits = foreach regrouped { sorted = order projected_joined_albums by score desc; result = limit sorted 1; generate flatten(result); }; However, although the first form works for me, the second dies with the following error: java.lang.ClassCastException: java.lang.Integer cannot be cast to org.apache.pig.data.Tuple at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:392) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:276) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:138) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:291) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:355) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:291) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:433) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:401) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:381) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:251) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176) (etc.) Is there a reason for why it would fail in this case? I can't understand the meaning of the error, it'd be nice if it reported *which* Tuple was failing a cast. regrouped has the following schema: {group: (artistid: int,country: int,week: chararray),projected_joined_albums: {joined_albums_2::joined_albums_1::flattened_albums::key: (artistid: int,country: int,week: chararray),joined_albums_2::joined_albums_1::flattened_albums::timestamp: long,joined_albums_2::joined_albums_1::flattened_albums::albumid: int,track_counts::numtracks: long,joined_albums_2::reach::reach: int,joined_albums_2::joined_albums_1::album_titles::title_len: long,score: long}} That's a bit complex so I extracted the individual fields with a foreach .. generate beforehand: {group: (artistid: int,country: int,week: chararray),projected_joined_albums: {key: (artistid: int,country: int,week: chararray),timestamp: long,albumid: int,numtracks: long,reach: int,title_len: long,score: long}} It didn't affect the error, though. Thanks for any suggestions, Andrew. -- http://tinyurl.com/andrew-clegg-linkedin | http://twitter.com/andrew_clegg