Hi! Are you sure in your types? Can you add a DESCRIBE statement for all relations before the line that causes the error.
Ruslan On Wed, Sep 19, 2012 at 4:22 PM, Björn-Elmar Macek <[email protected]> wrote: > Hi, > > during execution of the following PIG script i ran into the class cast > exception mentioned in the title of this mail. The log indicates, that > the error is happening in the reduce process and i somehow have the > feeling, that the problem exists because of my UDF, since the error > happens in GreaterThanExpr.java:106 which only could occurs in the step > that defines variable "j4". BUT ... > 1. My UDF returns an int (both in code and as defined by @outputSchema) > 2. The returned value is compared to an int (so this should not be the > problem) > > > Is there anybody who sees a solution and reason for this behavious? > > Best regards, > Elmar > > > ########################################## > register 'myUDF.py' using jython as moins; > > t = load 'rt_bow_notag'; > rt = load 'rt_only_notag'; > j = join t by $1, rt by $2; > > j3 = foreach j generate $0 as ots, $1 as oauthor, $2 as omention, $3 as > otag1, $4 as otag2, $5 as ourl, $6 as ort, $7 as oown, $8 as omsg, $9 as > rtts, $10 as rtauthor, $11 as rtmention, moins.isRT($8,$17) as isrt; > j4 = filter j3 by (isrt > 0); > j5 = limit j4 5; > > dump j5; > ########################################### > > The UDF "isRT" is defined in myUDF.py as the following: > ########################################### > @outputSchema("isRT:int") > def isRT(bag1, bag2): > if bag1 is None or bag2 is None: return 0; > intersection = set(bag1) & set(bag2) > intersection_size = len(intersection) > size1 = len(set(bag1)); > size2 = len(set(bag2)); > max_size = max([size1,size2]) > if max_size == 0: return 1 > if intersection_size/max_size > 0.5: return 1 > return 0 > ############################################ > > And finally the stacktrace looks like the following: > ############################################ > 2012-09-19 13:35:18,358 [main] INFO > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > - 18% complete > 2012-09-19 13:38:25,393 [main] INFO > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > - 33% complete > 2012-09-19 13:41:33,896 [main] INFO > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > - 50% complete > 2012-09-19 13:41:38,422 [main] INFO > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > - job job_201209181449_0024 has failed! Stop running all dependent jobs > 2012-09-19 13:41:38,422 [main] INFO > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > - 100% complete > 2012-09-19 13:41:38,683 [main] ERROR > org.apache.pig.tools.pigstats.SimplePigStats - ERROR 2997: Unable to > recreate exception from backed error: java.lang.ClassCastException: > java.lang.Integer cannot be cast to java.lang.Double > at java.lang.Double.compareTo(Double.java:49) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.GreaterThanExpr.doComparison(GreaterThanExpr.java:106) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.GreaterThanExpr.getNext(GreaterThanExpr.java:74) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.getNext(POFilter.java:148) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:290) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLimit.getNext(POLimit.java:117) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.runPipeline(PigGenericMapReduce.java:460) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.processOnePackageOutput(PigGenericMapReduce.java:428) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:400) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:262) > at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176) > at > org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:417) > at org.apache.hadoop.mapred.Child$4.run(Child.java:255) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) > at org.apache.hadoop.mapred.Child.main(Child.java:249) > > 2012-09-19 13:41:38,684 [main] ERROR > org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed! > 2012-09-19 13:41:38,685 [main] INFO > org.apache.pig.tools.pigstats.SimplePigStats - Script Statistics: >
