Hi,

during execution of the following PIG script i ran into the class cast exception mentioned in the title of this mail. The log indicates, that the error is happening in the reduce process and i somehow have the feeling, that the problem exists because of my UDF, since the error happens in GreaterThanExpr.java:106 which only could occurs in the step that defines variable "j4". BUT ...
1. My UDF returns an int (both in code and as defined by @outputSchema)
2. The returned value is compared to an int (so this should not be the problem)


Is there anybody who sees a solution and reason for this behavious?

Best regards,
Elmar


##########################################
register 'myUDF.py' using jython as moins;

t = load 'rt_bow_notag';
rt = load 'rt_only_notag';
j = join t by $1, rt by $2;

j3 = foreach j generate $0 as ots, $1 as oauthor, $2 as omention, $3 as otag1, $4 as otag2, $5 as ourl, $6 as ort, $7 as oown, $8 as omsg, $9 as rtts, $10 as rtauthor, $11 as rtmention, moins.isRT($8,$17) as isrt;
j4 = filter j3 by (isrt > 0);
j5 = limit j4 5;

dump j5;
###########################################

The UDF "isRT" is defined in myUDF.py as the following:
###########################################
@outputSchema("isRT:int")
def isRT(bag1, bag2):
    if bag1 is None or bag2 is None: return 0;
    intersection = set(bag1) & set(bag2)
    intersection_size = len(intersection)
    size1 = len(set(bag1));
    size2 = len(set(bag2));
    max_size = max([size1,size2])
    if max_size == 0: return 1
    if intersection_size/max_size > 0.5: return 1
    return 0
############################################

And finally the stacktrace looks like the following:
############################################
2012-09-19 13:35:18,358 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 18% complete 2012-09-19 13:38:25,393 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 33% complete 2012-09-19 13:41:33,896 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete 2012-09-19 13:41:38,422 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_201209181449_0024 has failed! Stop running all dependent jobs 2012-09-19 13:41:38,422 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete 2012-09-19 13:41:38,683 [main] ERROR org.apache.pig.tools.pigstats.SimplePigStats - ERROR 2997: Unable to recreate exception from backed error: java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.Double
    at java.lang.Double.compareTo(Double.java:49)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.GreaterThanExpr.doComparison(GreaterThanExpr.java:106) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.GreaterThanExpr.getNext(GreaterThanExpr.java:74) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.getNext(POFilter.java:148) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:290) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLimit.getNext(POLimit.java:117) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.runPipeline(PigGenericMapReduce.java:460) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.processOnePackageOutput(PigGenericMapReduce.java:428) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:400) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:262)
    at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:417)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
    at org.apache.hadoop.mapred.Child.main(Child.java:249)

2012-09-19 13:41:38,684 [main] ERROR org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed! 2012-09-19 13:41:38,685 [main] INFO org.apache.pig.tools.pigstats.SimplePigStats - Script Statistics:


Reply via email to