Thanks I made a last ditch effort and bounced my cluster. The error went away 
must be Cloudera gremlin.

Thanks for the suggestions and help.

Best,
Steven

On Apr 24, 2014, at 12:25 PM, Darpan R <[email protected]> wrote:

> Please do a sanity of the datacheck : colA2  might not be cast-able to
> numeric for one or more records.
> 
> 
> 
> 
> On 24 April 2014 22:24, Pradeep Gollakota <[email protected]> wrote:
> 
>> Whats the LoadFunc you're using?
>> 
>> 
>> On Thu, Apr 24, 2014 at 9:28 AM, Swapnil Shinde <[email protected]
>>> wrote:
>> 
>>> I am facing very weird problem while multiplication.
>>> Pig simplified code snippet-
>>> A = LOAD 'file_A' AS (colA1 : double, colA2 : double);
>>> describe A;
>>>     *A: {colA1: double,colA2: double}*
>>> B = LOAD 'file_B' AS (colB1 : double, colB2 : double);
>>> describe B;
>>>     *B: {colB1: double,colB2: double}*
>>> 
>>> joined = JOIN A BY (colA1) LEFT OUTER, B BY (colB1) USING 'replicated';
>>> SPLIT joined INTO  split1 IF A::colB1 IS NOT NULL,
>>>                            split2 IF (A::colB1 IS NULL AND A;:colA2 ==
>> 2),
>>>                            split3 IF (A::colB1 IS NULL AND A;:colA2 !=
>> 2);
>>> describe split1;
>>> *       split1: {A::colA1: double,A::colA2: double,B::colB1:
>>> double,B::colB2: double}*
>>> 
>>> 
>>> D = FOREACH split1 GENERATE (A::colA1 * B::colB1) AS newCol;
>>> 
>>> *Error-*
>>> 2014-04-24 10:02:30,458 [main] ERROR
>>> org.apache.pig.tools.pigstats.SimplePigStats - ERROR 0: Exception while
>>> executing [Multiply (Name: Multiply[double] - scope-6 Operator Key:
>>> scope-6) children: [[POProject (Name: Project[double][1] - scope-3
>> Operator
>>> Key: scope-3) children: null at []], [POCast (Name: Cast[double] -
>> scope-5
>>> Operator Key: scope-5) children: [[ConstantExpression (Name: Constant(3)
>> -
>>> scope-4 Operator Key: scope-4) children: null at []]] at []]] at []]:
>>> java.lang.ClassCastException: org.apache.pig.data.DataByteArray cannot be
>>> cast to java.lang.Number
>>> 
>>> Stack tarce-
>>> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception
>>> while executing [Multiply (Name: Multiply[double] - scope-6 Operator Key:
>>> scope-6) children: [[POProject (Name: Project[double][1] - scope-3
>> Operator
>>> Key: scope-3) children: null at []], [POCast (Name: Cast[double] -
>> scope-5
>>> Operator Key: scope-5) children: [[ConstantExpression (Name: Constant(3)
>> -
>>> scope-4 Operator Key: scope-4) children: null at []]] at []]] at []]:
>>> java.lang.ClassCastException: org.apache.pig.data.DataByteArray cannot be
>>> cast to java.lang.Number at
>>> 
>>> 
>> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:338)
>>> at
>>> 
>>> 
>> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:378)
>>> at
>>> 
>>> 
>> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:298)
>>> at
>>> 
>>> 
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:282)
>>> at
>>> 
>>> 
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:277)
>>> at
>>> 
>>> 
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
>>> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at
>>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:681) at
>>> org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) at
>>> org.apache.hadoop.mapred.Child$4.run(Child.java:270) at
>>> java.security.AccessController.doPrivileged(Native Method) at
>>> javax.security.auth.Subject.doAs(Subject.java:396) at
>>> 
>>> 
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
>>> at org.apache.hadoop.mapred.Child.main(Child.java:264) Caused by:
>>> java.lang.ClassCastException: org.apache.pig.data.DataByteArray cannot be
>>> cast to java.lang.Number at
>>> 
>>> 
>> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.Multiply.genericGetNext(Multiply.java:89)
>>> at
>>> 
>>> 
>> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.Multiply.getNextDouble(Multiply.java:104)
>>> at
>>> 
>>> 
>> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:317)
>>> ... 13 more
>>> 
>>> 
>>> I tried below options but no luck-
>>> 1) Doing addition instead of multiplication and I get similar error.
>>> 2) I verified multiplication for double works with few sample files.
>>> 3) I tried casting it again to double before multiplication too.
>>> 4) I tried storing result before multiplication and loading it back.
>> still
>>> same error.
>>> 
>>> I am not sure why it's throwing classCastException when schema has double
>>> as data type.
>>> Please let me know if need any further information or missing something
>> in
>>> above simplified snippet.
>>> Any help is very much appreciated.
>>> 
>>> Thanks
>>> 
>> 

Reply via email to