Swapnil, sorry I partially saw the tile and thought Darpan/Pradeep were responding to my earlier post. My problem was not the same as yours.
Best, Steven On Apr 24, 2014, at 2:11 PM, Swapnil Shinde <[email protected]> wrote: > Thanks for reply.. > @ Pradeep - I am using PigStorage load function. > @ Darpan - I forgot to mention but I made sure that all values in columns > are numeric and can be cast to double. > @ Steven - Could you please explain more what resolved your error? > > Thanks > > > > On Thu, Apr 24, 2014 at 2:59 PM, Steven E. Waldren <[email protected]>wrote: > >> Thanks I made a last ditch effort and bounced my cluster. The error went >> away must be Cloudera gremlin. >> >> Thanks for the suggestions and help. >> >> Best, >> Steven >> >> On Apr 24, 2014, at 12:25 PM, Darpan R <[email protected]> wrote: >> >>> Please do a sanity of the datacheck : colA2 might not be cast-able to >>> numeric for one or more records. >>> >>> >>> >>> >>> On 24 April 2014 22:24, Pradeep Gollakota <[email protected]> wrote: >>> >>>> Whats the LoadFunc you're using? >>>> >>>> >>>> On Thu, Apr 24, 2014 at 9:28 AM, Swapnil Shinde < >> [email protected] >>>>> wrote: >>>> >>>>> I am facing very weird problem while multiplication. >>>>> Pig simplified code snippet- >>>>> A = LOAD 'file_A' AS (colA1 : double, colA2 : double); >>>>> describe A; >>>>> *A: {colA1: double,colA2: double}* >>>>> B = LOAD 'file_B' AS (colB1 : double, colB2 : double); >>>>> describe B; >>>>> *B: {colB1: double,colB2: double}* >>>>> >>>>> joined = JOIN A BY (colA1) LEFT OUTER, B BY (colB1) USING 'replicated'; >>>>> SPLIT joined INTO split1 IF A::colB1 IS NOT NULL, >>>>> split2 IF (A::colB1 IS NULL AND A;:colA2 == >>>> 2), >>>>> split3 IF (A::colB1 IS NULL AND A;:colA2 != >>>> 2); >>>>> describe split1; >>>>> * split1: {A::colA1: double,A::colA2: double,B::colB1: >>>>> double,B::colB2: double}* >>>>> >>>>> >>>>> D = FOREACH split1 GENERATE (A::colA1 * B::colB1) AS newCol; >>>>> >>>>> *Error-* >>>>> 2014-04-24 10:02:30,458 [main] ERROR >>>>> org.apache.pig.tools.pigstats.SimplePigStats - ERROR 0: Exception while >>>>> executing [Multiply (Name: Multiply[double] - scope-6 Operator Key: >>>>> scope-6) children: [[POProject (Name: Project[double][1] - scope-3 >>>> Operator >>>>> Key: scope-3) children: null at []], [POCast (Name: Cast[double] - >>>> scope-5 >>>>> Operator Key: scope-5) children: [[ConstantExpression (Name: >> Constant(3) >>>> - >>>>> scope-4 Operator Key: scope-4) children: null at []]] at []]] at []]: >>>>> java.lang.ClassCastException: org.apache.pig.data.DataByteArray cannot >> be >>>>> cast to java.lang.Number >>>>> >>>>> Stack tarce- >>>>> org.apache.pig.backend.executionengine.ExecException: ERROR 0: >> Exception >>>>> while executing [Multiply (Name: Multiply[double] - scope-6 Operator >> Key: >>>>> scope-6) children: [[POProject (Name: Project[double][1] - scope-3 >>>> Operator >>>>> Key: scope-3) children: null at []], [POCast (Name: Cast[double] - >>>> scope-5 >>>>> Operator Key: scope-5) children: [[ConstantExpression (Name: >> Constant(3) >>>> - >>>>> scope-4 Operator Key: scope-4) children: null at []]] at []]] at []]: >>>>> java.lang.ClassCastException: org.apache.pig.data.DataByteArray cannot >> be >>>>> cast to java.lang.Number at >>>>> >>>>> >>>> >> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:338) >>>>> at >>>>> >>>>> >>>> >> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:378) >>>>> at >>>>> >>>>> >>>> >> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:298) >>>>> at >>>>> >>>>> >>>> >> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:282) >>>>> at >>>>> >>>>> >>>> >> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:277) >>>>> at >>>>> >>>>> >>>> >> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64) >>>>> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at >>>>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:681) at >>>>> org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) at >>>>> org.apache.hadoop.mapred.Child$4.run(Child.java:270) at >>>>> java.security.AccessController.doPrivileged(Native Method) at >>>>> javax.security.auth.Subject.doAs(Subject.java:396) at >>>>> >>>>> >>>> >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127) >>>>> at org.apache.hadoop.mapred.Child.main(Child.java:264) Caused by: >>>>> java.lang.ClassCastException: org.apache.pig.data.DataByteArray cannot >> be >>>>> cast to java.lang.Number at >>>>> >>>>> >>>> >> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.Multiply.genericGetNext(Multiply.java:89) >>>>> at >>>>> >>>>> >>>> >> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.Multiply.getNextDouble(Multiply.java:104) >>>>> at >>>>> >>>>> >>>> >> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:317) >>>>> ... 13 more >>>>> >>>>> >>>>> I tried below options but no luck- >>>>> 1) Doing addition instead of multiplication and I get similar error. >>>>> 2) I verified multiplication for double works with few sample files. >>>>> 3) I tried casting it again to double before multiplication too. >>>>> 4) I tried storing result before multiplication and loading it back. >>>> still >>>>> same error. >>>>> >>>>> I am not sure why it's throwing classCastException when schema has >> double >>>>> as data type. >>>>> Please let me know if need any further information or missing something >>>> in >>>>> above simplified snippet. >>>>> Any help is very much appreciated. >>>>> >>>>> Thanks >>>>> >>>> >> >>
signature.asc
Description: Message signed with OpenPGP using GPGMail
