One possibility off the top of my head is that the delimiter might be
wrong. Can you try specifying the correct delimiter to PigStorage.

E.g. For CSV files

A = LOAD 'file_A' USING PigStorage(',') AS (colA1 : double, colA2 : double);



On Thu, Apr 24, 2014 at 12:48 PM, Steven E. Waldren <[email protected]>wrote:

> Swapnil, sorry I partially saw the tile and thought Darpan/Pradeep were
> responding to my earlier post. My problem was not the same as yours.
>
> Best,
> Steven
>
> On Apr 24, 2014, at 2:11 PM, Swapnil Shinde <[email protected]>
> wrote:
>
> > Thanks for reply..
> > @ Pradeep - I am using PigStorage load function.
> > @ Darpan - I forgot to mention but I made sure that all values in columns
> > are numeric and can be cast to double.
> > @ Steven - Could you please explain more what resolved your error?
> >
> > Thanks
> >
> >
> >
> > On Thu, Apr 24, 2014 at 2:59 PM, Steven E. Waldren <[email protected]
> >wrote:
> >
> >> Thanks I made a last ditch effort and bounced my cluster. The error went
> >> away must be Cloudera gremlin.
> >>
> >> Thanks for the suggestions and help.
> >>
> >> Best,
> >> Steven
> >>
> >> On Apr 24, 2014, at 12:25 PM, Darpan R <[email protected]> wrote:
> >>
> >>> Please do a sanity of the datacheck : colA2  might not be cast-able to
> >>> numeric for one or more records.
> >>>
> >>>
> >>>
> >>>
> >>> On 24 April 2014 22:24, Pradeep Gollakota <[email protected]>
> wrote:
> >>>
> >>>> Whats the LoadFunc you're using?
> >>>>
> >>>>
> >>>> On Thu, Apr 24, 2014 at 9:28 AM, Swapnil Shinde <
> >> [email protected]
> >>>>> wrote:
> >>>>
> >>>>> I am facing very weird problem while multiplication.
> >>>>> Pig simplified code snippet-
> >>>>> A = LOAD 'file_A' AS (colA1 : double, colA2 : double);
> >>>>> describe A;
> >>>>>    *A: {colA1: double,colA2: double}*
> >>>>> B = LOAD 'file_B' AS (colB1 : double, colB2 : double);
> >>>>> describe B;
> >>>>>    *B: {colB1: double,colB2: double}*
> >>>>>
> >>>>> joined = JOIN A BY (colA1) LEFT OUTER, B BY (colB1) USING
> 'replicated';
> >>>>> SPLIT joined INTO  split1 IF A::colB1 IS NOT NULL,
> >>>>>                           split2 IF (A::colB1 IS NULL AND A;:colA2 ==
> >>>> 2),
> >>>>>                           split3 IF (A::colB1 IS NULL AND A;:colA2 !=
> >>>> 2);
> >>>>> describe split1;
> >>>>> *       split1: {A::colA1: double,A::colA2: double,B::colB1:
> >>>>> double,B::colB2: double}*
> >>>>>
> >>>>>
> >>>>> D = FOREACH split1 GENERATE (A::colA1 * B::colB1) AS newCol;
> >>>>>
> >>>>> *Error-*
> >>>>> 2014-04-24 10:02:30,458 [main] ERROR
> >>>>> org.apache.pig.tools.pigstats.SimplePigStats - ERROR 0: Exception
> while
> >>>>> executing [Multiply (Name: Multiply[double] - scope-6 Operator Key:
> >>>>> scope-6) children: [[POProject (Name: Project[double][1] - scope-3
> >>>> Operator
> >>>>> Key: scope-3) children: null at []], [POCast (Name: Cast[double] -
> >>>> scope-5
> >>>>> Operator Key: scope-5) children: [[ConstantExpression (Name:
> >> Constant(3)
> >>>> -
> >>>>> scope-4 Operator Key: scope-4) children: null at []]] at []]] at []]:
> >>>>> java.lang.ClassCastException: org.apache.pig.data.DataByteArray
> cannot
> >> be
> >>>>> cast to java.lang.Number
> >>>>>
> >>>>> Stack tarce-
> >>>>> org.apache.pig.backend.executionengine.ExecException: ERROR 0:
> >> Exception
> >>>>> while executing [Multiply (Name: Multiply[double] - scope-6 Operator
> >> Key:
> >>>>> scope-6) children: [[POProject (Name: Project[double][1] - scope-3
> >>>> Operator
> >>>>> Key: scope-3) children: null at []], [POCast (Name: Cast[double] -
> >>>> scope-5
> >>>>> Operator Key: scope-5) children: [[ConstantExpression (Name:
> >> Constant(3)
> >>>> -
> >>>>> scope-4 Operator Key: scope-4) children: null at []]] at []]] at []]:
> >>>>> java.lang.ClassCastException: org.apache.pig.data.DataByteArray
> cannot
> >> be
> >>>>> cast to java.lang.Number at
> >>>>>
> >>>>>
> >>>>
> >>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:338)
> >>>>> at
> >>>>>
> >>>>>
> >>>>
> >>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:378)
> >>>>> at
> >>>>>
> >>>>>
> >>>>
> >>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:298)
> >>>>> at
> >>>>>
> >>>>>
> >>>>
> >>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:282)
> >>>>> at
> >>>>>
> >>>>>
> >>>>
> >>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:277)
> >>>>> at
> >>>>>
> >>>>>
> >>>>
> >>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
> >>>>> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at
> >>>>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:681) at
> >>>>> org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) at
> >>>>> org.apache.hadoop.mapred.Child$4.run(Child.java:270) at
> >>>>> java.security.AccessController.doPrivileged(Native Method) at
> >>>>> javax.security.auth.Subject.doAs(Subject.java:396) at
> >>>>>
> >>>>>
> >>>>
> >>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
> >>>>> at org.apache.hadoop.mapred.Child.main(Child.java:264) Caused by:
> >>>>> java.lang.ClassCastException: org.apache.pig.data.DataByteArray
> cannot
> >> be
> >>>>> cast to java.lang.Number at
> >>>>>
> >>>>>
> >>>>
> >>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.Multiply.genericGetNext(Multiply.java:89)
> >>>>> at
> >>>>>
> >>>>>
> >>>>
> >>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.Multiply.getNextDouble(Multiply.java:104)
> >>>>> at
> >>>>>
> >>>>>
> >>>>
> >>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:317)
> >>>>> ... 13 more
> >>>>>
> >>>>>
> >>>>> I tried below options but no luck-
> >>>>> 1) Doing addition instead of multiplication and I get similar error.
> >>>>> 2) I verified multiplication for double works with few sample files.
> >>>>> 3) I tried casting it again to double before multiplication too.
> >>>>> 4) I tried storing result before multiplication and loading it back.
> >>>> still
> >>>>> same error.
> >>>>>
> >>>>> I am not sure why it's throwing classCastException when schema has
> >> double
> >>>>> as data type.
> >>>>> Please let me know if need any further information or missing
> something
> >>>> in
> >>>>> above simplified snippet.
> >>>>> Any help is very much appreciated.
> >>>>>
> >>>>> Thanks
> >>>>>
> >>>>
> >>
> >>
>
>

Reply via email to