[ 
https://issues.apache.org/jira/browse/PIG-3987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Giovanni Botta updated PIG-3987:
--------------------------------

    Description: 
I ran into a very strange issue with one of my pig scripts. I described it in 
this SO: 
http://stackoverflow.com/questions/24047572/strange-cast-error-in-pig-hadoop 
Here it is:
I have the following script:

{code}
    br = LOAD 'cfs:///somefile';

    SPLIT br INTO s0 IF (sp == 1), not_s0 OTHERWISE;
    SPLIT not_s0 INTO s1 IF (adp >= 1.0), not_s1 OTHERWISE;
    SPLIT not_s1 INTO s2 IF (p > 1L), not_s2 OTHERWISE;
    SPLIT not_s2 INTO s3 IF (s > 0L), s4 OTHERWISE;

    tmp0 = FOREACH s0 GENERATE b, 'x' as seg;
    tmp1 = FOREACH s1 GENERATE b, 'y' as seg;
    tmp2 = FOREACH s2 GENERATE b, 'z' as seg;
    tmp3 = FOREACH s3 GENERATE b, 'w' as seg;
    tmp4 = FOREACH s4 GENERATE b, 't' as seg;

    out = UNION ONSCHEMA tmp0, tmp1, tmp2, tmp3, tmp4;

    dump out;
{code}

Where the file loaded in `br` was generated by a previous Pig script and has an 
embedded schema (a .pig_schema file):

{code}
    describe br
    br: {b: chararray,p: long,afternoon: long,ddv: long,pa: long,t0002: 
long,t0204: long,t0406: long,t0608: long,t0810: long,t1012: long,t1214: 
long,t1416: long,t1618: long,t1820: long,t2022: long,t2200: 
long,browser_software: chararray,first_timestamp: long,last_timestamp: long,os: 
chararray,platform: chararray,sp: int,adp: double}
{code}

Some irrelevant fields were edited from the above (I can't fully disclose the 
nature of the data at this time).

The script fails with the following error:

{code}
    ERROR org.apache.pig.tools.pigstats.SimplePigStats - ERROR: 
java.lang.Integer cannot be cast to java.lang.Long
{code}

However, dumping `s0`, `s1`, `s2`, `s3`, `s4` or `tmp0`, `tmp1`, `tmp2` `tmp3`, 
`tmp4` works flawlessly.

The Hadoop job tracker shows the following error 4 times:

{code}
    java.lang.ClassCastException: java.lang.Integer cannot be cast to 
java.lang.Long
        at java.lang.Long.compareTo(Long.java:50)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.EqualToExpr.doComparison(EqualToExpr.java:116)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.EqualToExpr.getNext(EqualToExpr.java:83)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.PONot.getNext(PONot.java:71)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.getNext(POFilter.java:148)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:290)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:233)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:290)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POSplit.getNext(POSplit.java:214)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POSplit.runPipeline(POSplit.java:254)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POSplit.processPlan(POSplit.java:236)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POSplit.getNext(POSplit.java:228)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:271)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:266)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
        at org.apache.hadoop.mapred.Child.main(Child.java:260)
{code}

I also tried this snippet (instead of the original `dump`):

{code}
    x = UNION s1,s2;
    y = FOREACH x GENERATE b;
    dump y;
{code}

and I get a different (but I assume related) error:

{code}
    ERROR org.apache.pig.tools.pigstats.SimplePigStats - ERROR: 
java.lang.Double cannot be cast to java.lang.Long
{code}

with the job tracker error (repeated 4 times):

{code}
    java.lang.ClassCastException: java.lang.Double cannot be cast to 
java.lang.Long
        at java.lang.Long.compareTo(Long.java:50)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.GTOrEqualToExpr.doComparison(GTOrEqualToExpr.java:111)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.GTOrEqualToExpr.getNext(GTOrEqualToExpr.java:78)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.PONot.getNext(PONot.java:71)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.getNext(POFilter.java:148)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:290)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:233)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:290)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.getNext(POStore.java:141)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POSplit.runPipeline(POSplit.java:254)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POSplit.processPlan(POSplit.java:236)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POSplit.getNext(POSplit.java:228)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:271)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:266)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
        at org.apache.hadoop.mapred.Child.main(Child.java:260)
{code}

Is this known or a new one? Is there a work around?

  was:
I ran into a very strange issue with one of my pig scripts. I described it in 
this SO: 
http://stackoverflow.com/questions/24047572/strange-cast-error-in-pig-hadoop 
(please don't make me retype it! :)).
Is this known or a new one? Is there a work around?


> Strange cast error with UNION
> -----------------------------
>
>                 Key: PIG-3987
>                 URL: https://issues.apache.org/jira/browse/PIG-3987
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.10.1
>            Reporter: Giovanni Botta
>
> I ran into a very strange issue with one of my pig scripts. I described it in 
> this SO: 
> http://stackoverflow.com/questions/24047572/strange-cast-error-in-pig-hadoop 
> Here it is:
> I have the following script:
> {code}
>     br = LOAD 'cfs:///somefile';
>     SPLIT br INTO s0 IF (sp == 1), not_s0 OTHERWISE;
>     SPLIT not_s0 INTO s1 IF (adp >= 1.0), not_s1 OTHERWISE;
>     SPLIT not_s1 INTO s2 IF (p > 1L), not_s2 OTHERWISE;
>     SPLIT not_s2 INTO s3 IF (s > 0L), s4 OTHERWISE;
>     tmp0 = FOREACH s0 GENERATE b, 'x' as seg;
>     tmp1 = FOREACH s1 GENERATE b, 'y' as seg;
>     tmp2 = FOREACH s2 GENERATE b, 'z' as seg;
>     tmp3 = FOREACH s3 GENERATE b, 'w' as seg;
>     tmp4 = FOREACH s4 GENERATE b, 't' as seg;
>     out = UNION ONSCHEMA tmp0, tmp1, tmp2, tmp3, tmp4;
>     dump out;
> {code}
> Where the file loaded in `br` was generated by a previous Pig script and has 
> an embedded schema (a .pig_schema file):
> {code}
>     describe br
>     br: {b: chararray,p: long,afternoon: long,ddv: long,pa: long,t0002: 
> long,t0204: long,t0406: long,t0608: long,t0810: long,t1012: long,t1214: 
> long,t1416: long,t1618: long,t1820: long,t2022: long,t2200: 
> long,browser_software: chararray,first_timestamp: long,last_timestamp: 
> long,os: chararray,platform: chararray,sp: int,adp: double}
> {code}
> Some irrelevant fields were edited from the above (I can't fully disclose the 
> nature of the data at this time).
> The script fails with the following error:
> {code}
>     ERROR org.apache.pig.tools.pigstats.SimplePigStats - ERROR: 
> java.lang.Integer cannot be cast to java.lang.Long
> {code}
> However, dumping `s0`, `s1`, `s2`, `s3`, `s4` or `tmp0`, `tmp1`, `tmp2` 
> `tmp3`, `tmp4` works flawlessly.
> The Hadoop job tracker shows the following error 4 times:
> {code}
>     java.lang.ClassCastException: java.lang.Integer cannot be cast to 
> java.lang.Long
>       at java.lang.Long.compareTo(Long.java:50)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.EqualToExpr.doComparison(EqualToExpr.java:116)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.EqualToExpr.getNext(EqualToExpr.java:83)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.PONot.getNext(PONot.java:71)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.getNext(POFilter.java:148)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:290)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:233)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:290)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POSplit.getNext(POSplit.java:214)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POSplit.runPipeline(POSplit.java:254)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POSplit.processPlan(POSplit.java:236)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POSplit.getNext(POSplit.java:228)
>       at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:271)
>       at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:266)
>       at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
>       at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>       at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>       at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:415)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
>       at org.apache.hadoop.mapred.Child.main(Child.java:260)
> {code}
> I also tried this snippet (instead of the original `dump`):
> {code}
>     x = UNION s1,s2;
>     y = FOREACH x GENERATE b;
>     dump y;
> {code}
> and I get a different (but I assume related) error:
> {code}
>     ERROR org.apache.pig.tools.pigstats.SimplePigStats - ERROR: 
> java.lang.Double cannot be cast to java.lang.Long
> {code}
> with the job tracker error (repeated 4 times):
> {code}
>     java.lang.ClassCastException: java.lang.Double cannot be cast to 
> java.lang.Long
>       at java.lang.Long.compareTo(Long.java:50)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.GTOrEqualToExpr.doComparison(GTOrEqualToExpr.java:111)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.GTOrEqualToExpr.getNext(GTOrEqualToExpr.java:78)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.PONot.getNext(PONot.java:71)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.getNext(POFilter.java:148)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:290)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:233)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:290)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.getNext(POStore.java:141)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POSplit.runPipeline(POSplit.java:254)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POSplit.processPlan(POSplit.java:236)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POSplit.getNext(POSplit.java:228)
>       at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:271)
>       at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:266)
>       at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
>       at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>       at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>       at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:415)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
>       at org.apache.hadoop.mapred.Child.main(Child.java:260)
> {code}
> Is this known or a new one? Is there a work around?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to