Hongchang Li created PIG-3938:
---------------------------------

             Summary: Type cast doesn't work after flatten result of UDF
                 Key: PIG-3938
                 URL: https://issues.apache.org/jira/browse/PIG-3938
             Project: Pig
          Issue Type: Bug
          Components: internal-udfs
    Affects Versions: 0.11.1, 0.12.0
            Reporter: Hongchang Li


this ticket was very close to 
http://stackoverflow.com/questions/8828839/how-can-correct-data-types-on-apache-pig-be-enforced.
To reproduce the issue, first, we have an UDF to cast map to bag, code almost 
like(http://stackoverflow.com/questions/12476929/group-key-value-of-map-in-pig?answertab=votes#tab-top)

{code:title=test.pig}
$ cat test.pig
register polisan/maptobag.jar;
define MAPTOBAG maptobag.MAPTOBAG();
A = load 'polisan/input1.txt' using PigStorage(' ') as (id:chararray, kv:[]);
B = foreach A generate id, MAPTOBAG(kv) as to_bag;
C = foreach B generate id, flatten(to_bag) as (key:chararray, value:chararray);
D = group C by (id, key);
E = foreach D generate group, MIN(C.value);
dump E;
{code}

{code:title=polisan/input1.pig}
1 [x#1,y#ab]
1 [x#2,y#cd]
{code}

then run the pig, I got exception as following:
{noformat}
2014-05-15 19:44:52,944 [Thread-2] WARN  
org.apache.hadoop.mapred.LocalJobRunner - job_local_0001
org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while 
executing (Name: D: Local Rearrange[tuple]{tuple}(false) - scope-42 Operator 
Key: scope-42): org.apache.pig.backend.executionengine.ExecException: ERROR 
2106: Error while computing min in Initial
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:289)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNextTuple(POLocalRearrange.java:263)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:282)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:277)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:1)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
        at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2106: 
Error while computing min in Initial
        at org.apache.pig.builtin.StringMin$Initial.exec(StringMin.java:81)
        at org.apache.pig.builtin.StringMin$Initial.exec(StringMin.java:1)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:352)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNextTuple(POUserFunc.java:391)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:334)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:378)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:298)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:281)
        ... 8 more
Caused by: java.lang.ClassCastException: org.apache.pig.data.DataByteArray 
cannot be cast to java.lang.String
        at org.apache.pig.builtin.StringMin$Initial.exec(StringMin.java:73)
        ... 15 more
{noformat}






--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to