Casts to complex types do not work as expected
----------------------------------------------

                 Key: PIG-616
                 URL: https://issues.apache.org/jira/browse/PIG-616
             Project: Pig
          Issue Type: Bug
          Components: impl
    Affects Versions: types_branch
            Reporter: Santhosh Srinivasan
             Fix For: types_branch


When we specify a (complex) type as a column in Pig, the TypeCastInserter 
inserts the appropriate cast for the (complex) type. However, in the 
implementation of POCast.java, when databyte arrays are converted to the 
(complex) types, we invoke the bytesToXXX method. 

For complex types, especially tuples and bags, we do not enforce the typing 
information specified by the user in the AS clause or with the explicit cast 
statement. The implementation solely relies on bytesToXXX to figure out the 
right type.

An example of a query that fails is given below. Wrt the query, the data is a 
single column that is a bag of integers. The user specifies this bag to be a 
bag of chararray. This conversion is allowed in pig but the implementation does 
not perform the actual cast. Instead the bytesToBag is called on the stream. 
The resulting type is a bag of integers and not a bag of chararray. In the 
subsequent statement the user (correctly) assumes that the conversion has been 
performed but in reality it has not been done. At run time when a chararray 
based operation is performed we see a ClassCastException.

The notion of a schema has is absent in the physical operators. The 
schema/fieldSchema in the logical layer has to be passed on to the physical 
layer. The schema can be used to perform additional operations like casting, 
etc.

{code}

grunt> cat bag.data
{(1)}

grunt> a = load 'bag.data' as (b:{t:(c:chararray)});
grunt> b = foreach a generate flatten(b);
grunt> c = foreach b generate CONCAT('Hello ', $0);
grunt> dump c;

2009-01-12 10:44:44,417 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- 0% complete
2009-01-12 10:45:09,439 [main] ERROR 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- Map reduce job failed
2009-01-12 10:45:09,440 [main] ERROR 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- Job failed!
2009-01-12 10:45:09,443 [main] ERROR 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - Error 
message from task (map) 
task_200812151518_9681_m_000000java.lang.ClassCastException: java.lang.Integer 
cannot be cast to java.lang.String
        at org.apache.pig.builtin.StringConcat.exec(StringConcat.java:37)
        at org.apache.pig.builtin.StringConcat.exec(StringConcat.java:31)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:185)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:259)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:271)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:197)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:187)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:175)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.map(PigMapOnly.java:65)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227)
        at 
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)
...

2009-01-12 10:45:09,448 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
1066: Unable to open iterator for alias c

2009-01-12 10:45:09,448 [main] ERROR org.apache.pig.tools.grunt.Grunt - 
org.apache.pig.impl.logicalLayer.FrontendException: Unable to open iterator for 
alias c
        at org.apache.pig.PigServer.openIterator(PigServer.java:426)
        at 
org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:271)
        at 
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:178)
        at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:84)
        at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:72)
        at org.apache.pig.Main.main(Main.java:302)

Caused by: java.io.IOException: Job terminated with anomalous status FAILED
        at org.apache.pig.PigServer.openIterator(PigServer.java:420)
        ... 5 more

{code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to