[ 
https://issues.apache.org/jira/browse/PIG-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pradeep Kamath updated PIG-1371:
--------------------------------

    Attachment: PIG-1371-partial.patch

partial patch - attaching here for future reference

> Pig should handle deep casting of complex types 
> ------------------------------------------------
>
>                 Key: PIG-1371
>                 URL: https://issues.apache.org/jira/browse/PIG-1371
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Pradeep Kamath
>             Fix For: 0.8.0
>
>         Attachments: PIG-1371-partial.patch
>
>
> Consider input data in BinStorage format which has a field of bag type - 
> bg:{t:(i:int)}. In the load statement if the schema specified has the type 
> for this field specified as bg:{t:(c:chararray)}, the current behavior is 
> that Pig thinks of the field to be of type specified in the load statement 
> (bg:{t:(c:chararray)}) but no deep cast from bag of int (the real data) to 
> bag of chararray (the user specified schema) is made.
> There are two issues currently:
> 1) The TypeCastInserter only considers the byte 'type' between the loader 
> presented schema and user specified schema to decided whether to introduce a 
> cast or not. In the above case since both schema have the type "bag" no cast 
> is inserted. This check has to be extended to consider the full FieldSchema 
> (with inner subschema) in order to decide whether a cast is needed.
> 2) POCast should be changed to handle casting a complex type to the type 
> specified the user supplied FieldSchema. Here is there is one issue to be 
> considered - if the user specified the cast type to be bg:{t:(i:int, j:int)} 
> and the real data had only one field what should the result of the cast be:
>  * A bag with two fields - the int field and a null? - In this approach pig 
> is assuming the lone field in the data is the first field which might be 
> incorrect if it in fact is the second field.
>  * A null bag to indicate that the bag is of unknown value - this is the one 
> I personally prefer
>  * The cast throws an IncompatibleCastException

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to