[ 
https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996845#comment-13996845
 ] 

Daniel Dai commented on PIG-3558:
---------------------------------

bq. I don't see hive binary being any different than pig bytearray
Technically it is different. Pig bytearray means unknown data type. Consider 
the following script & UDF:
{code}
public class MapGenerate extends EvalFunc<Map> {
    @Override
    public Map exec(Tuple input) throws IOException {
        // TODO Auto-generated method stub
        Map m = new HashMap();
        m.put("key", new Integer(input.size()));
        return m;
    }
    
    @Override
    public Schema outputSchema(Schema input) {
        return new Schema(new Schema.FieldSchema(null, DataType.MAP));
    }
}
{code}
{code}
a = load '1.txt' as (a0);
b = foreach a generate a0, MapGenerate(*) as m:map[];
c = group c by key;
dump c;
{code}
The group key will be of data type bytearray (since it is unknown), and the map 
key is NullableBytesWritable. NullableBytesWritable takes any Object instead of 
just DataByteArray to accommodate this case.

It is possible we map Pig bytearray to binary, but must deal with the fact that 
the data may not be DataByteArray.

> ORC support for Pig
> -------------------
>
>                 Key: PIG-3558
>                 URL: https://issues.apache.org/jira/browse/PIG-3558
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>              Labels: porc
>             Fix For: 0.13.0
>
>         Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch, 
> PIG-3558-4.patch, PIG-3558-5.patch
>
>
> Adding LoadFunc and StoreFunc for ORC.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to