[ https://issues.apache.org/jira/browse/CRUNCH-450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14083166#comment-14083166 ]
Zhong Wang commented on CRUNCH-450: ----------------------------------- Hi Josh, 1) tupleDerived functions work as wrappers if there is only a single object to serialize to OrcStruct (basic unit for orc serialization). The functions will wrap the single object with a tuple, and use TupleObjectInspector to serialize it to OrcStruct. Collections and maps are the same as other primitives except that they need a real deep copier 2) I am not sure what you mean by executing a MapReduce job purely in OrcTypes. We can combine different type families in a single MapReduce job, such as using OrcTypes as input, Writables as shuffle and Avros as the output. The OrcTypeFamily exists to serialize/deserialize java objects to/from OrcStructs > Adding ORC file format support in Crunch > ---------------------------------------- > > Key: CRUNCH-450 > URL: https://issues.apache.org/jira/browse/CRUNCH-450 > Project: Crunch > Issue Type: New Feature > Components: Core, IO > Reporter: Zhong Wang > Assignee: Josh Wills > Fix For: 0.11.0 > > Attachments: CRUNCH-450-submodule.1.patch, > CRUNCH-450-submodule.2.patch, CRUNCH-450-submodule.patch, CRUNCH-450.patch > > > This JIRA adds ORC file format support in Crunch by: > -- > 1. Adding input source and output target for ORC > 2. Adding a new type family - OrcTypeFamily to serialize / deserialize > objects into OrcStruct > 3. Supporting column pruning optimization -- This message was sent by Atlassian JIRA (v6.2#6252)