[jira] [Commented] (CRUNCH-450) Adding ORC file format support in Crunch

Josh Wills (JIRA) Fri, 01 Aug 2014 15:02:28 -0700

    [ 
https://issues.apache.org/jira/browse/CRUNCH-450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14083025#comment-14083025
 ]


Josh Wills commented on CRUNCH-450:
-----------------------------------

So a couple of questions as I peruse this:

1) How exactly does the tupleDerived stuff in the OrcTypeFamily work? 
Especially for collections and maps?
2) Is there any sense in which I could (or would want to) execute a MapReduce 
job purely in terms of OrcTypes for serialization? If so, could we add an 
integration test to that effect? Or is the intent that the TypeFamily primarily 
exists for expressing IO operations to ORC data files?


> Adding ORC file format support in Crunch
> ----------------------------------------
>
>                 Key: CRUNCH-450
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-450
>             Project: Crunch
>          Issue Type: New Feature
>          Components: Core, IO
>            Reporter: Zhong Wang
>            Assignee: Josh Wills
>             Fix For: 0.11.0
>
>         Attachments: CRUNCH-450-submodule.1.patch, 
> CRUNCH-450-submodule.2.patch, CRUNCH-450-submodule.patch, CRUNCH-450.patch
>
>
> This JIRA adds ORC file format support in Crunch by:
> --
> 1. Adding input source and output target for ORC
> 2. Adding a new type family - OrcTypeFamily to serialize / deserialize 
> objects into OrcStruct
> 3. Supporting column pruning optimization



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CRUNCH-450) Adding ORC file format support in Crunch

Reply via email to