[ 
https://issues.apache.org/jira/browse/AVRO-806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13263424#comment-13263424
 ] 

Doug Cutting commented on AVRO-806:
-----------------------------------

Raymie, thanks for your thoughts.

I agree that benchmarks are needed.  The best benchmarks are real applications. 
 I provided code that folks can try now in their MapReduce applications.  I 
have not yet had a chance to integrate this with Hive by writing a SerDe, but 
that is an obvious next step.  (I've never written a SerDe.  If someone else 
has perhaps they can help.)

Do you have any datasets or queries that you'd like to propose as benchmarks?

I'll work on better documenting the type mapping for Avro, since that's been 
implemented.

I'll be on offline next week and won't be able to work more on this (or 
respond) until the week after.
                
> add a column-major codec for data files
> ---------------------------------------
>
>                 Key: AVRO-806
>                 URL: https://issues.apache.org/jira/browse/AVRO-806
>             Project: Avro
>          Issue Type: New Feature
>          Components: java, spec
>            Reporter: Doug Cutting
>            Assignee: Doug Cutting
>             Fix For: 1.7.0
>
>         Attachments: AVRO-806-v2.patch, AVRO-806.patch, avro-file-columnar.pdf
>
>
> Define a codec that, when a data file's schema is a record schema, writes 
> blocks within the file in column-major order.  This would permit better 
> compression and also permit efficient skipping of fields that are not of 
> interest.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to