[jira] [Commented] (TAJO-30) Parquet Integration

David Chen (JIRA) Sat, 22 Mar 2014 00:39:28 -0700

    [ 
https://issues.apache.org/jira/browse/TAJO-30?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13943956#comment-13943956
 ]


David Chen commented on TAJO-30:
--------------------------------

[~hyunsik], [~jihoonson] Thanks! I look forward to iterating on your feedback.

[~hjkim] Thanks for running the benchmark! Really excited that we're already 
seeing performance gains with Parquet. I'll add the missing property.

[~hsaputra] Agreed. In the meantime, I'm adding some more detailed Javadoc 
comments in the source code and a package-info.java that will give an overview 
of the Parquet support code, including a mapping of Tajo to Parquet data types.

Yes, I have moved my changes to https://github.com/davidzchen/tajo/tree/parquet 
after seeing the great news that Tajo has graduated to a TLP. :) 
Congratulations!

> Parquet Integration
> -------------------
>
>                 Key: TAJO-30
>                 URL: https://issues.apache.org/jira/browse/TAJO-30
>             Project: Tajo
>          Issue Type: New Feature
>            Reporter: Hyunsik Choi
>            Assignee: David Chen
>              Labels: Parquet
>         Attachments: TAJO-30.patch
>
>
> Parquet is a columnar storage format developed by Twitter. Implement Parquet 
> (http://parquet.io/) support for Tajo.
> The implementation consists of the following:
>  * {{ParquetScanner}} and {{ParquetAppender}} - FileScanner and FileAppenders 
> for reading and writing Parquet.
>  * {{TajoParquetReader}} and {{TajoParquetWriter}} - Top-level reader and 
> writer for serializing/deserializing to Tajo Tuples.
>  * {{TajoReadSupport}} and {{TajoWriteSupport}} - Abstractions to perform 
> conversion between Parquet and Tajo records.
>  * {{TajoRecordMaterializer}} - Materializes Tajo Tuples from Parquet's 
> internal representation.
>  * {{TajoRecordConverter}} - Used by {{TajoRecordMateriailzer}} to 
> materialize a Tajo Tuple.
>  * {{TajoSchemaConverter}} - Converts between Tajo and Parquet schemas.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (TAJO-30) Parquet Integration

Reply via email to