[ 
https://issues.apache.org/jira/browse/PARQUET-420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15093537#comment-15093537
 ] 

Nicolas Romanetti commented on PARQUET-420:
-------------------------------------------

Thanks [~rdblue] for this tip!
My first goal was to try out parquet on few POJO of an existing application.
I wanted to check if using Parquet file would save some disk space compare to 
our current custom file format (which is not column based).

On the README page, just after Avro and Thrift section, there is a "Create your 
own objects" section. This made me think that it is apparently OK to simply do 
not use Avro/Thrift or protoc. So I started to investigate that "manual" way.

And I confirm, I could feel there was something awkward while implementing it.
(writing is quite natural, reading is more complex)

Performance is a key factor of our current application but my initial goal was 
to check disk space. So you are right, using Avro reflection was the right way 
to go and documentation may probably be improved in that regard.

> Provide example to write/save your own object (without thrift, avro, protoc)
> ----------------------------------------------------------------------------
>
>                 Key: PARQUET-420
>                 URL: https://issues.apache.org/jira/browse/PARQUET-420
>             Project: Parquet
>          Issue Type: Wish
>            Reporter: Nicolas Romanetti
>
> I am studying parquet and found that it is not so easy to grasp the basic 
> mechanism for writing/reading your own objects to a parquet file when you are 
> not using protocol buffer, avro or thrift.
> So my wish is to have a module that cover this topic.
> I think for educational purposes it may have some value as it would help 
> people get into the code.
> I did the exercice, the code is here:
> https://github.com/nromanetti/parquet-mr/tree/master/parquet-manual



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to