Hello,
due to me being in Europe, this is a very inconvenient time. Thus I
rather write a longer mail instead of joining. As a bit of input, here
is what I'm up to at the moment:
* Write support in a basic form for parquet-cpp (no compression, fixed
encodings, excessive memory usage, ..) is nearly done. I hope to open
the final PR for discussion next week.
* Remaining Tasks until I make the PR:
* a bit of code cleanup
* Going through the API again to make it consistent
* Metadata for RowGroups and ColumnChunks
Afterwards I would look into one of the following tasks w.r.t. parquet-cpp:
* WriterProperties to specify compression, encoding, .. on a global
and per-column basis.
* Performance benchmarks for Write
* Integration of Parquet support in Apache Arrow to use it with Python
* Reduce the memory usage of the initial Writer implementation
(therefore we probably need to extend the encoders a bit)
If anyone else also looks into this, I'm happy to collaborate ;)
Cheers
Uwe
On 21.04.16 00:51, Julien Le Dem wrote:
It is happening at 4pm PT on google hangout
https://plus.google.com/hangouts/_/event/parquet_sync_up