Re: hadoop LZ4 incompatible with open source LZ4

2018-08-08 Thread ALeX Wang
@Wes Okay I think I figured it out why I could not read LZ4 encoded parquet file generated by parquet-mr. Turns out hadoop LZ4 has its own framing format. I summarized details in the JIRA ticket you posted: https://issues.apache.org/jira/browse/PARQUET-1241 Thanks, Alex Wang, On Tue, 7 Aug

[jira] [Comment Edited] (PARQUET-1241) Use LZ4 frame format

2018-08-08 Thread Alex Wang (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16574328#comment-16574328 ] Alex Wang edited comment on PARQUET-1241 at 8/9/18 5:57 AM: Hi, I and

[jira] [Comment Edited] (PARQUET-1241) Use LZ4 frame format

2018-08-08 Thread Alex Wang (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16574328#comment-16574328 ] Alex Wang edited comment on PARQUET-1241 at 8/9/18 5:56 AM: Hi, I and

[jira] [Comment Edited] (PARQUET-1241) Use LZ4 frame format

2018-08-08 Thread Alex Wang (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16574328#comment-16574328 ] Alex Wang edited comment on PARQUET-1241 at 8/9/18 5:56 AM: Hi, I and

[jira] [Commented] (PARQUET-1241) Use LZ4 frame format

2018-08-08 Thread Alex Wang (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16574328#comment-16574328 ] Alex Wang commented on PARQUET-1241: Hi, I and [~wesmckinn] discussed the issue on parquet-cpp

[jira] [Commented] (PARQUET-1352) [CPP] Trying to write an arrow table with structs to a parquet file

2018-08-08 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16573720#comment-16573720 ] Wes McKinney commented on PARQUET-1352: --- Either you can contribute to the nested data support

Re: hadoop LZ4 incompatible with open source LZ4

2018-08-08 Thread ALeX Wang
Hi Wes, Thanks again for the pointers. During investigation I noticed this possible bug, The private variables 'min_', 'max_' in 'TypedRowGroupStatistics' class is not initialized in constructor. And I got an abort while trying to read a column using 'parquet_reader'. And gdb breakpoint at

Date and time for next Parquet sync

2018-08-08 Thread Nandor Kollar
Hi All, It has been a while since we had a Parquet sync, therefore I'd like to propose to have one next week on August 15th, at 6pm CET / 9 am PST. I'll send a meeting invite with the details soon, let me know if this time is not suitable for you! Since the last sync there are couple of topics

[jira] [Created] (PARQUET-1373) Encryption key management tools

2018-08-08 Thread Gidon Gershinsky (JIRA)
Gidon Gershinsky created PARQUET-1373: - Summary: Encryption key management tools Key: PARQUET-1373 URL: https://issues.apache.org/jira/browse/PARQUET-1373 Project: Parquet Issue Type: