Re: Date and time for next parquet sync

2018-01-29 Thread Lars Volker
Thanks all who replied, I sent an invite for Tuesday. Cheers, Lars On Mon, Jan 29, 2018 at 10:56 AM, Marcel Kornacker wrote: > +1 for Tuesday > > On Mon, Jan 29, 2018 at 4:03 AM, Uwe L. Korn wrote: > > +1, Tuesday to Thursday are ok for me but I would

Re: Parquet DeltaLengthByteArrayDecoder question

2018-01-29 Thread Wes McKinney
Cool. Where is this development happening? Would you like to join the Apache Parquet community? - Wes On Mon, Jan 29, 2018 at 4:20 PM, Ivan Sadikov wrote: > Thanks Wes. It is okay, I fixed the issues, so everything is great. > > We are currently pushing parquet-rs to be

Re: Parquet DeltaLengthByteArrayDecoder question

2018-01-29 Thread Ivan Sadikov
Link: https://github.com/sunchao/parquet-rs I think @sunchao is in Apache Community already, there is an email on the GitHub profile page. We are just trying to bring it up to speed with other Parquet implementations, but there is still a lot of work to do:) Would appreciate any help! Currently

Re: Parquet DeltaLengthByteArrayDecoder question

2018-01-29 Thread Wes McKinney
hi Ivan -- as soon as practical it would be great to import the codebase into the Apache project. We would have to conduct an IP clearance process (http://incubator.apache.org/ip-clearance/) because the code was not developed within the Community (i.e. under Apache process / IP oversight /

Re: Parquet DeltaLengthByteArrayDecoder question

2018-01-29 Thread Ivan Sadikov
Thanks Wes. It is okay, I fixed the issues, so everything is great. We are currently pushing parquet-rs to be feature compatible with parquet-mr and parquet-cpp. On Tue, 30 Jan 2018 at 9:57 AM, Wes McKinney wrote: > hi Ivan -- note that this code has not been actively

Re: Date and time for next parquet sync

2018-01-29 Thread Marcel Kornacker
+1 for Tuesday On Mon, Jan 29, 2018 at 4:03 AM, Uwe L. Korn wrote: > +1, Tuesday to Thursday are ok for me but I would prefer Tuesday this week. > > Uwe > > On Mon, Jan 29, 2018, at 12:54 PM, Zoltan Ivanfi wrote: >> +1 for Tuesday, this week I can't attend on Wednesday. >> >>

Breaking changes in parquet-format without a major version bump

2018-01-29 Thread Zoltan Ivanfi
Hi, I have noticed that the recent addition of new compressions to parquet-format happened in a patch version (in Semantic Versioning terminology). I think this is a problem. Consider a theoretical library (or application) that implemented Parquet according to parquet-format 2.3.0. By

[jira] [Updated] (PARQUET-1202) Add differentiation of nested records with the same name

2018-01-29 Thread Benoit Lacelle (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benoit Lacelle updated PARQUET-1202: Description: Hello, While reading back a Parquet file produced with Spark, it appears

[jira] [Created] (PARQUET-1202) Add differentiation of nested records with the same name

2018-01-29 Thread Benoit Lacelle (JIRA)
Benoit Lacelle created PARQUET-1202: --- Summary: Add differentiation of nested records with the same name Key: PARQUET-1202 URL: https://issues.apache.org/jira/browse/PARQUET-1202 Project: Parquet

Re: Date and time for next parquet sync

2018-01-29 Thread Uwe L. Korn
+1, Tuesday to Thursday are ok for me but I would prefer Tuesday this week. Uwe On Mon, Jan 29, 2018, at 12:54 PM, Zoltan Ivanfi wrote: > +1 for Tuesday, this week I can't attend on Wednesday. > > Zoltan > > On Mon, Jan 29, 2018 at 7:29 AM Lars Volker wrote: > > > I'm good

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-01-29 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343305#comment-16343305 ] ASF GitHub Bot commented on PARQUET-41: --- daedric commented on a change in pull request #432:

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-01-29 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343306#comment-16343306 ] ASF GitHub Bot commented on PARQUET-41: --- daedric commented on a change in pull request #432:

Re: Date and time for next parquet sync

2018-01-29 Thread Zoltan Ivanfi
+1 for Tuesday, this week I can't attend on Wednesday. Zoltan On Mon, Jan 29, 2018 at 7:29 AM Lars Volker wrote: > I'm good with either day. Does anyone prefer Wednesday over Tuesday? > > On Tue, Jan 23, 2018 at 11:27 PM, Gabor Szadovszky < > gabor.szadovs...@cloudera.com>

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-01-29 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343563#comment-16343563 ] ASF GitHub Bot commented on PARQUET-41: --- daedric commented on issue #432: PARQUET-41: Add bloom

[jira] [Commented] (PARQUET-1178) Parquet modular encryption

2018-01-29 Thread Gidon Gershinsky (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343562#comment-16343562 ] Gidon Gershinsky commented on PARQUET-1178: --- Made some changes in wording of the "Meta-data

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-01-29 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343376#comment-16343376 ] ASF GitHub Bot commented on PARQUET-41: --- cjjnjust commented on a change in pull request #432:

Does Parquet use LZOP compression?

2018-01-29 Thread Hao Luo
Hi, I have a question about LZO compression in Parquet file. Does Parquet use LZOP compression? If it uses LZOP, how does it differentiate between LZO and LZOP? In the CompressionCodecName I only see LZO there. How do I find information about which codec to use when decompress a dictionary

[jira] [Commented] (PARQUET-41) Add bloom filters to parquet statistics

2018-01-29 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343497#comment-16343497 ] ASF GitHub Bot commented on PARQUET-41: --- cjjnjust commented on issue #432: PARQUET-41: Add bloom

Re: Parquet DeltaLengthByteArrayDecoder question

2018-01-29 Thread Lars Volker
Hi Ivan, As you likely saw in the other thread, the Parquet developer community will have our bi-weekly sync tomorrow (Tuesday) morning at 9am PST. Everyone is invited to join, even if you only want to listen in. It's a good opportunity to meet a lot of the folks working on various parts of the