Re: [DISCUSS] Ongoing LZ4 problems with Parquet files

2020-06-29 Thread Wes McKinney
On Thu, Jun 25, 2020 at 3:31 AM Antoine Pitrou wrote: > > > Le 25/06/2020 à 00:02, Wes McKinney a écrit : > > hi folks, > > > > (cross-posting to dev@arrow and dev@parquet since there are > > stakeholders in both places) > > > > It seems there are still problems at least with the C++

[jira] [Commented] (PARQUET-1643) Use airlift non-native implementations for GZIP, LZ0 and LZ4 codecs

2020-06-29 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148060#comment-17148060 ] ASF GitHub Bot commented on PARQUET-1643: - samarthjain commented on pull request #671: URL:

[GitHub] [parquet-mr] samarthjain commented on pull request #671: PARQUET-1643 Use airlift codecs for LZ4, LZ0, GZIP

2020-06-29 Thread GitBox
samarthjain commented on pull request #671: URL: https://github.com/apache/parquet-mr/pull/671#issuecomment-651282816 @nandorKollar, @rdblue, @danielcweeks - if you have cycles, could you please take a look at this PR. This

[jira] [Commented] (PARQUET-1373) Encryption key management tools

2020-06-29 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17147824#comment-17147824 ] ASF GitHub Bot commented on PARQUET-1373: - gszadovszky commented on a change in pull request

[GitHub] [parquet-mr] gszadovszky commented on a change in pull request #615: PARQUET-1373: Encryption key tools

2020-06-29 Thread GitBox
gszadovszky commented on a change in pull request #615: URL: https://github.com/apache/parquet-mr/pull/615#discussion_r446146320 ## File path: parquet-hadoop/src/main/java/org/apache/parquet/crypto/keytools/KeyMaterial.java ## @@ -0,0 +1,166 @@ +/* + * Licensed to the Apache

Announcing ApacheCon @Home 2020

2020-06-29 Thread Rich Bowen
Hi, Apache enthusiast! (You’re receiving this because you’re subscribed to one or more dev or user mailing lists for an Apache Software Foundation project.) The ApacheCon Planners and the Apache Software Foundation are pleased to announce that ApacheCon @Home will be held online, September

[jira] [Commented] (PARQUET-1879) Apache Arrow can not read a Parquet File written with Parqet-Avro 1.11.0 with a Map field

2020-06-29 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17147611#comment-17147611 ] ASF GitHub Bot commented on PARQUET-1879: - maccamlc commented on pull request #798: URL:

[GitHub] [parquet-mr] maccamlc commented on pull request #798: PARQUET-1879 MapKeyValue is not a valid Logical Type

2020-06-29 Thread GitBox
maccamlc commented on pull request #798: URL: https://github.com/apache/parquet-mr/pull/798#issuecomment-651009716 > @maccamlc, > > The main problem I think is that the spec does not say anything about how the thrift objects shall be used. The specification is about the semantics of

[jira] [Commented] (PARQUET-1879) Apache Arrow can not read a Parquet File written with Parqet-Avro 1.11.0 with a Map field

2020-06-29 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17147599#comment-17147599 ] ASF GitHub Bot commented on PARQUET-1879: - gszadovszky commented on pull request #798: URL:

[GitHub] [parquet-mr] gszadovszky commented on pull request #798: PARQUET-1879 MapKeyValue is not a valid Logical Type

2020-06-29 Thread GitBox
gszadovszky commented on pull request #798: URL: https://github.com/apache/parquet-mr/pull/798#issuecomment-650992678 @maccamlc, The main problem I think is that the spec does not say anything about how the thrift objects shall be used. The specification is about the semantics of

[jira] [Commented] (PARQUET-1643) Use airlift non-native implementations for GZIP, LZ0 and LZ4 codecs

2020-06-29 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17147587#comment-17147587 ] ASF GitHub Bot commented on PARQUET-1643: - samarthjain commented on pull request #671: URL:

[GitHub] [parquet-mr] samarthjain commented on pull request #671: PARQUET-1643 Use airlift codecs for LZ4, LZ0, GZIP

2020-06-29 Thread GitBox
samarthjain commented on pull request #671: URL: https://github.com/apache/parquet-mr/pull/671#issuecomment-650971182 @dbtsai > Since airlift is pure Java implementation, what's the performance implications for zstd? I saw there is a benchmark for GZIP, but I don't see benchmark for