Re: Parquet File Meta Data & Compatibility

2020-12-04 Thread Antoine Pitrou
On Fri, 4 Dec 2020 11:21:58 -0800 Tim Armstrong wrote: > I probably didn't say it very clearly, but my opinion as a consumer of the > Parquet spec is that the format needs a reset where encodings, logical > types and other metadata that are not widely adopted are removed from the > core spec and

[jira] [Commented] (PARQUET-1872) Add TransCompression command

2020-12-04 Thread Xinli Shang (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17244334#comment-17244334 ] Xinli Shang commented on PARQUET-1872: -- Thanks [~gszadovszky] for working on this! I just created

[jira] [Commented] (PARQUET-1949) Mark Parquet-1872 with note support bloom filter yet

2020-12-04 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17244333#comment-17244333 ] ASF GitHub Bot commented on PARQUET-1949: - shangxinli opened a new pull request #845: URL:

[GitHub] [parquet-mr] shangxinli opened a new pull request #845: PARQUET-1949: Mark Parquet-1872 with note support bloom filter yet

2020-12-04 Thread GitBox
shangxinli opened a new pull request #845: URL: https://github.com/apache/parquet-mr/pull/845 Make sure you have checked _all_ steps below. ### Jira - [ ] My PR addresses the following [Parquet Jira](https://issues.apache.org/jira/browse/PARQUET/) issues and references them

[jira] [Created] (PARQUET-1949) Mark Parquet-1872 with note support bloom filter yet

2020-12-04 Thread Xinli Shang (Jira)
Xinli Shang created PARQUET-1949: Summary: Mark Parquet-1872 with note support bloom filter yet Key: PARQUET-1949 URL: https://issues.apache.org/jira/browse/PARQUET-1949 Project: Parquet

Re: Parquet File Meta Data & Compatibility

2020-12-04 Thread Tim Armstrong
I probably didn't say it very clearly, but my opinion as a consumer of the Parquet spec is that the format needs a reset where encodings, logical types and other metadata that are not widely adopted are removed from the core spec and put in an "experimental" category from which we can later

Re: Parquet File Meta Data & Compatibility

2020-12-04 Thread Tim Armstrong
I think it would be good for the project to define a core set of features that a Parquet implementation must support to be able to correctly read files all written by another compliant writer with the same version. There are then additional extensions like page indices that are not required to

[jira] [Commented] (PARQUET-1666) Remove Unused Modules

2020-12-04 Thread Daniel Dai (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17244209#comment-17244209 ] Daniel Dai commented on PARQUET-1666: - This sounds good to me. I can also put into old branches

[jira] [Commented] (PARQUET-1928) Interpret Parquet INT96 type as FIXED[12] AVRO Schema

2020-12-04 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17244109#comment-17244109 ] ASF GitHub Bot commented on PARQUET-1928: - gszadovszky commented on pull request #831: URL:

[GitHub] [parquet-mr] gszadovszky commented on pull request #831: PARQUET-1928: Interpret Parquet INT96 type as FIXED[12] AVRO Schema

2020-12-04 Thread GitBox
gszadovszky commented on pull request #831: URL: https://github.com/apache/parquet-mr/pull/831#issuecomment-738890468 @anantdamle, thank you for the contribution! I will do it. Just usually wait 24 hours (because of the weekend this time a bit more) to give a chance for others to

[jira] [Commented] (PARQUET-1928) Interpret Parquet INT96 type as FIXED[12] AVRO Schema

2020-12-04 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17244081#comment-17244081 ] ASF GitHub Bot commented on PARQUET-1928: - anantdamle commented on pull request #831: URL:

[GitHub] [parquet-mr] anantdamle commented on pull request #831: PARQUET-1928: Interpret Parquet INT96 type as FIXED[12] AVRO Schema

2020-12-04 Thread GitBox
anantdamle commented on pull request #831: URL: https://github.com/apache/parquet-mr/pull/831#issuecomment-738864269 @gszadovszky Thanks for the approval. sorry for a noob question. What happens next? how does this PR get merged in master?

[jira] [Commented] (PARQUET-1666) Remove Unused Modules

2020-12-04 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17243916#comment-17243916 ] Gabor Szadovszky commented on PARQUET-1666: --- Thanks, [~daijy]. But you also submitted a PR

[jira] [Commented] (PARQUET-1872) Add TransCompression command

2020-12-04 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17243881#comment-17243881 ] Gabor Szadovszky commented on PARQUET-1872: --- [~sha...@uber.com], based on the git logs this

[jira] [Commented] (PARQUET-1948) TransCompressionCommand Inoperable

2020-12-04 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17243879#comment-17243879 ] Gabor Szadovszky commented on PARQUET-1948: --- [~vanhooser], this feature is to be released in