Re: Current status of Data Page V2?

2020-10-08 Thread Micah Kornfield
Thanks for the quick reply Ryan. > We only use v1 and it still works well. That said, I'd love to make some > progress on better encodings and finalizing v2 so we can use them! Are there JIRAs or other documentation that is tracking this work? Thanks, Micah On Thu, Oct 8, 2020 at 12:55 PM

Re: Current status of Data Page V2?

2020-10-08 Thread Ryan Blue
While there isn't anything wrong with it, the same challenges have been solved in different ways with v1 pages. The main difference is that v2 pages are broken at record boundaries, and v1 pages weren't guaranteed to be. But, in order to write page indexes near the footer, breaking pages at record

Current status of Data Page V2?

2020-10-08 Thread Micah Kornfield
What is the current status of support for Data Page V2? Is it recommended for production workloads? Thanks, Micah

[GitHub] [parquet-mr] belugabehr commented on a change in pull request #825: PARQUET-1922: Deprecate IOExceptionUtils

2020-10-08 Thread GitBox
belugabehr commented on a change in pull request #825: URL: https://github.com/apache/parquet-mr/pull/825#discussion_r501920811 ## File path: parquet-column/src/main/java/org/apache/parquet/column/values/plain/PlainValuesWriter.java ## @@ -127,7 +127,6 @@ public void reset()

[jira] [Commented] (PARQUET-1922) Deprecate IOExceptionUtils

2020-10-08 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17210372#comment-17210372 ] ASF GitHub Bot commented on PARQUET-1922: - belugabehr commented on a change in pull request

[jira] [Commented] (PARQUET-1922) Deprecate IOExceptionUtils

2020-10-08 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17210369#comment-17210369 ] ASF GitHub Bot commented on PARQUET-1922: - belugabehr commented on a change in pull request

[GitHub] [parquet-mr] belugabehr commented on a change in pull request #825: PARQUET-1922: Deprecate IOExceptionUtils

2020-10-08 Thread GitBox
belugabehr commented on a change in pull request #825: URL: https://github.com/apache/parquet-mr/pull/825#discussion_r501917260 ## File path: parquet-common/src/main/java/org/apache/parquet/bytes/LittleEndianDataOutputStream.java ## @@ -210,8 +207,8 @@ public final void

[jira] [Commented] (PARQUET-1922) Deprecate IOExceptionUtils

2020-10-08 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17210368#comment-17210368 ] ASF GitHub Bot commented on PARQUET-1922: - belugabehr opened a new pull request #825: URL:

[GitHub] [parquet-mr] belugabehr opened a new pull request #825: PARQUET-1922: Deprecate IOExceptionUtils

2020-10-08 Thread GitBox
belugabehr opened a new pull request #825: URL: https://github.com/apache/parquet-mr/pull/825 Make sure you have checked _all_ steps below. ### Jira - [X] My PR addresses the following [PARQUET-1922](https://issues.apache.org/jira/browse/PARQUET-1922) issues and references

[jira] [Created] (PARQUET-1922) Deprecate IOExceptionUtils

2020-10-08 Thread David Mollitor (Jira)
David Mollitor created PARQUET-1922: --- Summary: Deprecate IOExceptionUtils Key: PARQUET-1922 URL: https://issues.apache.org/jira/browse/PARQUET-1922 Project: Parquet Issue Type: Improvement

[jira] [Created] (PARQUET-1921) Use StringBuilder instead of StringBuffer

2020-10-08 Thread David Mollitor (Jira)
David Mollitor created PARQUET-1921: --- Summary: Use StringBuilder instead of StringBuffer Key: PARQUET-1921 URL: https://issues.apache.org/jira/browse/PARQUET-1921 Project: Parquet Issue

[jira] [Commented] (PARQUET-1920) Fix issue with reading parquet files with too large column chunks

2020-10-08 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17210256#comment-17210256 ] ASF GitHub Bot commented on PARQUET-1920: - gszadovszky commented on a change in pull request

[GitHub] [parquet-mr] gszadovszky commented on a change in pull request #824: PARQUET-1920: Fix Parquet writer's memory check interval calculation

2020-10-08 Thread GitBox
gszadovszky commented on a change in pull request #824: URL: https://github.com/apache/parquet-mr/pull/824#discussion_r501773921 ## File path: parquet-common/src/main/java/org/apache/parquet/bytes/BytesInput.java ## @@ -215,6 +215,13 @@ public static BytesInput copy(BytesInput

[jira] [Resolved] (PARQUET-1824) [C++] Fix crashes on invalid input (OSS-Fuzz)

2020-10-08 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou resolved PARQUET-1824. - Resolution: Fixed > [C++] Fix crashes on invalid input (OSS-Fuzz) >