[jira] [Commented] (PARQUET-1968) FilterApi support In predicate

2021-08-31 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17407733#comment-17407733 ] ASF GitHub Bot commented on PARQUET-1968: - huaxingao commented on a change in pull request

[GitHub] [parquet-mr] huaxingao commented on a change in pull request #923: [PARQUET-1968] FilterApi support In predicate

2021-08-31 Thread GitBox
huaxingao commented on a change in pull request #923: URL: https://github.com/apache/parquet-mr/pull/923#discussion_r699767598 ## File path: parquet-column/src/main/java/org/apache/parquet/filter2/predicate/Operators.java ## @@ -247,6 +250,80 @@ public int hashCode() { }

[GitHub] [parquet-mr] huaxingao commented on a change in pull request #923: [PARQUET-1968] FilterApi support In predicate

2021-08-31 Thread GitBox
huaxingao commented on a change in pull request #923: URL: https://github.com/apache/parquet-mr/pull/923#discussion_r699767560 ## File path: parquet-column/src/main/java/org/apache/parquet/internal/column/columnindex/ColumnIndexBuilder.java ## @@ -287,6 +291,27 @@ boolean

[jira] [Commented] (PARQUET-1968) FilterApi support In predicate

2021-08-31 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17407732#comment-17407732 ] ASF GitHub Bot commented on PARQUET-1968: - huaxingao commented on a change in pull request

[jira] [Commented] (PARQUET-2083) Expose getFieldPath from ColumnIO

2021-08-31 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17407615#comment-17407615 ] ASF GitHub Bot commented on PARQUET-2083: - sunchao opened a new pull request #926: URL:

[GitHub] [parquet-mr] sunchao opened a new pull request #926: PARQUET-2083: Expose getFieldPath from ColumnIO

2021-08-31 Thread GitBox
sunchao opened a new pull request #926: URL: https://github.com/apache/parquet-mr/pull/926 Make sure you have checked _all_ steps below. ### Jira - [ ] My PR addresses the following [Parquet Jira](https://issues.apache.org/jira/browse/PARQUET/) issues and references them in

[jira] [Created] (PARQUET-2083) Expose getFieldPath from ColumnIO

2021-08-31 Thread Chao Sun (Jira)
Chao Sun created PARQUET-2083: - Summary: Expose getFieldPath from ColumnIO Key: PARQUET-2083 URL: https://issues.apache.org/jira/browse/PARQUET-2083 Project: Parquet Issue Type: Improvement

[jira] [Commented] (PARQUET-1968) FilterApi support In predicate

2021-08-31 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17407610#comment-17407610 ] ASF GitHub Bot commented on PARQUET-1968: - viirya commented on a change in pull request #923:

[GitHub] [parquet-mr] viirya commented on a change in pull request #923: [PARQUET-1968] FilterApi support In predicate

2021-08-31 Thread GitBox
viirya commented on a change in pull request #923: URL: https://github.com/apache/parquet-mr/pull/923#discussion_r699605818 ## File path: parquet-column/src/main/java/org/apache/parquet/filter2/predicate/Operators.java ## @@ -247,6 +250,80 @@ public int hashCode() { }

[jira] [Commented] (PARQUET-1968) FilterApi support In predicate

2021-08-31 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17407608#comment-17407608 ] ASF GitHub Bot commented on PARQUET-1968: - viirya commented on a change in pull request #923:

[jira] [Commented] (PARQUET-1968) FilterApi support In predicate

2021-08-31 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17407606#comment-17407606 ] ASF GitHub Bot commented on PARQUET-1968: - viirya commented on a change in pull request #923:

[GitHub] [parquet-mr] viirya commented on a change in pull request #923: [PARQUET-1968] FilterApi support In predicate

2021-08-31 Thread GitBox
viirya commented on a change in pull request #923: URL: https://github.com/apache/parquet-mr/pull/923#discussion_r699604499 ## File path: parquet-column/src/main/java/org/apache/parquet/filter2/predicate/Operators.java ## @@ -247,6 +250,80 @@ public int hashCode() { }

[GitHub] [parquet-mr] viirya commented on a change in pull request #923: [PARQUET-1968] FilterApi support In predicate

2021-08-31 Thread GitBox
viirya commented on a change in pull request #923: URL: https://github.com/apache/parquet-mr/pull/923#discussion_r699604019 ## File path: parquet-column/src/main/java/org/apache/parquet/filter2/predicate/Operators.java ## @@ -247,6 +250,80 @@ public int hashCode() { }

Re: Any Parquet implementations might be impacted by PARQUET-2078

2021-08-31 Thread Chao Sun
Thanks Gabor. The Spark community is in the process of releasing Spark 3.2.0 with Parquet 1.12. Any idea when a new release will be available with the fix? we may need to hold off the Spark release for that. Chao On Mon, Aug 30, 2021 at 6:31 AM Gabor Szadovszky wrote: > It turned out that

[jira] [Commented] (PARQUET-2078) Failed to read parquet file after writing with the same parquet version

2021-08-31 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17407450#comment-17407450 ] ASF GitHub Bot commented on PARQUET-2078: - ggershinsky commented on pull request #925: URL:

[GitHub] [parquet-mr] ggershinsky commented on pull request #925: PARQUET-2078: Failed to read parquet file after writing with the same …

2021-08-31 Thread GitBox
ggershinsky commented on pull request #925: URL: https://github.com/apache/parquet-mr/pull/925#issuecomment-908443796 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[jira] [Commented] (PARQUET-2080) Deprecate RowGroup.file_offset

2021-08-31 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17407443#comment-17407443 ] ASF GitHub Bot commented on PARQUET-2080: - gszadovszky opened a new pull request #178: URL:

[jira] [Commented] (PARQUET-1968) FilterApi support In predicate

2021-08-31 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17407440#comment-17407440 ] ASF GitHub Bot commented on PARQUET-1968: - huaxingao commented on pull request #923: URL:

[GitHub] [parquet-mr] huaxingao commented on pull request #923: [PARQUET-1968] FilterApi support In predicate

2021-08-31 Thread GitBox
huaxingao commented on pull request #923: URL: https://github.com/apache/parquet-mr/pull/923#issuecomment-908849008 @gszadovszky @shangxinli @dbtsai Thank you all very much for reviewing! I have changed the code to generate the visit methods for in/notIn and also added the default by

[jira] [Commented] (PARQUET-2078) Failed to read parquet file after writing with the same parquet version

2021-08-31 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17407429#comment-17407429 ] ASF GitHub Bot commented on PARQUET-2078: - gszadovszky commented on a change in pull request

[GitHub] [parquet-mr] gszadovszky commented on a change in pull request #925: PARQUET-2078: Failed to read parquet file after writing with the same …

2021-08-31 Thread GitBox
gszadovszky commented on a change in pull request #925: URL: https://github.com/apache/parquet-mr/pull/925#discussion_r698473966 ## File path: parquet-hadoop/src/test/java/org/apache/parquet/hadoop/TestParquetFileWriter.java ## @@ -239,6 +248,82 @@ public void testWriteRead()

[jira] [Commented] (PARQUET-2078) Failed to read parquet file after writing with the same parquet version

2021-08-31 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17407426#comment-17407426 ] ASF GitHub Bot commented on PARQUET-2078: - loudongfeng commented on pull request #925: URL:

[GitHub] [parquet-mr] loudongfeng commented on pull request #925: PARQUET-2078: Failed to read parquet file after writing with the same …

2021-08-31 Thread GitBox
loudongfeng commented on pull request #925: URL: https://github.com/apache/parquet-mr/pull/925#issuecomment-909000698 FYI,Maybe we can make use of this information : RowGroup[n].file_offset = RowGroup[n-1].file_offset + RowGroup[n-1].total_compressed_size total_compressed_size always

[jira] [Commented] (PARQUET-1950) Define core features / compliance level

2021-08-31 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17407420#comment-17407420 ] ASF GitHub Bot commented on PARQUET-1950: - pitrou commented on a change in pull request #164:

[GitHub] [parquet-format] pitrou commented on a change in pull request #164: PARQUET-1950: Define core features

2021-08-31 Thread GitBox
pitrou commented on a change in pull request #164: URL: https://github.com/apache/parquet-format/pull/164#discussion_r699241790 ## File path: CoreFeatures.md ## @@ -0,0 +1,188 @@ + + +# Parquet Core Features + +This document lists the core features for each parquet-format

[GitHub] [parquet-mr] shangxinli commented on a change in pull request #923: [PARQUET-1968] FilterApi support In predicate

2021-08-31 Thread GitBox
shangxinli commented on a change in pull request #923: URL: https://github.com/apache/parquet-mr/pull/923#discussion_r698598441 ## File path: parquet-column/src/main/java/org/apache/parquet/filter2/predicate/Operators.java ## @@ -247,6 +250,80 @@ public int hashCode() {

[jira] [Commented] (PARQUET-1968) FilterApi support In predicate

2021-08-31 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17407415#comment-17407415 ] ASF GitHub Bot commented on PARQUET-1968: - huaxingao commented on a change in pull request

[jira] [Commented] (PARQUET-1968) FilterApi support In predicate

2021-08-31 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17407416#comment-17407416 ] ASF GitHub Bot commented on PARQUET-1968: - shangxinli commented on a change in pull request

[GitHub] [parquet-mr] huaxingao commented on a change in pull request #923: [PARQUET-1968] FilterApi support In predicate

2021-08-31 Thread GitBox
huaxingao commented on a change in pull request #923: URL: https://github.com/apache/parquet-mr/pull/923#discussion_r698937308 ## File path: parquet-column/src/main/java/org/apache/parquet/filter2/recordlevel/IncrementallyUpdatedFilterPredicate.java ## @@ -123,6 +124,46 @@

[jira] [Commented] (PARQUET-2078) Failed to read parquet file after writing with the same parquet version

2021-08-31 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17407403#comment-17407403 ] ASF GitHub Bot commented on PARQUET-2078: - ggershinsky commented on a change in pull request

[GitHub] [parquet-mr] ggershinsky commented on a change in pull request #925: PARQUET-2078: Failed to read parquet file after writing with the same …

2021-08-31 Thread GitBox
ggershinsky commented on a change in pull request #925: URL: https://github.com/apache/parquet-mr/pull/925#discussion_r699207678 ## File path: parquet-hadoop/src/main/java/org/apache/parquet/format/converter/ParquetMetadataConverter.java ## @@ -1226,12 +1226,25 @@ public

[jira] [Commented] (PARQUET-2078) Failed to read parquet file after writing with the same parquet version

2021-08-31 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17407394#comment-17407394 ] ASF GitHub Bot commented on PARQUET-2078: - shangxinli commented on pull request #925: URL:

[GitHub] [parquet-mr] shangxinli commented on pull request #925: PARQUET-2078: Failed to read parquet file after writing with the same …

2021-08-31 Thread GitBox
shangxinli commented on pull request #925: URL: https://github.com/apache/parquet-mr/pull/925#issuecomment-908428874 @ggershinsky Do you want to have a look? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[jira] [Commented] (PARQUET-2078) Failed to read parquet file after writing with the same parquet version

2021-08-31 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17407389#comment-17407389 ] ASF GitHub Bot commented on PARQUET-2078: - gszadovszky commented on pull request #925: URL:

[GitHub] [parquet-mr] gszadovszky commented on pull request #925: PARQUET-2078: Failed to read parquet file after writing with the same …

2021-08-31 Thread GitBox
gszadovszky commented on pull request #925: URL: https://github.com/apache/parquet-mr/pull/925#issuecomment-908452452 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[jira] [Commented] (PARQUET-1950) Define core features / compliance level

2021-08-31 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-1950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17407267#comment-17407267 ] ASF GitHub Bot commented on PARQUET-1950: - pitrou commented on a change in pull request #164:

[GitHub] [parquet-format] pitrou commented on a change in pull request #164: PARQUET-1950: Define core features

2021-08-31 Thread GitBox
pitrou commented on a change in pull request #164: URL: https://github.com/apache/parquet-format/pull/164#discussion_r699241790 ## File path: CoreFeatures.md ## @@ -0,0 +1,188 @@ + + +# Parquet Core Features + +This document lists the core features for each parquet-format

[jira] [Commented] (PARQUET-2078) Failed to read parquet file after writing with the same parquet version

2021-08-31 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17407243#comment-17407243 ] ASF GitHub Bot commented on PARQUET-2078: - ggershinsky commented on a change in pull request

[GitHub] [parquet-mr] ggershinsky commented on a change in pull request #925: PARQUET-2078: Failed to read parquet file after writing with the same …

2021-08-31 Thread GitBox
ggershinsky commented on a change in pull request #925: URL: https://github.com/apache/parquet-mr/pull/925#discussion_r699207678 ## File path: parquet-hadoop/src/main/java/org/apache/parquet/format/converter/ParquetMetadataConverter.java ## @@ -1226,12 +1226,25 @@ public

[jira] [Commented] (PARQUET-2078) Failed to read parquet file after writing with the same parquet version

2021-08-31 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17407239#comment-17407239 ] ASF GitHub Bot commented on PARQUET-2078: - ggershinsky commented on pull request #925: URL:

[GitHub] [parquet-mr] ggershinsky commented on pull request #925: PARQUET-2078: Failed to read parquet file after writing with the same …

2021-08-31 Thread GitBox
ggershinsky commented on pull request #925: URL: https://github.com/apache/parquet-mr/pull/925#issuecomment-909119450 > FYI,Maybe we can make use of this information : > RowGroup[n].file_offset = RowGroup[n-1].file_offset + RowGroup[n-1].total_compressed_size > total_compressed_size

[jira] [Commented] (PARQUET-2078) Failed to read parquet file after writing with the same parquet version

2021-08-31 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17407230#comment-17407230 ] ASF GitHub Bot commented on PARQUET-2078: - ggershinsky commented on pull request #925: URL:

[GitHub] [parquet-mr] ggershinsky commented on pull request #925: PARQUET-2078: Failed to read parquet file after writing with the same …

2021-08-31 Thread GitBox
ggershinsky commented on pull request #925: URL: https://github.com/apache/parquet-mr/pull/925#issuecomment-909113979 @gszadovszky No problem at all, thank you for helping with this! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[jira] [Commented] (PARQUET-2078) Failed to read parquet file after writing with the same parquet version

2021-08-31 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17407163#comment-17407163 ] ASF GitHub Bot commented on PARQUET-2078: - loudongfeng commented on pull request #925: URL:

[GitHub] [parquet-mr] loudongfeng commented on pull request #925: PARQUET-2078: Failed to read parquet file after writing with the same …

2021-08-31 Thread GitBox
loudongfeng commented on pull request #925: URL: https://github.com/apache/parquet-mr/pull/925#issuecomment-909000698 FYI,Maybe we can make use of this information : RowGroup[n].file_offset = RowGroup[n-1].file_offset + RowGroup[n-1].total_compressed_size total_compressed_size always

[jira] [Commented] (PARQUET-2078) Failed to read parquet file after writing with the same parquet version

2021-08-31 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17407159#comment-17407159 ] ASF GitHub Bot commented on PARQUET-2078: - gszadovszky commented on pull request #925: URL:

[GitHub] [parquet-mr] gszadovszky commented on pull request #925: PARQUET-2078: Failed to read parquet file after writing with the same …

2021-08-31 Thread GitBox
gszadovszky commented on pull request #925: URL: https://github.com/apache/parquet-mr/pull/925#issuecomment-908986790 @ggershinsky, sorry, I've completely missed the fact that `RowGroup.file_offset` is introduced for the encryption feature and it actually required for it. Somehow we shall