[jira] [Commented] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly

2023-02-06 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17684522#comment-17684522 ] ASF GitHub Bot commented on PARQUET-2237: - wgtmac commented on PR #1023: URL:

[GitHub] [parquet-mr] wgtmac commented on pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly

2023-02-06 Thread via GitHub
wgtmac commented on PR #1023: URL: https://github.com/apache/parquet-mr/pull/1023#issuecomment-1418711576 Unfortunately we cannot modify the signature of any public methods. My suggestion was to make the new enum serves as an internal state of the visitor (and probably use it to terminate

[GitHub] [parquet-mr] wgtmac commented on a diff in pull request #1022: PARQUET-831: fix estimate page size check overflow corrupting parquet

2023-02-06 Thread via GitHub
wgtmac commented on code in PR #1022: URL: https://github.com/apache/parquet-mr/pull/1022#discussion_r1097088924 ## parquet-column/src/test/java/org/apache/parquet/column/impl/TestColumnWriterV1.java: ## @@ -0,0 +1,66 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[jira] [Commented] (PARQUET-831) Corrupt Parquet Files

2023-02-06 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17684524#comment-17684524 ] ASF GitHub Bot commented on PARQUET-831: wgtmac commented on code in PR #1022: URL:

[jira] [Commented] (PARQUET-2159) Parquet bit-packing de/encode optimization

2023-02-06 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17684539#comment-17684539 ] ASF GitHub Bot commented on PARQUET-2159: - jiangjiguang commented on PR #1011: URL:

[jira] [Commented] (PARQUET-2239) Replace log4j1 with reload4j

2023-02-06 Thread Steve Loughran (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17684607#comment-17684607 ] Steve Loughran commented on PARQUET-2239: - good, but trickier than you think as you have to do

[GitHub] [parquet-mr] jiangjiguang commented on pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization

2023-02-06 Thread via GitHub
jiangjiguang commented on PR #1011: URL: https://github.com/apache/parquet-mr/pull/1011#issuecomment-1418781509 @wgtmac I added doc about how big data applications use Java Vector API -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[jira] [Created] (PARQUET-2239) Replace log4j1 with reload4j

2023-02-06 Thread Akshat Mathur (Jira)
Akshat Mathur created PARQUET-2239: -- Summary: Replace log4j1 with reload4j Key: PARQUET-2239 URL: https://issues.apache.org/jira/browse/PARQUET-2239 Project: Parquet Issue Type: Improvement

[jira] [Commented] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly

2023-02-06 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17684798#comment-17684798 ] ASF GitHub Bot commented on PARQUET-2237: - yabola commented on PR #1023: URL:

[GitHub] [parquet-mr] yabola commented on pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly

2023-02-06 Thread via GitHub
yabola commented on PR #1023: URL: https://github.com/apache/parquet-mr/pull/1023#issuecomment-1419363481 retest this please -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [parquet-mr] shangxinli commented on pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly

2023-02-06 Thread via GitHub
shangxinli commented on PR #1023: URL: https://github.com/apache/parquet-mr/pull/1023#issuecomment-1419254754 +1, let's not modify the signature. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[jira] [Commented] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly

2023-02-06 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17684747#comment-17684747 ] ASF GitHub Bot commented on PARQUET-2237: - shangxinli commented on PR #1023: URL:

[GitHub] [parquet-mr] wgtmac commented on a diff in pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization

2023-02-06 Thread via GitHub
wgtmac commented on code in PR #1011: URL: https://github.com/apache/parquet-mr/pull/1011#discussion_r1097538545 ## README.md: ## @@ -83,6 +83,16 @@ Parquet is a very active project, and new features are being added quickly. Here * Column stats * Delta encoding * Index

[jira] [Commented] (PARQUET-2159) Parquet bit-packing de/encode optimization

2023-02-06 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17684756#comment-17684756 ] ASF GitHub Bot commented on PARQUET-2159: - wgtmac commented on code in PR #1011: URL:

[GitHub] [parquet-mr] yabola commented on pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly

2023-02-06 Thread via GitHub
yabola commented on PR #1023: URL: https://github.com/apache/parquet-mr/pull/1023#issuecomment-1419361716 I thought of a way to use Boolean objects to distinguish the different types, It doesn't need to modify the visitor return type now. But it seems Jenkins machine was broken?...

[jira] [Commented] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly

2023-02-06 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17684800#comment-17684800 ] ASF GitHub Bot commented on PARQUET-2237: - yabola commented on PR #1023: URL:

[jira] [Commented] (PARQUET-2159) Parquet bit-packing de/encode optimization

2023-02-06 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17684935#comment-17684935 ] ASF GitHub Bot commented on PARQUET-2159: - dongjoon-hyun commented on code in PR #1011: URL:

[GitHub] [parquet-mr] dongjoon-hyun commented on a diff in pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization

2023-02-06 Thread via GitHub
dongjoon-hyun commented on code in PR #1011: URL: https://github.com/apache/parquet-mr/pull/1011#discussion_r1097900504 ## README.md: ## @@ -83,6 +83,16 @@ Parquet is a very active project, and new features are being added quickly. Here * Column stats * Delta encoding *

[jira] [Commented] (PARQUET-2159) Parquet bit-packing de/encode optimization

2023-02-06 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17684922#comment-17684922 ] ASF GitHub Bot commented on PARQUET-2159: - sunchao commented on code in PR #1011: URL:

[GitHub] [parquet-mr] sunchao commented on a diff in pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization

2023-02-06 Thread via GitHub
sunchao commented on code in PR #1011: URL: https://github.com/apache/parquet-mr/pull/1011#discussion_r1097875149 ## README.md: ## @@ -83,6 +83,16 @@ Parquet is a very active project, and new features are being added quickly. Here * Column stats * Delta encoding * Index

[jira] [Commented] (PARQUET-831) Corrupt Parquet Files

2023-02-06 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17684828#comment-17684828 ] ASF GitHub Bot commented on PARQUET-831: jianchun commented on code in PR #1022: URL:

[GitHub] [parquet-mr] jianchun commented on a diff in pull request #1022: PARQUET-831: fix estimate page size check overflow corrupting parquet

2023-02-06 Thread via GitHub
jianchun commented on code in PR #1022: URL: https://github.com/apache/parquet-mr/pull/1022#discussion_r1097699497 ## parquet-column/src/test/java/org/apache/parquet/column/impl/TestColumnWriterV1.java: ## @@ -0,0 +1,66 @@ +/* + * Licensed to the Apache Software Foundation

[jira] [Commented] (PARQUET-831) Corrupt Parquet Files

2023-02-06 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17684829#comment-17684829 ] ASF GitHub Bot commented on PARQUET-831: jianchun commented on code in PR #1022: URL:

[GitHub] [parquet-mr] jianchun commented on a diff in pull request #1022: PARQUET-831: fix estimate page size check overflow corrupting parquet

2023-02-06 Thread via GitHub
jianchun commented on code in PR #1022: URL: https://github.com/apache/parquet-mr/pull/1022#discussion_r1097699497 ## parquet-column/src/test/java/org/apache/parquet/column/impl/TestColumnWriterV1.java: ## @@ -0,0 +1,66 @@ +/* + * Licensed to the Apache Software Foundation

[jira] [Commented] (PARQUET-831) Corrupt Parquet Files

2023-02-06 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17684988#comment-17684988 ] ASF GitHub Bot commented on PARQUET-831: wgtmac commented on code in PR #1022: URL:

[GitHub] [parquet-mr] wgtmac commented on a diff in pull request #1022: PARQUET-831: fix estimate page size check overflow corrupting parquet

2023-02-06 Thread via GitHub
wgtmac commented on code in PR #1022: URL: https://github.com/apache/parquet-mr/pull/1022#discussion_r1098084801 ## parquet-column/src/test/java/org/apache/parquet/column/impl/TestColumnWriterV1.java: ## @@ -0,0 +1,66 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[jira] [Commented] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly

2023-02-06 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17685021#comment-17685021 ] ASF GitHub Bot commented on PARQUET-2237: - yabola commented on PR #1023: URL:

[GitHub] [parquet-mr] yabola commented on pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly

2023-02-06 Thread via GitHub
yabola commented on PR #1023: URL: https://github.com/apache/parquet-mr/pull/1023#issuecomment-1420123553 @wgtmac @shangxinli I thought of a way to avoid interface modification and distinguish by Boolean objects. Please take a look -- This is an automated message from the Apache Git

[jira] [Commented] (PARQUET-2159) Parquet bit-packing de/encode optimization

2023-02-06 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17685075#comment-17685075 ] ASF GitHub Bot commented on PARQUET-2159: - jiangjiguang commented on code in PR #1011: URL:

[GitHub] [parquet-mr] jiangjiguang commented on a diff in pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization

2023-02-06 Thread via GitHub
jiangjiguang commented on code in PR #1011: URL: https://github.com/apache/parquet-mr/pull/1011#discussion_r1098239530 ## parquet-column/src/main/java/org/apache/parquet/column/values/bitpacking/ParquetReadRouter.java: ## @@ -0,0 +1,133 @@ +/* + * Licensed to the Apache

[jira] [Commented] (PARQUET-2159) Parquet bit-packing de/encode optimization

2023-02-06 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17685077#comment-17685077 ] ASF GitHub Bot commented on PARQUET-2159: - jiangjiguang commented on code in PR #1011: URL:

[GitHub] [parquet-mr] jiangjiguang commented on a diff in pull request #1011: PARQUET-2159: java17 vector parquet bit-packing decode optimization

2023-02-06 Thread via GitHub
jiangjiguang commented on code in PR #1011: URL: https://github.com/apache/parquet-mr/pull/1011#discussion_r1098236516 ## README.md: ## @@ -83,6 +83,16 @@ Parquet is a very active project, and new features are being added quickly. Here * Column stats * Delta encoding *