[jira] [Commented] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly

2023-02-05 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17684360#comment-17684360 ] ASF GitHub Bot commented on PARQUET-2237: - yabola commented on PR #1023: URL:

[GitHub] [parquet-mr] yabola commented on pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly

2023-02-05 Thread via GitHub
yabola commented on PR #1023: URL: https://github.com/apache/parquet-mr/pull/1023#issuecomment-1418302885 It seems code compatibility check error caused by modifying visitor return value... can we remove that restrictions ? or I should keep the code compatibility and add new flag to mark

[jira] [Commented] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly

2023-02-05 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17684358#comment-17684358 ] ASF GitHub Bot commented on PARQUET-2237: - yabola commented on PR #1023: URL:

[GitHub] [parquet-mr] yabola commented on pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly

2023-02-05 Thread via GitHub
yabola commented on PR #1023: URL: https://github.com/apache/parquet-mr/pull/1023#issuecomment-1418295634 > It seems code compatibility check error caused by modifying visitor return value... can we remove that restrictions ? I will add more UT later If we have to keep the code

[jira] [Commented] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly

2023-02-05 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17684314#comment-17684314 ] ASF GitHub Bot commented on PARQUET-2237: - yabola commented on PR #1023: URL:

[GitHub] [parquet-mr] yabola commented on pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly

2023-02-05 Thread via GitHub
yabola commented on PR #1023: URL: https://github.com/apache/parquet-mr/pull/1023#issuecomment-1418012997 It seems code compatibility check error caused by modifying visitor return value... -- This is an automated message from the Apache Git Service. To respond to the message, please log

[jira] [Commented] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly

2023-02-05 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17684313#comment-17684313 ] ASF GitHub Bot commented on PARQUET-2237: - wgtmac commented on PR #1023: URL:

[GitHub] [parquet-mr] wgtmac commented on pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly

2023-02-05 Thread via GitHub
wgtmac commented on PR #1023: URL: https://github.com/apache/parquet-mr/pull/1023#issuecomment-1418001192 cc @gszadovszky @ggershinsky @shangxinli -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[jira] [Commented] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly

2023-02-05 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17684311#comment-17684311 ] ASF GitHub Bot commented on PARQUET-2237: - yabola commented on PR #1023: URL:

[GitHub] [parquet-mr] yabola commented on pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly

2023-02-05 Thread via GitHub
yabola commented on PR #1023: URL: https://github.com/apache/parquet-mr/pull/1023#issuecomment-1417985954 please also cc @gszadovszky if you have time , I will add more UT and improve my codes. -- This is an automated message from the Apache Git Service. To respond to the message, please

[jira] [Commented] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly

2023-02-05 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17684270#comment-17684270 ] ASF GitHub Bot commented on PARQUET-2237: - yabola commented on code in PR #1023: URL:

[GitHub] [parquet-mr] yabola commented on a diff in pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly

2023-02-05 Thread via GitHub
yabola commented on code in PR #1023: URL: https://github.com/apache/parquet-mr/pull/1023#discussion_r1096651016 ## parquet-hadoop/src/main/java/org/apache/parquet/filter2/compat/RowGroupFilter.java: ## @@ -98,16 +99,19 @@ public List visit(FilterCompat.FilterPredicateCompat

[jira] [Commented] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly

2023-02-05 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17684268#comment-17684268 ] ASF GitHub Bot commented on PARQUET-2237: - yabola commented on code in PR #1023: URL:

[GitHub] [parquet-mr] yabola commented on a diff in pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly

2023-02-05 Thread via GitHub
yabola commented on code in PR #1023: URL: https://github.com/apache/parquet-mr/pull/1023#discussion_r1096651016 ## parquet-hadoop/src/main/java/org/apache/parquet/filter2/compat/RowGroupFilter.java: ## @@ -98,16 +99,19 @@ public List visit(FilterCompat.FilterPredicateCompat

[jira] [Commented] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly

2023-02-05 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17684267#comment-17684267 ] ASF GitHub Bot commented on PARQUET-2237: - yabola commented on code in PR #1023: URL:

[GitHub] [parquet-mr] yabola commented on a diff in pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly

2023-02-05 Thread via GitHub
yabola commented on code in PR #1023: URL: https://github.com/apache/parquet-mr/pull/1023#discussion_r1096650875 ## parquet-hadoop/src/test/java/org/apache/parquet/filter2/dictionarylevel/DictionaryFilterTest.java: ## @@ -792,6 +793,16 @@ public void

[jira] [Commented] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly

2023-02-05 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17684266#comment-17684266 ] ASF GitHub Bot commented on PARQUET-2237: - yabola commented on code in PR #1023: URL:

[GitHub] [parquet-mr] yabola commented on a diff in pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly

2023-02-05 Thread via GitHub
yabola commented on code in PR #1023: URL: https://github.com/apache/parquet-mr/pull/1023#discussion_r1096650696 ## parquet-hadoop/src/main/java/org/apache/parquet/filter2/compat/RowGroupFilter.java: ## @@ -98,16 +99,19 @@ public List visit(FilterCompat.FilterPredicateCompat

[jira] [Updated] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly

2023-02-05 Thread Mars (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mars updated PARQUET-2237: -- Description: If we can accurately judge by the minMax status, we don’t need to load the dictionary from

[jira] [Updated] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly

2023-02-05 Thread Mars (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mars updated PARQUET-2237: -- Description: Bloomfilter needs to load from filesystem, it may costs time and memory. If we can exactly

[jira] [Commented] (PARQUET-2237) Improve performance when filters in RowGroupFilter can match exactly

2023-02-05 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17684261#comment-17684261 ] ASF GitHub Bot commented on PARQUET-2237: - yabola commented on PR #1023: URL:

[GitHub] [parquet-mr] yabola commented on pull request #1023: PARQUET-2237 Improve performance when filters in RowGroupFilter can match exactly

2023-02-05 Thread via GitHub
yabola commented on PR #1023: URL: https://github.com/apache/parquet-mr/pull/1023#issuecomment-1417093967 @wgtmac Thanks for review. I will address your comments and I updated my PR description to explain in more detail. -- This is an automated message from the Apache Git Service. To