[jira] [Commented] (PARQUET-2226) Support merge Bloom Filter

2023-01-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17677196#comment-17677196 ] ASF GitHub Bot commented on PARQUET-2226: - yabola commented on code in PR #1020: URL:

[GitHub] [parquet-mr] yabola commented on a diff in pull request #1020: PARQUET-2226 Support merge bloom filters

2023-01-15 Thread GitBox
yabola commented on code in PR #1020: URL: https://github.com/apache/parquet-mr/pull/1020#discussion_r1070907144 ## parquet-column/src/main/java/org/apache/parquet/column/values/bloomfilter/BlockSplitBloomFilter.java: ## @@ -394,4 +395,24 @@ public long hash(float value) {

[jira] [Commented] (PARQUET-2223) Parquet Data Masking for Column Encryption

2023-01-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17677173#comment-17677173 ] ASF GitHub Bot commented on PARQUET-2223: - ggershinsky commented on PR #1016: URL:

[GitHub] [parquet-mr] ggershinsky commented on pull request #1016: PARQUET-2223: Parquet Data Masking Enhancement for Column Encryption

2023-01-15 Thread GitBox
ggershinsky commented on PR #1016: URL: https://github.com/apache/parquet-mr/pull/1016#issuecomment-1383552737 As far as I understand, _data masking_ replaces content of sensitive columns; it does not remove the columns (schema and content). The latter is done by _column pruning_ - when

[jira] [Commented] (PARQUET-2226) Support merge Bloom Filter

2023-01-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17677157#comment-17677157 ] ASF GitHub Bot commented on PARQUET-2226: - wgtmac commented on code in PR #1020: URL:

[GitHub] [parquet-mr] wgtmac commented on a diff in pull request #1020: PARQUET-2226 Support merge bloom filters

2023-01-15 Thread GitBox
wgtmac commented on code in PR #1020: URL: https://github.com/apache/parquet-mr/pull/1020#discussion_r1070853151 ## parquet-column/src/main/java/org/apache/parquet/column/values/bloomfilter/BlockSplitBloomFilter.java: ## @@ -394,4 +395,24 @@ public long hash(float value) {

[jira] [Commented] (PARQUET-2226) Support merge Bloom Filter

2023-01-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17677063#comment-17677063 ] ASF GitHub Bot commented on PARQUET-2226: - yabola commented on code in PR #1020: URL:

[GitHub] [parquet-mr] yabola commented on a diff in pull request #1020: PARQUET-2226 Support merge bloom filters

2023-01-15 Thread GitBox
yabola commented on code in PR #1020: URL: https://github.com/apache/parquet-mr/pull/1020#discussion_r1070638035 ## parquet-column/src/test/java/org/apache/parquet/column/values/bloomfilter/TestBlockSplitBloomFilter.java: ## @@ -181,6 +182,83 @@ public void

[jira] [Commented] (PARQUET-2226) Support merge Bloom Filter

2023-01-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17677060#comment-17677060 ] ASF GitHub Bot commented on PARQUET-2226: - yabola commented on code in PR #1020: URL:

[jira] [Commented] (PARQUET-2226) Support merge Bloom Filter

2023-01-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17677061#comment-17677061 ] ASF GitHub Bot commented on PARQUET-2226: - yabola commented on code in PR #1020: URL:

[GitHub] [parquet-mr] yabola commented on a diff in pull request #1020: PARQUET-2226 Support merge bloom filters

2023-01-15 Thread GitBox
yabola commented on code in PR #1020: URL: https://github.com/apache/parquet-mr/pull/1020#discussion_r1070637510 ## parquet-column/src/test/java/org/apache/parquet/column/values/bloomfilter/TestBlockSplitBloomFilter.java: ## @@ -181,6 +182,60 @@ public void

[jira] [Commented] (PARQUET-2226) Support merge Bloom Filter

2023-01-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17677059#comment-17677059 ] ASF GitHub Bot commented on PARQUET-2226: - yabola commented on code in PR #1020: URL:

[GitHub] [parquet-mr] yabola commented on a diff in pull request #1020: PARQUET-2226 Support merge bloom filters

2023-01-15 Thread GitBox
yabola commented on code in PR #1020: URL: https://github.com/apache/parquet-mr/pull/1020#discussion_r1070637510 ## parquet-column/src/test/java/org/apache/parquet/column/values/bloomfilter/TestBlockSplitBloomFilter.java: ## @@ -181,6 +182,60 @@ public void

[GitHub] [parquet-mr] yabola commented on a diff in pull request #1020: PARQUET-2226 Support merge bloom filters

2023-01-15 Thread GitBox
yabola commented on code in PR #1020: URL: https://github.com/apache/parquet-mr/pull/1020#discussion_r1070637510 ## parquet-column/src/test/java/org/apache/parquet/column/values/bloomfilter/TestBlockSplitBloomFilter.java: ## @@ -181,6 +182,60 @@ public void

[jira] [Commented] (PARQUET-2226) Support merge Bloom Filter

2023-01-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17677058#comment-17677058 ] ASF GitHub Bot commented on PARQUET-2226: - yabola commented on code in PR #1020: URL:

[GitHub] [parquet-mr] yabola commented on a diff in pull request #1020: PARQUET-2226 Support merge bloom filters

2023-01-15 Thread GitBox
yabola commented on code in PR #1020: URL: https://github.com/apache/parquet-mr/pull/1020#discussion_r1070637029 ## parquet-column/src/main/java/org/apache/parquet/column/values/bloomfilter/BlockSplitBloomFilter.java: ## @@ -394,4 +395,24 @@ public long hash(float value) {

[jira] [Commented] (PARQUET-2226) Support merge Bloom Filter

2023-01-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17677057#comment-17677057 ] ASF GitHub Bot commented on PARQUET-2226: - yabola commented on code in PR #1020: URL:

[GitHub] [parquet-mr] yabola commented on a diff in pull request #1020: PARQUET-2226 Support merge bloom filters

2023-01-15 Thread GitBox
yabola commented on code in PR #1020: URL: https://github.com/apache/parquet-mr/pull/1020#discussion_r1070636680 ## parquet-column/src/main/java/org/apache/parquet/column/values/bloomfilter/BlockSplitBloomFilter.java: ## @@ -398,18 +398,21 @@ public long hash(Binary value) {

[jira] [Commented] (PARQUET-2227) Refactor different file rewriters to use single implementation

2023-01-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17677039#comment-17677039 ] ASF GitHub Bot commented on PARQUET-2227: - wgtmac commented on PR #1014: URL:

[GitHub] [parquet-mr] wgtmac commented on pull request #1014: PARQUET-2227: Refactor several file rewriters to use a new unified ParquetRewriter implementation

2023-01-15 Thread GitBox
wgtmac commented on PR #1014: URL: https://github.com/apache/parquet-mr/pull/1014#issuecomment-1383160117 > I agree that merging the key-value metadata is not an easy question. Let's discuss it separately as it is not related to this PR. > > I also agree to store the current writer

[jira] [Commented] (PARQUET-2227) Refactor different file rewriters to use single implementation

2023-01-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17677036#comment-17677036 ] ASF GitHub Bot commented on PARQUET-2227: - gszadovszky commented on PR #1014: URL:

[GitHub] [parquet-mr] gszadovszky commented on pull request #1014: PARQUET-2227: Refactor several file rewriters to use a new unified ParquetRewriter implementation

2023-01-15 Thread GitBox
gszadovszky commented on PR #1014: URL: https://github.com/apache/parquet-mr/pull/1014#issuecomment-1383152341 I agree that merging the key-value metadata is not an easy question. Let's discuss it separately as it is not related to this PR. I also agree to store the current writer

[jira] [Commented] (PARQUET-2226) Support merge Bloom Filter

2023-01-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17677035#comment-17677035 ] ASF GitHub Bot commented on PARQUET-2226: - wgtmac commented on code in PR #1020: URL:

[GitHub] [parquet-mr] wgtmac commented on a diff in pull request #1020: PARQUET-2226 Support merge bloom filters

2023-01-15 Thread GitBox
wgtmac commented on code in PR #1020: URL: https://github.com/apache/parquet-mr/pull/1020#discussion_r1070591970 ## parquet-column/src/main/java/org/apache/parquet/column/values/bloomfilter/BlockSplitBloomFilter.java: ## @@ -398,18 +398,21 @@ public long hash(Binary value) {

[jira] [Commented] (PARQUET-2226) Support merge Bloom Filter

2023-01-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17677033#comment-17677033 ] ASF GitHub Bot commented on PARQUET-2226: - yabola commented on code in PR #1020: URL:

[GitHub] [parquet-mr] yabola commented on a diff in pull request #1020: PARQUET-2226 Support merge bloom filters

2023-01-15 Thread GitBox
yabola commented on code in PR #1020: URL: https://github.com/apache/parquet-mr/pull/1020#discussion_r1070591022 ## parquet-column/src/test/java/org/apache/parquet/column/values/bloomfilter/TestBlockSplitBloomFilter.java: ## @@ -181,6 +182,60 @@ public void

[jira] [Commented] (PARQUET-2226) Support merge Bloom Filter

2023-01-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17677031#comment-17677031 ] ASF GitHub Bot commented on PARQUET-2226: - yabola commented on code in PR #1020: URL:

[GitHub] [parquet-mr] yabola commented on a diff in pull request #1020: PARQUET-2226 Support merge bloom filters

2023-01-15 Thread GitBox
yabola commented on code in PR #1020: URL: https://github.com/apache/parquet-mr/pull/1020#discussion_r1070591022 ## parquet-column/src/test/java/org/apache/parquet/column/values/bloomfilter/TestBlockSplitBloomFilter.java: ## @@ -181,6 +182,60 @@ public void

[jira] [Commented] (PARQUET-2226) Support merge Bloom Filter

2023-01-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17677029#comment-17677029 ] ASF GitHub Bot commented on PARQUET-2226: - yabola commented on code in PR #1020: URL:

[jira] [Commented] (PARQUET-2226) Support merge Bloom Filter

2023-01-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17677030#comment-17677030 ] ASF GitHub Bot commented on PARQUET-2226: - yabola commented on code in PR #1020: URL:

[GitHub] [parquet-mr] yabola commented on a diff in pull request #1020: PARQUET-2226 Support merge bloom filters

2023-01-15 Thread GitBox
yabola commented on code in PR #1020: URL: https://github.com/apache/parquet-mr/pull/1020#discussion_r1070591022 ## parquet-column/src/test/java/org/apache/parquet/column/values/bloomfilter/TestBlockSplitBloomFilter.java: ## @@ -181,6 +182,60 @@ public void

[GitHub] [parquet-mr] yabola commented on a diff in pull request #1020: PARQUET-2226 Support merge bloom filters

2023-01-15 Thread GitBox
yabola commented on code in PR #1020: URL: https://github.com/apache/parquet-mr/pull/1020#discussion_r1070591022 ## parquet-column/src/test/java/org/apache/parquet/column/values/bloomfilter/TestBlockSplitBloomFilter.java: ## @@ -181,6 +182,60 @@ public void

[jira] [Commented] (PARQUET-2226) Support merge Bloom Filter

2023-01-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17677028#comment-17677028 ] ASF GitHub Bot commented on PARQUET-2226: - yabola commented on code in PR #1020: URL:

[GitHub] [parquet-mr] yabola commented on a diff in pull request #1020: PARQUET-2226 Support merge bloom filters

2023-01-15 Thread GitBox
yabola commented on code in PR #1020: URL: https://github.com/apache/parquet-mr/pull/1020#discussion_r1070591022 ## parquet-column/src/test/java/org/apache/parquet/column/values/bloomfilter/TestBlockSplitBloomFilter.java: ## @@ -181,6 +182,60 @@ public void

[jira] [Commented] (PARQUET-2226) Support merge Bloom Filter

2023-01-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17677027#comment-17677027 ] ASF GitHub Bot commented on PARQUET-2226: - yabola commented on code in PR #1020: URL:

[GitHub] [parquet-mr] yabola commented on a diff in pull request #1020: PARQUET-2226 Support merge bloom filters

2023-01-15 Thread GitBox
yabola commented on code in PR #1020: URL: https://github.com/apache/parquet-mr/pull/1020#discussion_r1070590290 ## parquet-column/src/main/java/org/apache/parquet/column/values/bloomfilter/BlockSplitBloomFilter.java: ## @@ -394,4 +395,21 @@ public long hash(float value) {

[jira] [Commented] (PARQUET-2226) Support merge Bloom Filter

2023-01-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17677026#comment-17677026 ] ASF GitHub Bot commented on PARQUET-2226: - yabola commented on code in PR #1020: URL:

[GitHub] [parquet-mr] yabola commented on a diff in pull request #1020: PARQUET-2226 Support merge bloom filters

2023-01-15 Thread GitBox
yabola commented on code in PR #1020: URL: https://github.com/apache/parquet-mr/pull/1020#discussion_r1070590238 ## parquet-column/src/test/java/org/apache/parquet/column/values/bloomfilter/TestBlockSplitBloomFilter.java: ## @@ -181,6 +182,60 @@ public void

[jira] [Commented] (PARQUET-2226) Support merge Bloom Filter

2023-01-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17677004#comment-17677004 ] ASF GitHub Bot commented on PARQUET-2226: - wgtmac commented on code in PR #1020: URL:

[GitHub] [parquet-mr] wgtmac commented on a diff in pull request #1020: PARQUET-2226 Support merge bloom filters

2023-01-15 Thread GitBox
wgtmac commented on code in PR #1020: URL: https://github.com/apache/parquet-mr/pull/1020#discussion_r1070546933 ## parquet-column/src/test/java/org/apache/parquet/column/values/bloomfilter/TestBlockSplitBloomFilter.java: ## @@ -181,6 +182,60 @@ public void

[jira] [Commented] (PARQUET-2226) Support merge Bloom Filter

2023-01-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17676997#comment-17676997 ] ASF GitHub Bot commented on PARQUET-2226: - yabola commented on PR #1020: URL:

[GitHub] [parquet-mr] yabola commented on pull request #1020: PARQUET-2226 Support merge bloom filters

2023-01-15 Thread GitBox
yabola commented on PR #1020: URL: https://github.com/apache/parquet-mr/pull/1020#issuecomment-1383102223 @wgtmac I had added unit test, please take a look~ thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[jira] [Commented] (PARQUET-2227) Refactor different file rewriters to use single implementation

2023-01-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-2227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17676996#comment-17676996 ] ASF GitHub Bot commented on PARQUET-2227: - wgtmac commented on PR #1014: URL:

[GitHub] [parquet-mr] wgtmac commented on pull request #1014: PARQUET-2227: Refactor several file rewriters to use a new unified ParquetRewriter implementation

2023-01-15 Thread GitBox
wgtmac commented on PR #1014: URL: https://github.com/apache/parquet-mr/pull/1014#issuecomment-1383101451 > > I am afraid some implementations may drop characters after `'\n'` when displaying the string content. Let me do some investigation. > > I do not have a strong opinion for