wgtmac commented on code in PR #1020:
URL: https://github.com/apache/parquet-mr/pull/1020#discussion_r1070233210
##
parquet-column/src/main/java/org/apache/parquet/column/values/bloomfilter/BlockSplitBloomFilter.java:
##
@@ -394,4 +395,21 @@ public long hash(float value) {
[
https://issues.apache.org/jira/browse/PARQUET-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17676864#comment-17676864
]
ASF GitHub Bot commented on PARQUET-2226:
-
wgtmac commented on code in PR #1020:
URL:
yabola commented on code in PR #1020:
URL: https://github.com/apache/parquet-mr/pull/1020#discussion_r1070267028
##
parquet-column/src/main/java/org/apache/parquet/column/values/bloomfilter/BloomFilter.java:
##
@@ -176,4 +176,10 @@ public String toString() {
* @return
gszadovszky commented on code in PR #1014:
URL: https://github.com/apache/parquet-mr/pull/1014#discussion_r1070274495
##
parquet-hadoop/src/main/java/org/apache/parquet/hadoop/rewrite/ParquetRewriter.java:
##
@@ -0,0 +1,733 @@
+/*
+ * Licensed to the Apache Software Foundation
[
https://issues.apache.org/jira/browse/PARQUET-2075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17676905#comment-17676905
]
ASF GitHub Bot commented on PARQUET-2075:
-
gszadovszky commented on code in PR #1014:
URL:
[
https://issues.apache.org/jira/browse/PARQUET-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mars updated PARQUET-2226:
--
Summary: Support merge Bloom Filter (was: Support union Bloom Filter)
> Support merge Bloom Filter
>
[
https://issues.apache.org/jira/browse/PARQUET-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17676900#comment-17676900
]
ASF GitHub Bot commented on PARQUET-2226:
-
gszadovszky commented on PR #1020:
URL:
gszadovszky commented on PR #1020:
URL: https://github.com/apache/parquet-mr/pull/1020#issuecomment-1382736990
Thanks, @yabola for working on this and also to @wgtmac for reviewing. I do
not have much experience with bloom filters so I will rely on your review. Ping
me if you have a +1.
[
https://issues.apache.org/jira/browse/PARQUET-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17676897#comment-17676897
]
ASF GitHub Bot commented on PARQUET-2226:
-
yabola commented on code in PR #1020:
URL:
[
https://issues.apache.org/jira/browse/PARQUET-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17676898#comment-17676898
]
ASF GitHub Bot commented on PARQUET-2226:
-
yabola commented on code in PR #1020:
URL:
yabola commented on code in PR #1020:
URL: https://github.com/apache/parquet-mr/pull/1020#discussion_r1070267080
##
parquet-column/src/main/java/org/apache/parquet/column/values/bloomfilter/BlockSplitBloomFilter.java:
##
@@ -394,4 +395,21 @@ public long hash(float value) {
gszadovszky commented on PR #1020:
URL: https://github.com/apache/parquet-mr/pull/1020#issuecomment-1382737603
One more thing, @yabola. The compatibility tests fail because you have added
a new method to a public interface. Even though this interface is not supposed
to be implemented by
[
https://issues.apache.org/jira/browse/PARQUET-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17676901#comment-17676901
]
ASF GitHub Bot commented on PARQUET-2226:
-
gszadovszky commented on PR #1020:
URL:
Gang Wu created PARQUET-2228:
Summary: ParquetRewriter supports more than one input file
Key: PARQUET-2228
URL: https://issues.apache.org/jira/browse/PARQUET-2228
Project: Parquet
Issue Type:
Gang Wu created PARQUET-2229:
Summary: ParquetRewriter supports masking and encrypting the same
column
Key: PARQUET-2229
URL: https://issues.apache.org/jira/browse/PARQUET-2229
Project: Parquet
[
https://issues.apache.org/jira/browse/PARQUET-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17676907#comment-17676907
]
ASF GitHub Bot commented on PARQUET-2226:
-
yabola commented on code in PR #1020:
URL:
yabola commented on code in PR #1020:
URL: https://github.com/apache/parquet-mr/pull/1020#discussion_r1070267028
##
parquet-column/src/main/java/org/apache/parquet/column/values/bloomfilter/BloomFilter.java:
##
@@ -176,4 +176,10 @@ public String toString() {
* @return
[
https://issues.apache.org/jira/browse/PARQUET-2075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17676912#comment-17676912
]
ASF GitHub Bot commented on PARQUET-2075:
-
gszadovszky commented on PR #1014:
URL:
gszadovszky commented on PR #1014:
URL: https://github.com/apache/parquet-mr/pull/1014#issuecomment-1382754916
> * I'd prefer creating a new JIRA for this refactor to be a prerequisite.
Merging multiple files to a single one with customized pruning, encryption, and
codec is also in my mind
[
https://issues.apache.org/jira/browse/PARQUET-2075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17676916#comment-17676916
]
ASF GitHub Bot commented on PARQUET-2075:
-
wgtmac commented on PR #1014:
URL:
wgtmac commented on PR #1014:
URL: https://github.com/apache/parquet-mr/pull/1014#issuecomment-1382815489
> > * I'd prefer creating a new JIRA for this refactor to be a prerequisite.
Merging multiple files to a single one with customized pruning, encryption, and
codec is also in my mind
Gang Wu created PARQUET-2230:
Summary: Add a new rewrite command powered by ParquetRewriter
Key: PARQUET-2230
URL: https://issues.apache.org/jira/browse/PARQUET-2230
Project: Parquet
Issue Type:
[
https://issues.apache.org/jira/browse/PARQUET-2227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17676927#comment-17676927
]
ASF GitHub Bot commented on PARQUET-2227:
-
gszadovszky commented on PR #1014:
URL:
gszadovszky commented on PR #1014:
URL: https://github.com/apache/parquet-mr/pull/1014#issuecomment-1382840526
> I am afraid some implementations may drop characters after `'\n'` when
displaying the string content. Let me do some investigation.
I do not have a strong opinion for
wgtmac commented on PR #1014:
URL: https://github.com/apache/parquet-mr/pull/1014#issuecomment-1382752637
> I think it is a great refactor. Thanks a lot for working on it, @wgtmac!
In the other hand I've thought about PARQUET-2075 as a request for a new
feature in `parquet-cli`
[
https://issues.apache.org/jira/browse/PARQUET-2075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17676909#comment-17676909
]
ASF GitHub Bot commented on PARQUET-2075:
-
wgtmac commented on PR #1014:
URL:
Gang Wu created PARQUET-2227:
Summary: Refactor different file rewriters to use single
implementation
Key: PARQUET-2227
URL: https://issues.apache.org/jira/browse/PARQUET-2227
Project: Parquet
shangxinli commented on PR #1020:
URL: https://github.com/apache/parquet-mr/pull/1020#issuecomment-1383006013
@chenjunjiedada Do you still have time to review this change?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and
[
https://issues.apache.org/jira/browse/PARQUET-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17676961#comment-17676961
]
ASF GitHub Bot commented on PARQUET-2226:
-
shangxinli commented on PR #1020:
URL:
shangxinli commented on PR #1016:
URL: https://github.com/apache/parquet-mr/pull/1016#issuecomment-1383006808
@ggershinsky Do you have time to have a look?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
[
https://issues.apache.org/jira/browse/PARQUET-2223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17676963#comment-17676963
]
ASF GitHub Bot commented on PARQUET-2223:
-
shangxinli commented on PR #1016:
URL:
[
https://issues.apache.org/jira/browse/PARQUET-2219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17676962#comment-17676962
]
ASF GitHub Bot commented on PARQUET-2219:
-
shangxinli merged PR #1018:
URL:
shangxinli merged PR #1018:
URL: https://github.com/apache/parquet-mr/pull/1018
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail:
33 matches
Mail list logo