pitrou merged PR #189:
URL: https://github.com/apache/parquet-format/pull/189
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscr...@parquet.ap
pitrou commented on code in PR #189:
URL: https://github.com/apache/parquet-format/pull/189#discussion_r1081911430
##
Encodings.md:
##
@@ -299,9 +302,18 @@ For a longer description, see
https://en.wikipedia.org/wiki/Incremental_encoding
This is stored as a sequence of delta-en
wjones127 commented on code in PR #189:
URL: https://github.com/apache/parquet-format/pull/189#discussion_r1081899568
##
Encodings.md:
##
@@ -280,16 +280,19 @@ concatenated back to back. The expected savings is from
the cost of encoding the
and possibly better compression in t
vectorijk commented on PR #1015:
URL: https://github.com/apache/parquet-mr/pull/1015#issuecomment-1385727824
@wgtmac thanks for the review! I will coordinate with
https://github.com/apache/parquet-mr/pull/1014 and address the comments
--
This is an automated message from the Apache Git Se
zhangjiashen commented on PR #1016:
URL: https://github.com/apache/parquet-mr/pull/1016#issuecomment-1384639397
> I found the doc. Could you provide me with a "comment" access, so we'll
discuss the goals and design there? Thanks.
@ggershinsky thanks for looking at this, I have added p
ggershinsky commented on PR #1016:
URL: https://github.com/apache/parquet-mr/pull/1016#issuecomment-1384054816
I found the doc. Could you provide me with a "comment" access, so we'll
discuss the goals and design there? Thanks.
--
This is an automated message from the Apache Git Service.
T
pitrou commented on PR #189:
URL: https://github.com/apache/parquet-format/pull/189#issuecomment-1383840418
Also cc @rok
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To
pitrou commented on PR #189:
URL: https://github.com/apache/parquet-format/pull/189#issuecomment-1383831257
@emkornfield @gszadovszky @rdblue
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
pitrou commented on PR #189:
URL: https://github.com/apache/parquet-format/pull/189#issuecomment-1383830870
@wjones127 Could you help review the wording?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
pitrou opened a new pull request, #189:
URL: https://github.com/apache/parquet-format/pull/189
DELTA_BYTE_ARRAY has been supported for FIXED_LEN_BYTE_ARRAY by parquet-mr
since 2015 (see PARQUET-152). Update the spec in consequence.
Also improve wording, markup and add an example.
gszadovszky commented on PR #1020:
URL: https://github.com/apache/parquet-mr/pull/1020#issuecomment-1383783806
Sure. :)
Please double-check the jira if I assigned it to the correct one.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on
yabola commented on PR #1020:
URL: https://github.com/apache/parquet-mr/pull/1020#issuecomment-1383715909
@wgtmac Thank you for your detailed review and @gszadovszky help.
My jira id is miracle
--
This is an automated message from the Apache Git Service.
To respond to the message, pleas
gszadovszky commented on PR #1020:
URL: https://github.com/apache/parquet-mr/pull/1020#issuecomment-1383689874
@yabola, what is your jira account? I'd like to assign the jira to you
before closing.
--
This is an automated message from the Apache Git Service.
To respond to the message, ple
gszadovszky merged PR #1020:
URL: https://github.com/apache/parquet-mr/pull/1020
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscr...@parquet
yabola commented on code in PR #1020:
URL: https://github.com/apache/parquet-mr/pull/1020#discussion_r1070907144
##
parquet-column/src/main/java/org/apache/parquet/column/values/bloomfilter/BlockSplitBloomFilter.java:
##
@@ -394,4 +395,24 @@ public long hash(float value) {
pu
ggershinsky commented on PR #1016:
URL: https://github.com/apache/parquet-mr/pull/1016#issuecomment-1383552737
As far as I understand, _data masking_ replaces content of sensitive
columns; it does not remove the columns (schema and content). The latter is
done by _column pruning_ - when re-
wgtmac commented on code in PR #1020:
URL: https://github.com/apache/parquet-mr/pull/1020#discussion_r1070853151
##
parquet-column/src/main/java/org/apache/parquet/column/values/bloomfilter/BlockSplitBloomFilter.java:
##
@@ -394,4 +395,24 @@ public long hash(float value) {
pu
yabola commented on code in PR #1020:
URL: https://github.com/apache/parquet-mr/pull/1020#discussion_r1070638035
##
parquet-column/src/test/java/org/apache/parquet/column/values/bloomfilter/TestBlockSplitBloomFilter.java:
##
@@ -181,6 +182,83 @@ public void testBloomFilterNDVs()
yabola commented on code in PR #1020:
URL: https://github.com/apache/parquet-mr/pull/1020#discussion_r1070637510
##
parquet-column/src/test/java/org/apache/parquet/column/values/bloomfilter/TestBlockSplitBloomFilter.java:
##
@@ -181,6 +182,60 @@ public void testBloomFilterNDVs()
yabola commented on code in PR #1020:
URL: https://github.com/apache/parquet-mr/pull/1020#discussion_r1070637510
##
parquet-column/src/test/java/org/apache/parquet/column/values/bloomfilter/TestBlockSplitBloomFilter.java:
##
@@ -181,6 +182,60 @@ public void testBloomFilterNDVs()
yabola commented on code in PR #1020:
URL: https://github.com/apache/parquet-mr/pull/1020#discussion_r1070637510
##
parquet-column/src/test/java/org/apache/parquet/column/values/bloomfilter/TestBlockSplitBloomFilter.java:
##
@@ -181,6 +182,60 @@ public void testBloomFilterNDVs()
yabola commented on code in PR #1020:
URL: https://github.com/apache/parquet-mr/pull/1020#discussion_r1070637029
##
parquet-column/src/main/java/org/apache/parquet/column/values/bloomfilter/BlockSplitBloomFilter.java:
##
@@ -394,4 +395,24 @@ public long hash(float value) {
pu
yabola commented on code in PR #1020:
URL: https://github.com/apache/parquet-mr/pull/1020#discussion_r1070636680
##
parquet-column/src/main/java/org/apache/parquet/column/values/bloomfilter/BlockSplitBloomFilter.java:
##
@@ -398,18 +398,21 @@ public long hash(Binary value) {
wgtmac commented on PR #1014:
URL: https://github.com/apache/parquet-mr/pull/1014#issuecomment-1383160117
> I agree that merging the key-value metadata is not an easy question. Let's
discuss it separately as it is not related to this PR.
>
> I also agree to store the current writer (p
gszadovszky commented on PR #1014:
URL: https://github.com/apache/parquet-mr/pull/1014#issuecomment-1383152341
I agree that merging the key-value metadata is not an easy question. Let's
discuss it separately as it is not related to this PR.
I also agree to store the current writer (pa
wgtmac commented on code in PR #1020:
URL: https://github.com/apache/parquet-mr/pull/1020#discussion_r1070591970
##
parquet-column/src/main/java/org/apache/parquet/column/values/bloomfilter/BlockSplitBloomFilter.java:
##
@@ -398,18 +398,21 @@ public long hash(Binary value) {
yabola commented on code in PR #1020:
URL: https://github.com/apache/parquet-mr/pull/1020#discussion_r1070591022
##
parquet-column/src/test/java/org/apache/parquet/column/values/bloomfilter/TestBlockSplitBloomFilter.java:
##
@@ -181,6 +182,60 @@ public void testBloomFilterNDVs()
yabola commented on code in PR #1020:
URL: https://github.com/apache/parquet-mr/pull/1020#discussion_r1070591022
##
parquet-column/src/test/java/org/apache/parquet/column/values/bloomfilter/TestBlockSplitBloomFilter.java:
##
@@ -181,6 +182,60 @@ public void testBloomFilterNDVs()
yabola commented on code in PR #1020:
URL: https://github.com/apache/parquet-mr/pull/1020#discussion_r1070591022
##
parquet-column/src/test/java/org/apache/parquet/column/values/bloomfilter/TestBlockSplitBloomFilter.java:
##
@@ -181,6 +182,60 @@ public void testBloomFilterNDVs()
yabola commented on code in PR #1020:
URL: https://github.com/apache/parquet-mr/pull/1020#discussion_r1070591022
##
parquet-column/src/test/java/org/apache/parquet/column/values/bloomfilter/TestBlockSplitBloomFilter.java:
##
@@ -181,6 +182,60 @@ public void testBloomFilterNDVs()
yabola commented on code in PR #1020:
URL: https://github.com/apache/parquet-mr/pull/1020#discussion_r1070591022
##
parquet-column/src/test/java/org/apache/parquet/column/values/bloomfilter/TestBlockSplitBloomFilter.java:
##
@@ -181,6 +182,60 @@ public void testBloomFilterNDVs()
yabola commented on code in PR #1020:
URL: https://github.com/apache/parquet-mr/pull/1020#discussion_r1070590290
##
parquet-column/src/main/java/org/apache/parquet/column/values/bloomfilter/BlockSplitBloomFilter.java:
##
@@ -394,4 +395,21 @@ public long hash(float value) {
pu
yabola commented on code in PR #1020:
URL: https://github.com/apache/parquet-mr/pull/1020#discussion_r1070590238
##
parquet-column/src/test/java/org/apache/parquet/column/values/bloomfilter/TestBlockSplitBloomFilter.java:
##
@@ -181,6 +182,60 @@ public void testBloomFilterNDVs()
wgtmac commented on code in PR #1020:
URL: https://github.com/apache/parquet-mr/pull/1020#discussion_r1070546933
##
parquet-column/src/test/java/org/apache/parquet/column/values/bloomfilter/TestBlockSplitBloomFilter.java:
##
@@ -181,6 +182,60 @@ public void testBloomFilterNDVs()
yabola commented on PR #1020:
URL: https://github.com/apache/parquet-mr/pull/1020#issuecomment-1383102223
@wgtmac I had added unit test, please take a look~ thanks!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
U
wgtmac commented on PR #1014:
URL: https://github.com/apache/parquet-mr/pull/1014#issuecomment-1383101451
> > I am afraid some implementations may drop characters after `'\n'` when
displaying the string content. Let me do some investigation.
>
> I do not have a strong opinion for `'\n
shangxinli commented on PR #1016:
URL: https://github.com/apache/parquet-mr/pull/1016#issuecomment-1383006808
@ggershinsky Do you have time to have a look?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above
shangxinli merged PR #1018:
URL: https://github.com/apache/parquet-mr/pull/1018
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscr...@parquet.
shangxinli commented on PR #1020:
URL: https://github.com/apache/parquet-mr/pull/1020#issuecomment-1383006013
@chenjunjiedada Do you still have time to review this change?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and u
gszadovszky commented on PR #1014:
URL: https://github.com/apache/parquet-mr/pull/1014#issuecomment-1382840526
> I am afraid some implementations may drop characters after `'\n'` when
displaying the string content. Let me do some investigation.
I do not have a strong opinion for `'\n'
wgtmac commented on PR #1014:
URL: https://github.com/apache/parquet-mr/pull/1014#issuecomment-1382815489
> > * I'd prefer creating a new JIRA for this refactor to be a prerequisite.
Merging multiple files to a single one with customized pruning, encryption, and
codec is also in my mind and
gszadovszky commented on PR #1014:
URL: https://github.com/apache/parquet-mr/pull/1014#issuecomment-1382754916
> * I'd prefer creating a new JIRA for this refactor to be a prerequisite.
Merging multiple files to a single one with customized pruning, encryption, and
codec is also in my mind
wgtmac commented on PR #1014:
URL: https://github.com/apache/parquet-mr/pull/1014#issuecomment-1382752637
> I think it is a great refactor. Thanks a lot for working on it, @wgtmac!
In the other hand I've thought about PARQUET-2075 as a request for a new
feature in `parquet-cli`
yabola commented on code in PR #1020:
URL: https://github.com/apache/parquet-mr/pull/1020#discussion_r1070267028
##
parquet-column/src/main/java/org/apache/parquet/column/values/bloomfilter/BloomFilter.java:
##
@@ -176,4 +176,10 @@ public String toString() {
* @return compre
gszadovszky commented on code in PR #1014:
URL: https://github.com/apache/parquet-mr/pull/1014#discussion_r1070274495
##
parquet-hadoop/src/main/java/org/apache/parquet/hadoop/rewrite/ParquetRewriter.java:
##
@@ -0,0 +1,733 @@
+/*
+ * Licensed to the Apache Software Foundation (
gszadovszky commented on PR #1020:
URL: https://github.com/apache/parquet-mr/pull/1020#issuecomment-1382737603
One more thing, @yabola. The compatibility tests fail because you have added
a new method to a public interface. Even though this interface is not supposed
to be implemented by our
gszadovszky commented on PR #1020:
URL: https://github.com/apache/parquet-mr/pull/1020#issuecomment-1382736990
Thanks, @yabola for working on this and also to @wgtmac for reviewing. I do
not have much experience with bloom filters so I will rely on your review. Ping
me if you have a +1.
yabola commented on code in PR #1020:
URL: https://github.com/apache/parquet-mr/pull/1020#discussion_r1070267080
##
parquet-column/src/main/java/org/apache/parquet/column/values/bloomfilter/BlockSplitBloomFilter.java:
##
@@ -394,4 +395,21 @@ public long hash(float value) {
pu
yabola commented on code in PR #1020:
URL: https://github.com/apache/parquet-mr/pull/1020#discussion_r1070267028
##
parquet-column/src/main/java/org/apache/parquet/column/values/bloomfilter/BloomFilter.java:
##
@@ -176,4 +176,10 @@ public String toString() {
* @return compre
wgtmac commented on code in PR #1020:
URL: https://github.com/apache/parquet-mr/pull/1020#discussion_r1070233210
##
parquet-column/src/main/java/org/apache/parquet/column/values/bloomfilter/BlockSplitBloomFilter.java:
##
@@ -394,4 +395,21 @@ public long hash(float value) {
pu
yabola commented on PR #1020:
URL: https://github.com/apache/parquet-mr/pull/1020#issuecomment-1382679860
@shangxinli @gszadovszky Can you help take a look if it is suitable~
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and
yabola opened a new pull request, #1020:
URL: https://github.com/apache/parquet-mr/pull/1020
Make sure you have checked _all_ steps below.
### Jira
- [ ] My PR addresses the following [Parquet
Jira](https://issues.apache.org/jira/browse/PARQUET/) issues and references
them in
parthchandra commented on PR #1008:
URL: https://github.com/apache/parquet-mr/pull/1008#issuecomment-1381826030
CI is failing at the pre-build step. Anyone know why?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
parthchandra commented on code in PR #1008:
URL: https://github.com/apache/parquet-mr/pull/1008#discussion_r1069182893
##
parquet-format-structures/src/main/java/org/apache/parquet/format/BlockCipher.java:
##
@@ -51,17 +52,26 @@
* @param AAD - Additional Authenticated Data
wgtmac commented on code in PR #1019:
URL: https://github.com/apache/parquet-mr/pull/1019#discussion_r1068164918
##
parquet-hadoop/src/main/java/org/apache/parquet/hadoop/metadata/FileMetaData.java:
##
@@ -71,7 +79,7 @@ public MessageType getSchema() {
@Override
public S
wgtmac commented on code in PR #1014:
URL: https://github.com/apache/parquet-mr/pull/1014#discussion_r1068159808
##
parquet-hadoop/src/main/java/org/apache/parquet/hadoop/rewrite/RewriteOptions.java:
##
@@ -0,0 +1,178 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) u
ggershinsky commented on code in PR #1014:
URL: https://github.com/apache/parquet-mr/pull/1014#discussion_r1067893929
##
parquet-hadoop/src/main/java/org/apache/parquet/hadoop/rewrite/RewriteOptions.java:
##
@@ -0,0 +1,178 @@
+/*
+ * Licensed to the Apache Software Foundation (A
wgtmac commented on code in PR #1008:
URL: https://github.com/apache/parquet-mr/pull/1008#discussion_r1067699349
##
parquet-format-structures/src/main/java/org/apache/parquet/format/BlockCipher.java:
##
@@ -51,17 +52,26 @@
* @param AAD - Additional Authenticated Data for t
Kimahriman commented on PR #982:
URL: https://github.com/apache/parquet-mr/pull/982#issuecomment-1379431375
same, we have certain jobs that can't function without a patched jar. Seems
to get worse with the more columns you read. Our worst offender (table with
thousands of columns), can easi
wgtmac commented on code in PR #1014:
URL: https://github.com/apache/parquet-mr/pull/1014#discussion_r1067083773
##
parquet-hadoop/src/test/java/org/apache/parquet/hadoop/rewrite/ParquetRewriterTest.java:
##
@@ -0,0 +1,308 @@
+/*
+ * Licensed to the Apache Software Foundation (A
wgtmac commented on code in PR #1014:
URL: https://github.com/apache/parquet-mr/pull/1014#discussion_r1067082884
##
parquet-hadoop/src/test/java/org/apache/parquet/hadoop/rewrite/ParquetRewriterTest.java:
##
@@ -0,0 +1,308 @@
+/*
+ * Licensed to the Apache Software Foundation (A
wgtmac commented on code in PR #1014:
URL: https://github.com/apache/parquet-mr/pull/1014#discussion_r1067082250
##
parquet-hadoop/src/main/java/org/apache/parquet/hadoop/rewrite/RewriteOptions.java:
##
@@ -0,0 +1,178 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) u
wgtmac commented on code in PR #1018:
URL: https://github.com/apache/parquet-mr/pull/1018#discussion_r1066592694
##
parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetFileReader.java:
##
@@ -927,7 +925,15 @@ public PageReadStore readRowGroup(int blockIndex) throws
IO
camper42 commented on PR #982:
URL: https://github.com/apache/parquet-mr/pull/982#issuecomment-1378181577
same problem with @alexeykudinkin
currently we replace paruqet jar with patched one in our image, waiting for
release
--
This is an automated message from the Apache Git Servi
shangxinli commented on PR #1014:
URL: https://github.com/apache/parquet-mr/pull/1014#issuecomment-1377840323
Thanks a lot @gszadovszky
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specif
gszadovszky commented on PR #1014:
URL: https://github.com/apache/parquet-mr/pull/1014#issuecomment-1377706622
> @gszadovszky I Just want to check if you have time to have a look. @wgtmac
just be nice to take over the work that we discussed earlier to have an
aggregated rewriter.
@sh
gszadovszky commented on PR #1018:
URL: https://github.com/apache/parquet-mr/pull/1018#issuecomment-1377700950
> @gszadovszky Nice to see you are back!
@shangxinli, I wouldn't say I'm back, unfortunately. I'm a bit closer to
Parquet at Dremio but actually not working on it. We'll see
shangxinli commented on PR #982:
URL: https://github.com/apache/parquet-mr/pull/982#issuecomment-1377598327
Thanks @alexeykudinkin for the explanation.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
alexeykudinkin commented on PR #982:
URL: https://github.com/apache/parquet-mr/pull/982#issuecomment-1377589036
Totally @shangxinli
We have running Spark clusters in production _ingesting_ from 100s of Apache
Hudi tables (using Parquet and Zstd) and writing into other ones. We switc
shangxinli commented on code in PR #1018:
URL: https://github.com/apache/parquet-mr/pull/1018#discussion_r1066042941
##
parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetFileReader.java:
##
@@ -1038,7 +1044,10 @@ public PageReadStore readNextFilteredRowGroup() throws
shangxinli commented on code in PR #1018:
URL: https://github.com/apache/parquet-mr/pull/1018#discussion_r1066038932
##
parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetFileReader.java:
##
@@ -927,7 +925,15 @@ public PageReadStore readRowGroup(int blockIndex) throws
wgtmac commented on code in PR #1014:
URL: https://github.com/apache/parquet-mr/pull/1014#discussion_r1065962705
##
parquet-hadoop/src/main/java/org/apache/parquet/hadoop/rewrite/RewriteOptions.java:
##
@@ -0,0 +1,178 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) u
wgtmac commented on code in PR #1014:
URL: https://github.com/apache/parquet-mr/pull/1014#discussion_r1065962705
##
parquet-hadoop/src/main/java/org/apache/parquet/hadoop/rewrite/RewriteOptions.java:
##
@@ -0,0 +1,178 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) u
wgtmac commented on PR #1018:
URL: https://github.com/apache/parquet-mr/pull/1018#issuecomment-1377485663
> Thanks you for fixing this. I've added some comments. Also, could you add
a similar test for the filtered row groups?
Thanks for your review @gszadovszky !
I have address
wgtmac commented on PR #1008:
URL: https://github.com/apache/parquet-mr/pull/1008#issuecomment-1376928785
> @wgtmac Do you have time to have a look?
@shangxinli Thanks for mentioning me. Sure, I will take a look this week.
--
This is an automated message from the Apache Git Service.
dongjoon-hyun commented on PR #1017:
URL: https://github.com/apache/parquet-mr/pull/1017#issuecomment-1376796050
Thank you all, @shangxinli , @ggershinsky , @sunchao , @wgtmac .
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
shangxinli commented on PR #1008:
URL: https://github.com/apache/parquet-mr/pull/1008#issuecomment-1376755881
@wgtmac Do you have time to have a look?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to g
shangxinli commented on code in PR #1008:
URL: https://github.com/apache/parquet-mr/pull/1008#discussion_r1065346689
##
parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ColumnChunkPageReadStore.java:
##
@@ -133,11 +135,36 @@ public DataPage readPage() {
public Dat
shangxinli commented on PR #1014:
URL: https://github.com/apache/parquet-mr/pull/1014#issuecomment-1376754942
@gszadovszky I Just want to check if you have time to have a look. @wgtmac
Just be nice to take over the work that we discussed earlier to have an
aggregated rewriter.
--
This i
shangxinli merged PR #1017:
URL: https://github.com/apache/parquet-mr/pull/1017
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscr...@parquet.
shangxinli commented on PR #1017:
URL: https://github.com/apache/parquet-mr/pull/1017#issuecomment-1376754236
Thank you @dongjoon-hyun for working on it!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above t
shangxinli commented on PR #1018:
URL: https://github.com/apache/parquet-mr/pull/1018#issuecomment-1376751333
@gszadovszky Nice to see you are back!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go t
shangxinli commented on PR #982:
URL: https://github.com/apache/parquet-mr/pull/982#issuecomment-1376750703
@alexeykudinkin We might release a new patch in the next 2 or 3 months.
Can you elaborate why "this is a severe problem that does affect our ability
to use Parquet w/ Zstd"? I
alexeykudinkin commented on PR #982:
URL: https://github.com/apache/parquet-mr/pull/982#issuecomment-1376498280
@gszadovszky @ggershinsky @shangxinli
Folks, do we have an approximate timeline for the next patch release that
will be including this patch?
This is a severe probl
anjakefala commented on PR #184:
URL: https://github.com/apache/parquet-format/pull/184#issuecomment-1376199292
Hey @emkornfield! Is it reasonable for me to send a proposal to the mailing
list for a vote? It seems @gszadovszky is not available for insight; is there
anyone else that can pro
gszadovszky commented on code in PR #1018:
URL: https://github.com/apache/parquet-mr/pull/1018#discussion_r1064374553
##
parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetFileReader.java:
##
@@ -1038,7 +1044,9 @@ public PageReadStore readNextFilteredRowGroup() throws
ggershinsky commented on code in PR #1014:
URL: https://github.com/apache/parquet-mr/pull/1014#discussion_r1064348254
##
parquet-hadoop/src/main/java/org/apache/parquet/hadoop/rewrite/RewriteOptions.java:
##
@@ -0,0 +1,178 @@
+/*
+ * Licensed to the Apache Software Foundation (A
dongjoon-hyun commented on PR #1017:
URL: https://github.com/apache/parquet-mr/pull/1017#issuecomment-1375203719
Thank you, @ggershinsky !
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the spec
dongjoon-hyun commented on PR #1017:
URL: https://github.com/apache/parquet-mr/pull/1017#issuecomment-1374985673
Thank you, @wgtmac and @sunchao .
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
wgtmac commented on PR #1018:
URL: https://github.com/apache/parquet-mr/pull/1018#issuecomment-1374806195
@gszadovszky @ggershinsky @shangxinli @sunchao Could you please take a look
when you have time?
cc @emkornfield
--
This is an automated message from the Apache Git Service.
To
wgtmac opened a new pull request, #1018:
URL: https://github.com/apache/parquet-mr/pull/1018
### Jira
My PR addresses the
[PARQUET-2219](https://issues.apache.org/jira/browse/PARQUET/PARQUET-2219).
### Tests
My PR adds the following unit test to read parquet file with em
dongjoon-hyun commented on PR #1017:
URL: https://github.com/apache/parquet-mr/pull/1017#issuecomment-1373908959
FYI, here is the ASF SBOM wikipage.
- https://cwiki.apache.org/confluence/display/COMDEV/SBOM
--
This is an automated message from the Apache Git Service.
To respond to the m
dongjoon-hyun commented on PR #1017:
URL: https://github.com/apache/parquet-mr/pull/1017#issuecomment-1372782088
Also, cc @shangxinli and @gszadovszky
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
dongjoon-hyun commented on PR #1017:
URL: https://github.com/apache/parquet-mr/pull/1017#issuecomment-1372539919
cc @ggershinsky and @sunchao
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
dongjoon-hyun opened a new pull request, #1017:
URL: https://github.com/apache/parquet-mr/pull/1017
Make sure you have checked _all_ steps below.
### Jira
- [ ] My PR addresses the following [Parquet
Jira](https://issues.apache.org/jira/browse/PARQUET/) issues and references
t
wgtmac commented on PR #1015:
URL: https://github.com/apache/parquet-mr/pull/1015#issuecomment-1372241704
BTW, I am working on https://github.com/apache/parquet-mr/pull/1014 to unify
several rewriters and most logic of `class ColumnEncryptor` will be relocated
to `class ParquetRewriter`. Ma
wgtmac commented on code in PR #1015:
URL: https://github.com/apache/parquet-mr/pull/1015#discussion_r1062474157
##
parquet-hadoop/src/main/java/org/apache/parquet/crypto/ColumnDecryptionProperties.java:
##
@@ -1,104 +1,109 @@
-/*
- * Licensed to the Apache Software Foundation (
wgtmac commented on code in PR #1016:
URL: https://github.com/apache/parquet-mr/pull/1016#discussion_r1062349914
##
parquet-hadoop/src/main/java/org/apache/parquet/hadoop/util/DataMaskingUtil.java:
##
@@ -0,0 +1,95 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) und
wgtmac commented on PR #1014:
URL: https://github.com/apache/parquet-mr/pull/1014#issuecomment-1371885483
> If you can add more unit tests, particularly the combinations of prune,
mask, trans-compression etc, it would be better.
I have added some test cases in the `ParquetRewriterTest
wgtmac commented on code in PR #1014:
URL: https://github.com/apache/parquet-mr/pull/1014#discussion_r1062185479
##
parquet-hadoop/src/main/java/org/apache/parquet/hadoop/rewrite/RewriteOptions.java:
##
@@ -0,0 +1,144 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) u
1 - 100 of 1570 matches
Mail list logo