[
https://issues.apache.org/jira/browse/PARQUET-2250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated PARQUET-2250:
Labels: pull-request-available (was: )
> Expose column descriptor through RecordReader
fatemah created PARQUET-2250:
Summary: Expose column descriptor through RecordReader
Key: PARQUET-2250
URL: https://issues.apache.org/jira/browse/PARQUET-2250
Project: Parquet
Issue Type:
[
https://issues.apache.org/jira/browse/PARQUET-2251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693007#comment-17693007
]
ASF GitHub Bot commented on PARQUET-2251:
-
wgtmac commented on code in PR #1033:
URL:
wgtmac commented on code in PR #1033:
URL: https://github.com/apache/parquet-mr/pull/1033#discussion_r1116520642
##
parquet-hadoop/src/test/java/org/apache/parquet/hadoop/TestStoreBloomFilter.java:
##
@@ -0,0 +1,132 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF)
[
https://issues.apache.org/jira/browse/PARQUET-2251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693017#comment-17693017
]
ASF GitHub Bot commented on PARQUET-2251:
-
yabola commented on code in PR #1033:
URL:
[
https://issues.apache.org/jira/browse/PARQUET-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693023#comment-17693023
]
ASF GitHub Bot commented on PARQUET-2149:
-
whcdjj commented on PR #968:
URL:
wgtmac commented on code in PR #1011:
URL: https://github.com/apache/parquet-mr/pull/1011#discussion_r1116427502
##
parquet-generator/src/main/java/org/apache/parquet/encoding/vectorbitpacking/BitPackingGenerator512Vector.java:
##
@@ -0,0 +1,67 @@
+/*
+ * Licensed to the Apache
[
https://issues.apache.org/jira/browse/PARQUET-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692972#comment-17692972
]
ASF GitHub Bot commented on PARQUET-831:
wgtmac commented on code in PR #1022:
URL:
wgtmac commented on code in PR #1022:
URL: https://github.com/apache/parquet-mr/pull/1022#discussion_r1116429546
##
parquet-column/src/test/java/org/apache/parquet/column/impl/TestColumnWriterV1.java:
##
@@ -0,0 +1,66 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF)
[
https://issues.apache.org/jira/browse/PARQUET-2251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692993#comment-17692993
]
ASF GitHub Bot commented on PARQUET-2251:
-
yabola opened a new pull request, #1033:
URL:
yabola opened a new pull request, #1033:
URL: https://github.com/apache/parquet-mr/pull/1033
In parquet pageV1, even all pages of a column are encoded by dictionary, it
will still generate BloomFilter. Actually it is unnecessary to generate
BloomFilter and it cost time and occupy storage.
[
https://issues.apache.org/jira/browse/PARQUET-2251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693000#comment-17693000
]
ASF GitHub Bot commented on PARQUET-2251:
-
yabola commented on PR #1033:
URL:
yabola commented on PR #1033:
URL: https://github.com/apache/parquet-mr/pull/1033#issuecomment-1442792170
@wgtmac @gerashegalov Please take a look, thank you~
And I will update [PR](https://github.com/apache/parquet-mr/pull/1023) to
skip bloomfilter when all pages are encoded in
[
https://issues.apache.org/jira/browse/PARQUET-2201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Micah Kornfield resolved PARQUET-2201.
--
Fix Version/s: cpp-11.0.0
Resolution: Fixed
Issue resolved by pull request
[
https://issues.apache.org/jira/browse/PARQUET-2201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Micah Kornfield reassigned PARQUET-2201:
Assignee: fatemah
> Add Stress test for RecordReader SkipRecords
>
[
https://issues.apache.org/jira/browse/PARQUET-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692970#comment-17692970
]
ASF GitHub Bot commented on PARQUET-2159:
-
wgtmac commented on code in PR #1011:
URL:
[
https://issues.apache.org/jira/browse/PARQUET-2251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mars updated PARQUET-2251:
--
Description: In parquet pageV1, even all pages of one column are encoded by
dictionary (was: In parquet
[
https://issues.apache.org/jira/browse/PARQUET-2251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mars updated PARQUET-2251:
--
Summary: Avoid generating Bloomfilter when all pages of a column are
encoded by dictionary (was: Avoid
[
https://issues.apache.org/jira/browse/PARQUET-2251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mars updated PARQUET-2251:
--
Description:
In parquet pageV1, even all pages of a column are encoded by dictionary, it
will still
[
https://issues.apache.org/jira/browse/PARQUET-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692968#comment-17692968
]
ASF GitHub Bot commented on PARQUET-2159:
-
wgtmac commented on code in PR #1011:
URL:
wgtmac commented on code in PR #1011:
URL: https://github.com/apache/parquet-mr/pull/1011#discussion_r1116420531
##
pom.xml:
##
@@ -151,6 +151,9 @@
parquet-scala
parquet-thrift
parquet-hadoop-bundle
+
+http://maven.apache.org/POM/4.0.0;
+
Mars created PARQUET-2251:
-
Summary: Avoid generating Bloomfilter when all pages of one column
are encoded by dictionary
Key: PARQUET-2251
URL: https://issues.apache.org/jira/browse/PARQUET-2251
Project:
[
https://issues.apache.org/jira/browse/PARQUET-2251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mars updated PARQUET-2251:
--
Description: In parquet pageV1,
> Avoid generating Bloomfilter when all pages of one column are encoded by
whcdjj commented on PR #968:
URL: https://github.com/apache/parquet-mr/pull/968#issuecomment-1442878000
Hi, I am very interested in this optimization and just have some questiones
when testing in a cluster with 4nodes/96 cores using spark3.1. Unfortunately,
I see little improvement.
I
yabola commented on code in PR #1033:
URL: https://github.com/apache/parquet-mr/pull/1033#discussion_r1116549921
##
parquet-hadoop/src/test/java/org/apache/parquet/hadoop/TestStoreBloomFilter.java:
##
@@ -0,0 +1,132 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF)
[
https://issues.apache.org/jira/browse/PARQUET-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692604#comment-17692604
]
ASF GitHub Bot commented on PARQUET-2159:
-
gszadovszky commented on code in PR #1011:
URL:
gszadovszky commented on code in PR #1011:
URL: https://github.com/apache/parquet-mr/pull/1011#discussion_r1115429762
##
parquet-column/src/main/java/org/apache/parquet/column/values/bitpacking/ParquetReadRouter.java:
##
@@ -0,0 +1,133 @@
+/*
+ * Licensed to the Apache Software
[
https://issues.apache.org/jira/browse/PARQUET-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692616#comment-17692616
]
ASF GitHub Bot commented on PARQUET-2159:
-
jiangjiguang commented on code in PR #1011:
URL:
jiangjiguang commented on code in PR #1011:
URL: https://github.com/apache/parquet-mr/pull/1011#discussion_r1115471862
##
parquet-column/src/main/java/org/apache/parquet/column/values/bitpacking/ParquetReadRouter.java:
##
@@ -0,0 +1,133 @@
+/*
+ * Licensed to the Apache
[
https://issues.apache.org/jira/browse/PARQUET-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692606#comment-17692606
]
ASF GitHub Bot commented on PARQUET-2159:
-
jatin-bhateja commented on code in PR #1011:
URL:
jatin-bhateja commented on code in PR #1011:
URL: https://github.com/apache/parquet-mr/pull/1011#discussion_r1115408615
##
parquet-column/src/main/java/org/apache/parquet/column/values/bitpacking/ParquetReadRouter.java:
##
@@ -0,0 +1,133 @@
+/*
+ * Licensed to the Apache
[
https://issues.apache.org/jira/browse/PARQUET-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692614#comment-17692614
]
ASF GitHub Bot commented on PARQUET-2159:
-
jiangjiguang commented on code in PR #1011:
URL:
jiangjiguang commented on code in PR #1011:
URL: https://github.com/apache/parquet-mr/pull/1011#discussion_r1115471862
##
parquet-column/src/main/java/org/apache/parquet/column/values/bitpacking/ParquetReadRouter.java:
##
@@ -0,0 +1,133 @@
+/*
+ * Licensed to the Apache
[
https://issues.apache.org/jira/browse/PARQUET-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692597#comment-17692597
]
ASF GitHub Bot commented on PARQUET-2159:
-
jatin-bhateja commented on code in PR #1011:
URL:
jatin-bhateja commented on code in PR #1011:
URL: https://github.com/apache/parquet-mr/pull/1011#discussion_r1115435366
##
parquet-column/src/main/java/org/apache/parquet/column/values/bitpacking/ParquetReadRouter.java:
##
@@ -0,0 +1,133 @@
+/*
+ * Licensed to the Apache
[
https://issues.apache.org/jira/browse/PARQUET-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692644#comment-17692644
]
ASF GitHub Bot commented on PARQUET-2159:
-
jiangjiguang commented on code in PR #1011:
URL:
jiangjiguang commented on code in PR #1011:
URL: https://github.com/apache/parquet-mr/pull/1011#discussion_r1115317241
##
parquet-generator/src/main/java/org/apache/parquet/encoding/vectorbitpacking/BitPackingGenerator512Vector.java:
##
@@ -0,0 +1,67 @@
+/*
+ * Licensed to the
[
https://issues.apache.org/jira/browse/PARQUET-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692642#comment-17692642
]
ASF GitHub Bot commented on PARQUET-2159:
-
jiangjiguang commented on code in PR #1011:
URL:
jiangjiguang commented on code in PR #1011:
URL: https://github.com/apache/parquet-mr/pull/1011#discussion_r1115317241
##
parquet-generator/src/main/java/org/apache/parquet/encoding/vectorbitpacking/BitPackingGenerator512Vector.java:
##
@@ -0,0 +1,67 @@
+/*
+ * Licensed to the
[
https://issues.apache.org/jira/browse/PARQUET-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692685#comment-17692685
]
ASF GitHub Bot commented on PARQUET-2246:
-
gszadovszky merged PR #1030:
URL:
[
https://issues.apache.org/jira/browse/PARQUET-2198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692703#comment-17692703
]
Brais Couce commented on PARQUET-2198:
--
I see that the PR associated to this ticket was merged
[
https://issues.apache.org/jira/browse/PARQUET-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gabor Szadovszky reassigned PARQUET-2246:
-
Assignee: Yujiang Zhong
> Add short circuit logic to column index filter
>
[
https://issues.apache.org/jira/browse/PARQUET-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gabor Szadovszky resolved PARQUET-2246.
---
Resolution: Fixed
> Add short circuit logic to column index filter
>
gszadovszky merged PR #1030:
URL: https://github.com/apache/parquet-mr/pull/1030
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail:
[
https://issues.apache.org/jira/browse/PARQUET-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692816#comment-17692816
]
ASF GitHub Bot commented on PARQUET-831:
jianchun commented on code in PR #1022:
URL:
jianchun commented on code in PR #1022:
URL: https://github.com/apache/parquet-mr/pull/1022#discussion_r1116058380
##
parquet-column/src/test/java/org/apache/parquet/column/impl/TestColumnWriterV1.java:
##
@@ -0,0 +1,66 @@
+/*
+ * Licensed to the Apache Software Foundation
jatin-bhateja commented on code in PR #1011:
URL: https://github.com/apache/parquet-mr/pull/1011#discussion_r1115408615
##
parquet-column/src/main/java/org/apache/parquet/column/values/bitpacking/ParquetReadRouter.java:
##
@@ -0,0 +1,133 @@
+/*
+ * Licensed to the Apache
[
https://issues.apache.org/jira/browse/PARQUET-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692819#comment-17692819
]
ASF GitHub Bot commented on PARQUET-2159:
-
jatin-bhateja commented on code in PR #1011:
URL:
48 matches
Mail list logo