Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/13371#discussion_r65303008
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala
---
@@ -578,62 +583,6 @@ private[sql]
Github user yhuai commented on the pull request:
https://github.com/apache/spark/pull/13371
It is a good idea to add it if parquet supports it (I have an impression
that parquet does not support it. But maybe I am wrong). I think having
benchmark results is a good practice, so we can
Github user yhuai commented on a diff in the pull request:
https://github.com/apache/spark/pull/13371#discussion_r65302925
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala
---
@@ -344,6 +344,11 @@ private[sql] class
Github user yhuai commented on a diff in the pull request:
https://github.com/apache/spark/pull/13371#discussion_r65302899
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala
---
@@ -344,6 +344,11 @@ private[sql] class
Github user viirya commented on the pull request:
https://github.com/apache/spark/pull/13371
BTW, I can't see any reason not to add a row-group level filter for parquet.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If
Github user viirya commented on the pull request:
https://github.com/apache/spark/pull/13371
@yhuai As you can see, this is not to fix a bug/problem. So I think it
might be hard to provide a test case for it. I will try to do the benchmark.
---
If your project is set up for it, you
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/13371#discussion_r65301812
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala
---
@@ -578,62 +583,6 @@ private[sql] object
Github user yhuai commented on the pull request:
https://github.com/apache/spark/pull/13371
Can you provide a test case that shows the problem? Also, can you provide
benchmark results of the performance benefit?
---
If your project is set up for it, you can reply to this email and
Github user yhuai commented on a diff in the pull request:
https://github.com/apache/spark/pull/13371#discussion_r65301654
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala
---
@@ -344,6 +344,11 @@ private[sql] class
Github user yhuai commented on a diff in the pull request:
https://github.com/apache/spark/pull/13371#discussion_r65301661
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala
---
@@ -578,62 +583,6 @@ private[sql] object
Github user viirya commented on the pull request:
https://github.com/apache/spark/pull/13371#issuecomment-222408752
also cc @yhuai
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user viirya commented on the pull request:
https://github.com/apache/spark/pull/13371#issuecomment-222408282
cc @nongli @liancheng
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/13371#issuecomment-93505
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/13371#issuecomment-93503
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/13371#issuecomment-93479
**[Test build #59550 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59550/consoleFull)**
for PR 13371 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/13371#issuecomment-91012
**[Test build #59550 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59550/consoleFull)**
for PR 13371 at commit
Github user viirya commented on the pull request:
https://github.com/apache/spark/pull/13371#issuecomment-90971
retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/13371#issuecomment-90965
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/13371#issuecomment-90964
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/13371#issuecomment-90957
**[Test build #59549 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59549/consoleFull)**
for PR 13371 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/13371#issuecomment-90740
**[Test build #59549 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59549/consoleFull)**
for PR 13371 at commit
GitHub user viirya opened a pull request:
https://github.com/apache/spark/pull/13371
[SPARK-15639][SQL] Try to push down filter at RowGroups level for parquet
reader
## What changes were proposed in this pull request?
When we use vecterized parquet reader, although the
22 matches
Mail list logo