Github user jainaks commented on the issue:
https://github.com/apache/spark/pull/21320
Thanks @mallman for making this huge contribution. 3 years is really a long
time to keep patience for concluding things.
I am attaching the sample parquet file for your reference with which you
Github user jainaks commented on the issue:
https://github.com/apache/spark/pull/21320
> @jainaks What is the value of the Spark SQL configuration setting
spark.sql.caseSensitive when you run this query? Also, are you querying the
parquet file as part of a Hive metastore ta
Github user jainaks commented on the issue:
https://github.com/apache/spark/pull/21320
Hi @mallman ,
I found another major issue after having this fix.
Schema:
a: struct (nullable = true)
||-- b: struct (nullable = true)
|||-- c1: string (nullable
Github user jainaks commented on the issue:
https://github.com/apache/spark/pull/21320
@mallman It does work fine with "name.First".
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For
Github user jainaks commented on the issue:
https://github.com/apache/spark/pull/21320
Hi @mallman, Thanks for this PR. It has huge impact on performance, when
querying the nested parquet schema. I had used the original PR#16578 and found
an issue, that it does not works well when
Github user jainaks commented on a diff in the pull request:
https://github.com/apache/spark/pull/21320#discussion_r194049288
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaPruning.scala
---
@@ -0,0 +1,153