[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-07-26 Thread jainaks
Github user jainaks commented on the issue: https://github.com/apache/spark/pull/21320 Thanks @mallman for making this huge contribution. 3 years is really a long time to keep patience for concluding things. I am attaching the sample parquet file for your reference with which you

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-13 Thread jainaks
Github user jainaks commented on the issue: https://github.com/apache/spark/pull/21320 > @jainaks What is the value of the Spark SQL configuration setting spark.sql.caseSensitive when you run this query? Also, are you querying the parquet file as part of a Hive metastore ta

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-12 Thread jainaks
Github user jainaks commented on the issue: https://github.com/apache/spark/pull/21320 Hi @mallman , I found another major issue after having this fix. Schema: a: struct (nullable = true) ||-- b: struct (nullable = true) |||-- c1: string (nullable

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-11 Thread jainaks
Github user jainaks commented on the issue: https://github.com/apache/spark/pull/21320 @mallman It does work fine with "name.First". --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-08 Thread jainaks
Github user jainaks commented on the issue: https://github.com/apache/spark/pull/21320 Hi @mallman, Thanks for this PR. It has huge impact on performance, when querying the nested parquet schema. I had used the original PR#16578 and found an issue, that it does not works well when

[GitHub] spark pull request #21320: [SPARK-4502][SQL] Parquet nested column pruning -...

2018-06-08 Thread jainaks
Github user jainaks commented on a diff in the pull request: https://github.com/apache/spark/pull/21320#discussion_r194049288 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaPruning.scala --- @@ -0,0 +1,153