Github user iddoav commented on the issue:
https://github.com/apache/spark/pull/21070
Our R&D in SimilarWeb have hard times with PARQUET-686, and merging this PR
will help us a lot. Note, that unlike Spark 2.1+ readers which have read-time
mitigations (SPARK-17213 et al), other systems like CDH5.X's spark and AWS
athena (probably also presto) do predicate pushdown on Spark 2.3 parquet
outputs, and return wrong answers when string columns are involved.
@gatorsmile @rdblue
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]