Github user dongjoon-hyun commented on the issue:
https://github.com/apache/spark/pull/20265
There might be many questions about ORC (or Parquet) performance
benchmarks. We can do that later. We cannot enumerate all cases. Also, users
can do that for their own workload. In fact, Apache Spark didn't show this kind
of benchmark when it turns on PPD for Parquet. If there is a benchmark for
Parquet, this PR will be a piece of cake.
I think this PR is enough to show the benefit of ORC PPD for enabling the
config true.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]