Github user omalley commented on the issue: https://github.com/apache/orc/pull/189 Here is a spread sheet that is sorted by data set and then by read+hdfs time. The read+hdfs time assumes 15mb/sec from hdfs. https://docs.google.com/spreadsheets/d/1bE1j-AaUY7Xq_uh1nqX1Jf7qXvfL1crp2Y8Hl2M4Y-E/edit?usp=sharing
---