Michael Armbrust created SPARK-2443:
---------------------------------------
Summary: Reading from Partitioned Tables is Slow
Key: SPARK-2443
URL: https://issues.apache.org/jira/browse/SPARK-2443
Project: Spark
Issue Type: Bug
Components: SQL
Reporter: Michael Armbrust
Assignee: Zongheng Yang
Here are some numbers, all queries return ~20million:
SELECT COUNT(*) FROM <non partitioned table>
5.496467726 s
SELECT COUNT(*) FROM <partitioned table stored in parquet>
50.266666947 s
SELECT COUNT(*) FROM <same table as previous but loaded with parquetFile
instead of through hive>
2s
--
This message was sent by Atlassian JIRA
(v6.2#6252)