Hi all, I did a simple experiment with Spark SQL. I created a partitioned parquet table with only one partition (date=20140701). A simple `select count(*) from table where date=20140701` would run very fast (0.1 seconds). However, as I added more partitions the query takes longer and longer. When I added about 10,000 partitions, the query took way too long. I feel like querying for a single partition should not be affected by having more partitions. Is this a known behaviour? What does spark try to do here?
Thanks, Jerrick