Select distinct on partitioned column requires reading all the files?

Stephen Boesch Mon, 23 Feb 2015 22:28:33 -0800

When querying a hive table according to a partitioning column, it would be
logical that a simple


select count(distinct partitioned_column_name) from my_partitioned_table

would complete almost instantaneously.

But we are seeing that both hive and impala are unable to execute this
query properly: they just read the entire table!

What do we need to do to ensure the above command executes rapidly?

Select distinct on partitioned column requires reading all the files?

Reply via email to