When querying a hive table according to a partitioning column, it would be logical that a simple
select count(distinct partitioned_column_name) from my_partitioned_table would complete almost instantaneously. But we are seeing that both hive and impala are unable to execute this query properly: they just read the entire table! What do we need to do to ensure the above command executes rapidly?