Benefits of parquet partitioning for non-restrictive, aggregate queries?

Edmon Begoli Sat, 21 Nov 2015 20:06:07 -0800

Hey guys,

Are there any benefits of generic partitioning for non-restrictive count(*)
queries
with Drill and Parquet files partitioned on some base criteria (by state,
month, etc.)


Let's say I am running:

select count(*) from dfs.tmp.`claims_parquet`;

where I have plain and partitioned claims_parquet

For example, is there maybe a scatter-gather parallelisation?

(we are about to benchmark this, but I would like to know a theory behind
it too)

Thank you,
Edmon

Benefits of parquet partitioning for non-restrictive, aggregate queries?

Reply via email to