Hi there, I have my data stored in HDFS partitioned by month in Parquet format. The directory looks like this:
-month=201411 -month=201412 -month=201501 -.... I want to compute some aggregates for every timestamp. How is it possible to achieve that by taking advantage of the existing partitioning? One naive way I am thinking is issuing multiple sql queries: SELECT * FROM TABLE WHERE month=201411 SELECT * FROM TABLE WHERE month=201412 SELECT * FROM TABLE WHERE month=201501 ..... computing the aggregates on the results of each query and combining them in the end. I think there should be a better way right? Thanks