Hi, I am trying to load aggregate data from one massive table containing historical data of ONE year. Partitioning is implemented on the historical table, however the number of files huge (>#100) and are gz compressed.
When i trying to load it using Tez execution engine. Can someone suggest some quick optimisation parameters to fine tune the query performance? Similar to hdfs block size, min input splits, map aggregation etc. Thanks, Saurabh