hi team, I have compressed (gzip) parquet files created with apache drill. the total folder size is 7.8gb and the number of rows are 116,249,263. the query takes 2min 18sec. Most of the time is spent on "PARQUET_ROW_GROUP_SCAN". Is there any way to improve this performance? i am using Drill - 1.20 CPU - 8 core mem - 16gb
I also tried increasing memory to 32GB but no much difference. I also tried certain recommendations given in drill documentation but with no success. Any pointer/help is highly appreciated. thx REgards Prabhakar