Hi All, We are having a distributed impala cluster with 6 impala executors and there is a kudu table with 3x replication. This table is hash partitioned by 3 columns (say HC1, HC2, HC3) and there are 2 partitions available from each column. Also this table is range partitioned by another column (RC1). Ex: [image: image.png]
We are executing the following sql on this table. [image: image.png] When checking the query profile of this sql from impala web UI, we noticed that only two executors are used to execute this sql. 1. Why impala doesn't use all 6 executors for execution? Since we have given conditions for HC1 and HC3 in where clause, does impala use only two executors to scan two partitions of HC2? Is there a way to distribute the sql to all 6 executors? 2. When executing this type of multiple sqls in parallel, we noticed all the queries use same two executors causing high memory usage. To avoid this I set the 'replica_preference' query option to 'remote' and then the queries dispatched to different two executors. However it introduced a small delay in execution time. Is there a way to avoid this high memory usage without introducing a delay to execution time. Thank You. Best Regards, Vibhath.