I've been testing TPC-DS queries and found that I can get Randomly
Distributed tables to outperform Hash Distributed tables by increasing
hawq_rm_nvseg_perquery_perseg_limit on a per query basis to as high as 24.

For Hash Distributed tables, 24 is way too high.  It is also not a great
idea to make the default so high in case users are creating a mix of Random
and Hash Distributed Tables.

Would it be possible to make this one GUC separated into two so that you
can leave it 6 for Hash Distributed tables but another value like 16 for
Randomly Distributed tables?

This enhancement would also make it possible for later improvements in the
optimizer to determine how many vsegs to use.  For example, some queries
worked best set to 12 while others greatly benefited when set to 24.


Jon Roberts

Reply via email to