Hi Dev,

Happy New Year. When you execute a hash join or a hash group-by on large
amount of data on the current master, you may experience slower performance
compared to the same execution on your binary. This is because the hash
table size is now part of the compiler.joinmemory and the
compiler.groupmemory settings. Previously, the hash table size was not
accounted for. Thus, the budget was fully utilized for the data side. Now,
both the data and hash table for the data consume the budget, you may see
more data spilling to the disk (temporary file creations) during the
operation. But, this is correct (expected) behavior. Thus, please not be
surprised. In order to maintain the performance, please assign more budget.

Best,
Taewoo

Reply via email to