Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/24159 )
Change subject: IMPALA-14863: Fixes Potential OOM During Row Batch Processing ...................................................................... Patch Set 14: (1 comment) The perf-AB-test doesn't show any performance difference. Is there a way to have an automated test for a problematic SQL? http://gerrit.cloudera.org:8080/#/c/24159/14/be/src/exec/partitioned-hash-join-node-ir.cc File be/src/exec/partitioned-hash-join-node-ir.cc: http://gerrit.cloudera.org:8080/#/c/24159/14/be/src/exec/partitioned-hash-join-node-ir.cc@60 PS14, Line 60: ClearExprResultsPool(); Things push in different directions: On the one hand, finding a single-row match in the build side hash table is very common. There are lots of joins that have a FK/PK relationship where every probe row matches a single build row. Having a per-row check is a bit overkill for what we are doing. Fewer SQL queries have duplicates on the build side. Skipping calling this on the single-row case would help. If we kept a counter for the for loop, we could call it every few rows. On the other hand, maybe with codegen we can detect that nothing in the loop will impact expr_results_pool_ (e.g. if there are no other conjuncts) and optimize this away for a lot of SQLs. -- To view, visit http://gerrit.cloudera.org:8080/24159 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic54b5c39e1388681275681f22e61b27728dba5af Gerrit-Change-Number: 24159 Gerrit-PatchSet: 14 Gerrit-Owner: Jason Fehr <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Jason Fehr <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Comment-Date: Wed, 27 May 2026 00:26:40 +0000 Gerrit-HasComments: Yes
