Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/24159 )

Change subject: IMPALA-14863: Fixes Potential OOM During Row Batch Processing
......................................................................


Patch Set 14:

(1 comment)

The perf-AB-test doesn't show any performance difference.

Is there a way to have an automated test for a problematic SQL?

http://gerrit.cloudera.org:8080/#/c/24159/14/be/src/exec/partitioned-hash-join-node-ir.cc
File be/src/exec/partitioned-hash-join-node-ir.cc:

http://gerrit.cloudera.org:8080/#/c/24159/14/be/src/exec/partitioned-hash-join-node-ir.cc@60
PS14, Line 60:     ClearExprResultsPool();
Things push in different directions:

On the one hand, finding a single-row match in the build side hash table is 
very common. There are lots of joins that have a FK/PK relationship where every 
probe row matches a single build row. Having a per-row check is a bit overkill 
for what we are doing. Fewer SQL queries have duplicates on the build side. 
Skipping calling this on the single-row case would help. If we kept a counter 
for the for loop, we could call it every few rows.

On the other hand, maybe with codegen we can detect that nothing in the loop 
will impact expr_results_pool_ (e.g. if there are no other conjuncts) and 
optimize this away for a lot of SQLs.



--
To view, visit http://gerrit.cloudera.org:8080/24159
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic54b5c39e1388681275681f22e61b27728dba5af
Gerrit-Change-Number: 24159
Gerrit-PatchSet: 14
Gerrit-Owner: Jason Fehr <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Jason Fehr <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Comment-Date: Wed, 27 May 2026 00:26:40 +0000
Gerrit-HasComments: Yes

Reply via email to