> On Aug 16, 2025, at 00:52, Tom Lane <t...@sss.pgh.pa.us> wrote: > > Chao Li <li.evan.c...@gmail.com> writes: >> With this patch, “isnull” now becomes true because of the change of strict >> op. Then the outer null join key tuple must be stored in a tuplestore. When >> an outer table contains a lot of null join key tuples, then the tuplestore >> could bump to very large, in that case, it would be hard to say this patch >> really benefits. > > What's your point? If we don't divert those tuples into the > tuplestore, then they will end up in the main hash table instead, > and the consequences of bloat there are far worse.
I might not state clearly. For this comments, I meant the outer table. For example: SELECT a.*, b.* from a RIGHT JOIN b on a.id = b.a_id; Let’s say table a is used to build hash, table b is the outer table. And say, table b has 1000 tuples whose a_id are NULL. Before this patch, when fetching such a tuple (a_id is null) from table b, the tuple will be returned to parent node immediately. With this tuple, all of such tuples will be put into hj_NullOuterTupleStore, and only be returned after all non-null tuples are processed. My comment was trying to say that if there are a lot of null join key tuples in outer table, then hj_NullOuterTupleStore might use a lot of memory or swap data to disk, which might lead to performance burden. So, I was thinking we could keep the original logic for outer table, and return null join key tuples immediately. > >> Based on this patch, if we are doing a left join, and outer table is empty, >> then all tuples from the inner table should be returned. In that case, we >> can skip building a hash table, instead, we can put all inner table tuples >> into hashtable.innerNullTupleStore. Building a tuplestore should be cheaper >> than building a hash table, so this way makes a little bit more performance >> improvement. > > I think that would make the logic completely unintelligible. Also, > a totally-empty input relation is not a common situation. We try to > optimize such cases when it's simple to do so, but we shouldn't let > that drive the fundamental design. > I absolutely agree we should not touch the fundamental design for the tiny optimization, that’s why I mentioned “based on this patch”. With this patch, you have introduced a change in MultiExecPrivateHash(): else if (node->keep_null_tuples) { /* null join key, but we must save tuple to be emitted later */ if (node->null_tuple_store == NULL) node->null_tuple_store = ExecHashBuildNullTupleStore(hashtable); tuplestore_puttupleslot(node->null_tuple_store, slot); } We can simply added a new flag to HashTable, say named skip_building_hash. Upon right join (join to the hash side), and outer table is empty, set the flag to true, then in the MultiExecPrivateHash(), if skip_building_hash is true, directly put all tuples into node->null_tuple_store without building a hash table. Then in ExecHashJoinImpl(), after "(void) MultiExecProcNode()" is called, if hashtable->skip_building_hash is true, directly set node->hj_JoinState = HJ_FILL_INNER_NULL_TUPLES. So, the tiny optimization is totally based on this patch, it depends on the HashTable.null_tuple_store (if you take this comment, then maybe rename this variable) and the new state HJ_FILL_INNER_NULL_TUPLES. Best regards, == Chao Li (Evan) -------------------- HighGo Software Co., Ltd. https://www.highgo.com/