OFFSET

Bykov Ivan Mon, 17 Mar 2025 00:34:09 -0700

Hello, Michael!

> So, here is attached a counter-proposal, where we can simply added a
> counter tracking a node count in _jumbleNode() to add more entropy to
> the mix, incrementing it as well for NULL nodes.


It definitely looks like a more reliable solution than my variant, which only
counts NULL nodes.

However, we already knew about the overhead of adding `\0` bytes for
every NULL field.

> So that adds about 9.1% overhead to jumbling, on average.

See:
https://www.postgresql.org/message-id/flat/5ac172e0b77a4baba50671cd1a15285f%40localhost.localdomain#6c43f354f5f42d2a27e6824faa660a86

Is it really worth spending extra execution time to increase entropy
when we have non-NULL nodes?

Maybe we should choose to add node_count to the hash every time we visit
non-NULL or NULL nodes.
We could also add entropy if we see a change in the node->type value for
non-NULL variants.

Your Variant
------------

< node_count = 1 > < node 1 >
< node_count = 2 > /* node 2 = NULL */
< node_count = 3 > < node 3 >

Alternative 1 (mark only NULL Nodes)
------------------------------------

/* node_count = 1 */ < node 1 >
< node_count = 2 > /* node 2 = NULL */
/* node_count = 3 */ < node 3 >

Alternative 2 (mark only non-NULL Nodes)
----------------------------------------
This could address concerns about problems related to visiting nodes with the
same content placed in different query tree branches.

< node_count = 1 > < node 1 >
/* node_count = 2 */ /* node 2 = NULL */
< node_count = 3 > < node 3 >

RE: Query ID Calculation Fix for DISTINCT / ORDER BY and LIMIT / OFFSET

Reply via email to