Hello, Michael! > So, here is attached a counter-proposal, where we can simply added a > counter tracking a node count in _jumbleNode() to add more entropy to > the mix, incrementing it as well for NULL nodes.
It definitely looks like a more reliable solution than my variant, which only counts NULL nodes. However, we already knew about the overhead of adding `\0` bytes for every NULL field. > So that adds about 9.1% overhead to jumbling, on average. See: https://www.postgresql.org/message-id/flat/5ac172e0b77a4baba50671cd1a15285f%40localhost.localdomain#6c43f354f5f42d2a27e6824faa660a86 Is it really worth spending extra execution time to increase entropy when we have non-NULL nodes? Maybe we should choose to add node_count to the hash every time we visit non-NULL or NULL nodes. We could also add entropy if we see a change in the node->type value for non-NULL variants. Your Variant ------------ < node_count = 1 > < node 1 > < node_count = 2 > /* node 2 = NULL */ < node_count = 3 > < node 3 > Alternative 1 (mark only NULL Nodes) ------------------------------------ /* node_count = 1 */ < node 1 > < node_count = 2 > /* node 2 = NULL */ /* node_count = 3 */ < node 3 > Alternative 2 (mark only non-NULL Nodes) ---------------------------------------- This could address concerns about problems related to visiting nodes with the same content placed in different query tree branches. < node_count = 1 > < node 1 > /* node_count = 2 */ /* node 2 = NULL */ < node_count = 3 > < node 3 >