On Tue, 11 Mar 2025 at 20:48, David Rowley <dgrowle...@gmail.com> wrote: > Since then, I see that Ivan > has already submitted a patch that accounts for NULL nodes and adds a > byte to the jumble buffer to account for NULLs. This seems quite clean > and simple. However, Sami seems to have concerns about the overhead of > doing this. Is that warranted at all? Potentially, there could be a > decent number of NULL fields. It'll probably be much cheaper than the > offsetof idea I came up with.
To try and get some timing information about the overhead added for jumbling the NULLs, I compared master and the patch at [1] using the attached jumbletime.patch.txt. The patch just adds an additional call to JumbleQuery and times how long it takes and outputs the result in nanoseconds. Running make check has the advantage of trying out many different queries. On an AMD 3990x machine, the results for jumbling all make check queries are: master: 73.66 milliseconds 0001-Query-ID-Calculation-Fix-Variant-B.patch: 80.36 milliseconds So that adds about 9.1% overhead to jumbling, on average. See the attached jumble_results.txt for full results and the code to extract the results. David [1] https://www.postgresql.org/message-id/attachment/173439/0001-Query-ID-Calculation-Fix-Variant-B.patch
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c index 55ab2da299b..cc03458fcba 100644 --- a/src/backend/tcop/postgres.c +++ b/src/backend/tcop/postgres.c @@ -873,6 +873,19 @@ pg_rewrite_query(Query *query) return querytree_list; } +#define NANOSEC_PER_SEC 1000000000 + +#include <time.h> + +/* Returns difference between t1 and t2 in nanoseconds */ +static int64 +get_clock_diff(struct timespec *t1, struct timespec *t2) +{ + int64 nanosec = (t1->tv_sec - t2->tv_sec) * NANOSEC_PER_SEC; + nanosec += (t1->tv_nsec - t2->tv_nsec); + + return nanosec; +} /* * Generate a plan for a single already-rewritten query. @@ -883,6 +896,8 @@ pg_plan_query(Query *querytree, const char *query_string, int cursorOptions, ParamListInfo boundParams) { PlannedStmt *plan; + struct timespec start, end; + int64 jumbletime; /* Utility commands have no plans. */ if (querytree->commandType == CMD_UTILITY) @@ -896,6 +911,12 @@ pg_plan_query(Query *querytree, const char *query_string, int cursorOptions, if (log_planner_stats) ResetUsage(); + clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &start); + JumbleQuery(querytree); + clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &end); + jumbletime = get_clock_diff(&end, &start); + elog(NOTICE, "QueryJumble in %ld nanoseconds", jumbletime); + /* call the optimizer */ plan = planner(querytree, query_string, cursorOptions, boundParams);
AMD3990x master for i in {1..10}; do make check 2>&1 > /dev/null || cat src/test/regress/regression.diffs | grep "+NOTICE: QueryJumble" | awk '{sum+=$4;} END{print sum;}'; done 73465622 73996983 73274602 73038683 73296263 72378037 72797576 72868935 75244516 76289167 wget https://www.postgresql.org/message-id/attachment/173439/0001-Query-ID-Calculation-Fix-Variant-B.patch patch -p1 < 0001-Query-ID-Calculation-Fix-Variant-B.patch make -j -s for i in {1..10}; do make check 2>&1 > /dev/null || cat src/test/regress/regression.diffs | grep "+NOTICE: QueryJumble" | awk '{sum+=$4;} END{print sum;}'; done 79860948 79848310 80034482 78989011 79237854 80003277 81465066 80826526 82407462 81020999