Hello Ashutosh and Alvaro, I appreciate you replying to the email.
On Tue, Dec 3, 2024 at 7:38 PM Alvaro Herrera <alvhe...@alvh.no-ip.org> wrote: > > I don't think planning time in assert-enabled builds is something we > should worry about, at all. Planning time in production builds is the > important one. Thank you for your reply. Making debug builds too slow is not good for developers, so I'd like to see how these patches behave with assert-enabled builds. On Tue, Dec 3, 2024 at 1:12 PM Ashutosh Bapat <ashutosh.bapat....@gmail.com> wrote: > > Hi Yuya, > For one of the earlier versions, I had reported a large memory > consumption in all cases and increase in planning time for Assert > enabled builds. How does the latest version perform in those aspects? 1. Experimental results I ran experiments to measure memory consumption during planning. These are done with the release build. In the experiments, I used the rebased version of your patch [1], which is attached to this email. Table 1: Memory consumption when planning query A (without partition-wise join (PWJ), MiB) --------------------------------- n | Master | v23 | v28 --------------------------------- 1 | 0.132 | 0.137 | 0.137 2 | 0.158 | 0.166 | 0.166 4 | 0.220 | 0.234 | 0.235 8 | 0.347 | 0.375 | 0.375 16 | 0.596 | 0.652 | 0.653 32 | 1.104 | 1.221 | 1.223 64 | 2.126 | 2.392 | 2.396 128 | 4.245 | 4.917 | 4.925 256 | 8.742 | 10.651 | 10.663 384 | 13.603 | 17.159 | 17.176 512 | 18.758 | 24.827 | 24.850 640 | 23.924 | 32.223 | 32.253 768 | 30.050 | 41.843 | 41.879 896 | 36.224 | 51.937 | 51.978 1024 | 42.923 | 64.058 | 64.105 --------------------------------- Table 2: Memory consumption when planning query A (with PWJ, MiB) ------------------------------------ n | Master | v23 | v28 ------------------------------------ 1 | 0.190 | 0.194 | 0.195 2 | 0.276 | 0.284 | 0.284 4 | 0.461 | 0.475 | 0.475 8 | 0.844 | 0.871 | 0.871 16 | 1.584 | 1.640 | 1.641 32 | 3.085 | 3.202 | 3.204 64 | 6.099 | 6.365 | 6.369 128 | 12.261 | 12.934 | 12.941 256 | 25.061 | 26.970 | 26.982 384 | 38.542 | 42.098 | 42.116 512 | 52.541 | 58.610 | 58.633 640 | 66.579 | 74.878 | 74.908 768 | 82.421 | 94.214 | 94.250 896 | 98.483 | 114.196 | 114.237 1024 | 115.074 | 136.208 | 136.255 ------------------------------------ Table 3: Memory consumption when planning query B (without PWJ, MiB) ---------------------------------- n | Master | v23 | v28 ---------------------------------- 1 | 16.019 | 16.404 | 16.404 2 | 15.288 | 15.708 | 15.708 4 | 15.674 | 16.360 | 16.360 8 | 16.554 | 17.784 | 17.786 16 | 18.221 | 19.954 | 19.958 32 | 21.630 | 25.609 | 25.617 64 | 28.913 | 39.419 | 39.427 128 | 45.331 | 77.015 | 77.030 256 | 86.127 | 192.884 | 192.916 ---------------------------------- Table 4: Memory consumption when planning query B (with PWJ, MiB) -------------------------------------- n | Master | v23 | v28 -------------------------------------- 1 | 33.623 | 34.008 | 34.008 2 | 50.285 | 50.705 | 50.705 4 | 85.562 | 86.247 | 86.247 8 | 156.465 | 157.695 | 157.697 16 | 298.692 | 300.424 | 300.428 32 | 585.713 | 589.692 | 589.699 64 | 1169.396 | 1179.901 | 1179.909 128 | 2375.592 | 2407.275 | 2407.291 256 | 4942.295 | 5049.053 | 5049.084 -------------------------------------- Next, I measured the planning times using the debug build with assertions. In this experiment, I set CFLAGS to "-O0" and also used the attached patch that removes assertions in Bitmapset-based indexes. Table 5: Planning time of query A (debug build, ms) ----------------------------------------- n | Master | v28 | v28 w/ patch ----------------------------------------- 1 | 0.648 | 0.664 | 0.665 2 | 0.788 | 0.810 | 0.800 4 | 0.891 | 0.936 | 0.931 8 | 1.202 | 1.301 | 1.268 16 | 1.973 | 2.145 | 2.042 32 | 3.668 | 4.000 | 3.638 64 | 8.093 | 8.597 | 7.167 128 | 20.015 | 19.641 | 14.274 256 | 57.634 | 51.008 | 29.930 384 | 114.280 | 94.760 | 46.449 512 | 196.492 | 154.230 | 63.758 640 | 315.037 | 240.142 | 82.476 768 | 466.149 | 338.043 | 101.318 896 | 679.029 | 511.097 | 134.854 1024 | 897.806 | 592.823 | 141.852 ----------------------------------------- Table 6: Planning time of query B (debug build, ms) ------------------------------------------ n | Master | v28 | v28 w/ patch ------------------------------------------ 1 | 43.788 | 46.364 | 45.418 2 | 42.637 | 45.750 | 44.093 4 | 43.842 | 48.109 | 45.000 8 | 47.504 | 54.410 | 48.199 16 | 55.682 | 67.242 | 53.895 32 | 77.736 | 98.507 | 66.877 64 | 144.772 | 185.697 | 96.591 128 | 411.967 | 503.644 | 166.437 256 | 1653.681 | 1610.697 | 337.940 ------------------------------------------ 2. Discussion Tables 1, 2, 3, and 4 show that the proposed patches increase memory consumption. There seems to be no difference between v23 and v28. The increase is more significant for query B, which is 2x or more. For debug builds, I observed regressions compared to the master. The regressions were reduced with the attached patch. This indicates that the get_ec_[source|derive]_indexes[_strict]() functions (quoted below) have time-consuming assertions. I think these assertions are helpful, but it might be better to remove them to avoid slowing down debug builds. What do you think? ===== Bitmapset * get_ec_derive_indexes_strict(PlannerInfo *root, EquivalenceClass *ec, Relids relids) { Bitmapset *edis = NULL; int i = bms_next_member(relids, -1); if (i >= 0) { EquivalenceClassIndexes *index = &root->eclass_indexes_array[i]; /* * bms_intersect to the first relation to try to keep the resulting * Bitmapset as small as possible. This saves having to make a * complete bms_copy() of one of them. One may contain significantly * more words than the other. */ edis = bms_intersect(ec->ec_derive_indexes, index->derive_indexes); while ((i = bms_next_member(relids, i)) >= 0) { index = &root->eclass_indexes_array[i]; edis = bms_int_members(edis, index->derive_indexes); } } #ifdef USE_ASSERT_CHECKING /* verify the results look sane */ i = -1; while ((i = bms_next_member(edis, i)) >= 0) { RestrictInfo *rinfo = list_nth_node(RestrictInfo, root->eq_derives, i); Assert(bms_is_subset(relids, rinfo->clause_relids)); } #endif return edis; } ===== [1] https://www.postgresql.org/message-id/caexhw5stmouobe55pmt83r8uxvfcph+pvo5dnpdrvcsbgxe...@mail.gmail.com -- Best regards, Yuya Watari
From 1c5bdc574b9ab136ec29a2bd86b43fb484a57345 Mon Sep 17 00:00:00 2001 From: Ashutosh Bapat <ashutosh.ba...@enterprisedb.com> Date: Wed, 12 Jul 2023 14:34:14 +0530 Subject: [PATCH] Report memory used for planning a query in EXPLAIN ANALYZE The memory used in the CurrentMemoryContext and its children is sampled before and after calling pg_plan_query() from ExplainOneQuery(). The difference in the two samples is reported as the memory consumed while planning the query. This may not account for the memory allocated in memory contexts which are not children of CurrentMemoryContext. These contexts are usually other long lived contexts, e.g. CacheMemoryContext, which are shared by all the queries run in a session. The consumption in those can not be attributed only to a given query and hence should not be reported any way. The memory consumption is reported as "Planning Memory" property in EXPLAIN ANALYZE output. Ashutosh Bapat --- src/backend/commands/explain.c | 12 ++++++++++-- src/backend/commands/prepare.c | 5 ++++- src/backend/utils/mmgr/mcxt.c | 19 +++++++++++++++++++ src/include/commands/explain.h | 3 ++- src/include/utils/memutils.h | 1 + 5 files changed, 36 insertions(+), 4 deletions(-) diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c index a3f1d53d7a..42fe11b3ce 100644 --- a/src/backend/commands/explain.c +++ b/src/backend/commands/explain.c @@ -471,6 +471,7 @@ standard_ExplainOneQuery(Query *query, int cursorOptions, MemoryContextCounters mem_counters; MemoryContext planner_ctx = NULL; MemoryContext saved_ctx = NULL; + Size mem_consumed; if (es->memory) { @@ -491,12 +492,14 @@ standard_ExplainOneQuery(Query *query, int cursorOptions, if (es->buffers) bufusage_start = pgBufferUsage; INSTR_TIME_SET_CURRENT(planstart); + mem_consumed = MemoryContextMemUsed(CurrentMemoryContext); /* plan the query */ plan = pg_plan_query(query, queryString, cursorOptions, params); INSTR_TIME_SET_CURRENT(planduration); INSTR_TIME_SUBTRACT(planduration, planstart); + mem_consumed = MemoryContextMemUsed(CurrentMemoryContext) - mem_consumed; if (es->memory) { @@ -514,7 +517,7 @@ standard_ExplainOneQuery(Query *query, int cursorOptions, /* run it (if needed) and produce output */ ExplainOnePlan(plan, into, es, queryString, params, queryEnv, &planduration, (es->buffers ? &bufusage : NULL), - es->memory ? &mem_counters : NULL); + es->memory ? &mem_counters : NULL, &mem_consumed); } /* @@ -638,7 +641,8 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es, const char *queryString, ParamListInfo params, QueryEnvironment *queryEnv, const instr_time *planduration, const BufferUsage *bufusage, - const MemoryContextCounters *mem_counters) + const MemoryContextCounters *mem_counters, + const Size *mem_consumed) { DestReceiver *dest; QueryDesc *queryDesc; @@ -769,6 +773,10 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es, double plantime = INSTR_TIME_GET_DOUBLE(*planduration); ExplainPropertyFloat("Planning Time", "ms", 1000.0 * plantime, 3, es); + + if (mem_consumed) + ExplainPropertyUInteger("Planning Memory", "bytes", + (uint64) *mem_consumed, es); } /* Print info about runtime of triggers */ diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c index a93f970a29..e5a49e825e 100644 --- a/src/backend/commands/prepare.c +++ b/src/backend/commands/prepare.c @@ -582,6 +582,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es, MemoryContextCounters mem_counters; MemoryContext planner_ctx = NULL; MemoryContext saved_ctx = NULL; + Size mem_consumed; if (es->memory) { @@ -596,6 +597,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es, if (es->buffers) bufusage_start = pgBufferUsage; INSTR_TIME_SET_CURRENT(planstart); + mem_consumed = MemoryContextMemUsed(CurrentMemoryContext); /* Look it up in the hash table */ entry = FetchPreparedStatement(execstmt->name, true); @@ -632,6 +634,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es, INSTR_TIME_SET_CURRENT(planduration); INSTR_TIME_SUBTRACT(planduration, planstart); + mem_consumed = MemoryContextMemUsed(CurrentMemoryContext) - mem_consumed; if (es->memory) { @@ -656,7 +659,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es, if (pstmt->commandType != CMD_UTILITY) ExplainOnePlan(pstmt, into, es, query_string, paramLI, pstate->p_queryEnv, &planduration, (es->buffers ? &bufusage : NULL), - es->memory ? &mem_counters : NULL); + es->memory ? &mem_counters : NULL, &mem_consumed); else ExplainOneUtility(pstmt->utilityStmt, into, es, pstate, paramLI); diff --git a/src/backend/utils/mmgr/mcxt.c b/src/backend/utils/mmgr/mcxt.c index 70d33226cb..e0a119a886 100644 --- a/src/backend/utils/mmgr/mcxt.c +++ b/src/backend/utils/mmgr/mcxt.c @@ -865,6 +865,25 @@ MemoryContextStatsDetail(MemoryContext context, } } +/* + * Compute the memory used in the given context and its children. + * + * XXX: Instead of this separate function we could modify + * MemoryContextStatsDetail() to report used memory and disable printing the + * detailed stats. + */ +extern Size +MemoryContextMemUsed(MemoryContext context) +{ + MemoryContextCounters grand_totals; + + memset(&grand_totals, 0, sizeof(grand_totals)); + + MemoryContextStatsInternal(context, 0, false, 100, &grand_totals, false); + + return grand_totals.totalspace - grand_totals.freespace; +} + /* * MemoryContextStatsInternal * One recursion level for MemoryContextStats diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h index aa5872bc15..4fb267afef 100644 --- a/src/include/commands/explain.h +++ b/src/include/commands/explain.h @@ -108,7 +108,8 @@ extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ParamListInfo params, QueryEnvironment *queryEnv, const instr_time *planduration, const BufferUsage *bufusage, - const MemoryContextCounters *mem_counters); + const MemoryContextCounters *mem_counters, + const Size *mem_consumed); extern void ExplainPrintPlan(ExplainState *es, QueryDesc *queryDesc); extern void ExplainPrintTriggers(ExplainState *es, QueryDesc *queryDesc); diff --git a/src/include/utils/memutils.h b/src/include/utils/memutils.h index bf93433b78..9afb0bfb2a 100644 --- a/src/include/utils/memutils.h +++ b/src/include/utils/memutils.h @@ -92,6 +92,7 @@ extern void MemoryContextStatsDetail(MemoryContext context, bool print_to_stderr); extern void MemoryContextAllowInCriticalSection(MemoryContext context, bool allow); +extern Size MemoryContextMemUsed(MemoryContext context); #ifdef MEMORY_CONTEXT_CHECKING extern void MemoryContextCheck(MemoryContext context); -- 2.45.2.windows.1
diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c index 86057538c5..9bbe11a9de 100644 --- a/src/backend/optimizer/path/equivclass.c +++ b/src/backend/optimizer/path/equivclass.c @@ -4014,18 +4014,6 @@ get_ec_source_indexes(PlannerInfo *root, EquivalenceClass *ec, Relids relids) rel_esis = bms_add_members(rel_esis, index->source_indexes); } -#ifdef USE_ASSERT_CHECKING - /* verify the results look sane */ - i = -1; - while ((i = bms_next_member(rel_esis, i)) >= 0) - { - RestrictInfo *rinfo = list_nth_node(RestrictInfo, root->eq_sources, - i); - - Assert(bms_overlap(relids, rinfo->clause_relids)); - } -#endif - /* bitwise-AND to leave only the ones for this EquivalenceClass */ return bms_int_members(rel_esis, ec->ec_source_indexes); } @@ -4065,18 +4053,6 @@ get_ec_source_indexes_strict(PlannerInfo *root, EquivalenceClass *ec, } } -#ifdef USE_ASSERT_CHECKING - /* verify the results look sane */ - i = -1; - while ((i = bms_next_member(esis, i)) >= 0) - { - RestrictInfo *rinfo = list_nth_node(RestrictInfo, root->eq_sources, - i); - - Assert(bms_is_subset(relids, rinfo->clause_relids)); - } -#endif - return esis; } @@ -4103,18 +4079,6 @@ get_ec_derive_indexes(PlannerInfo *root, EquivalenceClass *ec, Relids relids) rel_edis = bms_add_members(rel_edis, index->derive_indexes); } -#ifdef USE_ASSERT_CHECKING - /* verify the results look sane */ - i = -1; - while ((i = bms_next_member(rel_edis, i)) >= 0) - { - RestrictInfo *rinfo = list_nth_node(RestrictInfo, root->eq_derives, - i); - - Assert(bms_overlap(relids, rinfo->clause_relids)); - } -#endif - /* bitwise-AND to leave only the ones for this EquivalenceClass */ return bms_int_members(rel_edis, ec->ec_derive_indexes); } @@ -4154,18 +4118,6 @@ get_ec_derive_indexes_strict(PlannerInfo *root, EquivalenceClass *ec, } } -#ifdef USE_ASSERT_CHECKING - /* verify the results look sane */ - i = -1; - while ((i = bms_next_member(edis, i)) >= 0) - { - RestrictInfo *rinfo = list_nth_node(RestrictInfo, root->eq_derives, - i); - - Assert(bms_is_subset(relids, rinfo->clause_relids)); - } -#endif - return edis; }