Hello Ashutosh and Alvaro,
I appreciate you replying to the email.
On Tue, Dec 3, 2024 at 7:38 PM Alvaro Herrera <[email protected]> wrote:
>
> I don't think planning time in assert-enabled builds is something we
> should worry about, at all. Planning time in production builds is the
> important one.
Thank you for your reply. Making debug builds too slow is not good for
developers, so I'd like to see how these patches behave with
assert-enabled builds.
On Tue, Dec 3, 2024 at 1:12 PM Ashutosh Bapat
<[email protected]> wrote:
>
> Hi Yuya,
> For one of the earlier versions, I had reported a large memory
> consumption in all cases and increase in planning time for Assert
> enabled builds. How does the latest version perform in those aspects?
1. Experimental results
I ran experiments to measure memory consumption during planning. These
are done with the release build. In the experiments, I used the
rebased version of your patch [1], which is attached to this email.
Table 1: Memory consumption when planning query A (without
partition-wise join (PWJ), MiB)
---------------------------------
n | Master | v23 | v28
---------------------------------
1 | 0.132 | 0.137 | 0.137
2 | 0.158 | 0.166 | 0.166
4 | 0.220 | 0.234 | 0.235
8 | 0.347 | 0.375 | 0.375
16 | 0.596 | 0.652 | 0.653
32 | 1.104 | 1.221 | 1.223
64 | 2.126 | 2.392 | 2.396
128 | 4.245 | 4.917 | 4.925
256 | 8.742 | 10.651 | 10.663
384 | 13.603 | 17.159 | 17.176
512 | 18.758 | 24.827 | 24.850
640 | 23.924 | 32.223 | 32.253
768 | 30.050 | 41.843 | 41.879
896 | 36.224 | 51.937 | 51.978
1024 | 42.923 | 64.058 | 64.105
---------------------------------
Table 2: Memory consumption when planning query A (with PWJ, MiB)
------------------------------------
n | Master | v23 | v28
------------------------------------
1 | 0.190 | 0.194 | 0.195
2 | 0.276 | 0.284 | 0.284
4 | 0.461 | 0.475 | 0.475
8 | 0.844 | 0.871 | 0.871
16 | 1.584 | 1.640 | 1.641
32 | 3.085 | 3.202 | 3.204
64 | 6.099 | 6.365 | 6.369
128 | 12.261 | 12.934 | 12.941
256 | 25.061 | 26.970 | 26.982
384 | 38.542 | 42.098 | 42.116
512 | 52.541 | 58.610 | 58.633
640 | 66.579 | 74.878 | 74.908
768 | 82.421 | 94.214 | 94.250
896 | 98.483 | 114.196 | 114.237
1024 | 115.074 | 136.208 | 136.255
------------------------------------
Table 3: Memory consumption when planning query B (without PWJ, MiB)
----------------------------------
n | Master | v23 | v28
----------------------------------
1 | 16.019 | 16.404 | 16.404
2 | 15.288 | 15.708 | 15.708
4 | 15.674 | 16.360 | 16.360
8 | 16.554 | 17.784 | 17.786
16 | 18.221 | 19.954 | 19.958
32 | 21.630 | 25.609 | 25.617
64 | 28.913 | 39.419 | 39.427
128 | 45.331 | 77.015 | 77.030
256 | 86.127 | 192.884 | 192.916
----------------------------------
Table 4: Memory consumption when planning query B (with PWJ, MiB)
--------------------------------------
n | Master | v23 | v28
--------------------------------------
1 | 33.623 | 34.008 | 34.008
2 | 50.285 | 50.705 | 50.705
4 | 85.562 | 86.247 | 86.247
8 | 156.465 | 157.695 | 157.697
16 | 298.692 | 300.424 | 300.428
32 | 585.713 | 589.692 | 589.699
64 | 1169.396 | 1179.901 | 1179.909
128 | 2375.592 | 2407.275 | 2407.291
256 | 4942.295 | 5049.053 | 5049.084
--------------------------------------
Next, I measured the planning times using the debug build with
assertions. In this experiment, I set CFLAGS to "-O0" and also used
the attached patch that removes assertions in Bitmapset-based indexes.
Table 5: Planning time of query A (debug build, ms)
-----------------------------------------
n | Master | v28 | v28 w/ patch
-----------------------------------------
1 | 0.648 | 0.664 | 0.665
2 | 0.788 | 0.810 | 0.800
4 | 0.891 | 0.936 | 0.931
8 | 1.202 | 1.301 | 1.268
16 | 1.973 | 2.145 | 2.042
32 | 3.668 | 4.000 | 3.638
64 | 8.093 | 8.597 | 7.167
128 | 20.015 | 19.641 | 14.274
256 | 57.634 | 51.008 | 29.930
384 | 114.280 | 94.760 | 46.449
512 | 196.492 | 154.230 | 63.758
640 | 315.037 | 240.142 | 82.476
768 | 466.149 | 338.043 | 101.318
896 | 679.029 | 511.097 | 134.854
1024 | 897.806 | 592.823 | 141.852
-----------------------------------------
Table 6: Planning time of query B (debug build, ms)
------------------------------------------
n | Master | v28 | v28 w/ patch
------------------------------------------
1 | 43.788 | 46.364 | 45.418
2 | 42.637 | 45.750 | 44.093
4 | 43.842 | 48.109 | 45.000
8 | 47.504 | 54.410 | 48.199
16 | 55.682 | 67.242 | 53.895
32 | 77.736 | 98.507 | 66.877
64 | 144.772 | 185.697 | 96.591
128 | 411.967 | 503.644 | 166.437
256 | 1653.681 | 1610.697 | 337.940
------------------------------------------
2. Discussion
Tables 1, 2, 3, and 4 show that the proposed patches increase memory
consumption. There seems to be no difference between v23 and v28. The
increase is more significant for query B, which is 2x or more.
For debug builds, I observed regressions compared to the master. The
regressions were reduced with the attached patch. This indicates that
the get_ec_[source|derive]_indexes[_strict]() functions (quoted below)
have time-consuming assertions. I think these assertions are helpful,
but it might be better to remove them to avoid slowing down debug
builds. What do you think?
=====
Bitmapset *
get_ec_derive_indexes_strict(PlannerInfo *root, EquivalenceClass *ec,
Relids relids)
{
Bitmapset *edis = NULL;
int i = bms_next_member(relids, -1);
if (i >= 0)
{
EquivalenceClassIndexes *index = &root->eclass_indexes_array[i];
/*
* bms_intersect to the first relation to try to keep the resulting
* Bitmapset as small as possible. This saves having to make a
* complete bms_copy() of one of them. One may contain significantly
* more words than the other.
*/
edis = bms_intersect(ec->ec_derive_indexes,
index->derive_indexes);
while ((i = bms_next_member(relids, i)) >= 0)
{
index = &root->eclass_indexes_array[i];
edis = bms_int_members(edis, index->derive_indexes);
}
}
#ifdef USE_ASSERT_CHECKING
/* verify the results look sane */
i = -1;
while ((i = bms_next_member(edis, i)) >= 0)
{
RestrictInfo *rinfo = list_nth_node(RestrictInfo, root->eq_derives,
i);
Assert(bms_is_subset(relids, rinfo->clause_relids));
}
#endif
return edis;
}
=====
[1]
https://www.postgresql.org/message-id/caexhw5stmouobe55pmt83r8uxvfcph+pvo5dnpdrvcsbgxe...@mail.gmail.com
--
Best regards,
Yuya Watari
From 1c5bdc574b9ab136ec29a2bd86b43fb484a57345 Mon Sep 17 00:00:00 2001
From: Ashutosh Bapat <[email protected]>
Date: Wed, 12 Jul 2023 14:34:14 +0530
Subject: [PATCH] Report memory used for planning a query in EXPLAIN ANALYZE
The memory used in the CurrentMemoryContext and its children is sampled
before and after calling pg_plan_query() from ExplainOneQuery(). The
difference in the two samples is reported as the memory consumed while
planning the query. This may not account for the memory allocated in
memory contexts which are not children of CurrentMemoryContext. These
contexts are usually other long lived contexts, e.g.
CacheMemoryContext, which are shared by all the queries run in a
session. The consumption in those can not be attributed only to a given
query and hence should not be reported any way.
The memory consumption is reported as "Planning Memory" property in
EXPLAIN ANALYZE output.
Ashutosh Bapat
---
src/backend/commands/explain.c | 12 ++++++++++--
src/backend/commands/prepare.c | 5 ++++-
src/backend/utils/mmgr/mcxt.c | 19 +++++++++++++++++++
src/include/commands/explain.h | 3 ++-
src/include/utils/memutils.h | 1 +
5 files changed, 36 insertions(+), 4 deletions(-)
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index a3f1d53d7a..42fe11b3ce 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -471,6 +471,7 @@ standard_ExplainOneQuery(Query *query, int cursorOptions,
MemoryContextCounters mem_counters;
MemoryContext planner_ctx = NULL;
MemoryContext saved_ctx = NULL;
+ Size mem_consumed;
if (es->memory)
{
@@ -491,12 +492,14 @@ standard_ExplainOneQuery(Query *query, int cursorOptions,
if (es->buffers)
bufusage_start = pgBufferUsage;
INSTR_TIME_SET_CURRENT(planstart);
+ mem_consumed = MemoryContextMemUsed(CurrentMemoryContext);
/* plan the query */
plan = pg_plan_query(query, queryString, cursorOptions, params);
INSTR_TIME_SET_CURRENT(planduration);
INSTR_TIME_SUBTRACT(planduration, planstart);
+ mem_consumed = MemoryContextMemUsed(CurrentMemoryContext) -
mem_consumed;
if (es->memory)
{
@@ -514,7 +517,7 @@ standard_ExplainOneQuery(Query *query, int cursorOptions,
/* run it (if needed) and produce output */
ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
&planduration, (es->buffers ? &bufusage :
NULL),
- es->memory ? &mem_counters : NULL);
+ es->memory ? &mem_counters : NULL,
&mem_consumed);
}
/*
@@ -638,7 +641,8 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv, const instr_time
*planduration,
const BufferUsage *bufusage,
- const MemoryContextCounters *mem_counters)
+ const MemoryContextCounters *mem_counters,
+ const Size *mem_consumed)
{
DestReceiver *dest;
QueryDesc *queryDesc;
@@ -769,6 +773,10 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
ExplainState *es,
double plantime = INSTR_TIME_GET_DOUBLE(*planduration);
ExplainPropertyFloat("Planning Time", "ms", 1000.0 * plantime,
3, es);
+
+ if (mem_consumed)
+ ExplainPropertyUInteger("Planning Memory", "bytes",
+
(uint64) *mem_consumed, es);
}
/* Print info about runtime of triggers */
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index a93f970a29..e5a49e825e 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -582,6 +582,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause
*into, ExplainState *es,
MemoryContextCounters mem_counters;
MemoryContext planner_ctx = NULL;
MemoryContext saved_ctx = NULL;
+ Size mem_consumed;
if (es->memory)
{
@@ -596,6 +597,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause
*into, ExplainState *es,
if (es->buffers)
bufusage_start = pgBufferUsage;
INSTR_TIME_SET_CURRENT(planstart);
+ mem_consumed = MemoryContextMemUsed(CurrentMemoryContext);
/* Look it up in the hash table */
entry = FetchPreparedStatement(execstmt->name, true);
@@ -632,6 +634,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause
*into, ExplainState *es,
INSTR_TIME_SET_CURRENT(planduration);
INSTR_TIME_SUBTRACT(planduration, planstart);
+ mem_consumed = MemoryContextMemUsed(CurrentMemoryContext) -
mem_consumed;
if (es->memory)
{
@@ -656,7 +659,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause
*into, ExplainState *es,
if (pstmt->commandType != CMD_UTILITY)
ExplainOnePlan(pstmt, into, es, query_string, paramLI,
pstate->p_queryEnv,
&planduration, (es->buffers
? &bufusage : NULL),
- es->memory ? &mem_counters :
NULL);
+ es->memory ? &mem_counters :
NULL, &mem_consumed);
else
ExplainOneUtility(pstmt->utilityStmt, into, es, pstate,
paramLI);
diff --git a/src/backend/utils/mmgr/mcxt.c b/src/backend/utils/mmgr/mcxt.c
index 70d33226cb..e0a119a886 100644
--- a/src/backend/utils/mmgr/mcxt.c
+++ b/src/backend/utils/mmgr/mcxt.c
@@ -865,6 +865,25 @@ MemoryContextStatsDetail(MemoryContext context,
}
}
+/*
+ * Compute the memory used in the given context and its children.
+ *
+ * XXX: Instead of this separate function we could modify
+ * MemoryContextStatsDetail() to report used memory and disable printing the
+ * detailed stats.
+ */
+extern Size
+MemoryContextMemUsed(MemoryContext context)
+{
+ MemoryContextCounters grand_totals;
+
+ memset(&grand_totals, 0, sizeof(grand_totals));
+
+ MemoryContextStatsInternal(context, 0, false, 100, &grand_totals,
false);
+
+ return grand_totals.totalspace - grand_totals.freespace;
+}
+
/*
* MemoryContextStatsInternal
* One recursion level for MemoryContextStats
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index aa5872bc15..4fb267afef 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -108,7 +108,8 @@ extern void ExplainOnePlan(PlannedStmt *plannedstmt,
IntoClause *into,
ParamListInfo params,
QueryEnvironment *queryEnv,
const instr_time
*planduration,
const BufferUsage *bufusage,
- const MemoryContextCounters
*mem_counters);
+ const MemoryContextCounters
*mem_counters,
+ const Size *mem_consumed);
extern void ExplainPrintPlan(ExplainState *es, QueryDesc *queryDesc);
extern void ExplainPrintTriggers(ExplainState *es, QueryDesc *queryDesc);
diff --git a/src/include/utils/memutils.h b/src/include/utils/memutils.h
index bf93433b78..9afb0bfb2a 100644
--- a/src/include/utils/memutils.h
+++ b/src/include/utils/memutils.h
@@ -92,6 +92,7 @@ extern void MemoryContextStatsDetail(MemoryContext context,
bool
print_to_stderr);
extern void MemoryContextAllowInCriticalSection(MemoryContext context,
bool allow);
+extern Size MemoryContextMemUsed(MemoryContext context);
#ifdef MEMORY_CONTEXT_CHECKING
extern void MemoryContextCheck(MemoryContext context);
--
2.45.2.windows.1
diff --git a/src/backend/optimizer/path/equivclass.c
b/src/backend/optimizer/path/equivclass.c
index 86057538c5..9bbe11a9de 100644
--- a/src/backend/optimizer/path/equivclass.c
+++ b/src/backend/optimizer/path/equivclass.c
@@ -4014,18 +4014,6 @@ get_ec_source_indexes(PlannerInfo *root,
EquivalenceClass *ec, Relids relids)
rel_esis = bms_add_members(rel_esis, index->source_indexes);
}
-#ifdef USE_ASSERT_CHECKING
- /* verify the results look sane */
- i = -1;
- while ((i = bms_next_member(rel_esis, i)) >= 0)
- {
- RestrictInfo *rinfo = list_nth_node(RestrictInfo,
root->eq_sources,
-
i);
-
- Assert(bms_overlap(relids, rinfo->clause_relids));
- }
-#endif
-
/* bitwise-AND to leave only the ones for this EquivalenceClass */
return bms_int_members(rel_esis, ec->ec_source_indexes);
}
@@ -4065,18 +4053,6 @@ get_ec_source_indexes_strict(PlannerInfo *root,
EquivalenceClass *ec,
}
}
-#ifdef USE_ASSERT_CHECKING
- /* verify the results look sane */
- i = -1;
- while ((i = bms_next_member(esis, i)) >= 0)
- {
- RestrictInfo *rinfo = list_nth_node(RestrictInfo,
root->eq_sources,
-
i);
-
- Assert(bms_is_subset(relids, rinfo->clause_relids));
- }
-#endif
-
return esis;
}
@@ -4103,18 +4079,6 @@ get_ec_derive_indexes(PlannerInfo *root,
EquivalenceClass *ec, Relids relids)
rel_edis = bms_add_members(rel_edis, index->derive_indexes);
}
-#ifdef USE_ASSERT_CHECKING
- /* verify the results look sane */
- i = -1;
- while ((i = bms_next_member(rel_edis, i)) >= 0)
- {
- RestrictInfo *rinfo = list_nth_node(RestrictInfo,
root->eq_derives,
-
i);
-
- Assert(bms_overlap(relids, rinfo->clause_relids));
- }
-#endif
-
/* bitwise-AND to leave only the ones for this EquivalenceClass */
return bms_int_members(rel_edis, ec->ec_derive_indexes);
}
@@ -4154,18 +4118,6 @@ get_ec_derive_indexes_strict(PlannerInfo *root,
EquivalenceClass *ec,
}
}
-#ifdef USE_ASSERT_CHECKING
- /* verify the results look sane */
- i = -1;
- while ((i = bms_next_member(edis, i)) >= 0)
- {
- RestrictInfo *rinfo = list_nth_node(RestrictInfo,
root->eq_derives,
-
i);
-
- Assert(bms_is_subset(relids, rinfo->clause_relids));
- }
-#endif
-
return edis;
}