Hi Ashutosh, Ashutosh Bapat <ashutosh.bapat....@gmail.com> 于2024年8月26日周一 19:05写道:
> Hi Shawn, > It will be good to document usage of this function. Please add > document changes in your patch. We need to document the impact of this > function so that users can judiciously decide whether or not to use > this function and under what conditions. Also they would know what to > expect when they use this function. I have already incorporated the usage of this function into the new patch. Currently, there is no memory information that can be extremely accurate to reflect whether a trim operation should be performed. Here are two conditions that can be used as references: 1. Check the difference between the process's memory usage (for example, the top command, due to the relationship with shared memory, it is necessary to subtract SHR from RES) and the statistics of the memory context. If the difference is very large, this function should be used to release memory; 2. Execute malloc_stats(). If the system bytes are greater than the in-use bytes, this indicates that this function can be used to release memory. > > Running it after a query finishes is one thing but that can't be > guaranteed because of the asynchronous nature of signal handlers. > malloc_trim() may be called while a query is being executed. We need > to assess that impact as well. > > Can you please share some numbers - TPS, latency etc. with and without > this function invoked during a benchmark run? > I have placed malloc_trim() at the end of the exec_simple_query function, so that malloc_trim() is executed once for each SQL statement executed. I used pgbench to reproduce the performance impact, and the results are as follows. *Database preparation:* > create database testc; > create user t1; > alter database testc owner to t1; > ./pgbench testc -U t1 -i -s 100 > ./pgbench testc -U t1 -S -c 100 -j 100 -T 600 *Without Trim*: > $./pgbench testc -U t1 -S -c 100 -j 100 -T 600 > pgbench (18devel) > starting vacuum...end. > transaction type: <builtin: select only> > scaling factor: 100 > query mode: simple > number of clients: 100 > number of threads: 100 > maximum number of tries: 1 > duration: 600 s > number of transactions actually processed: 551984376 > number of failed transactions: 0 (0.000%) > latency average = 0.109 ms > initial connection time = 23.569 ms > tps = 920001.842189 (without initial connection time) *With Trim :* > $./pgbench testc -U t1 -S -c 100 -j 100 -T 600 > pgbench (18devel) > starting vacuum...end. > transaction type: <builtin: select only> > scaling factor: 100 > query mode: simple > number of clients: 100 > number of threads: 100 > maximum number of tries: 1 > duration: 600 s > number of transactions actually processed: 470690787 > number of failed transactions: 0 (0.000%) > latency average = 0.127 ms > initial connection time = 23.632 ms > tps = 784511.901558 (without initial connection time)
From 6d286d506ba5ed2dff012537766e7874952413ac Mon Sep 17 00:00:00 2001 From: Shawn Wang <shawn.wang.pg@gmail.com> Date: Wed, 28 Aug 2024 18:20:57 +0800 Subject: [PATCH] Trim the free heap Memory. ALL processes in PostgreSQL actually use malloc to allocate and free memory. In the case of long connections where business queries are executed over extended periods, the distribution of memory can become extremely complex. Under certain circumstances, a common issue in memory usage due to the caching strategy of malloc may arise: even if memory is released through the free function, it may not be returned to the OS in a timely manner. This can lead to high system memory usage, affecting performance and the operation of other applications, and may even result in Out-Of-Memory (OOM) errors. Examine the difference between the memory usage of a process (for example, using the top command, where due to shared memory, it is necessary to subtract SHR from RES) and the statistics of the memory context. If the difference is very large, or after executing malloc_stats(), if the system bytes are greater than the in-use bytes, this indicates that this process needs to release the free heap memory. --- doc/src/sgml/func.sgml | 22 +++++++ src/backend/catalog/system_functions.sql | 2 + src/backend/postmaster/autovacuum.c | 4 ++ src/backend/postmaster/checkpointer.c | 4 ++ src/backend/postmaster/interrupt.c | 4 ++ src/backend/postmaster/pgarch.c | 4 ++ src/backend/postmaster/startup.c | 4 ++ src/backend/postmaster/walsummarizer.c | 4 ++ src/backend/storage/ipc/procsignal.c | 3 + src/backend/tcop/postgres.c | 3 + src/backend/utils/adt/mcxtfuncs.c | 54 +++++++++++++++ src/backend/utils/init/globals.c | 1 + src/backend/utils/mmgr/Makefile | 1 + src/backend/utils/mmgr/memtrim.c | 69 ++++++++++++++++++++ src/include/catalog/pg_proc.dat | 6 ++ src/include/miscadmin.h | 2 + src/include/storage/procsignal.h | 2 + src/include/utils/memutils.h | 3 + src/test/regress/expected/misc_functions.out | 11 ++++ src/test/regress/sql/misc_functions.sql | 8 +++ 20 files changed, 211 insertions(+) create mode 100644 src/backend/utils/mmgr/memtrim.c diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 461fc3f437..8b396371c8 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -28321,6 +28321,28 @@ acl | {postgres=arwdDxtm/postgres,foo=r/postgres} <literal>false</literal> is returned. </para></entry> </row> + + <row> + <entry role="func_table_entry"><para role="func_signature"> + <indexterm> + <primary>pg_trim_backend_heap_free_memory</primary> + </indexterm> + <function>pg_trim_backend_heap_free_memory</function> ( <parameter>pid</parameter> <type>integer</type> ) + <returnvalue>boolean</returnvalue> + </para> + <para> + Release the free heap memory in a specified process that has not been reclaimed by the Operating System. + When a process's memory usage significantly exceeds its Memory Context size, there may be memory that has + been freed but not yet returned to the operating system. In such cases, this function can be called to + forcibly release this memory. This function will send requests to backend and auxiliary processes, + excluding the logging process. The relevant processes will execute the malloc_trim() function during + interrupt handling to forcibly release free memory. If the signal is successfully sent, the function + will return <literal>true</literal>; if the sending fails, it will return <literal>false</literal>. + Additionally, the usage of this function will be logged to facilitate tracking of the process's memory + trimming operations. + </para></entry> + </row> + </tbody> </tgroup> </table> diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql index 623b9539b1..3aa3b60288 100644 --- a/src/backend/catalog/system_functions.sql +++ b/src/backend/catalog/system_functions.sql @@ -754,6 +754,8 @@ REVOKE EXECUTE ON FUNCTION pg_ls_dir(text,boolean,boolean) FROM public; REVOKE EXECUTE ON FUNCTION pg_log_backend_memory_contexts(integer) FROM PUBLIC; +REVOKE EXECUTE ON FUNCTION pg_trim_backend_heap_free_memory(integer) FROM PUBLIC; + REVOKE EXECUTE ON FUNCTION pg_ls_logicalsnapdir() FROM PUBLIC; REVOKE EXECUTE ON FUNCTION pg_ls_logicalmapdir() FROM PUBLIC; diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c index 7d0877c95e..d90d1649a1 100644 --- a/src/backend/postmaster/autovacuum.c +++ b/src/backend/postmaster/autovacuum.c @@ -765,6 +765,10 @@ HandleAutoVacLauncherInterrupts(void) if (LogMemoryContextPending) ProcessLogMemoryContextInterrupt(); + /* Perform trimming heap free memory of this process */ + if (TrimHeapFreeMemoryPending) + ProcessTrimHeapFreeMemoryInterrupt(); + /* Process sinval catchup interrupts that happened while sleeping */ ProcessCatchupInterrupt(); } diff --git a/src/backend/postmaster/checkpointer.c b/src/backend/postmaster/checkpointer.c index 199f008bcd..2109f9e822 100644 --- a/src/backend/postmaster/checkpointer.c +++ b/src/backend/postmaster/checkpointer.c @@ -605,6 +605,10 @@ HandleCheckpointerInterrupts(void) /* Perform logging of memory contexts of this process */ if (LogMemoryContextPending) ProcessLogMemoryContextInterrupt(); + + /* Perform trimming heap free memory of this process */ + if (TrimHeapFreeMemoryPending) + ProcessTrimHeapFreeMemoryInterrupt(); } /* diff --git a/src/backend/postmaster/interrupt.c b/src/backend/postmaster/interrupt.c index eedc0980cf..4389ff3d48 100644 --- a/src/backend/postmaster/interrupt.c +++ b/src/backend/postmaster/interrupt.c @@ -48,6 +48,10 @@ HandleMainLoopInterrupts(void) /* Perform logging of memory contexts of this process */ if (LogMemoryContextPending) ProcessLogMemoryContextInterrupt(); + + /* Perform trimming heap free memory of this process */ + if (TrimHeapFreeMemoryPending) + ProcessTrimHeapFreeMemoryInterrupt(); } /* diff --git a/src/backend/postmaster/pgarch.c b/src/backend/postmaster/pgarch.c index 02f91431f5..cd72412614 100644 --- a/src/backend/postmaster/pgarch.c +++ b/src/backend/postmaster/pgarch.c @@ -865,6 +865,10 @@ HandlePgArchInterrupts(void) if (LogMemoryContextPending) ProcessLogMemoryContextInterrupt(); + /* Perform trimming heap free memory of this process */ + if (TrimHeapFreeMemoryPending) + ProcessTrimHeapFreeMemoryInterrupt(); + if (ConfigReloadPending) { char *archiveLib = pstrdup(XLogArchiveLibrary); diff --git a/src/backend/postmaster/startup.c b/src/backend/postmaster/startup.c index ef6f98ebcd..5eb8d168e7 100644 --- a/src/backend/postmaster/startup.c +++ b/src/backend/postmaster/startup.c @@ -192,6 +192,10 @@ HandleStartupProcInterrupts(void) /* Perform logging of memory contexts of this process */ if (LogMemoryContextPending) ProcessLogMemoryContextInterrupt(); + + /* Perform trimming heap free memory of this process */ + if (TrimHeapFreeMemoryPending) + ProcessTrimHeapFreeMemoryInterrupt(); } diff --git a/src/backend/postmaster/walsummarizer.c b/src/backend/postmaster/walsummarizer.c index daa7909382..d167e41291 100644 --- a/src/backend/postmaster/walsummarizer.c +++ b/src/backend/postmaster/walsummarizer.c @@ -874,6 +874,10 @@ HandleWalSummarizerInterrupts(void) /* Perform logging of memory contexts of this process */ if (LogMemoryContextPending) ProcessLogMemoryContextInterrupt(); + + /* Perform trimming heap free memory of this process */ + if (TrimHeapFreeMemoryPending) + ProcessTrimHeapFreeMemoryInterrupt(); } /* diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c index 87027f27eb..3251d2823e 100644 --- a/src/backend/storage/ipc/procsignal.c +++ b/src/backend/storage/ipc/procsignal.c @@ -712,6 +712,9 @@ procsignal_sigusr1_handler(SIGNAL_ARGS) if (CheckProcSignal(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN)) HandleRecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN); + if (CheckProcSignal(PROCSIG_TRIM_HEAP_FREE_MEMORY)) + HandleTrimHeapFreeMemoryInterrupt(); + SetLatch(MyLatch); } diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c index 8bc6bea113..f90a557bbe 100644 --- a/src/backend/tcop/postgres.c +++ b/src/backend/tcop/postgres.c @@ -3479,6 +3479,9 @@ ProcessInterrupts(void) if (ParallelApplyMessagePending) HandleParallelApplyMessages(); + + if (TrimHeapFreeMemoryPending) + ProcessTrimHeapFreeMemoryInterrupt(); } /* diff --git a/src/backend/utils/adt/mcxtfuncs.c b/src/backend/utils/adt/mcxtfuncs.c index 6a6634e1cd..c213c4163a 100644 --- a/src/backend/utils/adt/mcxtfuncs.c +++ b/src/backend/utils/adt/mcxtfuncs.c @@ -305,3 +305,57 @@ pg_log_backend_memory_contexts(PG_FUNCTION_ARGS) PG_RETURN_BOOL(true); } + +/* + * pg_trim_backend_heap_free_memory + * Signal a backend or an auxiliary process to trim heap free memory. + * + * On receipt of this signal, a backend or an auxiliary process sets the flag + * in the signal handler, which causes the next CHECK_FOR_INTERRUPTS() + * or process-specific interrupt handler to log the memory contexts. + */ +Datum +pg_trim_backend_heap_free_memory(PG_FUNCTION_ARGS) +{ + int pid = PG_GETARG_INT32(0); + PGPROC *proc; + ProcNumber procNumber = INVALID_PROC_NUMBER; + + /* + * See if the process with given pid is a backend or an auxiliary process. + */ + proc = BackendPidGetProc(pid); + if (proc == NULL) + proc = AuxiliaryPidGetProc(pid); + + /* + * BackendPidGetProc() and AuxiliaryPidGetProc() return NULL if the pid + * isn't valid; but by the time we reach kill(), a process for which we + * get a valid proc here might have terminated on its own. There's no way + * to acquire a lock on an arbitrary process to prevent that. But since + * this mechanism is usually used to debug a backend or an auxiliary + * process running and consuming lots of memory, that it might end on its + * own first and its memory contexts are not logged is not a problem. + */ + if (proc == NULL) + { + /* + * This is just a warning so a loop-through-resultset will not abort + * if one backend terminated on its own during the run. + */ + ereport(WARNING, + (errmsg("PID %d is not a PostgreSQL server process", pid))); + PG_RETURN_BOOL(false); + } + + procNumber = GetNumberFromPGProc(proc); + if (SendProcSignal(pid, PROCSIG_TRIM_HEAP_FREE_MEMORY, procNumber) < 0) + { + /* Again, just a warning to allow loops */ + ereport(WARNING, + (errmsg("could not send signal to process %d: %m", pid))); + PG_RETURN_BOOL(false); + } + + PG_RETURN_BOOL(true); +} diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c index 03a54451ac..e90a61affe 100644 --- a/src/backend/utils/init/globals.c +++ b/src/backend/utils/init/globals.c @@ -39,6 +39,7 @@ volatile sig_atomic_t IdleSessionTimeoutPending = false; volatile sig_atomic_t ProcSignalBarrierPending = false; volatile sig_atomic_t LogMemoryContextPending = false; volatile sig_atomic_t IdleStatsUpdateTimeoutPending = false; +volatile sig_atomic_t TrimHeapFreeMemoryPending = false; volatile uint32 InterruptHoldoffCount = 0; volatile uint32 QueryCancelHoldoffCount = 0; volatile uint32 CritSectionCount = 0; diff --git a/src/backend/utils/mmgr/Makefile b/src/backend/utils/mmgr/Makefile index 01a1fb8527..395f119d77 100644 --- a/src/backend/utils/mmgr/Makefile +++ b/src/backend/utils/mmgr/Makefile @@ -21,6 +21,7 @@ OBJS = \ generation.o \ mcxt.o \ memdebug.o \ + memtrim.o \ portalmem.o \ slab.o diff --git a/src/backend/utils/mmgr/memtrim.c b/src/backend/utils/mmgr/memtrim.c new file mode 100644 index 0000000000..5664eb9f80 --- /dev/null +++ b/src/backend/utils/mmgr/memtrim.c @@ -0,0 +1,69 @@ +/*------------------------------------------------------------------------- + * + * memtrim.c + * Declarations used in memory implementations. + * + * + * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + * src/backend/utils/mmgr/memtrim.c + * + *------------------------------------------------------------------------- + */ + +#include <malloc.h> + +#include "postgres.h" + +#include "miscadmin.h" +#include "utils/memutils.h" + +/* + * HandleTrimHeapFreeMemoryInterrupt + * Handle receipt of an interrupt indicating trimming of heap free + * memory. + * + * All the actual work is deferred to ProcessTrimHeapFreeMemoryInterrupt(), + * because we cannot safely precess trim inside the signal handler. + */ +void +HandleTrimHeapFreeMemoryInterrupt(void) +{ + InterruptPending = true; + TrimHeapFreeMemoryPending = true; + /* latch will be set by procsignal_sigusr1_handler */ +} + +/* + * ProcessTrimHeapFreeMemoryInterrupt + * Perform trimming of heap free memory of this backend process. + * + * Any backend that participates in ProcSignal signaling must arrange + * to call this function if we see TrimHeapFreeMemoryPending set. + * It is called from CHECK_FOR_INTERRUPTS(), which is enough because + * the target process for trimming of heap free memory is a backend. + */ +void +ProcessTrimHeapFreeMemoryInterrupt(void) +{ + TrimHeapFreeMemoryPending = false; + + /* + * Use LOG_SERVER_ONLY to prevent this message from being sent to the + * connected client. + */ + ereport(LOG_SERVER_ONLY, + (errhidestmt(true), + errhidecontext(true), + errmsg("trimming heap free memory of PID %d", MyProcPid))); + + /* + * The malloc_trim() function attempts to release free memory from + * the heap (by calling sbrk(2) or madvise(2) with suitable + * arguments). + * The argument is 0, only the minimum amount of memory is maintained + * at the top of the heap (i.e., one page or less). + */ + malloc_trim(0); +} diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat index 4abc6d9526..6186d9247d 100644 --- a/src/include/catalog/pg_proc.dat +++ b/src/include/catalog/pg_proc.dat @@ -8348,6 +8348,12 @@ prorettype => 'bool', proargtypes => 'int4', prosrc => 'pg_log_backend_memory_contexts' }, +# logging memory contexts of the specified backend +{ oid => '4551', descr => 'Trim Heap free memory of the specified backend', + proname => 'pg_trim_backend_heap_free_memory', provolatile => 'v', + prorettype => 'bool', proargtypes => 'int4', + prosrc => 'pg_trim_backend_heap_free_memory' }, + # non-persistent series generator { oid => '1066', descr => 'non-persistent series generator', proname => 'generate_series', prorows => '1000', diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h index 25348e71eb..5a676eeb9e 100644 --- a/src/include/miscadmin.h +++ b/src/include/miscadmin.h @@ -100,6 +100,8 @@ extern PGDLLIMPORT volatile sig_atomic_t IdleStatsUpdateTimeoutPending; extern PGDLLIMPORT volatile sig_atomic_t CheckClientConnectionPending; extern PGDLLIMPORT volatile sig_atomic_t ClientConnectionLost; +extern PGDLLIMPORT volatile sig_atomic_t TrimHeapFreeMemoryPending; + /* these are marked volatile because they are examined by signal handlers: */ extern PGDLLIMPORT volatile uint32 InterruptHoldoffCount; extern PGDLLIMPORT volatile uint32 QueryCancelHoldoffCount; diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h index f94c11a9a8..0d4a3a42a2 100644 --- a/src/include/storage/procsignal.h +++ b/src/include/storage/procsignal.h @@ -48,6 +48,8 @@ typedef enum PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK, PROCSIG_RECOVERY_CONFLICT_LAST = PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK, + PROCSIG_TRIM_HEAP_FREE_MEMORY, /* ask backend to release free memory from the heap */ + NUM_PROCSIGNALS /* Must be last! */ } ProcSignalReason; diff --git a/src/include/utils/memutils.h b/src/include/utils/memutils.h index cd9596ff21..61d4d6d252 100644 --- a/src/include/utils/memutils.h +++ b/src/include/utils/memutils.h @@ -104,6 +104,9 @@ extern void MemoryContextCheck(MemoryContext context); extern void HandleLogMemoryContextInterrupt(void); extern void ProcessLogMemoryContextInterrupt(void); +extern void HandleTrimHeapFreeMemoryInterrupt(void); +extern void ProcessTrimHeapFreeMemoryInterrupt(void); + /* * Memory-context-type-specific functions */ diff --git a/src/test/regress/expected/misc_functions.out b/src/test/regress/expected/misc_functions.out index 35fb72f302..9bde10b2b1 100644 --- a/src/test/regress/expected/misc_functions.out +++ b/src/test/regress/expected/misc_functions.out @@ -365,6 +365,17 @@ RESET ROLE; REVOKE EXECUTE ON FUNCTION pg_log_backend_memory_contexts(integer) FROM regress_log_memory; DROP ROLE regress_log_memory; +-- +-- pg_trim_backend_heap_free_memory() +-- +-- Trim the heap free memory. +-- +SELECT pg_trim_backend_heap_free_memory(pg_backend_pid()); + pg_trim_backend_heap_free_memory +---------------------------------- + t +(1 row) + -- -- Test some built-in SRFs -- diff --git a/src/test/regress/sql/misc_functions.sql b/src/test/regress/sql/misc_functions.sql index e570783453..d1cd2440ee 100644 --- a/src/test/regress/sql/misc_functions.sql +++ b/src/test/regress/sql/misc_functions.sql @@ -143,6 +143,14 @@ REVOKE EXECUTE ON FUNCTION pg_log_backend_memory_contexts(integer) DROP ROLE regress_log_memory; +-- +-- pg_trim_backend_heap_free_memory() +-- +-- Trim the heap free memory. +-- + +SELECT pg_trim_backend_heap_free_memory(pg_backend_pid()); + -- -- Test some built-in SRFs -- -- 2.39.3