Hi -hackers, >From time to time we hit some memory leaks in PostgreSQL and on one occasion Tomas wrote: "I certainly agree it's annoying that when OOM hits, we end up with no information about what used the memory. Maybe we could have a threshold triggering a call to MemoryContextStats?" - see [1]. I had this in my memory/TODO list for quite some time now, and attached is simplistic patch to provide new "debug_palloc_context_threshold" GUC with aim to help investigating such memory leaks in contexts (think: mostly on customer side of things) and start discussion maybe if there are better alternatives. The aim is to see which codepath contributes to the case of using e.g. more than 5GB in a single context (it will start then dumping stats/stack traces every 100ms). Here the patch assumes that most memory leaks are present in the same context, so it dumps memory into a log, which can be later analyzed to build a stack trace profile. This patch is just to get discussion as I found that: - pg_backend_memory_contexts view: we often cannot modify an application to issue it from time to time. - Scripting gdb to call MemoryContextStats() or using 14+ pg_log_backend_memory_contexts() often seems impossible too as we do not know the PID which is going to cause it (the usual pattern is like this: some app starts some SQL, sometimes OOMs, rinse and repeat)
My doubts are following: 1. Is dumping on per-context threshold enough? That means it is not per-backend-RSS threshold. My problem is that palloc() would have to recursively collect context->mem_allocated for every context out there starting from TopMemoryContext on every run, right? (or just some global form of memory accounting would have to be introduced to avoid such slowdown of palloc(). What's worse there's seems to be no standardized OS-agnostic way of getting RSS in C/POSIX) 2. Is such stack trace reporting - see example in [2] - enough? Right now it is multi-line, but maybe we could have our own version of certain routines so that it would report all in one line? Like, e.g.: 2024-08-19 09:11:04.664 CEST [9317] LOG: dumping memory context stats for context "Caller tuples": Grand total: 25165712 bytes in 13 blocks; 1705672 free (0 chunks); 23460040 used; errbacktrace+0x46 <- +0x66c7e5 <- palloc+0x4a <- datumCopy+0x30 <- tuplesort_putdatum+0x8f <- +0x37bf26 <- [..long list..] The advantage here would be that - with that sampling every XXX ms, we could extract and statistically investigate what's the code path requesting most of the memory , even by using simple grep(1) / sort(1) / uniq(1) 3. In offlist discussion Tomas, pointed out to me that "it's not very clear the place where we exceed the limit is necessarily the place where we leak memory" (that palloc(), might be later pfree'ed()) 4. Another idea (by Tomas) is related to the static threshold. Maybe we want it to be dynamic, so add e.g. +1 .. 50% on every such call, so to catch only growing mem usage, not just > debug_palloc_context_threshold. Anything else? What other experiences may be relevant here? -J. [1] https://www.postgresql.org/message-id/9574e4b6-3334-777b-4287-29d81151963a%40enterprisedb.com [2] - sample use and log: create table zz as select 'AAAAAAAAAAA' || (i % 100) as id from generate_series(1000000, 3000000) i; set work_mem to '1GB'; set max_parallel_workers_per_gather to 0; set enable_hashagg to off; set debug_palloc_context_threshold to '1 MB'; select id, count(*) from zz group by id having count(*) > 1; 2024-08-19 09:11:04.664 CEST [9317] LOG: dumping memory context stats 2024-08-19 09:11:04.664 CEST [9317] DETAIL: Context "Caller tuples". 2024-08-19 09:11:04.664 CEST [9317] BACKTRACE: postgres: postgres postgres [local] SELECT(errbacktrace+0x46) [0x55ac9ab06e56] postgres: postgres postgres [local] SELECT(+0x66c7e5) [0x55ac9ab2f7e5] postgres: postgres postgres [local] SELECT(palloc+0x4a) [0x55ac9ab2fd3a] postgres: postgres postgres [local] SELECT(datumCopy+0x30) [0x55ac9aa10cf0] postgres: postgres postgres [local] SELECT(tuplesort_putdatum+0x8f) [0x55ac9ab4032f] postgres: postgres postgres [local] SELECT(+0x37bf26) [0x55ac9a83ef26] [..] 2024-08-19 09:11:04.664 CEST [9317] STATEMENT: select id, count(*) from zz group by id having count(*) > 1; 2024-08-19 09:11:04.664 CEST [9317] LOG: level: 0; Caller tuples: 25165712 total in 13 blocks; 1705672 free; 23460040 used 2024-08-19 09:11:04.664 CEST [9317] LOG: Grand total: 25165712 bytes in 13 blocks; 1705672 free (0 chunks); 23460040 used //100ms later: 2024-08-19 09:11:04.764 CEST [9317] LOG: dumping memory context stats 2024-08-19 09:11:04.764 CEST [9317] DETAIL: Context "Caller tuples". 2024-08-19 09:11:04.764 CEST [9317] BACKTRACE: postgres: postgres postgres [local] SELECT(errbacktrace+0x46) [0x55ac9ab06e56] postgres: postgres postgres [local] SELECT(+0x66c7e5) [0x55ac9ab2f7e5] postgres: postgres postgres [local] SELECT(palloc+0x4a) [0x55ac9ab2fd3a] postgres: postgres postgres [local] SELECT(datumCopy+0x30) [0x55ac9aa10cf0] postgres: postgres postgres [local] SELECT(tuplesort_putdatum+0x8f) [0x55ac9ab4032f] postgres: postgres postgres [local] SELECT(+0x37bf26) [0x55ac9a83ef26] [..] 2024-08-19 09:11:04.764 CEST [9317] STATEMENT: select id, count(*) from zz group by id having count(*) > 1; 2024-08-19 09:11:04.765 CEST [9317] LOG: level: 0; Caller tuples: 41942928 total in 15 blocks; 7015448 free; 34927480 used 2024-08-19 09:11:04.765 CEST [9317] LOG: Grand total: 41942928 bytes in 15 blocks; 7015448 free (0 chunks); 34927480 used
debug_palloc_context_threshold_v1.patch
Description: Binary data