Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values
On 8/19/16 10:42 AM, Tom Lane wrote: It appeared to me after collecting some stats about the functions present in the regression tests that for larger functions, the extra space eaten is just some small multiple (like 2x-3x) of the function body string length. So it's not *that* much data, even for big functions, and it's only going to be eaten on the first usage of a function within a session. I wonder if freeing that memory would help speed up further allocations, and how much of the expense of the final free for the context gets reduced by an early reset. Probably not enough to negate the cost of resetting on every call though... -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values
Jim Nasbywrites: > On 7/25/16 1:50 PM, Tom Lane wrote: >> There's a glibc-dependent hack in aset.c that reports any >> plpgsql-driven palloc or pfree against a context named "SPI Proc", as >> well as changes in pl_comp.c so that transient junk created during initial >> parsing of a plpgsql function body doesn't end up in the SPI Proc context. >> (I did that just to cut the amount of noise I had to chase down. I suppose >> in large functions it might be worth adopting such a change so that that >> junk can be released immediately after parsing; but I suspect for most >> functions it'd just be an extra context without much gain.) > Some folks do create very large plpgsql functions, so if there's a handy > way to estimate the size of the function (pg_proc.prosrc's varlena size > perhaps) then it might be worth doing that for quite large functions. I poked at that a bit and realized that during a normal plpgsql function call, there isn't anything in the SPI Proc context when do_compile is entered, which means that we could flush all the transient cruft by just resetting the context afterwards. The sanest place to put that seems to be in plpgsql_call_handler, as per attached. We could put it in plpgsql_compile so it's not hit in the main line code path, but that would require plpgsql_compile to know that it's called in an empty context, which seems a bit fragile (and indeed probably false in the forValidator case). There isn't any equivalently easy way to clean up in the case of a DO block, but I'm not sure that we care as much in that case. I'm not sure that this is really a net win. MemoryContextReset is pretty cheap (at least in non-cassert builds) but it's not zero cost, and in most scenarios we're not going to care that much about the extra space. It appeared to me after collecting some stats about the functions present in the regression tests that for larger functions, the extra space eaten is just some small multiple (like 2x-3x) of the function body string length. So it's not *that* much data, even for big functions, and it's only going to be eaten on the first usage of a function within a session. regards, tom lane diff --git a/src/pl/plpgsql/src/pl_handler.c b/src/pl/plpgsql/src/pl_handler.c index 36868fb..2a36913 100644 *** a/src/pl/plpgsql/src/pl_handler.c --- b/src/pl/plpgsql/src/pl_handler.c *** *** 23,28 --- 23,29 #include "utils/builtins.h" #include "utils/guc.h" #include "utils/lsyscache.h" + #include "utils/memutils.h" #include "utils/syscache.h" *** plpgsql_call_handler(PG_FUNCTION_ARGS) *** 230,235 --- 231,239 /* Find or compile the function */ func = plpgsql_compile(fcinfo, false); + /* Flush any transient junk allocated during compile */ + MemoryContextReset(CurrentMemoryContext); + /* Must save and restore prior value of cur_estate */ save_cur_estate = func->cur_estate; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values
On 7/25/16 1:50 PM, Tom Lane wrote: There's a glibc-dependent hack in aset.c that reports any plpgsql-driven palloc or pfree against a context named "SPI Proc", as well as changes in pl_comp.c so that transient junk created during initial parsing of a plpgsql function body doesn't end up in the SPI Proc context. (I did that just to cut the amount of noise I had to chase down. I suppose in large functions it might be worth adopting such a change so that that junk can be released immediately after parsing; but I suspect for most functions it'd just be an extra context without much gain.) Some folks do create very large plpgsql functions, so if there's a handy way to estimate the size of the function (pg_proc.prosrc's varlena size perhaps) then it might be worth doing that for quite large functions. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values
Pavel Stehulewrites: > I am sending a review of this patch: > ... > I'll mark this patch as ready for commiter Pushed, thanks for the review. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values
2016-08-10 11:25 GMT+02:00 Pavel Stehule: > Hi > > 2016-07-27 16:49 GMT+02:00 Tom Lane : > >> Robert Haas writes: >> > On Mon, Jul 25, 2016 at 6:04 PM, Tom Lane wrote: >> >> I suppose that a fix based on putting PG_TRY blocks into all the >> affected >> >> functions might be simple enough that we'd risk back-patching it, but >> >> I don't really want to go that way. >> >> > try/catch blocks aren't completely free, either, and PL/pgsql is not >> > suffering from a deplorable excess of execution speed. >> >> BTW, just to annotate that a bit: I did some measurements and found out >> that on my Linux box, creating/deleting a memory context >> (AllocSetContextCreate + MemoryContextDelete) is somewhere around 10x >> more expensive than a PG_TRY block. This means that the PG_TRY approach >> would actually be faster for cases involving only a small number of >> statements-needing-local-storage within a single plpgsql function >> execution. However, the memory context creation cost is amortized across >> repeated executions of a statement, whereas of course PG_TRY won't be. >> We can roughly estimate that PG_TRY would lose any time we loop through >> the statement in question more than circa ten times. So I believe the >> way I want to do it will win speed-wise in cases where it matters, but >> it's not entirely an open-and-shut conclusion. >> >> Anyway, there are enough other reasons not to go the PG_TRY route. >> > > I did some synthetic benchmarks related to plpgsql speed - bubble sort and > loop over handling errors and I don't see any slowdown > > handling exceptions is little bit faster with your patch > > CREATE OR REPLACE FUNCTION public.loop_test(a integer) > RETURNS void > LANGUAGE plpgsql > AS $function$ > declare x int; > begin > for i in 1..a > loop > declare s text; > begin > s := 'AHOJ'; > x := (random()*1000)::int; > raise exception '*'; > exception when others then > x := 0; --raise notice 'handled'; > end; > end loop; > end; > $function$ > > head - 100K loops 640ms, patched 610ms > > Regards > Hi I am sending a review of this patch: 1. There was not any problem with patching and compilation 2. All tests passed without any problem 3. The advantages and disadvantages was discussed detailed in this thread - selected way is good + the code is little bit reduced and cleaned + special context can be used in future 4. For this patch new regress tests or documentation is not necessary 5. I didn't find performance slowdown in special tests - the impact on performance in real code should be insignificant I'll mark this patch as ready for commiter Regards Pavel > > Pavel > > > >> regards, tom lane >> >> >> -- >> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-hackers >> > >
Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values
Hi 2016-07-27 16:49 GMT+02:00 Tom Lane: > Robert Haas writes: > > On Mon, Jul 25, 2016 at 6:04 PM, Tom Lane wrote: > >> I suppose that a fix based on putting PG_TRY blocks into all the > affected > >> functions might be simple enough that we'd risk back-patching it, but > >> I don't really want to go that way. > > > try/catch blocks aren't completely free, either, and PL/pgsql is not > > suffering from a deplorable excess of execution speed. > > BTW, just to annotate that a bit: I did some measurements and found out > that on my Linux box, creating/deleting a memory context > (AllocSetContextCreate + MemoryContextDelete) is somewhere around 10x > more expensive than a PG_TRY block. This means that the PG_TRY approach > would actually be faster for cases involving only a small number of > statements-needing-local-storage within a single plpgsql function > execution. However, the memory context creation cost is amortized across > repeated executions of a statement, whereas of course PG_TRY won't be. > We can roughly estimate that PG_TRY would lose any time we loop through > the statement in question more than circa ten times. So I believe the > way I want to do it will win speed-wise in cases where it matters, but > it's not entirely an open-and-shut conclusion. > > Anyway, there are enough other reasons not to go the PG_TRY route. > I did some synthetic benchmarks related to plpgsql speed - bubble sort and loop over handling errors and I don't see any slowdown handling exceptions is little bit faster with your patch CREATE OR REPLACE FUNCTION public.loop_test(a integer) RETURNS void LANGUAGE plpgsql AS $function$ declare x int; begin for i in 1..a loop declare s text; begin s := 'AHOJ'; x := (random()*1000)::int; raise exception '*'; exception when others then x := 0; --raise notice 'handled'; end; end loop; end; $function$ head - 100K loops 640ms, patched 610ms Regards Pavel > regards, tom lane > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers >
Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values
Robert Haaswrites: > On Mon, Jul 25, 2016 at 6:04 PM, Tom Lane wrote: >> I suppose that a fix based on putting PG_TRY blocks into all the affected >> functions might be simple enough that we'd risk back-patching it, but >> I don't really want to go that way. > try/catch blocks aren't completely free, either, and PL/pgsql is not > suffering from a deplorable excess of execution speed. BTW, just to annotate that a bit: I did some measurements and found out that on my Linux box, creating/deleting a memory context (AllocSetContextCreate + MemoryContextDelete) is somewhere around 10x more expensive than a PG_TRY block. This means that the PG_TRY approach would actually be faster for cases involving only a small number of statements-needing-local-storage within a single plpgsql function execution. However, the memory context creation cost is amortized across repeated executions of a statement, whereas of course PG_TRY won't be. We can roughly estimate that PG_TRY would lose any time we loop through the statement in question more than circa ten times. So I believe the way I want to do it will win speed-wise in cases where it matters, but it's not entirely an open-and-shut conclusion. Anyway, there are enough other reasons not to go the PG_TRY route. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values
On Mon, Jul 25, 2016 at 6:04 PM, Tom Lanewrote: > I suppose that a fix based on putting PG_TRY blocks into all the affected > functions might be simple enough that we'd risk back-patching it, but > I don't really want to go that way. try/catch blocks aren't completely free, either, and PL/pgsql is not suffering from a deplorable excess of execution speed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values
Alvaro Herrerawrites: > Tom Lane wrote: >> Although this is in principle a bug fix, it's invasive enough that I'd >> be pretty hesitant to back-patch it, or even to stick it into HEAD >> post-beta. I'm inclined to sign it up for the next commitfest instead. > Do we have a backpatchable fix for the reported problem? If so, then it > seems a good tradeoff to install this more invasive fix in HEAD only and > apply the simpler fix to back branches. Otherwise, is the proposal to > leave the bug unfixed in back branches? The latter is what I was thinking. Given that issues like this have been there, and gone unreported, since we invented subtransactions, I do not feel too awful about not fixing it in existing branches. It's possible that we could fix just the one issue originally complained of with a smaller patch, but I don't find that approach attractive, because it was also a small leak. The case that seemed nastiest to me is that a FOREACH ... IN ARRAY loop will leak a copy of the entire array if an error causes control to be thrown out of exec_stmt_foreach_a. That could be a lot more leaked data than the querystring of an EXECUTE. I suppose that a fix based on putting PG_TRY blocks into all the affected functions might be simple enough that we'd risk back-patching it, but I don't really want to go that way. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values
Alvaro Herrera wrote: > Tom Lane wrote: > > > Although this is in principle a bug fix, it's invasive enough that I'd > > be pretty hesitant to back-patch it, or even to stick it into HEAD > > post-beta. I'm inclined to sign it up for the next commitfest instead. > > Do we have a backpatchable fix for the reported problem? If so, then it > seems a good tradeoff to install this more invasive fix in HEAD only and > apply the simpler fix to back branches. Otherwise, is the proposal to > leave the bug unfixed in back branches? I meant "install in HEAD only, after the 10.0 branch", which is what you proposed. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values
Tom Lane wrote: > Although this is in principle a bug fix, it's invasive enough that I'd > be pretty hesitant to back-patch it, or even to stick it into HEAD > post-beta. I'm inclined to sign it up for the next commitfest instead. Do we have a backpatchable fix for the reported problem? If so, then it seems a good tradeoff to install this more invasive fix in HEAD only and apply the simpler fix to back branches. Otherwise, is the proposal to leave the bug unfixed in back branches? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values
Attached is a draft patch for creating statement-level temporary memory contexts in plpgsql. It creates such contexts only as needed, and there are a lot of simpler plpgsql statements that don't need them, so I think the performance impact should be pretty minimal --- though I've not tried to measure it, since I don't have a credible benchmark for overall plpgsql performance. This fixes the originally complained-of leak and several others besides, including at least one path that leaked function-lifespan memory even without an error :-(. In addition to the patch proper, I attach for amusement's sake some additional hacking I did for testing purposes, to make sure I'd found and accounted for all the places that were allocating memory in the SPI Proc context. There's a glibc-dependent hack in aset.c that reports any plpgsql-driven palloc or pfree against a context named "SPI Proc", as well as changes in pl_comp.c so that transient junk created during initial parsing of a plpgsql function body doesn't end up in the SPI Proc context. (I did that just to cut the amount of noise I had to chase down. I suppose in large functions it might be worth adopting such a change so that that junk can be released immediately after parsing; but I suspect for most functions it'd just be an extra context without much gain.) Although this is in principle a bug fix, it's invasive enough that I'd be pretty hesitant to back-patch it, or even to stick it into HEAD post-beta. I'm inclined to sign it up for the next commitfest instead. regards, tom lane diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c index 586ff1f..e23ca96 100644 *** a/src/pl/plpgsql/src/pl_exec.c --- b/src/pl/plpgsql/src/pl_exec.c *** typedef struct *** 48,54 Oid *types; /* types of arguments */ Datum *values; /* evaluated argument values */ char *nulls; /* null markers (' '/'n' style) */ - bool *freevals; /* which arguments are pfree-able */ } PreparedParamsData; /* --- 48,53 *** static EState *shared_simple_eval_estate *** 88,93 --- 87,122 static SimpleEcontextStackEntry *simple_econtext_stack = NULL; /* + * Memory management within a plpgsql function generally works with three + * contexts: + * + * 1. Function-call-lifespan data, such as variable values, is kept in the + * "main" context, a/k/a the "SPI Proc" context established by SPI_connect(). + * This is usually the CurrentMemoryContext while running code in this module + * (which is not good, because careless coding can easily cause + * function-lifespan memory leaks, but we live with it for now). + * + * 2. Some statement-execution routines need statement-lifespan workspace. + * A suitable context is created on-demand by get_stmt_mcontext(), and must + * be reset at the end of the requesting routine. Error recovery will clean + * it up automatically. Nested statements requiring statement-lifespan + * workspace will result in a stack of such contexts, see push_stmt_mcontext(). + * + * 3. We use the eval_econtext's per-tuple memory context for expression + * evaluation, and as a general-purpose workspace for short-lived allocations. + * Such allocations usually aren't explicitly freed, but are left to be + * cleaned up by a context reset, typically done by exec_eval_cleanup(). + * + * These macros are for use in making short-lived allocations: + */ + #define get_eval_mcontext(estate) \ + ((estate)->eval_econtext->ecxt_per_tuple_memory) + #define eval_mcontext_alloc(estate, sz) \ + MemoryContextAlloc(get_eval_mcontext(estate), sz) + #define eval_mcontext_alloc0(estate, sz) \ + MemoryContextAllocZero(get_eval_mcontext(estate), sz) + + /* * We use a session-wide hash table for caching cast information. * * Once built, the compiled expression trees (cast_expr fields) survive for *** static HTAB *shared_cast_hash = NULL; *** 128,133 --- 157,165 / static void plpgsql_exec_error_callback(void *arg); static PLpgSQL_datum *copy_plpgsql_datum(PLpgSQL_datum *datum); + static MemoryContext get_stmt_mcontext(PLpgSQL_execstate *estate); + static void push_stmt_mcontext(PLpgSQL_execstate *estate); + static void pop_stmt_mcontext(PLpgSQL_execstate *estate); static int exec_stmt_block(PLpgSQL_execstate *estate, PLpgSQL_stmt_block *block); *** static void exec_eval_cleanup(PLpgSQL_ex *** 191,197 static void exec_prepare_plan(PLpgSQL_execstate *estate, PLpgSQL_expr *expr, int cursorOptions); static bool exec_simple_check_node(Node *node); ! static void exec_simple_check_plan(PLpgSQL_expr *expr); static void exec_simple_recheck_plan(PLpgSQL_expr *expr, CachedPlan *cplan); static void exec_check_rw_parameter(PLpgSQL_expr *expr, int target_dno); static bool contains_target_param(Node *node, int *target_dno); ---
Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values
On Sun, Jul 24, 2016 at 9:17 PM, Tom Lanewrote: > Amit Kapila writes: > >> Wouldn't it be better, if each nested sub-block (which is having an >> exception) has a separate "SPI Proc", "SPI Exec" contexts which would >> be destroyed at sub-block end (either commit or rollback)? > > AFAICS that would just confuse matters. In the first place, plpgsql > variable values are not subtransaction-local, so they'd have to live in > the outermost SPI Proc context anyway. In the second place, spi.c > contains a whole lot of assumptions that actions like saving a plan are > tied to the current SPI Proc/Exec contexts, so SPI-using plpgsql > statements that were nested inside a BEGIN/EXCEPT would probably break: > state they expect to remain valid from one execution to the next would > disappear. > I think for all such usage, we can always switch to top level SPI Proc/Exec context. To do so, we might need to invent a notion of something like Sub SPI Proc/Exec contexts and that sounds quite tricky. >> In short, why do you think it is better to create a new context rather >> than using "SPI Exec"? > > "SPI Exec" has the same problem as the eval_econtext: there are already > points at which it will be reset, and those can't necessarily be delayed > till end of statement. In particular, _SPI_end_call will delete whatever > is in that context. > That's right and I think it might not be even feasible to do so, mainly because that is done in exposed SPI calls. I have checked some other usage of SPI, it seems plperl is handling some similar problems either via creating temporary work context or by using PG_TRY/PG_CATCH. I think it might be better if, whatever we are trying here could be of help for other similar usage. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values
Amit Kapilawrites: > On Fri, Jul 22, 2016 at 4:32 AM, Tom Lane wrote: >> The problem is that exec_stmt_dynexecute() loses control to the error >> thrown from the bogus command, and therefore leaks its "querystr" local >> variable --- which is not much of a leak, but it adds up if the loop >> iterates enough times. There are similar problems in many other places in >> plpgsql. Basically the issue is that while running a plpgsql function, >> CurrentMemoryContext points to a function-lifespan context (the same as >> the SPI procCxt the function is using). We also store things such as >> values of the function's variables there, so just resetting that context >> is not an option. plpgsql does have an expression-evaluation-lifespan >> context for short-lived stuff, but anything that needs to live for more >> or less the duration of a statement is put into the procedure-lifespan >> context, where it risks becoming a leak. > Wouldn't it be better, if each nested sub-block (which is having an > exception) has a separate "SPI Proc", "SPI Exec" contexts which would > be destroyed at sub-block end (either commit or rollback)? AFAICS that would just confuse matters. In the first place, plpgsql variable values are not subtransaction-local, so they'd have to live in the outermost SPI Proc context anyway. In the second place, spi.c contains a whole lot of assumptions that actions like saving a plan are tied to the current SPI Proc/Exec contexts, so SPI-using plpgsql statements that were nested inside a BEGIN/EXCEPT would probably break: state they expect to remain valid from one execution to the next would disappear. > In short, why do you think it is better to create a new context rather > than using "SPI Exec"? "SPI Exec" has the same problem as the eval_econtext: there are already points at which it will be reset, and those can't necessarily be delayed till end of statement. In particular, _SPI_end_call will delete whatever is in that context. Also, spi.c does not consider the execCxt to be an exported part of its abstraction, and I'm pretty loath to punch another hole in that API. Also, as I've been working through this, I've found that only a rather small minority of plpgsql statements actually need statement-lifetime storage. So I'm thinking that it will be faster to create such a context only on-demand, not unconditionally; which knocks out any thought of changing plpgsql's coding conventions so much that statement-lifespan storage would become the normal place for CurrentMemoryContext to point. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values
On Sun, Jul 24, 2016 at 12:40 PM, Amit Kapilawrote: > In short, why do you think > it is better to create a new context rather than using "SPI Exec"? > I think life span of the memory allocated from "SPI Exec" is only within "Executor", and after that SPI_Exec will be reset. But many places we need such memory beyond "Executor"(including one which is reported in above issue). If we see below example of exec_stmt_dynexecute. exec_stmt_dynexecute { querystr = pstrdup(querystr); SPI_Execute --> inside this SPI_Exec context will be reset. After this querystr is being used in this function. } -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values
On Fri, Jul 22, 2016 at 4:32 AM, Tom Lanewrote: > In > https://www.postgresql.org/message-id/tencent_5c738eca65bad6861aa43...@qq.com > it was pointed out that you could get an intra-function-call memory leak > from something like > > LOOP > BEGIN > EXECUTE 'bogus command'; > EXCEPTION WHEN OTHERS THEN > END; > END LOOP; > > The problem is that exec_stmt_dynexecute() loses control to the error > thrown from the bogus command, and therefore leaks its "querystr" local > variable --- which is not much of a leak, but it adds up if the loop > iterates enough times. There are similar problems in many other places in > plpgsql. Basically the issue is that while running a plpgsql function, > CurrentMemoryContext points to a function-lifespan context (the same as > the SPI procCxt the function is using). We also store things such as > values of the function's variables there, so just resetting that context > is not an option. plpgsql does have an expression-evaluation-lifespan > context for short-lived stuff, but anything that needs to live for more > or less the duration of a statement is put into the procedure-lifespan > context, where it risks becoming a leak. (That design was fine > originally, because any error would result in abandoning function > execution and thereby cleaning up that context. But once we invented > plpgsql exception blocks, it's not so good.) > Wouldn't it be better, if each nested sub-block (which is having an exception) has a separate "SPI Proc", "SPI Exec" contexts which would be destroyed at sub-block end (either commit or rollback)? I think one difficulty could be that we need some way to propagate the information that is required by outer blocks, if there exists any such information. > One way we could resolve the problem is to require all plpgsql code to > use PG_TRY/PG_CATCH blocks to ensure that statement-lifespan variables > are explicitly released. That's undesirable on pretty much every front > though: it'd be notationally ugly, prone to omissions, and not very > speedy. > > Another answer is to invent a third per-function memory context intended > to hold statement-lifespan variables. > This sounds better than spreading PG_TRY/PG_CATCH everywhere. I think if this allocation would have been done in executor context "SPI Exec", then it wouldn't have leaked. One way could have been that by default all exec_stmt* functions execute in "SPI Exec" context and we then switch to "SPI Proc" for the memory that is required for longer duration. I think that might not be good, if we have to switch at many places, but OTOH the same will be required for a new statement-level execution context as well. In short, why do you think it is better to create a new context rather than using "SPI Exec"? -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values
Craig Ringerwrites: > On 22 July 2016 at 13:24, Craig Ringer wrote: >> I suggest that in cassert builds, or maybe just CACHE_CLOBBER_ALWAYS, the >> context is reset after each call even when not cleaning up after an error. >> That'll help catch mistakes and leaks. > That is, after each statement. Well, the issue is with nested statements: if some outer statement has data in such a context, you can't just clobber it because of a failure in an inner statement. To make that work, you need a new context for each nesting level. Once we've gone to that much trouble, I'm inclined to think we should just do the resets always (ie, not only in debug builds). That will let us get rid of retail pfree's and buy back at least some of the added cost. I've not done any detail work on this idea yet, but at least some of plpgsql's loop control statements don't have any pass-by-ref local data; exec_stmt_fori, for example. So it seems possible for that function not to bother with creating a new context for its child statements. That would save some more cycles, at the cost of a possibly-confusing-and- fragile difference in the way different kinds of loops work. In any case, I'm thinking this is too invasive, relative to its value, for a back-patch; but it might be okay to sneak it into beta4 if I can get it done soon. Or we could let it slide till 9.7^H^H^H10. Comments? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values
On 22 July 2016 at 13:24, Craig Ringerwrote: > > On 22 July 2016 at 07:02, Tom Lane wrote: > >> In >> >> https://www.postgresql.org/message-id/tencent_5c738eca65bad6861aa43...@qq.com >> it was pointed out that you could get an intra-function-call memory leak >> from something like >> >> LOOP >> BEGIN >> EXECUTE 'bogus command'; >> EXCEPTION WHEN OTHERS THEN >> END; >> END LOOP; > > > ... > > >> >> Another answer is to invent a third per-function memory context intended >> to hold statement-lifespan variables. >> >> > I've wanted this in the past and been surprised we don't have it, so +1 > from me. > > I don't think a few MemoryContextAlloc's are too ugly. > > I suggest that in cassert builds, or maybe just CACHE_CLOBBER_ALWAYS, the > context is reset after each call even when not cleaning up after an error. > That'll help catch mistakes and leaks. > That is, after each statement. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values
On 22 July 2016 at 07:02, Tom Lanewrote: > In > > https://www.postgresql.org/message-id/tencent_5c738eca65bad6861aa43...@qq.com > it was pointed out that you could get an intra-function-call memory leak > from something like > > LOOP > BEGIN > EXECUTE 'bogus command'; > EXCEPTION WHEN OTHERS THEN > END; > END LOOP; ... > > Another answer is to invent a third per-function memory context intended > to hold statement-lifespan variables. > > I've wanted this in the past and been surprised we don't have it, so +1 from me. I don't think a few MemoryContextAlloc's are too ugly. I suggest that in cassert builds, or maybe just CACHE_CLOBBER_ALWAYS, the context is reset after each call even when not cleaning up after an error. That'll help catch mistakes and leaks. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
[HACKERS] Curing plpgsql's memory leaks for statement-lifespan values
In https://www.postgresql.org/message-id/tencent_5c738eca65bad6861aa43...@qq.com it was pointed out that you could get an intra-function-call memory leak from something like LOOP BEGIN EXECUTE 'bogus command'; EXCEPTION WHEN OTHERS THEN END; END LOOP; The problem is that exec_stmt_dynexecute() loses control to the error thrown from the bogus command, and therefore leaks its "querystr" local variable --- which is not much of a leak, but it adds up if the loop iterates enough times. There are similar problems in many other places in plpgsql. Basically the issue is that while running a plpgsql function, CurrentMemoryContext points to a function-lifespan context (the same as the SPI procCxt the function is using). We also store things such as values of the function's variables there, so just resetting that context is not an option. plpgsql does have an expression-evaluation-lifespan context for short-lived stuff, but anything that needs to live for more or less the duration of a statement is put into the procedure-lifespan context, where it risks becoming a leak. (That design was fine originally, because any error would result in abandoning function execution and thereby cleaning up that context. But once we invented plpgsql exception blocks, it's not so good.) One way we could resolve the problem is to require all plpgsql code to use PG_TRY/PG_CATCH blocks to ensure that statement-lifespan variables are explicitly released. That's undesirable on pretty much every front though: it'd be notationally ugly, prone to omissions, and not very speedy. Another answer is to invent a third per-function memory context intended to hold statement-lifespan variables. We could say that statements are still expected to clean up their mess in normal cases, and this context is only cleared when BEGIN/EXCEPT catches an error. Each BEGIN/EXCEPT would need to create and install a new such context, since otherwise it might be clobbering statement-lifespan variables from a surrounding statement. A more aggressive variant is to arrange things so we can automatically reset that context after each statement, avoiding the need for explicit pfree's of statement-lifespan variables. I think that would mean that not only BEGINs but also looping statements would need to install new contexts for their controlled statements, unless they had no pass-by-reference local values. The traffic for creating and deleting such contexts might be too expensive, even though it'd buy back some retail pfree's. I looked into whether we could actually install such a context as CurrentMemoryContext, allowing palloc's to go there by default. This seems not workable: a lot of the existing palloc's really need to be creating function-lifespan values, and there's also a problem that SPI expects CurrentMemoryContext to be the same as its procCxt, because many SPI functions will just set CurrentMemoryContext to equal procCxt at exit. So in any case, use of such a context would require explicit MemoryContextAllocs or MemoryContextSwitchTos, which is a bit annoying notationally, but there seems little help for it. Thoughts, better ideas? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers