subject:"\[HACKERS\] Curing plpgsql's memory leaks for statement\-lifespan values"

Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

2016-08-19 Thread Jim Nasby


On 8/19/16 10:42 AM, Tom Lane wrote:

It appeared to me after collecting some stats about the functions present
in the regression tests that for larger functions, the extra space eaten
is just some small multiple (like 2x-3x) of the function body string
length.  So it's not *that* much data, even for big functions, and it's
only going to be eaten on the first usage of a function within a session.


I wonder if freeing that memory would help speed up further allocations, 
and how much of the expense of the final free for the context gets 
reduced by an early reset. Probably not enough to negate the cost of 
resetting on every call though...

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

2016-08-19 Thread Tom Lane

Jim Nasby  writes:
> On 7/25/16 1:50 PM, Tom Lane wrote:
>> There's a glibc-dependent hack in aset.c that reports any
>> plpgsql-driven palloc or pfree against a context named "SPI Proc", as
>> well as changes in pl_comp.c so that transient junk created during initial
>> parsing of a plpgsql function body doesn't end up in the SPI Proc context.
>> (I did that just to cut the amount of noise I had to chase down.  I suppose
>> in large functions it might be worth adopting such a change so that that
>> junk can be released immediately after parsing; but I suspect for most
>> functions it'd just be an extra context without much gain.)

> Some folks do create very large plpgsql functions, so if there's a handy 
> way to estimate the size of the function (pg_proc.prosrc's varlena size 
> perhaps) then it might be worth doing that for quite large functions.

I poked at that a bit and realized that during a normal plpgsql function
call, there isn't anything in the SPI Proc context when do_compile is
entered, which means that we could flush all the transient cruft by just
resetting the context afterwards.  The sanest place to put that seems to
be in plpgsql_call_handler, as per attached.  We could put it in
plpgsql_compile so it's not hit in the main line code path, but that would
require plpgsql_compile to know that it's called in an empty context,
which seems a bit fragile (and indeed probably false in the forValidator
case).

There isn't any equivalently easy way to clean up in the case of a DO
block, but I'm not sure that we care as much in that case.

I'm not sure that this is really a net win.  MemoryContextReset is pretty
cheap (at least in non-cassert builds) but it's not zero cost, and in
most scenarios we're not going to care that much about the extra space.
It appeared to me after collecting some stats about the functions present
in the regression tests that for larger functions, the extra space eaten
is just some small multiple (like 2x-3x) of the function body string
length.  So it's not *that* much data, even for big functions, and it's
only going to be eaten on the first usage of a function within a session.

regards, tom lane

diff --git a/src/pl/plpgsql/src/pl_handler.c b/src/pl/plpgsql/src/pl_handler.c
index 36868fb..2a36913 100644
*** a/src/pl/plpgsql/src/pl_handler.c
--- b/src/pl/plpgsql/src/pl_handler.c
***
*** 23,28 
--- 23,29 
  #include "utils/builtins.h"
  #include "utils/guc.h"
  #include "utils/lsyscache.h"
+ #include "utils/memutils.h"
  #include "utils/syscache.h"

*** plpgsql_call_handler(PG_FUNCTION_ARGS)
*** 230,235 
--- 231,239 
  	/* Find or compile the function */
  	func = plpgsql_compile(fcinfo, false);

+ 	/* Flush any transient junk allocated during compile */
+ 	MemoryContextReset(CurrentMemoryContext);
+ 
  	/* Must save and restore prior value of cur_estate */
  	save_cur_estate = func->cur_estate;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

2016-08-18 Thread Jim Nasby


On 7/25/16 1:50 PM, Tom Lane wrote:

There's a glibc-dependent hack in aset.c that reports any
plpgsql-driven palloc or pfree against a context named "SPI Proc", as
well as changes in pl_comp.c so that transient junk created during initial
parsing of a plpgsql function body doesn't end up in the SPI Proc context.
(I did that just to cut the amount of noise I had to chase down.  I suppose
in large functions it might be worth adopting such a change so that that
junk can be released immediately after parsing; but I suspect for most
functions it'd just be an extra context without much gain.)


Some folks do create very large plpgsql functions, so if there's a handy 
way to estimate the size of the function (pg_proc.prosrc's varlena size 
perhaps) then it might be worth doing that for quite large functions.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

2016-08-17 Thread Tom Lane

Pavel Stehule  writes:
> I am sending a review of this patch:
> ...
> I'll mark this patch as ready for commiter

Pushed, thanks for the review.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

2016-08-10 Thread Pavel Stehule

2016-08-10 11:25 GMT+02:00 Pavel Stehule :

> Hi
>
> 2016-07-27 16:49 GMT+02:00 Tom Lane :
>
>> Robert Haas  writes:
>> > On Mon, Jul 25, 2016 at 6:04 PM, Tom Lane  wrote:
>> >> I suppose that a fix based on putting PG_TRY blocks into all the
>> affected
>> >> functions might be simple enough that we'd risk back-patching it, but
>> >> I don't really want to go that way.
>>
>> > try/catch blocks aren't completely free, either, and PL/pgsql is not
>> > suffering from a deplorable excess of execution speed.
>>
>> BTW, just to annotate that a bit: I did some measurements and found out
>> that on my Linux box, creating/deleting a memory context
>> (AllocSetContextCreate + MemoryContextDelete) is somewhere around 10x
>> more expensive than a PG_TRY block.  This means that the PG_TRY approach
>> would actually be faster for cases involving only a small number of
>> statements-needing-local-storage within a single plpgsql function
>> execution.  However, the memory context creation cost is amortized across
>> repeated executions of a statement, whereas of course PG_TRY won't be.
>> We can roughly estimate that PG_TRY would lose any time we loop through
>> the statement in question more than circa ten times.  So I believe the
>> way I want to do it will win speed-wise in cases where it matters, but
>> it's not entirely an open-and-shut conclusion.
>>
>> Anyway, there are enough other reasons not to go the PG_TRY route.
>>
>
> I did some synthetic benchmarks related to plpgsql speed - bubble sort and
> loop over handling errors and I don't see any slowdown
>
> handling exceptions is little bit faster with your patch
>
>  CREATE OR REPLACE FUNCTION public.loop_test(a integer)
>  RETURNS void
>  LANGUAGE plpgsql
> AS $function$
> declare x int;
> begin
>   for i in 1..a
>   loop
> declare s text;
> begin
>   s := 'AHOJ';
>   x := (random()*1000)::int;
>   raise exception '*';
> exception when others then
>   x := 0; --raise notice 'handled';
> end;
>   end loop;
> end;
> $function$
>
> head - 100K loops 640ms, patched 610ms
>
> Regards
>

Hi

I am sending a review of this patch:

1. There was not any problem with patching and compilation
2. All tests passed without any problem
3. The advantages and disadvantages was discussed detailed in this thread -
selected way is good

+ the code is little bit reduced and cleaned
+ special context can be used in future

4. For this patch new regress tests or documentation is not necessary
5. I didn't find performance slowdown in special tests - the impact on
performance in real code should be insignificant

I'll mark this patch as ready for commiter

Regards

Pavel


>
> Pavel
>
>
>
>> regards, tom lane
>>
>>
>> --
>> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-hackers
>>
>
>

Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

2016-08-10 Thread Pavel Stehule

Hi

2016-07-27 16:49 GMT+02:00 Tom Lane :

> Robert Haas  writes:
> > On Mon, Jul 25, 2016 at 6:04 PM, Tom Lane  wrote:
> >> I suppose that a fix based on putting PG_TRY blocks into all the
> affected
> >> functions might be simple enough that we'd risk back-patching it, but
> >> I don't really want to go that way.
>
> > try/catch blocks aren't completely free, either, and PL/pgsql is not
> > suffering from a deplorable excess of execution speed.
>
> BTW, just to annotate that a bit: I did some measurements and found out
> that on my Linux box, creating/deleting a memory context
> (AllocSetContextCreate + MemoryContextDelete) is somewhere around 10x
> more expensive than a PG_TRY block.  This means that the PG_TRY approach
> would actually be faster for cases involving only a small number of
> statements-needing-local-storage within a single plpgsql function
> execution.  However, the memory context creation cost is amortized across
> repeated executions of a statement, whereas of course PG_TRY won't be.
> We can roughly estimate that PG_TRY would lose any time we loop through
> the statement in question more than circa ten times.  So I believe the
> way I want to do it will win speed-wise in cases where it matters, but
> it's not entirely an open-and-shut conclusion.
>
> Anyway, there are enough other reasons not to go the PG_TRY route.
>

I did some synthetic benchmarks related to plpgsql speed - bubble sort and
loop over handling errors and I don't see any slowdown

handling exceptions is little bit faster with your patch

 CREATE OR REPLACE FUNCTION public.loop_test(a integer)
 RETURNS void
 LANGUAGE plpgsql
AS $function$
declare x int;
begin
  for i in 1..a
  loop
declare s text;
begin
  s := 'AHOJ';
  x := (random()*1000)::int;
  raise exception '*';
exception when others then
  x := 0; --raise notice 'handled';
end;
  end loop;
end;
$function$

head - 100K loops 640ms, patched 610ms

Regards

Pavel



> regards, tom lane
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

2016-07-27 Thread Tom Lane

Robert Haas  writes:
> On Mon, Jul 25, 2016 at 6:04 PM, Tom Lane  wrote:
>> I suppose that a fix based on putting PG_TRY blocks into all the affected
>> functions might be simple enough that we'd risk back-patching it, but
>> I don't really want to go that way.

> try/catch blocks aren't completely free, either, and PL/pgsql is not
> suffering from a deplorable excess of execution speed.

BTW, just to annotate that a bit: I did some measurements and found out
that on my Linux box, creating/deleting a memory context
(AllocSetContextCreate + MemoryContextDelete) is somewhere around 10x
more expensive than a PG_TRY block.  This means that the PG_TRY approach
would actually be faster for cases involving only a small number of
statements-needing-local-storage within a single plpgsql function
execution.  However, the memory context creation cost is amortized across
repeated executions of a statement, whereas of course PG_TRY won't be.
We can roughly estimate that PG_TRY would lose any time we loop through
the statement in question more than circa ten times.  So I believe the
way I want to do it will win speed-wise in cases where it matters, but
it's not entirely an open-and-shut conclusion.

Anyway, there are enough other reasons not to go the PG_TRY route.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

2016-07-25 Thread Robert Haas

On Mon, Jul 25, 2016 at 6:04 PM, Tom Lane  wrote:
> I suppose that a fix based on putting PG_TRY blocks into all the affected
> functions might be simple enough that we'd risk back-patching it, but
> I don't really want to go that way.

try/catch blocks aren't completely free, either, and PL/pgsql is not
suffering from a deplorable excess of execution speed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

2016-07-25 Thread Tom Lane

Alvaro Herrera  writes:
> Tom Lane wrote:
>> Although this is in principle a bug fix, it's invasive enough that I'd
>> be pretty hesitant to back-patch it, or even to stick it into HEAD
>> post-beta.  I'm inclined to sign it up for the next commitfest instead.

> Do we have a backpatchable fix for the reported problem?  If so, then it
> seems a good tradeoff to install this more invasive fix in HEAD only and
> apply the simpler fix to back branches.  Otherwise, is the proposal to
> leave the bug unfixed in back branches?

The latter is what I was thinking.  Given that issues like this have been
there, and gone unreported, since we invented subtransactions, I do not
feel too awful about not fixing it in existing branches.  It's possible
that we could fix just the one issue originally complained of with a
smaller patch, but I don't find that approach attractive, because it was
also a small leak.  The case that seemed nastiest to me is that a FOREACH
... IN ARRAY loop will leak a copy of the entire array if an error causes
control to be thrown out of exec_stmt_foreach_a.  That could be a lot more
leaked data than the querystring of an EXECUTE.

I suppose that a fix based on putting PG_TRY blocks into all the affected
functions might be simple enough that we'd risk back-patching it, but
I don't really want to go that way.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

2016-07-25 Thread Alvaro Herrera

Alvaro Herrera wrote:
> Tom Lane wrote:
> 
> > Although this is in principle a bug fix, it's invasive enough that I'd
> > be pretty hesitant to back-patch it, or even to stick it into HEAD
> > post-beta.  I'm inclined to sign it up for the next commitfest instead.
> 
> Do we have a backpatchable fix for the reported problem?  If so, then it
> seems a good tradeoff to install this more invasive fix in HEAD only and
> apply the simpler fix to back branches.  Otherwise, is the proposal to
> leave the bug unfixed in back branches?

I meant "install in HEAD only, after the 10.0 branch", which is what you
proposed.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

2016-07-25 Thread Alvaro Herrera

Tom Lane wrote:

> Although this is in principle a bug fix, it's invasive enough that I'd
> be pretty hesitant to back-patch it, or even to stick it into HEAD
> post-beta.  I'm inclined to sign it up for the next commitfest instead.

Do we have a backpatchable fix for the reported problem?  If so, then it
seems a good tradeoff to install this more invasive fix in HEAD only and
apply the simpler fix to back branches.  Otherwise, is the proposal to
leave the bug unfixed in back branches?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

2016-07-25 Thread Tom Lane

Attached is a draft patch for creating statement-level temporary memory
contexts in plpgsql.  It creates such contexts only as needed, and there
are a lot of simpler plpgsql statements that don't need them, so I think
the performance impact should be pretty minimal --- though I've not tried
to measure it, since I don't have a credible benchmark for overall plpgsql
performance.

This fixes the originally complained-of leak and several others besides,
including at least one path that leaked function-lifespan memory even
without an error :-(.

In addition to the patch proper, I attach for amusement's sake some
additional hacking I did for testing purposes, to make sure I'd found and
accounted for all the places that were allocating memory in the SPI Proc
context.  There's a glibc-dependent hack in aset.c that reports any
plpgsql-driven palloc or pfree against a context named "SPI Proc", as
well as changes in pl_comp.c so that transient junk created during initial
parsing of a plpgsql function body doesn't end up in the SPI Proc context.
(I did that just to cut the amount of noise I had to chase down.  I suppose
in large functions it might be worth adopting such a change so that that
junk can be released immediately after parsing; but I suspect for most
functions it'd just be an extra context without much gain.)

Although this is in principle a bug fix, it's invasive enough that I'd
be pretty hesitant to back-patch it, or even to stick it into HEAD
post-beta.  I'm inclined to sign it up for the next commitfest instead.

regards, tom lane

diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c
index 586ff1f..e23ca96 100644
*** a/src/pl/plpgsql/src/pl_exec.c
--- b/src/pl/plpgsql/src/pl_exec.c
*** typedef struct
*** 48,54 
  	Oid		   *types;			/* types of arguments */
  	Datum	   *values;			/* evaluated argument values */
  	char	   *nulls;			/* null markers (' '/'n' style) */
- 	bool	   *freevals;		/* which arguments are pfree-able */
  } PreparedParamsData;
  
  /*
--- 48,53 
*** static EState *shared_simple_eval_estate
*** 88,93 
--- 87,122 
  static SimpleEcontextStackEntry *simple_econtext_stack = NULL;
  
  /*
+  * Memory management within a plpgsql function generally works with three
+  * contexts:
+  *
+  * 1. Function-call-lifespan data, such as variable values, is kept in the
+  * "main" context, a/k/a the "SPI Proc" context established by SPI_connect().
+  * This is usually the CurrentMemoryContext while running code in this module
+  * (which is not good, because careless coding can easily cause
+  * function-lifespan memory leaks, but we live with it for now).
+  *
+  * 2. Some statement-execution routines need statement-lifespan workspace.
+  * A suitable context is created on-demand by get_stmt_mcontext(), and must
+  * be reset at the end of the requesting routine.  Error recovery will clean
+  * it up automatically.  Nested statements requiring statement-lifespan
+  * workspace will result in a stack of such contexts, see push_stmt_mcontext().
+  *
+  * 3. We use the eval_econtext's per-tuple memory context for expression
+  * evaluation, and as a general-purpose workspace for short-lived allocations.
+  * Such allocations usually aren't explicitly freed, but are left to be
+  * cleaned up by a context reset, typically done by exec_eval_cleanup().
+  *
+  * These macros are for use in making short-lived allocations:
+  */
+ #define get_eval_mcontext(estate) \
+ 	((estate)->eval_econtext->ecxt_per_tuple_memory)
+ #define eval_mcontext_alloc(estate, sz) \
+ 	MemoryContextAlloc(get_eval_mcontext(estate), sz)
+ #define eval_mcontext_alloc0(estate, sz) \
+ 	MemoryContextAllocZero(get_eval_mcontext(estate), sz)
+ 
+ /*
   * We use a session-wide hash table for caching cast information.
   *
   * Once built, the compiled expression trees (cast_expr fields) survive for
*** static HTAB *shared_cast_hash = NULL;
*** 128,133 
--- 157,165 
   /
  static void plpgsql_exec_error_callback(void *arg);
  static PLpgSQL_datum *copy_plpgsql_datum(PLpgSQL_datum *datum);
+ static MemoryContext get_stmt_mcontext(PLpgSQL_execstate *estate);
+ static void push_stmt_mcontext(PLpgSQL_execstate *estate);
+ static void pop_stmt_mcontext(PLpgSQL_execstate *estate);
  
  static int exec_stmt_block(PLpgSQL_execstate *estate,
  PLpgSQL_stmt_block *block);
*** static void exec_eval_cleanup(PLpgSQL_ex
*** 191,197 
  static void exec_prepare_plan(PLpgSQL_execstate *estate,
    PLpgSQL_expr *expr, int cursorOptions);
  static bool exec_simple_check_node(Node *node);
! static void exec_simple_check_plan(PLpgSQL_expr *expr);
  static void exec_simple_recheck_plan(PLpgSQL_expr *expr, CachedPlan *cplan);
  static void exec_check_rw_parameter(PLpgSQL_expr *expr, int target_dno);
  static bool contains_target_param(Node *node, int *target_dno);
---

Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

2016-07-25 Thread Amit Kapila

On Sun, Jul 24, 2016 at 9:17 PM, Tom Lane  wrote:
> Amit Kapila  writes:
>
>> Wouldn't it be better, if each nested sub-block (which is having an
>> exception) has a separate "SPI Proc", "SPI Exec" contexts which would
>> be destroyed at sub-block end (either commit or rollback)?
>
> AFAICS that would just confuse matters.  In the first place, plpgsql
> variable values are not subtransaction-local, so they'd have to live in
> the outermost SPI Proc context anyway.  In the second place, spi.c
> contains a whole lot of assumptions that actions like saving a plan are
> tied to the current SPI Proc/Exec contexts, so SPI-using plpgsql
> statements that were nested inside a BEGIN/EXCEPT would probably break:
> state they expect to remain valid from one execution to the next would
> disappear.
>

I think for all such usage, we can always switch to top level SPI
Proc/Exec context. To do so, we might need to invent a notion of
something like Sub SPI Proc/Exec contexts and that sounds quite
tricky.

>> In short, why do you think it is better to create a new context rather
>> than using "SPI Exec"?
>
> "SPI Exec" has the same problem as the eval_econtext: there are already
> points at which it will be reset, and those can't necessarily be delayed
> till end of statement.  In particular, _SPI_end_call will delete whatever
> is in that context.
>

That's right and I think it might not be even feasible to do so,
mainly because that is done in exposed SPI calls.  I have checked some
other usage of SPI, it seems plperl is handling some similar problems
either via creating temporary work context or by using
PG_TRY/PG_CATCH.  I think it might be better if, whatever we are
trying here could be of help for other similar usage.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

2016-07-24 Thread Tom Lane

Amit Kapila  writes:
> On Fri, Jul 22, 2016 at 4:32 AM, Tom Lane  wrote:
>> The problem is that exec_stmt_dynexecute() loses control to the error
>> thrown from the bogus command, and therefore leaks its "querystr" local
>> variable --- which is not much of a leak, but it adds up if the loop
>> iterates enough times.  There are similar problems in many other places in
>> plpgsql.  Basically the issue is that while running a plpgsql function,
>> CurrentMemoryContext points to a function-lifespan context (the same as
>> the SPI procCxt the function is using).  We also store things such as
>> values of the function's variables there, so just resetting that context
>> is not an option.  plpgsql does have an expression-evaluation-lifespan
>> context for short-lived stuff, but anything that needs to live for more
>> or less the duration of a statement is put into the procedure-lifespan
>> context, where it risks becoming a leak.

> Wouldn't it be better, if each nested sub-block (which is having an
> exception) has a separate "SPI Proc", "SPI Exec" contexts which would
> be destroyed at sub-block end (either commit or rollback)?

AFAICS that would just confuse matters.  In the first place, plpgsql
variable values are not subtransaction-local, so they'd have to live in
the outermost SPI Proc context anyway.  In the second place, spi.c
contains a whole lot of assumptions that actions like saving a plan are
tied to the current SPI Proc/Exec contexts, so SPI-using plpgsql
statements that were nested inside a BEGIN/EXCEPT would probably break:
state they expect to remain valid from one execution to the next would
disappear.

> In short, why do you think it is better to create a new context rather
> than using "SPI Exec"?

"SPI Exec" has the same problem as the eval_econtext: there are already
points at which it will be reset, and those can't necessarily be delayed
till end of statement.  In particular, _SPI_end_call will delete whatever
is in that context.  Also, spi.c does not consider the execCxt to be
an exported part of its abstraction, and I'm pretty loath to punch
another hole in that API.

Also, as I've been working through this, I've found that only a rather
small minority of plpgsql statements actually need statement-lifetime
storage.  So I'm thinking that it will be faster to create such a context
only on-demand, not unconditionally; which knocks out any thought of
changing plpgsql's coding conventions so much that statement-lifespan
storage would become the normal place for CurrentMemoryContext to point.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

2016-07-24 Thread Dilip Kumar

On Sun, Jul 24, 2016 at 12:40 PM, Amit Kapila 
wrote:

> In short, why do you think
> it is better to create a new context rather than using "SPI Exec"?
>

I think life span of the memory allocated from "SPI Exec" is only within
"Executor", and after that SPI_Exec
will be reset. But many places we need such memory beyond "Executor"(including
one which is reported in above issue).

If we see below example of exec_stmt_dynexecute.
exec_stmt_dynexecute

{

querystr = pstrdup(querystr);

SPI_Execute --> inside this SPI_Exec context will be reset.

After this querystr is being used in this function.

}
-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

2016-07-24 Thread Amit Kapila

On Fri, Jul 22, 2016 at 4:32 AM, Tom Lane  wrote:
> In
> https://www.postgresql.org/message-id/tencent_5c738eca65bad6861aa43...@qq.com
> it was pointed out that you could get an intra-function-call memory leak
> from something like
>
> LOOP
>   BEGIN
> EXECUTE 'bogus command';
>   EXCEPTION WHEN OTHERS THEN
>   END;
> END LOOP;
>
> The problem is that exec_stmt_dynexecute() loses control to the error
> thrown from the bogus command, and therefore leaks its "querystr" local
> variable --- which is not much of a leak, but it adds up if the loop
> iterates enough times.  There are similar problems in many other places in
> plpgsql.  Basically the issue is that while running a plpgsql function,
> CurrentMemoryContext points to a function-lifespan context (the same as
> the SPI procCxt the function is using).  We also store things such as
> values of the function's variables there, so just resetting that context
> is not an option.  plpgsql does have an expression-evaluation-lifespan
> context for short-lived stuff, but anything that needs to live for more
> or less the duration of a statement is put into the procedure-lifespan
> context, where it risks becoming a leak.  (That design was fine
> originally, because any error would result in abandoning function
> execution and thereby cleaning up that context.  But once we invented
> plpgsql exception blocks, it's not so good.)
>

Wouldn't it be better, if each nested sub-block (which is having an
exception) has a separate "SPI Proc", "SPI Exec" contexts which would
be destroyed at sub-block end (either commit or rollback)?  I think
one difficulty could be that we need some way to propagate the
information that is required by outer blocks, if there exists any such
information.

> One way we could resolve the problem is to require all plpgsql code to
> use PG_TRY/PG_CATCH blocks to ensure that statement-lifespan variables
> are explicitly released.  That's undesirable on pretty much every front
> though: it'd be notationally ugly, prone to omissions, and not very
> speedy.
>
> Another answer is to invent a third per-function memory context intended
> to hold statement-lifespan variables.
>

This sounds better than spreading PG_TRY/PG_CATCH everywhere.  I think
if this allocation would have been done in executor context "SPI
Exec", then it wouldn't have leaked.  One way could have been that by
default all exec_stmt* functions execute in "SPI Exec" context and we
then switch to "SPI Proc" for the memory that is required for longer
duration.  I think that might not be good, if we have to switch at
many places, but OTOH the same will be required for a new
statement-level execution context as well.  In short, why do you think
it is better to create a new context rather than using "SPI Exec"?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

2016-07-22 Thread Tom Lane

Craig Ringer  writes:
> On 22 July 2016 at 13:24, Craig Ringer  wrote:
>> I suggest that in cassert builds, or maybe just CACHE_CLOBBER_ALWAYS, the
>> context is reset after each call even when not cleaning up after an error.
>> That'll help catch mistakes and leaks.

> That is, after each statement.

Well, the issue is with nested statements: if some outer statement has
data in such a context, you can't just clobber it because of a failure in
an inner statement.  To make that work, you need a new context for each
nesting level.  Once we've gone to that much trouble, I'm inclined to
think we should just do the resets always (ie, not only in debug builds).
That will let us get rid of retail pfree's and buy back at least some of
the added cost.

I've not done any detail work on this idea yet, but at least some of
plpgsql's loop control statements don't have any pass-by-ref local data;
exec_stmt_fori, for example.  So it seems possible for that function not
to bother with creating a new context for its child statements.  That
would save some more cycles, at the cost of a possibly-confusing-and-
fragile difference in the way different kinds of loops work.

In any case, I'm thinking this is too invasive, relative to its value,
for a back-patch; but it might be okay to sneak it into beta4 if I can
get it done soon.  Or we could let it slide till 9.7^H^H^H10.  Comments?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

2016-07-21 Thread Craig Ringer

On 22 July 2016 at 13:24, Craig Ringer  wrote:

>
> On 22 July 2016 at 07:02, Tom Lane  wrote:
>
>> In
>>
>> https://www.postgresql.org/message-id/tencent_5c738eca65bad6861aa43...@qq.com
>> it was pointed out that you could get an intra-function-call memory leak
>> from something like
>>
>> LOOP
>>   BEGIN
>> EXECUTE 'bogus command';
>>   EXCEPTION WHEN OTHERS THEN
>>   END;
>> END LOOP;
>
>
> ...
>
>
>>
>> Another answer is to invent a third per-function memory context intended
>> to hold statement-lifespan variables.
>>
>>
> I've wanted this in the past and been surprised we don't have it, so +1
> from me.
>
> I don't think a few MemoryContextAlloc's are too ugly.
>
> I suggest that in cassert builds, or maybe just CACHE_CLOBBER_ALWAYS, the
> context is reset after each call even when not cleaning up after an error.
> That'll help catch mistakes and leaks.
>


That is, after each statement.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

2016-07-21 Thread Craig Ringer

On 22 July 2016 at 07:02, Tom Lane  wrote:

> In
>
> https://www.postgresql.org/message-id/tencent_5c738eca65bad6861aa43...@qq.com
> it was pointed out that you could get an intra-function-call memory leak
> from something like
>
> LOOP
>   BEGIN
> EXECUTE 'bogus command';
>   EXCEPTION WHEN OTHERS THEN
>   END;
> END LOOP;


...


>
> Another answer is to invent a third per-function memory context intended
> to hold statement-lifespan variables.
>
>
I've wanted this in the past and been surprised we don't have it, so +1
from me.

I don't think a few MemoryContextAlloc's are too ugly.

I suggest that in cassert builds, or maybe just CACHE_CLOBBER_ALWAYS, the
context is reset after each call even when not cleaning up after an error.
That'll help catch mistakes and leaks.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

[HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

2016-07-21 Thread Tom Lane

In
https://www.postgresql.org/message-id/tencent_5c738eca65bad6861aa43...@qq.com
it was pointed out that you could get an intra-function-call memory leak
from something like

LOOP
  BEGIN
EXECUTE 'bogus command';
  EXCEPTION WHEN OTHERS THEN
  END;
END LOOP;

The problem is that exec_stmt_dynexecute() loses control to the error
thrown from the bogus command, and therefore leaks its "querystr" local
variable --- which is not much of a leak, but it adds up if the loop
iterates enough times.  There are similar problems in many other places in
plpgsql.  Basically the issue is that while running a plpgsql function,
CurrentMemoryContext points to a function-lifespan context (the same as
the SPI procCxt the function is using).  We also store things such as
values of the function's variables there, so just resetting that context
is not an option.  plpgsql does have an expression-evaluation-lifespan
context for short-lived stuff, but anything that needs to live for more
or less the duration of a statement is put into the procedure-lifespan
context, where it risks becoming a leak.  (That design was fine
originally, because any error would result in abandoning function
execution and thereby cleaning up that context.  But once we invented
plpgsql exception blocks, it's not so good.)

One way we could resolve the problem is to require all plpgsql code to
use PG_TRY/PG_CATCH blocks to ensure that statement-lifespan variables
are explicitly released.  That's undesirable on pretty much every front
though: it'd be notationally ugly, prone to omissions, and not very
speedy.

Another answer is to invent a third per-function memory context intended
to hold statement-lifespan variables.  We could say that statements are
still expected to clean up their mess in normal cases, and this context is
only cleared when BEGIN/EXCEPT catches an error.  Each BEGIN/EXCEPT would
need to create and install a new such context, since otherwise it might be
clobbering statement-lifespan variables from a surrounding statement.
A more aggressive variant is to arrange things so we can automatically
reset that context after each statement, avoiding the need for explicit
pfree's of statement-lifespan variables.  I think that would mean that
not only BEGINs but also looping statements would need to install new
contexts for their controlled statements, unless they had no
pass-by-reference local values.  The traffic for creating and deleting
such contexts might be too expensive, even though it'd buy back some
retail pfree's.

I looked into whether we could actually install such a context as
CurrentMemoryContext, allowing palloc's to go there by default.  This
seems not workable: a lot of the existing palloc's really need to be
creating function-lifespan values, and there's also a problem that SPI
expects CurrentMemoryContext to be the same as its procCxt, because many
SPI functions will just set CurrentMemoryContext to equal procCxt at exit.
So in any case, use of such a context would require explicit
MemoryContextAllocs or MemoryContextSwitchTos, which is a bit annoying
notationally, but there seems little help for it.

Thoughts, better ideas?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

[HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

20 matches

Site Navigation

Mail list logo

Footer information