Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

2016-08-19 Thread Jim Nasby

On 8/19/16 10:42 AM, Tom Lane wrote:

It appeared to me after collecting some stats about the functions present
in the regression tests that for larger functions, the extra space eaten
is just some small multiple (like 2x-3x) of the function body string
length.  So it's not *that* much data, even for big functions, and it's
only going to be eaten on the first usage of a function within a session.


I wonder if freeing that memory would help speed up further allocations, 
and how much of the expense of the final free for the context gets 
reduced by an early reset. Probably not enough to negate the cost of 
resetting on every call though...

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

2016-08-19 Thread Tom Lane
Jim Nasby  writes:
> On 7/25/16 1:50 PM, Tom Lane wrote:
>> There's a glibc-dependent hack in aset.c that reports any
>> plpgsql-driven palloc or pfree against a context named "SPI Proc", as
>> well as changes in pl_comp.c so that transient junk created during initial
>> parsing of a plpgsql function body doesn't end up in the SPI Proc context.
>> (I did that just to cut the amount of noise I had to chase down.  I suppose
>> in large functions it might be worth adopting such a change so that that
>> junk can be released immediately after parsing; but I suspect for most
>> functions it'd just be an extra context without much gain.)

> Some folks do create very large plpgsql functions, so if there's a handy 
> way to estimate the size of the function (pg_proc.prosrc's varlena size 
> perhaps) then it might be worth doing that for quite large functions.

I poked at that a bit and realized that during a normal plpgsql function
call, there isn't anything in the SPI Proc context when do_compile is
entered, which means that we could flush all the transient cruft by just
resetting the context afterwards.  The sanest place to put that seems to
be in plpgsql_call_handler, as per attached.  We could put it in
plpgsql_compile so it's not hit in the main line code path, but that would
require plpgsql_compile to know that it's called in an empty context,
which seems a bit fragile (and indeed probably false in the forValidator
case).

There isn't any equivalently easy way to clean up in the case of a DO
block, but I'm not sure that we care as much in that case.

I'm not sure that this is really a net win.  MemoryContextReset is pretty
cheap (at least in non-cassert builds) but it's not zero cost, and in
most scenarios we're not going to care that much about the extra space.
It appeared to me after collecting some stats about the functions present
in the regression tests that for larger functions, the extra space eaten
is just some small multiple (like 2x-3x) of the function body string
length.  So it's not *that* much data, even for big functions, and it's
only going to be eaten on the first usage of a function within a session.

regards, tom lane

diff --git a/src/pl/plpgsql/src/pl_handler.c b/src/pl/plpgsql/src/pl_handler.c
index 36868fb..2a36913 100644
*** a/src/pl/plpgsql/src/pl_handler.c
--- b/src/pl/plpgsql/src/pl_handler.c
***
*** 23,28 
--- 23,29 
  #include "utils/builtins.h"
  #include "utils/guc.h"
  #include "utils/lsyscache.h"
+ #include "utils/memutils.h"
  #include "utils/syscache.h"
  
  
*** plpgsql_call_handler(PG_FUNCTION_ARGS)
*** 230,235 
--- 231,239 
  	/* Find or compile the function */
  	func = plpgsql_compile(fcinfo, false);
  
+ 	/* Flush any transient junk allocated during compile */
+ 	MemoryContextReset(CurrentMemoryContext);
+ 
  	/* Must save and restore prior value of cur_estate */
  	save_cur_estate = func->cur_estate;
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

2016-08-18 Thread Jim Nasby

On 7/25/16 1:50 PM, Tom Lane wrote:

There's a glibc-dependent hack in aset.c that reports any
plpgsql-driven palloc or pfree against a context named "SPI Proc", as
well as changes in pl_comp.c so that transient junk created during initial
parsing of a plpgsql function body doesn't end up in the SPI Proc context.
(I did that just to cut the amount of noise I had to chase down.  I suppose
in large functions it might be worth adopting such a change so that that
junk can be released immediately after parsing; but I suspect for most
functions it'd just be an extra context without much gain.)


Some folks do create very large plpgsql functions, so if there's a handy 
way to estimate the size of the function (pg_proc.prosrc's varlena size 
perhaps) then it might be worth doing that for quite large functions.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

2016-08-17 Thread Tom Lane
Pavel Stehule  writes:
> I am sending a review of this patch:
> ...
> I'll mark this patch as ready for commiter

Pushed, thanks for the review.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

2016-08-10 Thread Pavel Stehule
2016-08-10 11:25 GMT+02:00 Pavel Stehule :

> Hi
>
> 2016-07-27 16:49 GMT+02:00 Tom Lane :
>
>> Robert Haas  writes:
>> > On Mon, Jul 25, 2016 at 6:04 PM, Tom Lane  wrote:
>> >> I suppose that a fix based on putting PG_TRY blocks into all the
>> affected
>> >> functions might be simple enough that we'd risk back-patching it, but
>> >> I don't really want to go that way.
>>
>> > try/catch blocks aren't completely free, either, and PL/pgsql is not
>> > suffering from a deplorable excess of execution speed.
>>
>> BTW, just to annotate that a bit: I did some measurements and found out
>> that on my Linux box, creating/deleting a memory context
>> (AllocSetContextCreate + MemoryContextDelete) is somewhere around 10x
>> more expensive than a PG_TRY block.  This means that the PG_TRY approach
>> would actually be faster for cases involving only a small number of
>> statements-needing-local-storage within a single plpgsql function
>> execution.  However, the memory context creation cost is amortized across
>> repeated executions of a statement, whereas of course PG_TRY won't be.
>> We can roughly estimate that PG_TRY would lose any time we loop through
>> the statement in question more than circa ten times.  So I believe the
>> way I want to do it will win speed-wise in cases where it matters, but
>> it's not entirely an open-and-shut conclusion.
>>
>> Anyway, there are enough other reasons not to go the PG_TRY route.
>>
>
> I did some synthetic benchmarks related to plpgsql speed - bubble sort and
> loop over handling errors and I don't see any slowdown
>
> handling exceptions is little bit faster with your patch
>
>  CREATE OR REPLACE FUNCTION public.loop_test(a integer)
>  RETURNS void
>  LANGUAGE plpgsql
> AS $function$
> declare x int;
> begin
>   for i in 1..a
>   loop
> declare s text;
> begin
>   s := 'AHOJ';
>   x := (random()*1000)::int;
>   raise exception '*';
> exception when others then
>   x := 0; --raise notice 'handled';
> end;
>   end loop;
> end;
> $function$
>
> head - 100K loops 640ms, patched 610ms
>
> Regards
>

Hi

I am sending a review of this patch:

1. There was not any problem with patching and compilation
2. All tests passed without any problem
3. The advantages and disadvantages was discussed detailed in this thread -
selected way is good

+ the code is little bit reduced and cleaned
+ special context can be used in future

4. For this patch new regress tests or documentation is not necessary
5. I didn't find performance slowdown in special tests - the impact on
performance in real code should be insignificant

I'll mark this patch as ready for commiter

Regards

Pavel


>
> Pavel
>
>
>
>> regards, tom lane
>>
>>
>> --
>> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-hackers
>>
>
>


Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

2016-08-10 Thread Pavel Stehule
Hi

2016-07-27 16:49 GMT+02:00 Tom Lane :

> Robert Haas  writes:
> > On Mon, Jul 25, 2016 at 6:04 PM, Tom Lane  wrote:
> >> I suppose that a fix based on putting PG_TRY blocks into all the
> affected
> >> functions might be simple enough that we'd risk back-patching it, but
> >> I don't really want to go that way.
>
> > try/catch blocks aren't completely free, either, and PL/pgsql is not
> > suffering from a deplorable excess of execution speed.
>
> BTW, just to annotate that a bit: I did some measurements and found out
> that on my Linux box, creating/deleting a memory context
> (AllocSetContextCreate + MemoryContextDelete) is somewhere around 10x
> more expensive than a PG_TRY block.  This means that the PG_TRY approach
> would actually be faster for cases involving only a small number of
> statements-needing-local-storage within a single plpgsql function
> execution.  However, the memory context creation cost is amortized across
> repeated executions of a statement, whereas of course PG_TRY won't be.
> We can roughly estimate that PG_TRY would lose any time we loop through
> the statement in question more than circa ten times.  So I believe the
> way I want to do it will win speed-wise in cases where it matters, but
> it's not entirely an open-and-shut conclusion.
>
> Anyway, there are enough other reasons not to go the PG_TRY route.
>

I did some synthetic benchmarks related to plpgsql speed - bubble sort and
loop over handling errors and I don't see any slowdown

handling exceptions is little bit faster with your patch

 CREATE OR REPLACE FUNCTION public.loop_test(a integer)
 RETURNS void
 LANGUAGE plpgsql
AS $function$
declare x int;
begin
  for i in 1..a
  loop
declare s text;
begin
  s := 'AHOJ';
  x := (random()*1000)::int;
  raise exception '*';
exception when others then
  x := 0; --raise notice 'handled';
end;
  end loop;
end;
$function$

head - 100K loops 640ms, patched 610ms

Regards

Pavel



> regards, tom lane
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

2016-07-27 Thread Tom Lane
Robert Haas  writes:
> On Mon, Jul 25, 2016 at 6:04 PM, Tom Lane  wrote:
>> I suppose that a fix based on putting PG_TRY blocks into all the affected
>> functions might be simple enough that we'd risk back-patching it, but
>> I don't really want to go that way.

> try/catch blocks aren't completely free, either, and PL/pgsql is not
> suffering from a deplorable excess of execution speed.

BTW, just to annotate that a bit: I did some measurements and found out
that on my Linux box, creating/deleting a memory context
(AllocSetContextCreate + MemoryContextDelete) is somewhere around 10x
more expensive than a PG_TRY block.  This means that the PG_TRY approach
would actually be faster for cases involving only a small number of
statements-needing-local-storage within a single plpgsql function
execution.  However, the memory context creation cost is amortized across
repeated executions of a statement, whereas of course PG_TRY won't be.
We can roughly estimate that PG_TRY would lose any time we loop through
the statement in question more than circa ten times.  So I believe the
way I want to do it will win speed-wise in cases where it matters, but
it's not entirely an open-and-shut conclusion.

Anyway, there are enough other reasons not to go the PG_TRY route.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

2016-07-25 Thread Robert Haas
On Mon, Jul 25, 2016 at 6:04 PM, Tom Lane  wrote:
> I suppose that a fix based on putting PG_TRY blocks into all the affected
> functions might be simple enough that we'd risk back-patching it, but
> I don't really want to go that way.

try/catch blocks aren't completely free, either, and PL/pgsql is not
suffering from a deplorable excess of execution speed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

2016-07-25 Thread Tom Lane
Alvaro Herrera  writes:
> Tom Lane wrote:
>> Although this is in principle a bug fix, it's invasive enough that I'd
>> be pretty hesitant to back-patch it, or even to stick it into HEAD
>> post-beta.  I'm inclined to sign it up for the next commitfest instead.

> Do we have a backpatchable fix for the reported problem?  If so, then it
> seems a good tradeoff to install this more invasive fix in HEAD only and
> apply the simpler fix to back branches.  Otherwise, is the proposal to
> leave the bug unfixed in back branches?

The latter is what I was thinking.  Given that issues like this have been
there, and gone unreported, since we invented subtransactions, I do not
feel too awful about not fixing it in existing branches.  It's possible
that we could fix just the one issue originally complained of with a
smaller patch, but I don't find that approach attractive, because it was
also a small leak.  The case that seemed nastiest to me is that a FOREACH
... IN ARRAY loop will leak a copy of the entire array if an error causes
control to be thrown out of exec_stmt_foreach_a.  That could be a lot more
leaked data than the querystring of an EXECUTE.

I suppose that a fix based on putting PG_TRY blocks into all the affected
functions might be simple enough that we'd risk back-patching it, but
I don't really want to go that way.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

2016-07-25 Thread Alvaro Herrera
Alvaro Herrera wrote:
> Tom Lane wrote:
> 
> > Although this is in principle a bug fix, it's invasive enough that I'd
> > be pretty hesitant to back-patch it, or even to stick it into HEAD
> > post-beta.  I'm inclined to sign it up for the next commitfest instead.
> 
> Do we have a backpatchable fix for the reported problem?  If so, then it
> seems a good tradeoff to install this more invasive fix in HEAD only and
> apply the simpler fix to back branches.  Otherwise, is the proposal to
> leave the bug unfixed in back branches?

I meant "install in HEAD only, after the 10.0 branch", which is what you
proposed.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

2016-07-25 Thread Alvaro Herrera
Tom Lane wrote:

> Although this is in principle a bug fix, it's invasive enough that I'd
> be pretty hesitant to back-patch it, or even to stick it into HEAD
> post-beta.  I'm inclined to sign it up for the next commitfest instead.

Do we have a backpatchable fix for the reported problem?  If so, then it
seems a good tradeoff to install this more invasive fix in HEAD only and
apply the simpler fix to back branches.  Otherwise, is the proposal to
leave the bug unfixed in back branches?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

2016-07-25 Thread Tom Lane
Attached is a draft patch for creating statement-level temporary memory
contexts in plpgsql.  It creates such contexts only as needed, and there
are a lot of simpler plpgsql statements that don't need them, so I think
the performance impact should be pretty minimal --- though I've not tried
to measure it, since I don't have a credible benchmark for overall plpgsql
performance.

This fixes the originally complained-of leak and several others besides,
including at least one path that leaked function-lifespan memory even
without an error :-(.

In addition to the patch proper, I attach for amusement's sake some
additional hacking I did for testing purposes, to make sure I'd found and
accounted for all the places that were allocating memory in the SPI Proc
context.  There's a glibc-dependent hack in aset.c that reports any
plpgsql-driven palloc or pfree against a context named "SPI Proc", as
well as changes in pl_comp.c so that transient junk created during initial
parsing of a plpgsql function body doesn't end up in the SPI Proc context.
(I did that just to cut the amount of noise I had to chase down.  I suppose
in large functions it might be worth adopting such a change so that that
junk can be released immediately after parsing; but I suspect for most
functions it'd just be an extra context without much gain.)

Although this is in principle a bug fix, it's invasive enough that I'd
be pretty hesitant to back-patch it, or even to stick it into HEAD
post-beta.  I'm inclined to sign it up for the next commitfest instead.

regards, tom lane

diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c
index 586ff1f..e23ca96 100644
*** a/src/pl/plpgsql/src/pl_exec.c
--- b/src/pl/plpgsql/src/pl_exec.c
*** typedef struct
*** 48,54 
  	Oid		   *types;			/* types of arguments */
  	Datum	   *values;			/* evaluated argument values */
  	char	   *nulls;			/* null markers (' '/'n' style) */
- 	bool	   *freevals;		/* which arguments are pfree-able */
  } PreparedParamsData;
  
  /*
--- 48,53 
*** static EState *shared_simple_eval_estate
*** 88,93 
--- 87,122 
  static SimpleEcontextStackEntry *simple_econtext_stack = NULL;
  
  /*
+  * Memory management within a plpgsql function generally works with three
+  * contexts:
+  *
+  * 1. Function-call-lifespan data, such as variable values, is kept in the
+  * "main" context, a/k/a the "SPI Proc" context established by SPI_connect().
+  * This is usually the CurrentMemoryContext while running code in this module
+  * (which is not good, because careless coding can easily cause
+  * function-lifespan memory leaks, but we live with it for now).
+  *
+  * 2. Some statement-execution routines need statement-lifespan workspace.
+  * A suitable context is created on-demand by get_stmt_mcontext(), and must
+  * be reset at the end of the requesting routine.  Error recovery will clean
+  * it up automatically.  Nested statements requiring statement-lifespan
+  * workspace will result in a stack of such contexts, see push_stmt_mcontext().
+  *
+  * 3. We use the eval_econtext's per-tuple memory context for expression
+  * evaluation, and as a general-purpose workspace for short-lived allocations.
+  * Such allocations usually aren't explicitly freed, but are left to be
+  * cleaned up by a context reset, typically done by exec_eval_cleanup().
+  *
+  * These macros are for use in making short-lived allocations:
+  */
+ #define get_eval_mcontext(estate) \
+ 	((estate)->eval_econtext->ecxt_per_tuple_memory)
+ #define eval_mcontext_alloc(estate, sz) \
+ 	MemoryContextAlloc(get_eval_mcontext(estate), sz)
+ #define eval_mcontext_alloc0(estate, sz) \
+ 	MemoryContextAllocZero(get_eval_mcontext(estate), sz)
+ 
+ /*
   * We use a session-wide hash table for caching cast information.
   *
   * Once built, the compiled expression trees (cast_expr fields) survive for
*** static HTAB *shared_cast_hash = NULL;
*** 128,133 
--- 157,165 
   /
  static void plpgsql_exec_error_callback(void *arg);
  static PLpgSQL_datum *copy_plpgsql_datum(PLpgSQL_datum *datum);
+ static MemoryContext get_stmt_mcontext(PLpgSQL_execstate *estate);
+ static void push_stmt_mcontext(PLpgSQL_execstate *estate);
+ static void pop_stmt_mcontext(PLpgSQL_execstate *estate);
  
  static int exec_stmt_block(PLpgSQL_execstate *estate,
  PLpgSQL_stmt_block *block);
*** static void exec_eval_cleanup(PLpgSQL_ex
*** 191,197 
  static void exec_prepare_plan(PLpgSQL_execstate *estate,
    PLpgSQL_expr *expr, int cursorOptions);
  static bool exec_simple_check_node(Node *node);
! static void exec_simple_check_plan(PLpgSQL_expr *expr);
  static void exec_simple_recheck_plan(PLpgSQL_expr *expr, CachedPlan *cplan);
  static void exec_check_rw_parameter(PLpgSQL_expr *expr, int target_dno);
  static bool contains_target_param(Node *node, int *target_dno);
--- 

Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

2016-07-25 Thread Amit Kapila
On Sun, Jul 24, 2016 at 9:17 PM, Tom Lane  wrote:
> Amit Kapila  writes:
>
>> Wouldn't it be better, if each nested sub-block (which is having an
>> exception) has a separate "SPI Proc", "SPI Exec" contexts which would
>> be destroyed at sub-block end (either commit or rollback)?
>
> AFAICS that would just confuse matters.  In the first place, plpgsql
> variable values are not subtransaction-local, so they'd have to live in
> the outermost SPI Proc context anyway.  In the second place, spi.c
> contains a whole lot of assumptions that actions like saving a plan are
> tied to the current SPI Proc/Exec contexts, so SPI-using plpgsql
> statements that were nested inside a BEGIN/EXCEPT would probably break:
> state they expect to remain valid from one execution to the next would
> disappear.
>

I think for all such usage, we can always switch to top level SPI
Proc/Exec context. To do so, we might need to invent a notion of
something like Sub SPI Proc/Exec contexts and that sounds quite
tricky.

>> In short, why do you think it is better to create a new context rather
>> than using "SPI Exec"?
>
> "SPI Exec" has the same problem as the eval_econtext: there are already
> points at which it will be reset, and those can't necessarily be delayed
> till end of statement.  In particular, _SPI_end_call will delete whatever
> is in that context.
>

That's right and I think it might not be even feasible to do so,
mainly because that is done in exposed SPI calls.  I have checked some
other usage of SPI, it seems plperl is handling some similar problems
either via creating temporary work context or by using
PG_TRY/PG_CATCH.  I think it might be better if, whatever we are
trying here could be of help for other similar usage.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

2016-07-24 Thread Tom Lane
Amit Kapila  writes:
> On Fri, Jul 22, 2016 at 4:32 AM, Tom Lane  wrote:
>> The problem is that exec_stmt_dynexecute() loses control to the error
>> thrown from the bogus command, and therefore leaks its "querystr" local
>> variable --- which is not much of a leak, but it adds up if the loop
>> iterates enough times.  There are similar problems in many other places in
>> plpgsql.  Basically the issue is that while running a plpgsql function,
>> CurrentMemoryContext points to a function-lifespan context (the same as
>> the SPI procCxt the function is using).  We also store things such as
>> values of the function's variables there, so just resetting that context
>> is not an option.  plpgsql does have an expression-evaluation-lifespan
>> context for short-lived stuff, but anything that needs to live for more
>> or less the duration of a statement is put into the procedure-lifespan
>> context, where it risks becoming a leak.

> Wouldn't it be better, if each nested sub-block (which is having an
> exception) has a separate "SPI Proc", "SPI Exec" contexts which would
> be destroyed at sub-block end (either commit or rollback)?

AFAICS that would just confuse matters.  In the first place, plpgsql
variable values are not subtransaction-local, so they'd have to live in
the outermost SPI Proc context anyway.  In the second place, spi.c
contains a whole lot of assumptions that actions like saving a plan are
tied to the current SPI Proc/Exec contexts, so SPI-using plpgsql
statements that were nested inside a BEGIN/EXCEPT would probably break:
state they expect to remain valid from one execution to the next would
disappear.

> In short, why do you think it is better to create a new context rather
> than using "SPI Exec"?

"SPI Exec" has the same problem as the eval_econtext: there are already
points at which it will be reset, and those can't necessarily be delayed
till end of statement.  In particular, _SPI_end_call will delete whatever
is in that context.  Also, spi.c does not consider the execCxt to be
an exported part of its abstraction, and I'm pretty loath to punch
another hole in that API.

Also, as I've been working through this, I've found that only a rather
small minority of plpgsql statements actually need statement-lifetime
storage.  So I'm thinking that it will be faster to create such a context
only on-demand, not unconditionally; which knocks out any thought of
changing plpgsql's coding conventions so much that statement-lifespan
storage would become the normal place for CurrentMemoryContext to point.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

2016-07-24 Thread Dilip Kumar
On Sun, Jul 24, 2016 at 12:40 PM, Amit Kapila 
wrote:

> In short, why do you think
> it is better to create a new context rather than using "SPI Exec"?
>

I think life span of the memory allocated from "SPI Exec" is only within
"Executor", and after that SPI_Exec
will be reset. But many places we need such memory beyond "Executor"(including
one which is reported in above issue).

If we see below example of exec_stmt_dynexecute.
exec_stmt_dynexecute

{



querystr = pstrdup(querystr);

SPI_Execute --> inside this SPI_Exec context will be reset.

After this querystr is being used in this function.

}
-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

2016-07-24 Thread Amit Kapila
On Fri, Jul 22, 2016 at 4:32 AM, Tom Lane  wrote:
> In
> https://www.postgresql.org/message-id/tencent_5c738eca65bad6861aa43...@qq.com
> it was pointed out that you could get an intra-function-call memory leak
> from something like
>
> LOOP
>   BEGIN
> EXECUTE 'bogus command';
>   EXCEPTION WHEN OTHERS THEN
>   END;
> END LOOP;
>
> The problem is that exec_stmt_dynexecute() loses control to the error
> thrown from the bogus command, and therefore leaks its "querystr" local
> variable --- which is not much of a leak, but it adds up if the loop
> iterates enough times.  There are similar problems in many other places in
> plpgsql.  Basically the issue is that while running a plpgsql function,
> CurrentMemoryContext points to a function-lifespan context (the same as
> the SPI procCxt the function is using).  We also store things such as
> values of the function's variables there, so just resetting that context
> is not an option.  plpgsql does have an expression-evaluation-lifespan
> context for short-lived stuff, but anything that needs to live for more
> or less the duration of a statement is put into the procedure-lifespan
> context, where it risks becoming a leak.  (That design was fine
> originally, because any error would result in abandoning function
> execution and thereby cleaning up that context.  But once we invented
> plpgsql exception blocks, it's not so good.)
>

Wouldn't it be better, if each nested sub-block (which is having an
exception) has a separate "SPI Proc", "SPI Exec" contexts which would
be destroyed at sub-block end (either commit or rollback)?  I think
one difficulty could be that we need some way to propagate the
information that is required by outer blocks, if there exists any such
information.

> One way we could resolve the problem is to require all plpgsql code to
> use PG_TRY/PG_CATCH blocks to ensure that statement-lifespan variables
> are explicitly released.  That's undesirable on pretty much every front
> though: it'd be notationally ugly, prone to omissions, and not very
> speedy.
>
> Another answer is to invent a third per-function memory context intended
> to hold statement-lifespan variables.
>

This sounds better than spreading PG_TRY/PG_CATCH everywhere.  I think
if this allocation would have been done in executor context "SPI
Exec", then it wouldn't have leaked.  One way could have been that by
default all exec_stmt* functions execute in "SPI Exec" context and we
then switch to "SPI Proc" for the memory that is required for longer
duration.  I think that might not be good, if we have to switch at
many places, but OTOH the same will be required for a new
statement-level execution context as well.  In short, why do you think
it is better to create a new context rather than using "SPI Exec"?


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

2016-07-22 Thread Tom Lane
Craig Ringer  writes:
> On 22 July 2016 at 13:24, Craig Ringer  wrote:
>> I suggest that in cassert builds, or maybe just CACHE_CLOBBER_ALWAYS, the
>> context is reset after each call even when not cleaning up after an error.
>> That'll help catch mistakes and leaks.

> That is, after each statement.

Well, the issue is with nested statements: if some outer statement has
data in such a context, you can't just clobber it because of a failure in
an inner statement.  To make that work, you need a new context for each
nesting level.  Once we've gone to that much trouble, I'm inclined to
think we should just do the resets always (ie, not only in debug builds).
That will let us get rid of retail pfree's and buy back at least some of
the added cost.

I've not done any detail work on this idea yet, but at least some of
plpgsql's loop control statements don't have any pass-by-ref local data;
exec_stmt_fori, for example.  So it seems possible for that function not
to bother with creating a new context for its child statements.  That
would save some more cycles, at the cost of a possibly-confusing-and-
fragile difference in the way different kinds of loops work.

In any case, I'm thinking this is too invasive, relative to its value,
for a back-patch; but it might be okay to sneak it into beta4 if I can
get it done soon.  Or we could let it slide till 9.7^H^H^H10.  Comments?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

2016-07-21 Thread Craig Ringer
On 22 July 2016 at 13:24, Craig Ringer  wrote:

>
> On 22 July 2016 at 07:02, Tom Lane  wrote:
>
>> In
>>
>> https://www.postgresql.org/message-id/tencent_5c738eca65bad6861aa43...@qq.com
>> it was pointed out that you could get an intra-function-call memory leak
>> from something like
>>
>> LOOP
>>   BEGIN
>> EXECUTE 'bogus command';
>>   EXCEPTION WHEN OTHERS THEN
>>   END;
>> END LOOP;
>
>
> ...
>
>
>>
>> Another answer is to invent a third per-function memory context intended
>> to hold statement-lifespan variables.
>>
>>
> I've wanted this in the past and been surprised we don't have it, so +1
> from me.
>
> I don't think a few MemoryContextAlloc's are too ugly.
>
> I suggest that in cassert builds, or maybe just CACHE_CLOBBER_ALWAYS, the
> context is reset after each call even when not cleaning up after an error.
> That'll help catch mistakes and leaks.
>


That is, after each statement.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Curing plpgsql's memory leaks for statement-lifespan values

2016-07-21 Thread Craig Ringer
On 22 July 2016 at 07:02, Tom Lane  wrote:

> In
>
> https://www.postgresql.org/message-id/tencent_5c738eca65bad6861aa43...@qq.com
> it was pointed out that you could get an intra-function-call memory leak
> from something like
>
> LOOP
>   BEGIN
> EXECUTE 'bogus command';
>   EXCEPTION WHEN OTHERS THEN
>   END;
> END LOOP;


...


>
> Another answer is to invent a third per-function memory context intended
> to hold statement-lifespan variables.
>
>
I've wanted this in the past and been surprised we don't have it, so +1
from me.

I don't think a few MemoryContextAlloc's are too ugly.

I suggest that in cassert builds, or maybe just CACHE_CLOBBER_ALWAYS, the
context is reset after each call even when not cleaning up after an error.
That'll help catch mistakes and leaks.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services