date:20140619

Re: [HACKERS] [bug fix] Memory leak in dblink

2014-06-19 Thread MauMau


From: "Tom Lane" 

In the first query, the MemoryContextReset is nearly free since there's
nothing to free (and we've special-cased that path in aset.c).  It's
certainly a measurement artifact that it measures out faster, which says
to me that these numbers can only be trusted within a couple percent;
but at least we're not taking much hit in cases where the patch isn't
actually conferring any benefit.  For the second query, losing 1% to avoid
memory bloat seems well worthwhile.

Barring objections I'll apply and back-patch this.


So this patch would solve memory leak issues if other modules had similar 
bugs, in addition to the original dblink problem, wouldn't this?  Definitely 
+1.  The original OP wants to use 9.2.  I'll report to him when you've 
committed this nce patch.  Thanks, Tom san.


Regards
MauMau





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal for CSN based snapshots

2014-06-19 Thread Amit Langote

Hi,

On Fri, Jun 13, 2014 at 7:24 PM, Heikki Linnakangas
 wrote:
> Yeah, that seems like a better design, after all.
>
> Attached is a new patch. It now keeps the current pg_clog unchanged, but
> adds a new pg_csnlog besides it. pg_csnlog is more similar to pg_subtrans
> than pg_clog: it's not WAL-logged, is reset at startup, and segments older
> than GlobalXmin can be truncated.
>
> This addresses the disk space consumption, and simplifies pg_upgrade.
>
> There are no other significant changes in this new version, so it's still
> very much WIP. But please take a look!
>

Thanks for working on this important patch. I know this patch is still
largely a WIP but I would like to report an observation.

I applied this patch and did a few pgbench runs with 32 clients (this
is on a not so powerful VM, by the way) . Perhaps you suspect such a
thing already but I observed a relatively larger percentage of time
being spent in XLogInsert().

Perhaps XLogCtlInsert.insertpos_lck contention via GetXLogInsertRecPtr()?

--
Amit

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Built-in support for a memory consumption ulimit?

2014-06-19 Thread Noah Misch

On Tue, Jun 17, 2014 at 04:39:51PM -0400, Tom Lane wrote:
> Robert Haas  writes:
> > We could do better by accounting for memory usage ourselves, inside
> > the memory-context system, but that'd probably impose some overhead we
> > don't have today.

> I wonder how practical it would be to forestall Noah's scenario by
> preallocating all the stack space we want before enabling the rlimit.

I think that's worth a closer look.  Compared to doing our own memory usage
tracking, it has the major advantage of isolating the added CPU overhead at
backend start.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] idle_in_transaction_timeout

2014-06-19 Thread Andrew Dunstan



On 06/19/2014 06:33 PM, Josh Berkus wrote:

Tom,


ISTM our realistic options are for seconds or msec as the unit.  If it's
msec, we'd be limited to INT_MAX msec or around 600 hours at the top end,
which seems like enough to me but maybe somebody thinks differently?
Seconds are probably OK but I'm worried about somebody complaining that
that's not enough resolution, especially as machines get faster.

I can picture a 500ms timeout more readily than I can picture a 1000hr
timeout.




As long as we can specify the units, and don't have to say 1000 to mean 
1 second, I agree. I would normally expect this to be set in terms of 
minutes rather than millisecs.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] idle_in_transaction_timeout

2014-06-19 Thread Alvaro Herrera

Josh Berkus wrote:
> Tom,
> 
> > ISTM our realistic options are for seconds or msec as the unit.  If it's
> > msec, we'd be limited to INT_MAX msec or around 600 hours at the top end,
> > which seems like enough to me but maybe somebody thinks differently?
> > Seconds are probably OK but I'm worried about somebody complaining that
> > that's not enough resolution, especially as machines get faster.
> 
> I can picture a 500ms timeout more readily than I can picture a 1000hr
> timeout.

Agreed.  600 hours are upwards of 25 days.  Dead tuples accumulated for
that long would be a really serious problem, unless your database is
almost totally idle.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [bug fix] Memory leak in dblink

2014-06-19 Thread Joe Conway

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 06/19/2014 04:36 PM, Tom Lane wrote:

> In the first query, the MemoryContextReset is nearly free since
> there's nothing to free (and we've special-cased that path in
> aset.c).  It's certainly a measurement artifact that it measures
> out faster, which says to me that these numbers can only be trusted
> within a couple percent; but at least we're not taking much hit in
> cases where the patch isn't actually conferring any benefit.  For
> the second query, losing 1% to avoid memory bloat seems well
> worthwhile.
> 
> Barring objections I'll apply and back-patch this.

Nice work! +1

Joe

- -- 
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting, & 24x7 Support
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.14 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJTo4XHAAoJEDfy90M199hlEDwP/1eqjD1lu9Dk5zRCJk239I7D
hLC0DguDNs/cwPCm4tC7ngtOLBUtrLVyvWYXQw9Yxs+8JYm2rx2fpD3bq5scZfXh
mWaaIM5plYI6v1QeMtg7u8LUEuH8dw0Ycbse2oZRWwZwn9dAEBHHdQTTtrm0bCSu
cKS402hTSUYMHhyhoURN4CzK1zzmoV8kb3ZhyCx+4HQNNJ3zf/yl/fONkHdc5aar
8M66g+Emf9Qpddx8+wWUL8xcqagIU0k36tJk1lM2vT5tLQ8ezYlNwLxC674ifIZn
dlsu127uSanEegYis+lFFQriRaYl2Bs7kZfYcas33RtuTlCJN4TK5AcRmgzPs+hT
WH2Kj/cVAMd7fwrogHQzGEnEGPRRfETWj+GPL9uibgrfY6mLaV5PLEosWTIRDsq/
6aSGNrBL5jp958k+d2bNjv/WTj/LZrZTSMRA//8mc3wbae5YDFEn3o6ADvm5/3Gn
xj+ijVOYFAgnNq4P5o1TIWoWqu+GA7ExfD8FW369Hgfi58o6KKRbpM4httJz/3fH
3q3pbDHp7dNFsM5AnmjFltg97X148u6dRd7sHDMEgSmGxJQiJGLEqNyROATRVTj1
DRxxmIsVcE52Rpm0Mz7SnmoSM+1RdtVO8N9Lz9ygcokP8XbbWlXUbEKl8f3S7v+P
ANXyc09bdqXYi40afE1d
=7MYr
-END PGP SIGNATURE-


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] How about a proper TEMPORARY TABLESPACE?

2014-06-19 Thread Fabrízio de Royes Mello

On Wed, Jun 18, 2014 at 10:39 PM, Matheus de Oliveira <
matioli.math...@gmail.com> wrote:
>
> I was going to answer each message, but Fabrízio summarized everything so
far really nicely. Thanks Fabrízio.
>

You're welcome. :-)


>
> On Wed, Jun 18, 2014 at 5:25 PM, Fabrízio de Royes Mello <
fabriziome...@gmail.com> wrote:
>>
>> Then, to summarize Matheus must do:
>> * use an option instead of change the syntax and catalog to indicate
that a tablespace is used to store temp objects
>
>
> Yes. I myself wasn't sure TEMPORARY syntax would be good, but I would
just like to hear more about. Storing as an options seems nice to me.
>

Are there somebody that reject this idea? Thoughts?


>>
>> * throw an exception if we try ALTER the option "only_temp_relations"
>
>
> I think we should postpone the ALTER option, perhaps as another patch.
Wasn't that what was decided?
>

Don't worry about that, we can help you. ;-)


>
> I'll investigate the options better before going this route. Let me work
on CREATE first.
>

See this files:
- src/include/commands/tablespace.h (TableSpaceOpts struct at line 35)
- src/backend/access/common/reloptions.c (tablespace_reloptions at line
1333)


>>
>> * add regression tests
>> * add documentation
>>
>> And, of course, register to the next open commitfest [1] to get detailed
feedback and review.
>
>
> Noted. It will be on the next commitfest. Just knowing some people do
want this make me more comfortable on working better.
>

Nice... I would like to be a reviewer.

Att,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello

Re: [HACKERS] [bug fix] Memory leak in dblink

2014-06-19 Thread Tom Lane

Joe Conway  writes:
> Not the most scientific of tests, but I think a reasonable one:
> ...
> 2.7% performance penalty

I made a slightly different test based on the original example.
This uses generate_series() which is surely faster than a SQL
function, so it shows the problem to a larger degree:

create table t1 as select x from generate_series(1,1000) x;
vacuum t1;

-- time repeated executions of this:
select count(*) from t1
where exists (select 1 from generate_series(x,x+ (random()*10)::int));

-- and this:
select count(*) from t1
where exists (select 1 from
  generate_series(x,x+ (random()*10)::int::text::int));

The first of these test queries doesn't leak memory because the argument
expressions are all pass-by-value; the second one adds some useless text
conversions just to provoke the leak (it eats about 1GB in HEAD).

HEAD, with asserts off:

first query:
Time: 21694.557 ms
Time: 21629.107 ms
Time: 21593.510 ms

second query:
Time: 26698.445 ms
Time: 26699.408 ms
Time: 26751.742 ms

With yesterday's patch:

first query:
Time: 23919.438 ms
Time: 24053.108 ms
Time: 23884.989 ms
... so about 10.7% slower, not good

second query no longer bloats, but:
Time: 29986.850 ms
Time: 29951.179 ms
Time: 29771.533 ms
... almost 12% slower

With the attached patch instead, I get:

first query:
Time: 21017.503 ms
Time: 21113.656 ms
Time: 21107.617 ms
... 2.5% faster??

second query:
Time: 27033.626 ms
Time: 26997.907 ms
Time: 26984.397 ms
... about 1% slower

In the first query, the MemoryContextReset is nearly free since there's
nothing to free (and we've special-cased that path in aset.c).  It's
certainly a measurement artifact that it measures out faster, which says
to me that these numbers can only be trusted within a couple percent;
but at least we're not taking much hit in cases where the patch isn't
actually conferring any benefit.  For the second query, losing 1% to avoid
memory bloat seems well worthwhile.

Barring objections I'll apply and back-patch this.

regards, tom lane

diff --git a/src/backend/executor/execQual.c b/src/backend/executor/execQual.c
index f162e92..bc79e3a 100644
*** a/src/backend/executor/execQual.c
--- b/src/backend/executor/execQual.c
*** ExecMakeFunctionResultNoSets(FuncExprSta
*** 2002,2007 
--- 2002,2008 
  Tuplestorestate *
  ExecMakeTableFunctionResult(ExprState *funcexpr,
  			ExprContext *econtext,
+ 			MemoryContext argContext,
  			TupleDesc expectedDesc,
  			bool randomAccess)
  {
*** ExecMakeTableFunctionResult(ExprState *f
*** 2083,2094 
  		/*
  		 * Evaluate the function's argument list.
  		 *
! 		 * Note: ideally, we'd do this in the per-tuple context, but then the
! 		 * argument values would disappear when we reset the context in the
! 		 * inner loop.  So do it in caller context.  Perhaps we should make a
! 		 * separate context just to hold the evaluated arguments?
  		 */
  		argDone = ExecEvalFuncArgs(&fcinfo, fcache->args, econtext);
  		/* We don't allow sets in the arguments of the table function */
  		if (argDone != ExprSingleResult)
  			ereport(ERROR,
--- 2084,2101 
  		/*
  		 * Evaluate the function's argument list.
  		 *
! 		 * We can't do this in the per-tuple context: the argument values
! 		 * would disappear when we reset that context in the inner loop.  And
! 		 * the caller's CurrentMemoryContext is typically a query-lifespan
! 		 * context, so we don't want to leak memory there.  We require the
! 		 * caller to pass a separate memory context that can be used for this,
! 		 * and can be reset each time through to avoid bloat.
  		 */
+ 		MemoryContextReset(argContext);
+ 		oldcontext = MemoryContextSwitchTo(argContext);
  		argDone = ExecEvalFuncArgs(&fcinfo, fcache->args, econtext);
+ 		MemoryContextSwitchTo(oldcontext);
+ 
  		/* We don't allow sets in the arguments of the table function */
  		if (argDone != ExprSingleResult)
  			ereport(ERROR,
diff --git a/src/backend/executor/nodeFunctionscan.c b/src/backend/executor/nodeFunctionscan.c
index da5d8c1..945a414 100644
*** a/src/backend/executor/nodeFunctionscan.c
--- b/src/backend/executor/nodeFunctionscan.c
***
*** 28,33 
--- 28,34 
  #include "nodes/nodeFuncs.h"
  #include "parser/parsetree.h"
  #include "utils/builtins.h"
+ #include "utils/memutils.h"
  
  
  /*
*** FunctionNext(FunctionScanState *node)
*** 94,99 
--- 95,101 
  			node->funcstates[0].tstore = tstore =
  ExecMakeTableFunctionResult(node->funcstates[0].funcexpr,
  			node->ss.ps.ps_ExprContext,
+ 			node->argcontext,
  			node->funcstates[0].tupdesc,
  		  node->eflags & EXEC_FLAG_BACKWARD);
  
*** FunctionNext(FunctionScanState *node)
*** 152,157 
--- 154,160 
  			fs->tstore =
  ExecMakeTableFunctionResult(fs->funcexpr,
  			node->ss.ps.ps_ExprContext,
+ 			node->argcontext,
  			fs->tupdesc,

Re: [HACKERS] pg_stat directory and pg_stat_statements

2014-06-19 Thread Shigeru Hanada

Fujii-san,

I found the right description in REL9_3_STABLE branch, thanks for
pointing out the commit.

Sorry for noise.

2014-06-18 12:39 GMT+09:00 Fujii Masao :
> On Tue, Jun 17, 2014 at 2:11 PM, Shigeru Hanada
>  wrote:
>> Fujii-san,
>>
>> I agree not to backpatch, but I noticed that the 9.3 document about
>> stats collector doesn't mention $PGDATA/pg_stat.
>>
>> http://www.postgresql.org/docs/current/static/monitoring-stats.html
>>
>> It just says:
>>> When the server shuts down, a permanent copy of the statistics data is 
>>> stored in the global subdirectory, so that statistics can be retained 
>>> across server restarts.
>>
>> I'm not sure whether we should modify the 9.3 document at the moment,
>> but actually the description made me confused a bit.
>
> Could you check 8dc90b9c4c45fa30a8f59313e21d294529ef7182 in REL9_3_STABLE
> branch? You can see that I added the description of pg_stat directory
> into the document
> in 9.3.
>
> Regards,
>
> --
> Fujii Masao



-- 
Shigeru HANADA


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] -DDISABLE_ENABLE_ASSERT

2014-06-19 Thread Tom Lane

Andres Freund  writes:
> Hm, that missed a couple things. Updated patch attached. Besides adding
> the missed documentation adjustments this also removes the -A
> commandline/startup packet parameter since it doesn't serve a purpose
> anymore.
> Does anyone think we should rather keep -A and have it not to anything?
> It seems fairly unlikely, but not impossible, that someone had the
> great idea to add -A off in their connection options.

I think we could delete it.  If anyone is using that, I think they'd
be happier getting an error than having it silently not do what they
want (and more slowly than before ...)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] -DDISABLE_ENABLE_ASSERT

2014-06-19 Thread Andres Freund

On 2014-06-19 19:31:28 +0200, Andres Freund wrote:
> On 2014-05-23 10:01:37 -0400, Tom Lane wrote:
> > Andres Freund  writes:
> > > The next question is whether to wait till after the branching with this?
> > 
> > +1 for waiting (it's only a couple weeks anyway).  This isn't a
> > user-facing feature in any way, so I feel no urgency to ship it in 9.4.
> 
> Here's my patch for this that I plan to apply later.

Hm, that missed a couple things. Updated patch attached. Besides adding
the missed documentation adjustments this also removes the -A
commandline/startup packet parameter since it doesn't serve a purpose
anymore.
Does anyone think we should rather keep -A and have it not to anything?
It seems fairly unlikely, but not impossible, that someone had the
great idea to add -A off in their connection options.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
>From dd658b4a7e4aa7fdf43d659286dc1e21822ef5c6 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Thu, 19 Jun 2014 19:16:49 +0200
Subject: [PATCH] Don't allow to disable backend assertions via the
 debug_assertions GUC.

The existance of the assert_enabled variable (backing the
debug_assertions GUC) reduced the amount of knowledge some static code
checkers (like coverity and various compilers) could infer from the
existance of the assertion. That could have been solved by optionally
removing the assertion_enabled variable from the Assert() et al macros
at compile time when some special macro is defined, but the resulting
complication doesn't seem to be worth the gain from having
debug_assertions. Recompiling is fast enough.

The debug_assertions GUC is still available, but readonly, as it's
useful when diagnosing problems. The commandline/client startup option
-A, which previously also allowed to enable/disable assertions, has
been removed as it doesn't serve a purpose anymore.

While at it, reduce code duplication in bufmgr.c and localbuf.c
assertions checking for spurious buffer pins. That code had to be
reindented anyway to cope with the assert_enabled removal.
---
 doc/src/sgml/config.sgml | 46 +++-
 src/backend/access/gin/ginpostinglist.c  |  1 -
 src/backend/commands/event_trigger.c |  1 -
 src/backend/postmaster/postmaster.c  |  6 +---
 src/backend/storage/buffer/bufmgr.c  | 62 ++--
 src/backend/storage/buffer/localbuf.c| 51 --
 src/backend/storage/lmgr/proc.c  |  3 --
 src/backend/tcop/postgres.c  |  6 +---
 src/backend/utils/cache/catcache.c   | 51 +-
 src/backend/utils/cache/relfilenodemap.c |  1 -
 src/backend/utils/misc/guc.c | 30 
 src/include/c.h  |  4 +--
 src/include/postgres.h   |  9 ++---
 13 files changed, 106 insertions(+), 165 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 697cf99..8f0ca4c 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -6722,6 +6722,26 @@ dynamic_library_path = 'C:\tools\postgresql;H:\my_project\lib;$libdir'
   
  
 
+ 
+  debug_assertions (boolean)
+  
+   debug_assertions configuration parameter
+  
+  
+  
+   
+Reports whether PostgreSQL has been built
+with assertions enabled. That is the case if the
+macro USE_ASSERT_CHECKING is defined
+when PostgreSQL is built (accomplished
+e.g. by the configure option
+--enable-cassert). By
+default PostgreSQL is built without
+assertions.
+   
+  
+ 
+
  
   integer_datetimes (boolean)
   
@@ -6973,28 +6993,6 @@ dynamic_library_path = 'C:\tools\postgresql;H:\my_project\lib;$libdir'
   
  
 
- 
-  debug_assertions (boolean)
-  
-   debug_assertions configuration parameter
-  
-  
-  
-   
-Turns on various assertion checks. This is a debugging aid. If
-you are experiencing strange problems or crashes you might want
-to turn this on, as it might expose programming mistakes. To use
-this parameter, the macro USE_ASSERT_CHECKING
-must be defined when PostgreSQL is
-built (accomplished by the configure option
---enable-cassert). Note that
-debug_assertions defaults to on
-if PostgreSQL has been built with
-assertions enabled.
-   
-  
- 
-
  
   ignore_system_indexes (boolean)
   
@@ -7355,10 +7353,6 @@ LOG:  CleanUpLock: deleting: lock(0xb7acd844) id(24688,24696,0,0,0,1)
 
   

--A x
-debug_assertions = x
-   
-   
 -B x
 shared_buffers = x

diff --git a/src/backend/access/gin/ginpostinglist.c b/src/backend/access/gin/ginpostinglist.c
index 606a824..ea59dea 100644
--- a/

Re: [HACKERS] [bug fix] Memory leak in dblink

2014-06-19 Thread MauMau


From: "Joe Conway" 

Fair enough -- this patch does it at that level in
materializeQueryResult()


I'm in favor of this.  TBH, I did this at first, but I was afraid this would 
be rejected due to the reason mentioned earlier.


if statement in PG_TRY block is not necessary like this, because sinfo is 
zero-cleared.



  PG_TRY();
  {
+   /* Create short-lived memory context for data conversions */
+   sinfo.tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+"dblink temporary context",

Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] idle_in_transaction_timeout

2014-06-19 Thread Josh Berkus

Tom,

> ISTM our realistic options are for seconds or msec as the unit.  If it's
> msec, we'd be limited to INT_MAX msec or around 600 hours at the top end,
> which seems like enough to me but maybe somebody thinks differently?
> Seconds are probably OK but I'm worried about somebody complaining that
> that's not enough resolution, especially as machines get faster.

I can picture a 500ms timeout more readily than I can picture a 1000hr
timeout.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] idle_in_transaction_timeout

2014-06-19 Thread Tom Lane

Josh Berkus  writes:
> Now ... can we decrease CPU overhead (wakeups) if we only check once per
> minute?  If so, there's a good argument for making 1min the minimum.

Polling wakeups are right out, and are unnecessary anyway.  The 
utils/misc/timeout.c infrastructure calculates the time left till the
closest timeout event.  So I don't see a need to worry about that end of
it.

ISTM our realistic options are for seconds or msec as the unit.  If it's
msec, we'd be limited to INT_MAX msec or around 600 hours at the top end,
which seems like enough to me but maybe somebody thinks differently?
Seconds are probably OK but I'm worried about somebody complaining that
that's not enough resolution, especially as machines get faster.

Whichever the unit, I don't see a reason to set a lower bound different
from "one".  You ask for a 1ms timeout, we'll give it to you, it's your
problem whether that's sane in your environment.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] change alter user to be a true alias for alter role

2014-06-19 Thread Vik Fearing

On 06/19/2014 07:18 PM, Tom Lane wrote:
> Jov  writes:
>> the doc say:
>>> ALTER USER is now an alias for ALTER 
>>> ROLE
> 
>> but alter user lack the following format:
>> ...
> 
> If we're going to have a policy that these commands be exactly equivalent,
> it seems like this patch is just sticking a finger into the dike.  It does
> nothing to prevent anyone from making the same mistake again in future.
> 
> What about collapsing both sets of productions into one, along the lines
> of
> 
> role_or_user: ROLE | USER;
> 
> AlterRoleSetStmt:
>   ALTER role_or_user RoleId opt_in_database SetResetClause
> 
> (and similarly to the latter for every existing ALTER ROLE variant).

I thought about suggesting that, and it seems that I should have.  I
also thought about suggesting adding GROUP as an alias, too.  That's
probably not as good of an idea.

> Because MAPPING is an unreserved keyword, I think that this approach
> might force us to also change ALTER USER MAPPING to ALTER role_or_user
> MAPPING, which is not contemplated by the SQL standard.  But hey,
> it would satisfy the principle of least surprise no?  Anyway we don't
> have to document that that would work.

That's a small price to pay, so I'm all for accepting it.  I agree that
it doesn't need to be documented.
-- 
Vik


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] idle_in_transaction_timeout

2014-06-19 Thread Josh Berkus

On 06/19/2014 02:44 PM, Kevin Grittner wrote:
> Josh Berkus  wrote:
> 
>> Also, I really hope that nobody is setting this to 10s ...
> 
> Well, we get to decide what the minimum allowed value is.  What do
> you think that should be?

1s?

I don't think that setting it to 1s would ever be a good idea, but I
don't want to tie users' hands if they know better than I do. Also,
unless we use 1s granuarity, users can't set it to values like 45s,
which is more reasonable. I can't see any circumstance under which
sub-second values would ever make sense.

Now ... can we decrease CPU overhead (wakeups) if we only check once per
minute?  If so, there's a good argument for making 1min the minimum.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] idle_in_transaction_timeout

2014-06-19 Thread Kevin Grittner

Josh Berkus  wrote:

> Also, I really hope that nobody is setting this to 10s ...

Well, we get to decide what the minimum allowed value is.  What do
you think that should be?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] idle_in_transaction_timeout

2014-06-19 Thread Josh Berkus

On 06/19/2014 10:39 AM, David G Johnston wrote:
> Advocating for the Devil (or a more robust, if probably complicated,
> solution):
> "idle_in_transaction_timeout=10s"
> "idle_in_transaction_target=session|transaction"

Per Kevin Grittner's assessment, aborting the transaction is currently a
nonstarter. So no need for a 2nd GUC.

Also, I really hope that nobody is setting this to 10s ...

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP patch for multiple column assignment in UPDATE

2014-06-19 Thread Pavel Stehule

2014-06-19 15:37 GMT+02:00 Tom Lane :

> Pavel Stehule  writes:
> > I did some tests and It looks so it allows only some form of nested loop.
>
> [ shrug... ]  It's a subplan.  One evaluation per outer row is what
> people are expecting.
>

ok

regards

Pavel

>
> regards, tom lane
>

Re: [HACKERS] Extended Prefetching using Asynchronous IO - proposal and patch

2014-06-19 Thread Fujii Masao

On Mon, Jun 9, 2014 at 11:12 AM, johnlumby  wrote:
> updated version of patch compatible with git head of 140608,
> (adjusted proc oid and a couple of minor fixes)

Compilation of patched version on MacOS failed. The error messages were

gcc -O0 -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Wendif-labels
-Wmissing-format-attribute -Wformat-security -fno-strict-aliasing
-fwrapv -g -I../../../../src/include   -c -o heapam.o heapam.c
heapam.c: In function 'heap_unread_add':
heapam.c:362: error: 'struct HeapScanDescData' has no member named
'rs_Unread_Pfetched_next'
heapam.c:363: error: 'struct HeapScanDescData' has no member named
'rs_Unread_Pfetched_count'
heapam.c:369: error: 'struct HeapScanDescData' has no member named
'rs_Unread_Pfetched_count'
heapam.c:375: error: 'struct HeapScanDescData' has no member named
'rs_Unread_Pfetched_base'
heapam.c:381: error: 'struct HeapScanDescData' has no member named
'rs_Unread_Pfetched_base'
heapam.c:387: error: 'struct HeapScanDescData' has no member named
'rs_Unread_Pfetched_next'
heapam.c:405: error: 'struct HeapScanDescData' has no member named
'rs_Unread_Pfetched_count'
heapam.c: In function 'heap_unread_subtract':
heapam.c:419: error: 'struct HeapScanDescData' has no member named
'rs_Unread_Pfetched_next'
heapam.c:425: error: 'struct HeapScanDescData' has no member named
'rs_Unread_Pfetched_count'
heapam.c:434: error: 'struct HeapScanDescData' has no member named
'rs_Unread_Pfetched_base'
heapam.c:442: error: 'struct HeapScanDescData' has no member named
'rs_Unread_Pfetched_base'
heapam.c:452: error: 'struct HeapScanDescData' has no member named
'rs_Unread_Pfetched_next'
heapam.c:453: error: 'struct HeapScanDescData' has no member named
'rs_Unread_Pfetched_next'
heapam.c:454: error: 'struct HeapScanDescData' has no member named
'rs_Unread_Pfetched_next'
heapam.c:456: error: 'struct HeapScanDescData' has no member named
'rs_Unread_Pfetched_count'
heapam.c: In function 'heapgettup':
heapam.c:944: error: 'struct HeapScanDescData' has no member named
'rs_pfchblock'
heapam.c:716: warning: unused variable 'ix'
heapam.c: In function 'heapgettup_pagemode':
heapam.c:1243: error: 'struct HeapScanDescData' has no member named
'rs_pfchblock'
heapam.c:1029: warning: unused variable 'ix'
heapam.c: In function 'heap_endscan':
heapam.c:1808: error: 'struct HeapScanDescData' has no member named
'rs_Unread_Pfetched_base'
heapam.c:1809: error: 'struct HeapScanDescData' has no member named
'rs_Unread_Pfetched_base'
make[4]: *** [heapam.o] Error 1
make[3]: *** [heap-recursive] Error 2
make[2]: *** [access-recursive] Error 2
make[1]: *** [install-backend-recurse] Error 2
make: *** [install-src-recurse] Error 2


Huge patch is basically not easy to review. What about simplifying the patch
by excluding non-core parts like the change of pg_stat_statements, so that
reviewers can easily read the patch? We can add such non-core parts later.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minmax indexes

2014-06-19 Thread Claudio Freire

On Wed, Jun 18, 2014 at 8:51 AM, Heikki Linnakangas
 wrote:
>
> I liked Greg's sketch of what the opclass support functions would be. It
> doesn't seem significantly more complicated than what's in the patch now.

Which was

On Tue, Jun 17, 2014 at 8:48 PM, Greg Stark  wrote:
> An aggregate to generate a min/max "bounding box" from several values
> A function which takes an "bounding box" and a new value and returns
> the new "bounding box"
> A function which tests if a value is in a "bounding box"
> A function which tests if a "bounding box" overlaps a "bounding box"

Which I'd generalize a bit further by renaming "bounding box" with
"compressed set", and allow it to be parameterized.

So, you have:

An aggregate to generate a "compressed set" from several values
A function which adds a new value to the "compressed set" and returns
the new "compressed set"
A function which tests if a value is in a "compressed set"
A function which tests if a "compressed set" overlaps another
"compressed set" of equal type

If you can define different compressed sets, you can use this to
generate both min/max indexes as well as bloom filter indexes. Whether
we'd want to have both is perhaps questionable, but having the ability
to is probably desirable.

One problem with such a generalized implementation would be, that I'm
not sure in-place modification of the "compressed set" on-disk can be
assumed to be safe on all cases. Surely, for strictly-enlarging sets
it would, but while min/max and bloom filters both fit the bill, it's
not clear that one can assume this for all structures.

Adding also a "in-place updateable" bit to the "type" would perhaps
inflate the complexity of the patch due to the need to provide both
code paths?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Extended Prefetching using Asynchronous IO - proposal and patch

2014-06-19 Thread Claudio Freire

On Thu, Jun 19, 2014 at 2:49 PM, Greg Stark  wrote:
> I don't think the threaded implementation on Linux is the one to use
> though. I find this *super* confusing but the kernel definitely
> supports aio syscalls, glibc also has a threaded implementation it
> uses if run on a kernel that doesn't implement the syscalls, and I
> think there are existing libaio and librt libraries from outside glibc
> that do one or the other. Which you build against seems to make a big
> difference. My instinct is that anything but the kernel native
> implementation will be worthless. The overhead of thread communication
> will completely outweigh any advantage over posix_fadvise's partial
> win.

What overhead?

The only communication is through a "done" bit and associated
synchronization structure when *and only when* you want to wait on it.

Furthermore, posix_fadvise is braindead on this use case, been there,
done that. What you win with threads is a better postgres-kernel
interaction, even if you loose some CPU performance it's gonna beat
posix_fadvise by a large margin.

> The main advantage of posix aio is that we can actually receive the
> data out of order. With posix_fadvise we can get the i/o and cpu
> overlap but we will never process the later blocks until the earlier
> requests are satisfied and processed in order. With aio you could do a
> sequential scan, initiating i/o on 1,000 blocks and then processing
> them as they arrive, initiating new requests as those blocks are
> handled.

And each and every I/O you issue with it counts as such on the kernel side.

It's not the case with posix_fadvise, mind you, and that's terribly
damaging for performance.

> When I investigated this I found the buffer manager's I/O bits seemed
> to already be able to represent the state we needed (i/o initiated on
> this buffer but not completed). The problem was in ensuring that a
> backend would process the i/o completion promptly when it might be in
> the midst of handling other tasks and might even get an elog() stack
> unwinding. The interface that actually fits Postgres best might be the
> threaded interface (orthogonal to the threaded implementation
> question) which is you give aio a callback which gets called on a
> separate thread when the i/o completes. The alternative is you give
> aio a list of operation control blocks and it tells you the state of
> all the i/o operations. But it's not clear to me how you arrange to do
> that regularly, promptly, and reliably.

Indeed we've been musing about using librt's support of completion
callbacks for that.

> The other gotcha here is that the kernel implementation only does
> anything useful on DIRECT_IO files. That means you have to do *all*
> the prefetching and i/o scheduling yourself.

If that's the case, we should discard kernel-based implementations and
stick to thread-based ones. Postgres cannot do scheduling as
efficiently as the kernel, and it shouldn't try.

> You would be doing that
> anyways for sequential scans and bitmap scans -- and we already do it
> with things like synchronised scans and posix_fadvise

That only works because the patterns are semi-sequential. If you try
to schedule random access, it becomes effectively impossible without
hardware info.

The kernel is the one with hardware info.

> Finally, when I did the posix_fadvise work I wrote a synthetic
> benchmark for testing the equivalent i/o pattern of a bitmap scan. It
> let me simulate bitmap scans of varying densities with varying
> parameters, notably how many i/o to keep in flight at once. It
> supported posix_fadvise or aio. You should look it up in the archives,
> it made for some nice looking graphs. IIRC I could not find any build
> environment where aio offered any performance boost at all. I think
> this means I just didn't know how to build it against the right
> libraries or wasn't using the right kernel or there was some skew
> between them at the time.

If it's old, it probable you didn't hit the kernel's braindeadness
since it was introduced somewhat recently (somewhate, ie, before 3.x I
believe).

Even if you did hit it, bitmap heap scans are blessed with sequential
ordering. The technique doesn't work nearly as well with random I/O
that might be sorted from time to time.

When traversing an index, you do a mostly sequential pattern due to
physical correlation, but not completely sequential. Not only that,
with the assumption of random I/O, and the uncertainty of when will
the scan be aborted too, you don't read ahead as much as you could if
you knew it was sequential or a full scan. That kills performance. You
don't fetch enough ahead of time to avoid stalls, and the kernel
doesn't do read-ahead either because posix_fadvise effectively
disables it, resulting in the equivalent of direct I/O with bad
scheduling.

Solving this for index scans isn't just a little more complex. It's
insanely more complex, because you need hardware information to do it
right. How many spindles, how many

Re: [HACKERS] Extended Prefetching using Asynchronous IO - proposal and patch

2014-06-19 Thread Greg Stark

On Wed, May 28, 2014 at 2:19 PM, Heikki Linnakangas
 wrote:
> How portable is POSIX aio nowadays? Googling around, it still seems that on
> Linux, it's implemented using threads. Does the thread-emulation
> implementation cause problems with the rest of the backend, which assumes
> that there is only a single thread? In any case, I think we'll want to
> encapsulate the AIO implementation behind some kind of an API, to allow
> other implementations to co-exist.

I think POSIX aio is pretty damn standard and it's a pretty fiddly
interface. If we abstract it behind an i/o interface we're going to
lose a lot of the power. Abstracting it behind a set of buffer manager
operations (initiate i/o on buffer, complete i/o on buffer, abort i/o
on buffer) should be fine but that's basically what we have, no?

I don't think the threaded implementation on Linux is the one to use
though. I find this *super* confusing but the kernel definitely
supports aio syscalls, glibc also has a threaded implementation it
uses if run on a kernel that doesn't implement the syscalls, and I
think there are existing libaio and librt libraries from outside glibc
that do one or the other. Which you build against seems to make a big
difference. My instinct is that anything but the kernel native
implementation will be worthless. The overhead of thread communication
will completely outweigh any advantage over posix_fadvise's partial
win.

The main advantage of posix aio is that we can actually receive the
data out of order. With posix_fadvise we can get the i/o and cpu
overlap but we will never process the later blocks until the earlier
requests are satisfied and processed in order. With aio you could do a
sequential scan, initiating i/o on 1,000 blocks and then processing
them as they arrive, initiating new requests as those blocks are
handled.

When I investigated this I found the buffer manager's I/O bits seemed
to already be able to represent the state we needed (i/o initiated on
this buffer but not completed). The problem was in ensuring that a
backend would process the i/o completion promptly when it might be in
the midst of handling other tasks and might even get an elog() stack
unwinding. The interface that actually fits Postgres best might be the
threaded interface (orthogonal to the threaded implementation
question) which is you give aio a callback which gets called on a
separate thread when the i/o completes. The alternative is you give
aio a list of operation control blocks and it tells you the state of
all the i/o operations. But it's not clear to me how you arrange to do
that regularly, promptly, and reliably.

The other gotcha here is that the kernel implementation only does
anything useful on DIRECT_IO files. That means you have to do *all*
the prefetching and i/o scheduling yourself. You would be doing that
anyways for sequential scans and bitmap scans -- and we already do it
with things like synchronised scans and posix_fadvise -- but index
scans would need to get some intelligence for when it makes sense to
read more than one page at a time.  It might be possible to do
something fairly coarse like having our i/o operators keep track of
how often i/o on a relation falls within a certain number of blocks of
an earlier i/o and autotune number of blocks to read based on that. It
might not be hard to do better than the kernel with even basic info
like what level of the index we're reading or what type of pointer
we're following.

Finally, when I did the posix_fadvise work I wrote a synthetic
benchmark for testing the equivalent i/o pattern of a bitmap scan. It
let me simulate bitmap scans of varying densities with varying
parameters, notably how many i/o to keep in flight at once. It
supported posix_fadvise or aio. You should look it up in the archives,
it made for some nice looking graphs. IIRC I could not find any build
environment where aio offered any performance boost at all. I think
this means I just didn't know how to build it against the right
libraries or wasn't using the right kernel or there was some skew
between them at the time.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] idle_in_transaction_timeout

2014-06-19 Thread David G Johnston

On Thu, Jun 19, 2014 at 12:40 PM, Abhijit Menon-Sen-2 [via PostgreSQL] <
ml-node+s1045698n5808016...@n5.nabble.com> wrote:

> At 2014-06-19 17:53:17 +0200, [hidden email]
>  wrote:
> >
> > I much prefer with "in" but it doesn't much matter.
>
> If you look at similar settings like statement_timeout, lock_timeout,
> etc., what comes before the _timeout is a concrete *thing* that can
> timeout, or that a timeout can be applied to (e.g. wal_receiver).
>
> "What's timing out?" "A statement."
>
> But in "idle_in_transaction_timeout", "idle_in_transaction" is not a
> thing. It's a description of the state of a thing (the thing being a
> session in the FATAL variant of your patch, or a transaction in the
> ERROR variant).
>

> "What's timing out?" "An idle in transaction." "Huh?"
>
> Strictly speaking, by this logic, the consistent name for the setting in
> the FATAL variant would be "idle_in_transaction_session_timeout",


+1; I even suggested something similar (I omitted the "in") - though we
hadn't come to a firm conclusion on what behavior we were going to
implement at the time.  Adding session reasonably precludes us from easily
changing our mind later (or giving the user an option) but that doesn't
seem likely anyway.

Advocating for the Devil (or a more robust, if probably complicated,
solution):
"idle_in_transaction_timeout=10s"
"idle_in_transaction_target=session|transaction"

Would be a valid pair since the first intentionally would not want to
specify a target - and thus does not fit within the pattern you defined.

"idle_transaction_timeout" is bad since idle is a valid state that is not
being affected by this patch; whereas pgbouncer indeed drops truly idle
connections.

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/idle-in-transaction-timeout-tp5805859p5808024.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

Re: [HACKERS] -DDISABLE_ENABLE_ASSERT

2014-06-19 Thread Andres Freund

On 2014-05-23 10:01:37 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > The next question is whether to wait till after the branching with this?
> 
> +1 for waiting (it's only a couple weeks anyway).  This isn't a
> user-facing feature in any way, so I feel no urgency to ship it in 9.4.

Here's my patch for this that I plan to apply later.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
>From 2a1e793479bbc038aef0580a339fef38518ad17f Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Thu, 19 Jun 2014 19:16:49 +0200
Subject: [PATCH] Don't allow to disable backend assertions via assert_enabled
 GUC anymore.

The existance of the assert_enabled variable reduced the amount of
knowledge some static code checkers (like coverity and various
compilers) could infer from the existance of the assertion. That could
have been solved by optionally removing the assertion_enabled at
compile time, but the resulting complication doesn't seem to be worth
it. Recompiling is fast enough.

While at it, reduce code duplication in bufmgr.c and localbuf.c in
assertions checking for spurious buffer pins.
---
 src/backend/access/gin/ginpostinglist.c  |  1 -
 src/backend/commands/event_trigger.c |  1 -
 src/backend/storage/buffer/bufmgr.c  | 62 ++--
 src/backend/storage/buffer/localbuf.c| 51 --
 src/backend/storage/lmgr/proc.c  |  3 --
 src/backend/utils/cache/catcache.c   | 51 +-
 src/backend/utils/cache/relfilenodemap.c |  1 -
 src/backend/utils/misc/guc.c | 30 
 src/include/c.h  |  4 +--
 src/include/postgres.h   |  9 ++---
 10 files changed, 84 insertions(+), 129 deletions(-)

diff --git a/src/backend/access/gin/ginpostinglist.c b/src/backend/access/gin/ginpostinglist.c
index 606a824..ea59dea 100644
--- a/src/backend/access/gin/ginpostinglist.c
+++ b/src/backend/access/gin/ginpostinglist.c
@@ -250,7 +250,6 @@ ginCompressPostingList(const ItemPointer ipd, int nipd, int maxsize,
 	 * Check that the encoded segment decodes back to the original items.
 	 */
 #if defined (CHECK_ENCODING_ROUNDTRIP)
-	if (assert_enabled)
 	{
 		int			ndecoded;
 		ItemPointer tmp = ginPostingListDecode(result, &ndecoded);
diff --git a/src/backend/commands/event_trigger.c b/src/backend/commands/event_trigger.c
index 6d4e091..110fe00 100644
--- a/src/backend/commands/event_trigger.c
+++ b/src/backend/commands/event_trigger.c
@@ -635,7 +635,6 @@ EventTriggerCommonSetup(Node *parsetree,
 	 * relevant command tag.
 	 */
 #ifdef USE_ASSERT_CHECKING
-	if (assert_enabled)
 	{
 		const char *dbgtag;
 
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index c070278..07ea665 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -109,6 +109,7 @@ static volatile BufferDesc *BufferAlloc(SMgrRelation smgr,
 			bool *foundPtr);
 static void FlushBuffer(volatile BufferDesc *buf, SMgrRelation reln);
 static void AtProcExit_Buffers(int code, Datum arg);
+static void CheckForBufferLeaks(void);
 static int	rnode_comparator(const void *p1, const void *p2);
 
 
@@ -1699,34 +1700,13 @@ SyncOneBuffer(int buf_id, bool skip_recently_used)
 	return result | BUF_WRITTEN;
 }
 
-
 /*
  *		AtEOXact_Buffers - clean up at end of transaction.
- *
- *		As of PostgreSQL 8.0, buffer pins should get released by the
- *		ResourceOwner mechanism.  This routine is just a debugging
- *		cross-check that no pins remain.
  */
 void
 AtEOXact_Buffers(bool isCommit)
 {
-#ifdef USE_ASSERT_CHECKING
-	if (assert_enabled)
-	{
-		int			RefCountErrors = 0;
-		Buffer		b;
-
-		for (b = 1; b <= NBuffers; b++)
-		{
-			if (PrivateRefCount[b - 1] != 0)
-			{
-PrintBufferLeakWarning(b);
-RefCountErrors++;
-			}
-		}
-		Assert(RefCountErrors == 0);
-	}
-#endif
+	CheckForBufferLeaks();
 
 	AtEOXact_LocalBuffers(isCommit);
 }
@@ -1756,26 +1736,36 @@ AtProcExit_Buffers(int code, Datum arg)
 	AbortBufferIO();
 	UnlockBuffers();
 
+	CheckForBufferLeaks();
+
+	/* localbuf.c needs a chance too */
+	AtProcExit_LocalBuffers();
+}
+
+/*
+ *		CheckForBufferLeaks - ensure this backend holds no buffer pins
+ *
+ *		As of PostgreSQL 8.0, buffer pins should get released by the
+ *		ResourceOwner mechanism.  This routine is just a debugging
+ *		cross-check that no pins remain.
+ */
+static void
+CheckForBufferLeaks(void)
+{
 #ifdef USE_ASSERT_CHECKING
-	if (assert_enabled)
-	{
-		int			RefCountErrors = 0;
-		Buffer		b;
+	int			RefCountErrors = 0;
+	Buffer		b;
 
-		for (b = 1; b <= NBuffers; b++)
+	for (b = 1; b <= NBuffers; b++)
+	{
+		if (PrivateRefCount[b - 1] != 0)
 		{
-			if (PrivateRefCount[b - 1] != 0)
-			{
-PrintBufferLeakWarning(b);
-RefCountErrors++;
-			}
+			PrintBufferLeakWarning(b);
+			RefCountErrors++;
 		}
-		Assert(RefCountErrors == 0);
 	}
+	Ass

Re: [HACKERS] Minmax indexes

2014-06-19 Thread Claudio Freire

On Thu, Jun 19, 2014 at 10:06 AM, Heikki Linnakangas
 wrote:
> On 06/18/2014 06:09 PM, Claudio Freire wrote:
>>
>> On Tue, Jun 17, 2014 at 8:48 PM, Greg Stark  wrote:
>>>
>>> An aggregate to generate a min/max "bounding box" from several values
>>> A function which takes an "bounding box" and a new value and returns
>>> the new "bounding box"
>>> A function which tests if a value is in a "bounding box"
>>> A function which tests if a "bounding box" overlaps a "bounding box"
>>
>>
>> Which I'd generalize a bit further by renaming "bounding box" with
>> "compressed set", and allow it to be parameterized.
>
>
> What do you mean by parameterized?

Bloom filters can be paired with number of hashes, number of bit
positions, and hash function, so it's not a simple bloom filter index,
but a bloom filter index with N SHA-1-based hashes spread on a
K-length bitmap.

>> So, you have:
>>
>> An aggregate to generate a "compressed set" from several values
>> A function which adds a new value to the "compressed set" and returns
>> the new "compressed set"
>> A function which tests if a value is in a "compressed set"
>> A function which tests if a "compressed set" overlaps another
>> "compressed set" of equal type
>
>
> Yeah, something like that. I'm not sure I like the "compressed set" term any
> more than bounding box, though. GiST seems to have avoided naming the thing,
> and just talks about "index entries". But if we can come up with a good
> name, that would be more clear.

I don't want to use the term bloom filter since it's very specific of
a specific technique, but it's basically that - an approximate set
without false negatives. Ie: compressed set.

It's not a bounding box either when using bloom filters. So...

>> One problem with such a generalized implementation would be, that I'm
>> not sure in-place modification of the "compressed set" on-disk can be
>> assumed to be safe on all cases. Surely, for strictly-enlarging sets
>> it would, but while min/max and bloom filters both fit the bill, it's
>> not clear that one can assume this for all structures.
>
>
> I don't understand what you mean. It's a fundamental property of minmax
> indexes that you can always replace the "min" or "max" or "compressing set"
> or "bounding box" or whatever with another datum that represents all the
> keys that the old one did, plus some.

Yes, and bloom filters happen to fall on that category too.

Never mind what I said. I was thinking of other potential and
imaginary implementation that supports removal or updates, that might
need care with transaction lifetimes, but that's easily fixed by
letting vacuum or some lazy process do the deleting just as it happens
with other indexes anyway.

So, I guess the interface must include also the invariant that
compressed sets only grow, never shrink unless from a rebuild or a
vacuum operation.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] change alter user to be a true alias for alter role

2014-06-19 Thread Tom Lane

Jov  writes:
> the doc say:
>> ALTER USER is now an alias for ALTER 
>> ROLE

> but alter user lack the following format:
> ...

If we're going to have a policy that these commands be exactly equivalent,
it seems like this patch is just sticking a finger into the dike.  It does
nothing to prevent anyone from making the same mistake again in future.

What about collapsing both sets of productions into one, along the lines
of

role_or_user: ROLE | USER;

AlterRoleSetStmt:
ALTER role_or_user RoleId opt_in_database SetResetClause

(and similarly to the latter for every existing ALTER ROLE variant).

Because MAPPING is an unreserved keyword, I think that this approach
might force us to also change ALTER USER MAPPING to ALTER role_or_user
MAPPING, which is not contemplated by the SQL standard.  But hey,
it would satisfy the principle of least surprise no?  Anyway we don't
have to document that that would work.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Crash on backend exit w/ OpenLDAP [2.4.24, 2.4.31]

2014-06-19 Thread Tom Lane

Noah Misch  writes:
> On Thu, Jun 12, 2014 at 05:02:19PM -0400, Noah Misch wrote:
>> You can cause the at-exit crash by building PostgreSQL against OpenLDAP
>> 2.4.31, connecting with LDAP authentication, and issuing "LOAD 'dblink'".

>> 4. Detect older OpenLDAP versions at runtime, just before we would otherwise
>> initialize OpenLDAP, and raise an error.  Possibly make the same check at
>> compile time, for packager convenience.

> Having pondered this some more, I lean toward the following conservative fix.
> Add to all supported branches a test case that triggers the crash and a
> configure-time warning if the OpenLDAP version falls in the vulnerable range.
> So long as those who build from source monitor either "configure" output or
> test suite failures, they'll have the opportunity to head off the problem.

+1 for a configure warning, but I share your concern that it's likely to
go unnoticed (sometimes I wish autoconf were not so chatty...).

Keep in mind that some distros patch bugs without changing the reported
version number, so I'm afraid we couldn't adopt the easy solution of
making configure give a hard error when the version is suspicious; and
for the same reason your #4 above is unworkable.

I'm not sure about the practicality of adding a test case --- how will we
test that if no LDAP server is at hand?

I concur with not working much harder than this, in any case.  It's really
OpenLDAP's bug to fix.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Crash on backend exit w/ OpenLDAP [2.4.24, 2.4.31]

2014-06-19 Thread Noah Misch

On Thu, Jun 12, 2014 at 05:02:19PM -0400, Noah Misch wrote:
> You can cause the at-exit crash by building PostgreSQL against OpenLDAP
> 2.4.31, connecting with LDAP authentication, and issuing "LOAD 'dblink'".

> 4. Detect older OpenLDAP versions at runtime, just before we would otherwise
> initialize OpenLDAP, and raise an error.  Possibly make the same check at
> compile time, for packager convenience.

Having pondered this some more, I lean toward the following conservative fix.
Add to all supported branches a test case that triggers the crash and a
configure-time warning if the OpenLDAP version falls in the vulnerable range.
So long as those who build from source monitor either "configure" output or
test suite failures, they'll have the opportunity to head off the problem.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] idle_in_transaction_timeout

2014-06-19 Thread Abhijit Menon-Sen

At 2014-06-19 17:53:17 +0200, vik.fear...@dalibo.com wrote:
>
> I much prefer with "in" but it doesn't much matter.

If you look at similar settings like statement_timeout, lock_timeout,
etc., what comes before the _timeout is a concrete *thing* that can
timeout, or that a timeout can be applied to (e.g. wal_receiver).

"What's timing out?" "A statement."

But in "idle_in_transaction_timeout", "idle_in_transaction" is not a
thing. It's a description of the state of a thing (the thing being a
session in the FATAL variant of your patch, or a transaction in the
ERROR variant).

"What's timing out?" "An idle in transaction." "Huh?"

Strictly speaking, by this logic, the consistent name for the setting in
the FATAL variant would be "idle_in_transaction_session_timeout", while
for the ERROR variant it would be "idle_transaction_timeout".

Both those names pass the "What's timing out?" test. But
"idle_in_transaction_timeout" alone doesn't, and that's why I can't
bring myself to like it. And pgbouncer's use of
"idle_transaction_timeout" is a weak precedent to continue to use the
same name for the same functionality.

Anyway, as you say, it doesn't matter so much. I promise I won't beat
the nomenclature horse any more. I just wanted to explain my thinking
once. Sorry for dragging it out.

-- Abhijit

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [REVIEW] Re: Re: BUG #9578: Undocumented behaviour for temp tables created inside query language (SQL) functions

2014-06-19 Thread Tom Lane

Haribabu Kommi  writes:
> Updated patch attached.

I revised this a bit more and committed it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minmax indexes

2014-06-19 Thread Greg Stark

On Wed, Jun 18, 2014 at 4:51 AM, Heikki Linnakangas
 wrote:
> Implementing something is a good way to demonstrate how it would look like.
> But no, I don't insist on implementing every possible design whenever a new
> feature is proposed.
>
> I liked Greg's sketch of what the opclass support functions would be. It
> doesn't seem significantly more complicated than what's in the patch now.

As a counter-point to my own point there will be nothing stopping us
in the future from generalizing things. Dealing with catalogs is
mostly book-keeping headaches and careful work. it's something that
might be well-suited for a GSOC or first patch from someone looking to
familiarize themselves with the system architecture. It's hard to
invent a whole new underlying infrastructure at the same time as
dealing with all that book-keeping and it's hard for someone
familiarizing themselves with the system to also have a great new
idea. Having tasks like this that are easy to explain and that mentor
understands well can be easier to manage than tasks where the newcomer
has some radical new idea.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [bug fix] Memory leak in dblink

2014-06-19 Thread Joe Conway

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 06/19/2014 08:18 AM, Tom Lane wrote:
> The code is really designed to put all the setup for storeRow into
> one place; but I concur with Joe that having teardown in a
> different place from setup isn't very nice.
> 
> An alternative that might be worth thinking about is putting both
> the context creation and deletion at the outermost level where the
> storeInfo struct is defined.  That seems possibly a shade less
> surprising than having it at the intermediate level.

Fair enough -- this patch does it at that level in
materializeQueryResult()

Joe

- -- 
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting, & 24x7 Support
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.14 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJToxoJEDfy90M199hl6okP/0B5Pf+SAk7uNpLhDgE5oR27
vWpzhZHau2Yo9RkPqXwJoHRIIwluBqAjp3Ps2b9qLCPpyPFWbxll/f2K2C3LTgYi
ZcpFWQqrCdDMFbHphE642mAt8oy9xAXHsCCH4ONAUTGfhOagRKP2oKGhwbMOah74
KDfwDuyXyXEhzOjX538/3WhGXgZV5pmBUp9+QKZ8KhLy6GdAwg5h8Q2QFXy5UVuf
pZyoXP8kxuN6ubYMlRnWpUK1yxhks9Hqjahx3dU4MDrKyxPGKJL2p5jpU04an8RD
JuswYNlZpMWbaF9C1AWM7C++COWPzSx5tULOJFWNhVgmrZj6jvtKR0SVB8GK2K4G
8xepGXREj6wAzFEY3FqOmihcKi0gk6jvJ6BJocZDAB+UTSSIMvqxfj+RDtX1CYWT
YKEh5Wzs4D1bK7eKxDxWRj+EzdxJVXq6nNSUdQHwq6HwR8XcT7R7/AP6qUSl90EY
pj+8iKu+c3u/fLogqH1xcLx0d5WpDttosjvJ2d1pA6iCBTRZJofj9Q80TDSXYEn8
/gdYHjam9U11wUzNhT5n7IV+vxMdNYxlHWX076xIA0wgrz7SIAp0DBkCu844OAl7
4DJ966m3QbWUHQgzNPE0nyHpEMUOpKMTAKxxeXMDik1U/q/GUx/dekjeL09W2qvh
O//08pMr5h+y6NAjbOMQ
=9P3q
-END PGP SIGNATURE-
diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index a81853f..957fe01 100644
*** a/contrib/dblink/dblink.c
--- b/contrib/dblink/dblink.c
*** materializeQueryResult(FunctionCallInfo
*** 977,982 
--- 977,990 
  
  	PG_TRY();
  	{
+ 		/* Create short-lived memory context for data conversions */
+ 		if (!sinfo.tmpcontext)
+ 			sinfo.tmpcontext = AllocSetContextCreate(CurrentMemoryContext,
+ "dblink temporary context",
+ ALLOCSET_DEFAULT_MINSIZE,
+ ALLOCSET_DEFAULT_INITSIZE,
+ ALLOCSET_DEFAULT_MAXSIZE);
+ 
  		/* execute query, collecting any tuples into the tuplestore */
  		res = storeQueryResult(&sinfo, conn, sql);
  
*** materializeQueryResult(FunctionCallInfo
*** 1041,1046 
--- 1049,1060 
  			PQclear(res);
  			res = NULL;
  		}
+ 
+ 		/* clean up data conversion short-lived memory context */
+ 		if (sinfo.tmpcontext != NULL)
+ 			MemoryContextDelete(sinfo.tmpcontext);
+ 		sinfo.tmpcontext = NULL;
+ 
  		PQclear(sinfo.last_res);
  		sinfo.last_res = NULL;
  		PQclear(sinfo.cur_res);
*** storeRow(storeInfo *sinfo, PGresult *res
*** 1204,1218 
  		if (sinfo->cstrs)
  			pfree(sinfo->cstrs);
  		sinfo->cstrs = (char **) palloc(nfields * sizeof(char *));
- 
- 		/* Create short-lived memory context for data conversions */
- 		if (!sinfo->tmpcontext)
- 			sinfo->tmpcontext =
- AllocSetContextCreate(CurrentMemoryContext,
- 	  "dblink temporary context",
- 	  ALLOCSET_DEFAULT_MINSIZE,
- 	  ALLOCSET_DEFAULT_INITSIZE,
- 	  ALLOCSET_DEFAULT_MAXSIZE);
  	}
  
  	/* Should have a single-row result if we get here */
--- 1218,1223 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [bug fix] Memory leak in dblink

2014-06-19 Thread Tom Lane

Joe Conway  writes:
>> Probably so. I'll try to scrounge up some time to test the
>> performance impact of your patch.

> Not the most scientific of tests, but I think a reasonable one:
> ...
> 2.7% performance penalty

Thanks.  While that's not awful, it's enough to be annoying.

I think we could mitigate this by allocating the argument context once in
nodeFunctionscan setup, and passing it into ExecMakeTableFunctionResult;
so we'd only do a memory context reset not a create/delete in each cycle.
That would make the patch a bit more invasive, but not much.

Back-patchability would depend on whether you think there's any third
party code calling ExecMakeTableFunctionResult; I kinda doubt that,
but I wonder if anyone has a different opinion.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [bug fix] Memory leak in dblink

2014-06-19 Thread Joe Conway

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 06/19/2014 09:18 AM, Tom Lane wrote:
> I think we could mitigate this by allocating the argument context
> once in nodeFunctionscan setup, and passing it into
> ExecMakeTableFunctionResult; so we'd only do a memory context reset
> not a create/delete in each cycle. That would make the patch a bit
> more invasive, but not much.
> 
> Back-patchability would depend on whether you think there's any
> third party code calling ExecMakeTableFunctionResult; I kinda doubt
> that, but I wonder if anyone has a different opinion.

Seems unlikely to me.

Joe

- -- 
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting, & 24x7 Support
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.14 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJTow9pAAoJEDfy90M199hlFIgQAIh8/65UIWiFzACoprPJI3O1
fLoU3mEAErCw1IE6pl+eYNiUNauvSGoizrDlqFdlcVs6PVbgirjcXYvMKNExkIoz
eKooxhb9hsfyV0iK6vDOzCWnvLIuarJJgYnu3lFJw5diEoCAywtUE8o3X3/RScQW
glMBiIuiGW2EmYkqBQTSc/lkmKIvCB9rrn1XPymvD2riTFvP466MY7gnt7vFTNjD
Cq/9PGTxdXOirSPMvIxUmZkbPOLmucDWJCl5CC8E7mYeG4XDj7YoV8KMn4zRKda9
ZUZk72BXpnsqMak0q+tHa9A9KgMkRB8Sw2YE16/qD6et+jbm1OoiD3prpBEdficZ
HQiSL/yyFjCX383ySAZ5AbfAlnnDpfWCgk9xjBWZecIt0GGNzRKBbIvO3+maAIok
XcotFHkAcFu4C0ssIJdL58tDOoqwaNshgTn9CUK6g4jqZlny9WF8RcRe6yx2XokY
x20oRcA4nZ0bDH6ZvxqBFWK0eAfc15zVH0EZaCWDYxX7LEHyr6dxNtTIj38xBt6d
RT5ioKR3k+v/mpTq8xKBedtKWsb3BHhvZ+WnncBOU/tjZJcjC6sEZ89O2rElxJ09
FalqGuioi3gR8YgfbNMAbemEwaPg8OP45ChN/SA3cEjX3OJLqHA/WZNmqO50oLZl
M3zNfviiWzHG/OjBxp3o
=8S6U
-END PGP SIGNATURE-


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] change alter user to be a true alias for alter role

2014-06-19 Thread Vik Fearing

On 01/20/2014 03:31 PM, Jov wrote:
> the doc say:
> 
> ALTER USER is now an alias for ALTER ROLE
> .
> 
>  
> but alter user lack the following format:
> 
> [...]
> 
> this make alter user a true alias for alter role.

This is a straightforward patch that does exactly what it says on the
tin.  I have marked it ready for committer.
-- 
Vik


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Atomics hardware support table & supported architectures

2014-06-19 Thread Andres Freund

On 2014-06-19 11:00:51 -0400, Alvaro Herrera wrote:
> Merlin Moncure wrote:
> > On Thu, Jun 19, 2014 at 7:07 AM, Abhijit Menon-Sen  
> > wrote:
> > > Let's not pretend to support platforms we have no practical way of
> > > verifying.
> > 
> > This is key.  The buildfarm defines the set of platforms we support.
> 
> This means that either Renesas should supply us with a bunch of
> buildfarm members for their SuperH and M32R stuff, or they should be
> considered unsupported as well.

Hm. I've missed SuperH in my comparison - it's not actually documented
to be a supported platform...

> I don't really care all that much about Linux and the glibc situation on
> M32R; I mean that platform is weird enough that it might well have its
> own, unheard-of Unix clone.

They have their own compiler for a some realtime os (not posix-y and
discontinued!) - but that doesn't support the gcc syntax used by
s_lock.h. So it's already not supported.

> But if people expect to use Postgres on it, we need BF members.

Yes. But I think it's exceedingly unlikely.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [bug fix] Memory leak in dblink

2014-06-19 Thread Joe Conway

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 06/19/2014 08:50 AM, Alvaro Herrera wrote:
> Joe Conway wrote:
>> -BEGIN PGP SIGNED MESSAGE- Hash: SHA1
>> 
>> On 06/18/2014 08:50 PM, Joe Conway wrote:
>>> On 06/18/2014 08:45 PM, Tom Lane wrote:
 Well, we usually think memory leaks are back-patchable bugs.
 I'm a bit worried about the potential performance impact of
 an extra memory context creation/deletion though.  It's
 probably not noticeable in this test case, but that's just
 because dblink() is such a spectacularly expensive function.
>>> 
>>> Probably so. I'll try to scrounge up some time to test the 
>>> performance impact of your patch.
>> 
>> Not the most scientific of tests, but I think a reasonable one:
> 
> Is this an assert-enabled build?

No:

CFLAGS = -O2 -Wall -Wmissing-prototypes -Wpointer-arith
- -Wdeclaration-after-statement -Wendif-labels
- -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing
- -fwrapv -fexcess-precision=standard -g

CONFIGURE = '--prefix=/usr/local/pgsql-head' '--with-pgport=55437'
'--enable-debug' '--enable-nls' '--with-pam' '--enable-thread-safety'
'--with-openssl' '--with-krb5'
'--with-includes=/usr/include/kerberosIV'
'--with-includes=/usr/include/et' '--enable-integer-datetimes'
'--with-libxml' '' 'build_alias=' 'host_alias=' 'target_alias='
'DOCBOOKSTYLE=/usr/share/sgml/docbook/stylesheet/dsssl/modular'


- -- 
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting, & 24x7 Support
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.14 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJTowsYAAoJEDfy90M199hll48QAJVSEiaYAAG4f50OyYpS5uka
y85ceg+EfE9UMG6FKbEK7z6+a0cQUumvuGOxd0TxztujHLvRCFqW+UlCALRwl5fi
d169MI+GYucHoKUj/CObM6dM8qYKK8OAVsMerpNLs97kW2Qsny1Jt930RhHxUgZe
ZyUvRsQssInMQ5JUcCy5IinENxhry9fhCMkMIIPVuCu4aF4DeIy7T/cBQkY3S+C3
5DcpHoubwcorL2lFZSpJorQ6w5bueHp0jVsWo1IfP/XlrRckaS1MIgW95YKKw/Ok
//F/CoFd9nSkyFmvZw8JW2R9o8Tv1HTZTMvzVCFK7yJGGXRQeTLEGHbYMkkaHgtr
piHf89p1wvYOMaGtjQ2pAp4oGjMQ2g3DefvmU3UiM2g9wQ9WgRBMkwnWl3IHms6g
OkozqzNtpBVyAxgoumuD2ohdnoHt521v0W4vGisLMv5ZbxHbTZBDmHBptiRkSxYH
i6jWlAnTPjXmjy335QQSMe17XAa/vE+0v3ut/vS80lGj+nAKiZnvd4XgR8VwAN8f
6qWU3E754ZnMdYaIrjdrcJ6F0L75tsvM/bS73IN+OsefiYM+BVPv4M2kiwUtOl3z
sPNl1uedv5QeF/B/69FBr6zRVBMUoxy6NTdwhZEf4aL9WnIxRO461NOlcEbBNjaC
Pds6HML5iBRZixluCHuM
=7cT8
-END PGP SIGNATURE-


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] idle_in_transaction_timeout

2014-06-19 Thread Abhijit Menon-Sen

At 2014-06-19 08:42:01 -0700, kgri...@ymail.com wrote:
>
> I'm not sure whether using the same name as pgbouncer for the same
> functionality makes it less confusing or might lead to confusion
> about which layer is disconnecting the client.

I don't think the names of the respective configuration settings will
significantly affect that confusion either way.

(I can imagine people being confused if they're using pgbouncer without
the timeout in front of a server with the timeout. But since they would
have to have turned it on explicitly on the server, it can't all THAT
much of a surprise. Or at least not that hard to figure out.)

> As long as BEGIN causes a connection to show as "idle in transaction"
> I think the BEGIN should start the clock on this timeout, based on
> POLA.

Yes, agreed. I don't think it's worth doing anything more complicated.

> Also, it seems like most are ok with the FATAL approach (as in v1
> of this patch).

I don't have a strong opinion, but I think FATAL is the pragmatic
approach.

-- Abhijit

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] delta relations in AFTER triggers

2014-06-19 Thread Kevin Grittner

Kevin Grittner  wrote:

> I've already said that I now think we should use the standard
> CREATE TRIGGER syntax to specify the transition tables, and that
> if we do that we don't need the reloption that's in the patch. 
> If you ignore the 19 lines of new code to add that reloption,
> absolutely 100% of the code changes in the patch so far are in
> trigger.c and trigger.h.

Although nobody has actually framed their feedback as a review, I
feel that I have enough to work with to throw the patch into
Waiting on Author status.  Since I started with the assumption that
I was going to be using standard syntax and got a ways into that
before convincing myself it was a bad idea, I should have a new
version of the patch working that way in a couple days.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] idle_in_transaction_timeout

2014-06-19 Thread Vik Fearing

On 06/19/2014 05:42 PM, Kevin Grittner wrote:
> Abhijit Menon-Sen  wrote:
>> At 2014-06-19 10:58:34 +0200, pavel.steh...@gmail.com wrote:
> 
>>> pgbouncer has idle_transaction_timeout defined some years
>>> without any problems, so we can take this design
>>>
>>> idle_transaction_timeout
>>
>> Let's steal the name too. :-)
>>
>> (I didn't realise there was precedent in pgbouncer, but given
>> that the name is in use already, it'll be less confusing to use
>> that name for Postgres as well.)
> 
> I'm not sure whether using the same name as pgbouncer for the same
> functionality makes it less confusing or might lead to confusion
> about which layer is disconnecting the client.  I'm leaning toward
> using the same name, but I'm really not sure.  Does anyone else
> want to offer an opinion?

I much prefer with "in" but it doesn't much matter.

> Somehow I had thought that pg_stat_activity didn't show a
> connection as "idle in transaction" as soon as BEGIN was run -- I
> thought some subsequent statement was needed to put it into that
> state; but I see that I was wrong about that.  As long as BEGIN
> causes a connection to show as "idle in transaction" I think the
> BEGIN should start the clock on this timeout, based on POLA.  

For me, this is a separate patch.  Whether it goes before or after this
one doesn't make a big difference, I don't think.

> It wouldn't bother me to see us distinguish among early transaction
> states, but I'm not sure whether it's really worth the effort.  We
> couldn't base it just on whether the transaction has acquired a
> snapshot, since it is not unusual for explicit LOCK statements at
> the front of the transaction to run before a snapshot is acquired,
> and we would want to terminate a transaction sitting idle and
> holding locks.
> 
> Also, it seems like most are ok with the FATAL approach (as in v1
> of this patch).  Does anyone want to make an argument against
> proceeding with that, in light of discussion so far?

>From my view, we have these outstanding issues:
  - the name
  - begin-only transactions
  - aborted transactions

Those last two could arguably be considered identical.  If the client
issued a BEGIN, then the client is in a transaction and I don't think we
should report otherwise.  So then we need to decide why "idle in
transaction (aborted)" gets killed but "idle in transaction (initiated)"
doesn't.  The v1 patch doesn't currently kill the former but there was
question that it should.  I still believe it shouldn't.

Anyway, I'm willing to modify the patch in any way that consensus dictates.
-- 
Vik


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [bug fix] Memory leak in dblink

2014-06-19 Thread Alvaro Herrera

Joe Conway wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> On 06/18/2014 08:50 PM, Joe Conway wrote:
> > On 06/18/2014 08:45 PM, Tom Lane wrote:
> >> Well, we usually think memory leaks are back-patchable bugs.  I'm
> >> a bit worried about the potential performance impact of an extra
> >>  memory context creation/deletion though.  It's probably not 
> >> noticeable in this test case, but that's just because dblink()
> >> is such a spectacularly expensive function.
> > 
> > Probably so. I'll try to scrounge up some time to test the
> > performance impact of your patch.
> 
> Not the most scientific of tests, but I think a reasonable one:

Is this an assert-enabled build?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] idle_in_transaction_timeout

2014-06-19 Thread Kevin Grittner

Abhijit Menon-Sen  wrote:
> At 2014-06-19 10:58:34 +0200, pavel.steh...@gmail.com wrote:

>> pgbouncer has idle_transaction_timeout defined some years
>> without any problems, so we can take this design
>>
>> idle_transaction_timeout
>
> Let's steal the name too. :-)
>
> (I didn't realise there was precedent in pgbouncer, but given
> that the name is in use already, it'll be less confusing to use
> that name for Postgres as well.)

I'm not sure whether using the same name as pgbouncer for the same
functionality makes it less confusing or might lead to confusion
about which layer is disconnecting the client.  I'm leaning toward
using the same name, but I'm really not sure.  Does anyone else
want to offer an opinion?

Somehow I had thought that pg_stat_activity didn't show a
connection as "idle in transaction" as soon as BEGIN was run -- I
thought some subsequent statement was needed to put it into that
state; but I see that I was wrong about that.  As long as BEGIN
causes a connection to show as "idle in transaction" I think the
BEGIN should start the clock on this timeout, based on POLA.  It
wouldn't bother me to see us distinguish among early transaction
states, but I'm not sure whether it's really worth the effort.  We
couldn't base it just on whether the transaction has acquired a
snapshot, since it is not unusual for explicit LOCK statements at
the front of the transaction to run before a snapshot is acquired,
and we would want to terminate a transaction sitting idle and
holding locks.

Also, it seems like most are ok with the FATAL approach (as in v1
of this patch).  Does anyone want to make an argument against
proceeding with that, in light of discussion so far?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [bug fix] Memory leak in dblink

2014-06-19 Thread Joe Conway

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 06/18/2014 08:50 PM, Joe Conway wrote:
> On 06/18/2014 08:45 PM, Tom Lane wrote:
>> Well, we usually think memory leaks are back-patchable bugs.  I'm
>> a bit worried about the potential performance impact of an extra
>>  memory context creation/deletion though.  It's probably not 
>> noticeable in this test case, but that's just because dblink()
>> is such a spectacularly expensive function.
> 
> Probably so. I'll try to scrounge up some time to test the
> performance impact of your patch.

Not the most scientific of tests, but I think a reasonable one:

8<---
create function testcontext(arg int) returns setof int as
$$select $1$$ language sql;

do $$
declare
i int;
rec record;
begin
for i in 1..100 loop
  select * into rec from testcontext(42);
end loop;
end
$$;
8<---

I threw out the first run (not shown) in each group below.

without patch
- -
Time: 9930.320 ms
Time: 9963.114 ms
Time: 10048.067 ms
Time: 9899.810 ms
Time: 9926.066 ms
Time: 9996.044 ms
Time: 9919.095 ms
Time: 9921.482 ms
Time: 9904.839 ms
Time: 9990.285 ms
=
Avg:  9949.912 ms

with patch
- -
Time: 10172.148 ms
Time: 10203.585 ms
Time: 10142.779 ms
Time: 10211.159 ms
Time: 10109.001 ms
Time: 10216.619 ms
Time: 10221.820 ms
Time: 10153.220 ms
Time: 10147.540 ms
Time: 10176.478 ms
==
Avg:  10175.435 ms

without patch
- -
Time: 9897.838 ms
Time: 9848.947 ms
Time: 9830.405 ms
Time: 9824.837 ms
Time: 9828.743 ms
Time: 9990.998 ms
Time: 9820.717 ms
Time: 9853.545 ms
Time: 9850.613 ms
Time: 9836.452 ms
=
Avg:  9858.310 ms

2.7% performance penalty

Joe

- -- 
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting, & 24x7 Support
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.14 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJTowRSAAoJEDfy90M199hl2kUP+QGkKWJ8MkIOFIH7YlsGwGEK
I2jZvgTA84FBmoKuKSRJmUTXenzh3KnINHmsJxywqH3QAjYt3WFZia1OucQQgdZ5
YIpDWyN7FBS2NEwXhDSp2X/Wpqw9ZcLY1cyivUFruRTYm4viw10InNKFe3+396i/
zVt1+e0NlxJKAl4wdtk29q8rlmSJ2ej5fGATgrdd1I6C0kLhBaAxYWqMCC81JQrT
slbE/y6qeLUkyCEbvrRPj+J8rCO5sCpXvWA691x5qFSrhFaI1jE62QGq6sz4eB1F
gUBNn2c57A5sTtqZDz704FbxHAv6mXZpwb4g7jYT5bV7NBlDUxaUURoWQxLvZ9Iy
6CKZ7eM7yU0k2wpHF7bnOVt5YGtF9spd4MZOkrxSjUJ1XwdBS7IKtdymtUleTRup
5T3oFTQ/joaAGKbO3Ioq2PgcDlVgfq/x2rf/veQXV4AdlvymWamnygZ/Nf91w4QA
GpN+cOtsvLVNejqdxR4CoXWA9xu6gfmjnATaVkBQ8vQb61OGMmLmxtJWljp785zL
3jyhISMvcW2Yn7Gv07f7cV89YzfxTwl1EY34DhT9hTKXlim7qu0w0kgR4gp/MKX/
DelfTIZz1JUVwfRDwhOo3cMUGD/ru6H8N/FgtQGycXxYfLg7yK69egxpM3+oZF2t
NaEbghbhHXn4LPEbSt0L
=2yrG
-END PGP SIGNATURE-


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [bug fix] Memory leak in dblink

2014-06-19 Thread Tom Lane

"MauMau"  writes:
> From: "Joe Conway" 
>> Any objections to me back-patching it this way?

> I thought the same at first before creating the patch, but I reconsidered. 
> If the query executed by dblink() doesn't return any row, the context 
> creation and deletion is a waste of processing.  I think the original author 
> wanted to eliminate this waste by creating the context when dblink() should 
> return a row.  I'd like to respect his thought.

I don't think speed is really worth worrying about.  The remote query will
take orders of magnitude more time than the possibly-useless context
creation.

The code is really designed to put all the setup for storeRow into one
place; but I concur with Joe that having teardown in a different place
from setup isn't very nice.

An alternative that might be worth thinking about is putting both the
context creation and deletion at the outermost level where the storeInfo
struct is defined.  That seems possibly a shade less surprising than
having it at the intermediate level.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Atomics hardware support table & supported architectures

2014-06-19 Thread Alvaro Herrera

Merlin Moncure wrote:
> On Thu, Jun 19, 2014 at 7:07 AM, Abhijit Menon-Sen  
> wrote:
> > Let's not pretend to support platforms we have no practical way of
> > verifying.
> 
> This is key.  The buildfarm defines the set of platforms we support.

This means that either Renesas should supply us with a bunch of
buildfarm members for their SuperH and M32R stuff, or they should be
considered unsupported as well.

I don't really care all that much about Linux and the glibc situation on
M32R; I mean that platform is weird enough that it might well have its
own, unheard-of Unix clone.  But if people expect to use Postgres on it,
we need BF members.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [bug fix] Memory leak in dblink

2014-06-19 Thread MauMau


From: "Joe Conway" 

I think the context deletion was missed in the first place because
storeRow() is the wrong place to create the context. Rather than
creating the context in storeRow() and deleting it two levels up in
materializeQueryResult(), I think it should be created and deleted in
the interim layer, storeQueryResult(). Patch along those lines attached.

Any objections to me back-patching it this way?


I thought the same at first before creating the patch, but I reconsidered. 
If the query executed by dblink() doesn't return any row, the context 
creation and deletion is a waste of processing.  I think the original author 
wanted to eliminate this waste by creating the context when dblink() should 
return a row.  I'd like to respect his thought.


Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Atomics hardware support table & supported architectures

2014-06-19 Thread Merlin Moncure

On Thu, Jun 19, 2014 at 7:07 AM, Abhijit Menon-Sen  wrote:
> Let's not pretend to support platforms we have no practical way of
> verifying.

This is key.  The buildfarm defines the set of platforms we support.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Removing dependency to wsock32.lib when compiling code on WIndows

2014-06-19 Thread Michael Paquier

On Thu, Jun 19, 2014 at 11:13 PM, MauMau  wrote:
> * The patch applies cleanly.  Running "grep -lir wsock32 ." after applying
> the patch shows that wsock32.lib is no longer used.
> However, I wonder if some DLL entries in dlls[] in
> src/interfaces/libpq/win32.c should be removed.  winsock.dll should
> definitely be removed, because it is for 16-bit applications.  I don't know
> about the rest.  Especially,  wsock32n.dll and mswsock.dll may also be
> obsolete libraries from their names.  Could you look into this and revise
> the patch if possible?  But I don't mind if you do so.
(google-sensei..) msock32n stands for Hummingbird Socks V4 Winsock
while mswsock implements the Windows socket SPI interface. I think we
should keep them and not risking opening a can of worms, but perhaps a
Windows guru has a different opinion on the matter.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] calculation for NUM_FIXED_LWLOCKS

2014-06-19 Thread Amit Kapila

On Thu, Jun 19, 2014 at 7:41 PM, Kevin Grittner  wrote:
>
> Kevin Grittner  wrote:
>
> > Will commit a fix shortly.
>
> Fixed exactly as Amit suggested.  Pushed to master and 9.4 branches.

Thanks for looking into it.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] postgresql.auto.conf read from wrong directory

2014-06-19 Thread Amit Kapila

On Thu, Jun 19, 2014 at 5:21 PM, Fujii Masao  wrote:
>
> On Wed, Jun 18, 2014 at 1:07 PM, Abhijit Menon-Sen 
wrote:
> > At 2014-06-18 12:55:36 +0900, masao.fu...@gmail.com wrote:
> >>
> >> Also I'm thinking to backpatch this to 9.4 because this is a bugfix
> >> rather than new feature.
> >
> > That sounds reasonable, thanks.
>
> Committed!

Thank you.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] calculation for NUM_FIXED_LWLOCKS

2014-06-19 Thread Kevin Grittner

Kevin Grittner  wrote:

> Will commit a fix shortly.

Fixed exactly as Amit suggested.  Pushed to master and 9.4 branches.

Thanks!

--

Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Removing dependency to wsock32.lib when compiling code on WIndows

2014-06-19 Thread MauMau


From: "Michael Paquier" 

You are right, I only though about the MS scripts when working on this
patch. Updated patch for a complete cleanup is attached (note I
updated manually ./configure for test purposes and did not re-run
autoconf).


I marked this patch as ready for committer.  I confirmed:

* The patch content is correct.

* The patch applies cleanly.  Running "grep -lir wsock32 ." after applying 
the patch shows that wsock32.lib is no longer used.


* The binaries can be built with MSVC 2008 Express.  I didn't try to build 
on MinGW.


* The regression tests pass ("vcregress check").

However, I wonder if some DLL entries in dlls[] in 
src/interfaces/libpq/win32.c should be removed.  winsock.dll should 
definitely be removed, because it is for 16-bit applications.  I don't know 
about the rest.  Especially,  wsock32n.dll and mswsock.dll may also be 
obsolete libraries from their names.  Could you look into this and revise 
the patch if possible?  But I don't mind if you do so.


Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] IMPORT FOREIGN SCHEMA statement

2014-06-19 Thread Michael Paquier

>> This seems to be related to re-using the table-name between invocations. The
>> attached patch should fix point 2. As for point 1, I don't know the cause for
>> it. Do you have a reproducible test case ?
> Sure. I'll try harder when looking in more details at the patch for
> postgres_fdw :)
With v2, I have tried randomly some of the scenarios of v1 plus some
extras, but was not able to reproduce it.

> I'll look into the patch for postgres_fdw later.
And here are some comments about it, when applied on top of the other 2 patches.
1) Code compiles without warning, regression tests pass.
2) In fetch_remote_tables, the palloc for the parameters should be
done after the number of parameters is determined. In the case of
IMPORT_ALL, some memory is wasted for nothing.
3) Current code is not able to import default expressions for a table.
A little bit of processing with pg_get_expr would be necessary:
select pg_catalog.pg_get_expr(adbin, adrelid) from pg_attrdef;
There are of course bonus cases like SERIAL columns coming immediately
to my mind but it would be possible to determine if a given column is
serial with pg_get_serial_sequence.
This would be a good addition for the FDW IMO.
4) The same applies of course to the constraint name: CREATE FOREIGN
TABLE foobar (a int CONSTRAINT toto NOT NULL) for example.
5) A bonus idea: enable default expression obtention and null/not null
switch by default but add options to disable their respective
obtention.
6) Defining once PGFDW_IMPORTRESULT_NUMCOLS at the top of
postgres_fdw.c without undefining would be perfectly fine.
7) In postgresImportForeignSchema, the palloc calls and the
definitions of the variables used to save the results should be done
within the for loop.
8) At quick glance, the logic of postgresImportForeignSchema looks
awkward... I'll have a second look with a fresher mind later on this
one.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_receivexlog add synchronous mode

2014-06-19 Thread Fujii Masao

On Mon, Jun 16, 2014 at 7:03 PM,   wrote:
>> You introduced the state machine using the flag "flush_flg" into
>> pg_receivexlog.
>> That's complicated and would reduce the readability of the source code.
>> I think that the logic should be simpler like walreceiver's one.
>>
>> Maybe I found one problematic path as follows:
>>
>> 1. WAL is written and flush_flag is set to 1 2. PQgetCopyData() returns
>> 0 and flush_flg is incremented to 2 3. PQconsumeInput() is executed 4.
>> PQgetCopyData() reads keepalive message 5. After processing keepalive
>> message, PQgetCopyDate() returns 0 6. Since flush_flg is 2, WAL is
>> flushed and flush_flg is reset to 0
>>
>> But new message can arrive while processing keepalive message. Before
>> flushing WAL, such new message should be processed.
> Together with the readability, fixed to the same process as the loop of 
> walreceiver.
>
>> +Enables synchronous mode. If replication slot is disabled then
>> +this setting is irrelevant.
>>
>> Why is that irrelevant in that case?
>>
>> Even when replication slot is not used, thanks to this feature,
>> pg_receivexlog can flush WAL more proactively and which may improve the
>> durability of WAL which pg_receivexlog writes.
> It's mean, report the flush position or not.
> If the SLOT is not used, it is not reported.
> Fixed to be reported only when using the SLOT.
>
>> +printf(_("  -m, --sync-modesynchronous mode\n"));
>>
>> I think that calling this feature "synchronous mode" is confusing.
> Modified the "synchronous mode" to "this mode is written some records, flush 
> them to disk.".

I found that this patch breaks --status-interval option of pg_receivexlog
when -m option which the patch introduced is supplied. When -m is set,
pg_receivexlog tries to send the feedback message as soon as it flushes
WAL file even if status interval timeout has not been passed yet. If you
want to send the feedback as soon as WAL is written or flushed, like
walreceiver does, you need to extend --status-interval option, for example,
so that it accepts the value "-1" which means enabling that behavior.

Including this change in your original patch would make it more difficult
to review. I think that you should implement this as separate patch.
Thought?

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minmax indexes

2014-06-19 Thread Tom Lane

Vik Fearing  writes:
> On 06/18/2014 12:46 PM, Andres Freund wrote:
>> So to implement a feature one now has to implement the most generic
>> variant as a prototype first? Really?

> Well, there is the inventor's paradox to consider.

I have not seen anyone demanding a different implementation in this
thread.  What *has* been asked for, and not supplied, is a concrete
defense of the particular level of generality that's been selected
in this implementation.  It's not at all clear to the rest of us
whether it was the right choice, and that is something that ought
to be asked now not later.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP patch for multiple column assignment in UPDATE

2014-06-19 Thread Tom Lane

Pavel Stehule  writes:
> I did some tests and It looks so it allows only some form of nested loop.

[ shrug... ]  It's a subplan.  One evaluation per outer row is what
people are expecting.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] calculation for NUM_FIXED_LWLOCKS

2014-06-19 Thread Kevin Grittner

Amit Kapila  wrote:

> #define PREDICATELOCK_MANAGER_LWLOCK_OFFSET \
> (NUM_INDIVIDUAL_LWLOCKS + NUM_LOCK_PARTITIONS)

> In this PREDICATELOCK_MANAGER_LWLOCK_OFFSET
> should consider NUM_BUFFER_PARTITIONS which means
> it should be:
>
> #define PREDICATELOCK_MANAGER_LWLOCK_OFFSET \
> (LOCK_MANAGER_LWLOCK_OFFSET + NUM_LOCK_PARTITIONS)

Agreed.  It looks like a pasto in commit ea9df81.

Will commit a fix shortly.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minmax indexes

2014-06-19 Thread Heikki Linnakangas


On 06/18/2014 06:09 PM, Claudio Freire wrote:

On Tue, Jun 17, 2014 at 8:48 PM, Greg Stark  wrote:

An aggregate to generate a min/max "bounding box" from several values
A function which takes an "bounding box" and a new value and returns
the new "bounding box"
A function which tests if a value is in a "bounding box"
A function which tests if a "bounding box" overlaps a "bounding box"


Which I'd generalize a bit further by renaming "bounding box" with
"compressed set", and allow it to be parameterized.


What do you mean by parameterized?


So, you have:

An aggregate to generate a "compressed set" from several values
A function which adds a new value to the "compressed set" and returns
the new "compressed set"
A function which tests if a value is in a "compressed set"
A function which tests if a "compressed set" overlaps another
"compressed set" of equal type


Yeah, something like that. I'm not sure I like the "compressed set" term 
any more than bounding box, though. GiST seems to have avoided naming 
the thing, and just talks about "index entries". But if we can come up 
with a good name, that would be more clear.



One problem with such a generalized implementation would be, that I'm
not sure in-place modification of the "compressed set" on-disk can be
assumed to be safe on all cases. Surely, for strictly-enlarging sets
it would, but while min/max and bloom filters both fit the bill, it's
not clear that one can assume this for all structures.


I don't understand what you mean. It's a fundamental property of minmax 
indexes that you can always replace the "min" or "max" or "compressing 
set" or "bounding box" or whatever with another datum that represents 
all the keys that the old one did, plus some.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Atomics hardware support table & supported architectures

2014-06-19 Thread Abhijit Menon-Sen

At 2014-06-19 13:33:03 +0200, p...@2ndquadrant.com wrote:
>
> I think quite the opposite, it's better to say we don't support the
> obscure platform than saying that we do and have no active testing or
> proof that it indeed does and somebody finding the hard way that there
> are issues.

Yes, I strongly agree. I've been in the position of having to get things
working on obscure platforms, and was definitely not fond of finding old
rotten code that was kept around "just in case", which nobody actually
cared about (or was familiar with) any more.

Having been on that side of the fence, I don't feel guilty saying that
if someone *really* cares about running the very latest Postgres on an
unsupported platform, they can do some digging around in the archives
and repository and do the necessary legwork.

Let's not pretend to support platforms we have no practical way of
verifying.

-- Abhijit

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] postgresql.auto.conf read from wrong directory

2014-06-19 Thread Fujii Masao

On Wed, Jun 18, 2014 at 1:07 PM, Abhijit Menon-Sen  wrote:
> At 2014-06-18 12:55:36 +0900, masao.fu...@gmail.com wrote:
>>
>> Also I'm thinking to backpatch this to 9.4 because this is a bugfix
>> rather than new feature.
>
> That sounds reasonable, thanks.

Committed!

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Atomics hardware support table & supported architectures

2014-06-19 Thread Petr Jelinek


On 18/06/14 17:15, Robert Haas wrote:

6) armv-v5


I think this is also a bit less dead than the other ones; Red Hat's
shows Bugzilla shows people filing bugs for platform-specific problems
as recently as January of 2013:

https://bugzilla.redhat.com/show_bug.cgi?id=892378


Closed as WONTFIX :P.

Joking aside, I think there are still usecases for arm-v5 - but it's
embedded stuff without a real OS and such. Nothing you'd install PG
on. There's distributions that are dropping ARMv6 support already... My
biggest problem is that it's not even documented whether v5 has atomic
4byte stores - while it's documted for v6.


I think in doubtful cases we might as well keep the support in.  If
you've got the fallback to non-atomics, keeping the other code around
doesn't hurt much, and might make it easier for someone who is
interested in one of those platforms.  It's fine and good to kill
things that are totally dead, but I think it's better for a user of
some obscure platform to find that it doesn't *quite* work than that
we've deliberately broken it.  But maybe I am being too conservative.



I think quite the opposite, it's better to say we don't support the 
obscure platform than saying that we do and have no active testing or 
proof that it indeed does and somebody finding the hard way that there 
are issues.


I also have hard time imagining somebody in 2016 installing brand new 
Postgres 9.5 on their 20 year old 200Mhz (or something) machine and 
doing something meaningful with it. It's not like we are removing 
supported platforms from old releases, but this basically means we are 
going to support some of the obscure almost dead platforms till at least 
2020, in some cases longer than their creators and even OSes.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] calculation for NUM_FIXED_LWLOCKS

2014-06-19 Thread Amit Kapila

Currently the calculation for NUM_FIXED_LWLOCKS is as
below, which I think has some problem:

/* Offsets for various chunks of preallocated lwlocks. */

#define BUFFER_MAPPING_LWLOCK_OFFSET NUM_INDIVIDUAL_LWLOCKS


#define LOCK_MANAGER_LWLOCK_OFFSET \

(BUFFER_MAPPING_LWLOCK_OFFSET + NUM_BUFFER_PARTITIONS)


#define PREDICATELOCK_MANAGER_LWLOCK_OFFSET \

(NUM_INDIVIDUAL_LWLOCKS + NUM_LOCK_PARTITIONS)


#define NUM_FIXED_LWLOCKS \

(PREDICATELOCK_MANAGER_LWLOCK_OFFSET + NUM_PREDICATELOCK_PARTITIONS)


In this PREDICATELOCK_MANAGER_LWLOCK_OFFSET

should consider NUM_BUFFER_PARTITIONS which means

it should be:

#define PREDICATELOCK_MANAGER_LWLOCK_OFFSET \

(*LOCK_MANAGER_LWLOCK_OFFSET* + NUM_LOCK_PARTITIONS)

If we don't consider buffer partitions in above calculation,
then the total shared memory allocated for fixed locks will
be wrong and can cause problems.


I have noticed this during my work on scaling shared
buffers where if I increase NUM_BUFFER_PARTITIONS,
it leads to hang for tpc-b test and as per my analysis
the reason is above calculation.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Minmax indexes

2014-06-19 Thread Vik Fearing

On 06/18/2014 12:46 PM, Andres Freund wrote:
>>> Isn't 'simpler implementation' a valid reason that's already been
>>> > >discussed onlist? Obviously simpler implementation doesn't trump
>>> > >everything, but it's one valid reason...
>>> > >Note that I have zap to do with the design of this feature. I work for
>>> > >the same company as Alvaro, but that's pretty much it...
>> > 
>> > Without some analysis (e.g implementing it and comparing), I don't buy that
>> > it makes the implementation simpler to restrict it in this way. Maybe it
>> > does, but often it's actually simpler to solve the general case.
>
> So to implement a feature one now has to implement the most generic
> variant as a prototype first? Really?

Well, there is the inventor's paradox to consider.
-- 
Vik


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC on WAL-logging hash indexes

2014-06-19 Thread Vik Fearing

On 04/30/2014 11:41 PM, Tom Lane wrote:
> We really oughta fix the WAL situation, not just band-aid around it.

After re-reading this thread, it is not clear that anyone is going to
work on it so I'll just ask:

Is anyone working on this?

If not, I'd like to put it on my plate.

-- 
Vik



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] idle_in_transaction_timeout

2014-06-19 Thread Abhijit Menon-Sen

At 2014-06-19 10:58:34 +0200, pavel.steh...@gmail.com wrote:
>
> Hello
> 
> pgbouncer has idle_transaction_timeout defined some years without any
> problems, so we can take this design
> 
> idle_transaction_timeout

Let's steal the name too. :-)

(I didn't realise there was precedent in pgbouncer, but given that the
name is in use already, it'll be less confusing to use that name for
Postgres as well.)

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] idle_in_transaction_timeout

2014-06-19 Thread Pavel Stehule

Hello

pgbouncer has idle_transaction_timeout defined some years without any
problems, so we can take this design

idle_transaction_timeout

If client has been in "idle in transaction" state longer, it will be
disconnected. [seconds]

Default: 0.0 (disabled)

This feature can be very important, and I seen a few databases thas was
unavailable due leaked transaction.

Regards

Pavel



2014-06-19 1:46 GMT+02:00 Josh Berkus :

> On 06/18/2014 02:52 PM, Bruce Momjian wrote:
> > On Wed, Jun 18, 2014 at 04:41:30PM -0400, Robert Haas wrote:
> >> The only problem I see is that it makes the semantics kind of weird
> >> and confusing.  "Kill connections that are idle in transaction for too
> >> long" is a pretty clear spec; "kill connections that are idle in
> >> transaction except if they haven't executed any commands yet because
> >> we think you don't care about that case" is not quite as clear, and
> >> not really what the GUC name says, and maybe not what everybody wants,
> >> and maybe masterminding.
> >
> > "Kill connections that are idle in non-empty transaction block for too
> > long"
>
> Here's the POLS violation in this:
>
> "I have idle_in_transaction_timeout set to 10min, but according to
> pg_stat_activity I have six connections which are IIT for over an hour.
>  What gives?"
>
> Robert's right, not killing the "BEGIN;" only transactions is liable to
> result in user confusion unless we label those sessions differently in
> pg_stat_activity. Tom is right in that killing them will cause some
> users to not use IIT_timeout when they should, or will set the timeout
> too high to be useful.
>
> So it seems like what we should do is NOT call sessions IIT if they've
> merely executed a BEGIN; and not done anything else.  Then the timeout
> and pg_stat_activity would be consistent.
>
> Counter-argument: most app frameworks which do an automatic BEGIN; also
> do other stuff like SET TIMEZONE each time as well.  Is this really a
> case worth worrying about?
>
> --
> Josh Berkus
> PostgreSQL Experts Inc.
> http://pgexperts.com
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: [HACKERS] replication commands and log_statements

2014-06-19 Thread Ian Barwick


On 12/06/14 20:37, Fujii Masao wrote:

On Wed, Jun 11, 2014 at 11:55 PM, Tom Lane  wrote:

Andres Freund  writes:

Your wish just seems like a separate feature to me. Including
replication commands in 'all' seems correct independent of the desire
for a more granular control.


No, I think I've got to vote with the other side on that.

The reason we can have log_statement as a scalar progression
"none < ddl < mod < all" is that there's little visible use-case
for logging DML but not DDL, nor for logging SELECTS but not
INSERT/UPDATE/DELETE.  However, logging replication commands seems
like something people would reasonably want an orthogonal control for.
There's no nice way to squeeze such a behavior into log_statement.

I guess you could say that log_statement treats replication commands
as if they were DDL, but is that really going to satisfy users?

I think we should consider log_statement to control logging of
SQL only, and invent a separate GUC (or, in the future, likely
more than one GUC?) for logging of replication activity.


Seems reasonable. OK. The attached patch adds log_replication_command
parameter which causes replication commands to be logged. I added this to
next CF.


A quick review:

- Compiles against HEAD
- Works as advertised
- Code style looks fine


A couple of suggestions:

- minor rewording for the description, mentioning that statements will
  still be logged as DEBUG1 even if parameter set to 'off' (might
  prevent reports of the kind "I set it to 'off', why am I still seeing
  log entries?").

   
Causes each replication command to be logged in the server log.
See  for more information about
replication commands. The default value is off. When set to
off, commands will be logged at log level 
DEBUG1.
Only superusers can change this setting.
   

- I feel it would be more consistent to use the plural form
  for this parameter, i.e. "log_replication_commands", in line with
  "log_lock_waits", "log_temp_files", "log_disconnections" etc.

- It might be an idea to add a cross-reference to this parameter from
  the "Streaming Replication Protocol" page:
  http://www.postgresql.org/docs/devel/static/protocol-replication.html


Regards


Ian Barwick

--
 Ian Barwick   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Removing dependency to wsock32.lib when compiling code on WIndows

2014-06-19 Thread Michael Paquier

On Wed, Jun 18, 2014 at 8:37 PM, MauMau  wrote:
> Doing "grep -lir wsock32 ." shows the following files.  Could you modify and
> test these files as well for code cleanup?
> ./configure
> ./configure.in
> ./contrib/pgcrypto/Makefile
> ./src/interfaces/libpq/Makefile
> ./src/interfaces/libpq/win32.c
> ./src/interfaces/libpq/win32.mak
> ./src/test/thread/README
> ./src/tools/msvc/Mkvcbuild.pm
You are right, I only though about the MS scripts when working on this
patch. Updated patch for a complete cleanup is attached (note I
updated manually ./configure for test purposes and did not re-run
autoconf).
Regards,
-- 
Michael
From f429372f402057591ede3239e69e4b70fb846751 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Thu, 19 Jun 2014 16:26:38 +0900
Subject: [PATCH] Remove usage of wsock32 in Windows builds

MS scripts, as well as Windows port Makefile are cleaned up. wsock32.dll
is an obsolete WinSock 1.1 library, while ws2_32.dll is a successor
WinSock 2.0 library which is fully compatible with the old one.
---
 configure  |  2 +-
 configure.in   |  2 +-
 contrib/pgcrypto/Makefile  |  2 +-
 src/interfaces/libpq/Makefile  |  2 +-
 src/interfaces/libpq/win32.c   |  3 ---
 src/interfaces/libpq/win32.mak |  2 +-
 src/test/thread/README |  2 +-
 src/tools/msvc/Mkvcbuild.pm| 13 +
 8 files changed, 11 insertions(+), 17 deletions(-)

diff --git a/configure b/configure
index da89c69..892f880 100755
--- a/configure
+++ b/configure
@@ -7661,7 +7661,7 @@ return socket ();
   return 0;
 }
 _ACEOF
-for ac_lib in '' socket wsock32; do
+for ac_lib in '' socket ws2_32; do
   if test -z "$ac_lib"; then
 ac_res="none required"
   else
diff --git a/configure.in b/configure.in
index 7dfbe9f..01bbe05 100644
--- a/configure.in
+++ b/configure.in
@@ -892,7 +892,7 @@ fi
 AC_CHECK_LIB(m, main)
 AC_SEARCH_LIBS(setproctitle, util)
 AC_SEARCH_LIBS(dlopen, dl)
-AC_SEARCH_LIBS(socket, [socket wsock32])
+AC_SEARCH_LIBS(socket, [socket ws2_32])
 AC_SEARCH_LIBS(shl_load, dld)
 # We only use libld in port/dynloader/aix.c
 case $host_os in
diff --git a/contrib/pgcrypto/Makefile b/contrib/pgcrypto/Makefile
index 1c85c98..4d91a06 100644
--- a/contrib/pgcrypto/Makefile
+++ b/contrib/pgcrypto/Makefile
@@ -54,7 +54,7 @@ SHLIB_LINK += $(filter -lcrypto -lz, $(LIBS))
 ifeq ($(PORTNAME), win32)
 SHLIB_LINK += $(filter -leay32, $(LIBS))
 # those must be at the end
-SHLIB_LINK += -lwsock32 -lws2_32
+SHLIB_LINK += -lws2_32
 endif
 
 rijndael.o: rijndael.tbl
diff --git a/src/interfaces/libpq/Makefile b/src/interfaces/libpq/Makefile
index 15ea648..718ecd6 100644
--- a/src/interfaces/libpq/Makefile
+++ b/src/interfaces/libpq/Makefile
@@ -70,7 +70,7 @@ else
 SHLIB_LINK += $(filter -lcrypt -ldes -lcom_err -lcrypto -lk5crypto -lkrb5 -lgssapi32 -lssl -lsocket -lnsl -lresolv -lintl $(PTHREAD_LIBS), $(LIBS)) $(LDAP_LIBS_FE)
 endif
 ifeq ($(PORTNAME), win32)
-SHLIB_LINK += -lshfolder -lwsock32 -lws2_32 -lsecur32 $(filter -leay32 -lssleay32 -lcomerr32 -lkrb5_32, $(LIBS))
+SHLIB_LINK += -lshfolder -lws2_32 -lsecur32 $(filter -leay32 -lssleay32 -lcomerr32 -lkrb5_32, $(LIBS))
 endif
 
 SHLIB_EXPORTS = exports.txt
diff --git a/src/interfaces/libpq/win32.c b/src/interfaces/libpq/win32.c
index 88d12d6..c0fe0fb 100644
--- a/src/interfaces/libpq/win32.c
+++ b/src/interfaces/libpq/win32.c
@@ -248,9 +248,6 @@ struct MessageDLL
 		"winsock.dll", 0, 0
 	},
 	{
-		"wsock32.dll", 0, 0
-	},
-	{
 		"ws2_32.dll", 0, 0
 	},
 	{
diff --git a/src/interfaces/libpq/win32.mak b/src/interfaces/libpq/win32.mak
index 23e09e9..99fef27 100644
--- a/src/interfaces/libpq/win32.mak
+++ b/src/interfaces/libpq/win32.mak
@@ -208,7 +208,7 @@ CPP_SBRS=.
 RSC_PROJ=/l 0x409 /fo"$(INTDIR)\libpq.res"
 
 LINK32=link.exe
-LINK32_FLAGS=kernel32.lib user32.lib advapi32.lib shfolder.lib wsock32.lib ws2_32.lib secur32.lib $(SSL_LIBS)  $(KFW_LIB) $(ADD_SECLIB) \
+LINK32_FLAGS=kernel32.lib user32.lib advapi32.lib shfolder.lib ws2_32.lib secur32.lib $(SSL_LIBS)  $(KFW_LIB) $(ADD_SECLIB) \
  /nologo /subsystem:windows /dll $(LOPT) /incremental:no \
  /pdb:"$(OUTDIR)\libpqdll.pdb" /machine:$(CPU) \
  /out:"$(OUTDIR)\$(OUTFILENAME).dll"\
diff --git a/src/test/thread/README b/src/test/thread/README
index 00ec2ff..4da2344 100644
--- a/src/test/thread/README
+++ b/src/test/thread/README
@@ -40,7 +40,7 @@ to test your system however, you can do so as follows:
 -D_POSIX_PTHREAD_SEMANTICS \
 -I../../../src/include/port/win32 \
 thread_test.c \
--lwsock32 \
+-lws2_32 \
 -lpthreadgc2
 
 3) Run thread_test.exe. You should see output like:
diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm
index 8002f23..0f63491 100644
--- a/src/tools/msvc/Mkvcbuild.pm
+++ b/src/tools/msvc/Mkvcbuild.pm
@@ -44,7 +44,7 @@ my @contrib_uselibpgcommon = (
 	'pg_test_fsync', 'pg_test_timing',
 	'pg_upgrade','pg_xlogdump',
 	'vacuumlo');
-my $contrib_extralibs = { 'pgbench' => ['wsock32.lib'] };
+my $contrib_extralibs = { 'pgbench'

Re: [HACKERS] WIP patch for multiple column assignment in UPDATE

2014-06-19 Thread Pavel Stehule

Hello

I did some tests and It looks so it allows only some form of nested loop.

postgres=# explain (analyze, timing off, buffers) update a1 set b = (select
b from a2 where a1.a = a2.a);
  QUERY
PLAN
---
 Update on a1  (cost=0.00..8456925.00 rows=100 width=10) (actual rows=0
loops=1)
   Buffers: shared hit=9017134 read=14376 dirtied=58170 written=1014
   ->  Seq Scan on a1  (cost=0.00..8456925.00 rows=100 width=10)
(actual rows=100 loops=1)
 Buffers: shared hit=4005465 read=4424 written=971
 SubPlan 1
   ->  Index Scan using a2_a_idx on a2  (cost=0.42..8.44 rows=1
width=4) (actual rows=1 loops=100)
 Index Cond: (a1.a = a)
 Buffers: shared hit=4005464
 Planning time: 0.212 ms
 Execution time: 30114.101 ms
(10 rows)

do you plan some sophisticated mechanism - like MERGE or some similar?

Regards

Pavel


2014-06-16 17:17 GMT+02:00 Tom Lane :

> Attached is a very-much-WIP patch for supporting
>  UPDATE foo SET ..., (a,b,c) = (select x,y,z from ...), ...
>
> It lacks documentation, ruleutils.c support, or any solution for the
> rule NEW.* expansion problem I mentioned Saturday.  The reason I'm
> posting it now is to get feedback about an implementation choice that
> feels a bit klugy to me; but I don't see any clearly better idea.
>
> The post-parse-analysis representation I've chosen here is that the
> output columns of the sub-select are represented by PARAM_MULTIEXEC
> Params, and the sub-select itself appears in a resjunk entry at the end
> of the targetlist; that is, the UPDATE tlist is more or less like
>
>$1,  -- to be assigned to a
>$2,  -- to be assigned to b
>$3,  -- to be assigned to c
>(select x,y,z from ...),  -- resjunk entry, value will be discarded
>
> If the sub-select is uncorrelated with the outer query, the planner
> turns it into an InitPlan, replacing the resjunk tlist item with a
> NULL constant, and then everything happens normally at execution.
>
> But more usually, the sub-select will be correlated with the outer
> query.  In this case, the subquery turns into a SubPlan tree that
> stays in the resjunk item.  At the start of execution, the ParamExecData
> structs for each of its output Params are marked with execPlan pointers
> to the subplan, just as would happen for an InitPlan.  This causes the
> subplan to get executed when the first of the output Params is evaluated;
> it loads the ParamExecData structs for all its output Params, and then
> the later Params just take data from the structs.  When execution reaches
> the MULTIEXEC SubPlan in the resjunk tlist item, no evaluation of the
> subplan is needed; but instead we re-mark all the output ParamExecData
> structs with non-null execPlan pointers, so that a fresh execution of
> the subplan will happen in the next evaluation of the targetlist.
>
> The klugy aspect of this is that it assumes that the SubPlan item will
> appear later in the tlist than all the Params referencing it.  This is
> true at the moment because resjunk tlist items always appear after
> non-resjunk ones.  There are a few other places that already depend on
> this ordering, but we've mostly tried to avoid introducing new
> dependencies on it.
>
> The alternative that I'd originally had in mind, before put-it-in-a-
> resjunk-item occurred to me, was to invent a new "secondary tlist"
> field of Query and of ModifyTable plan nodes, as I sketched back in
> http://www.postgresql.org/message-id/1783.1399054...@sss.pgh.pa.us
> We'd put the MULTIEXEC SubPlans in this secondary tlist and expect
> the executor to evaluate it just before evaluating the main tlist.
> However, that way isn't terribly pretty either, because it extends
> knowledge of this feature to a *whole lot* of places that don't have
> to know about it in the attached approach; in particular, just about
> every place that manipulates targetlist contents would have to also
> manipulate the secondary tlist.
>
> Another approach is to explicitly identify which of the Params will
> be evaluated first and replace it with a node tree that evaluates
> the subplan (and sets output Params for the remaining columns).
> This is a bit messy because the first-to-be-evaluated is dependent
> on the targetlist reordering that the planner does; so we don't want
> parse analysis to try to do it.  (If we allowed parse analysis to
> know that the planner will sort the tlist items into physical column
> order, we could do it like that; but then it would break if we ever
> get around to allowing ALTER TABLE to change the physical order.)
> We could safely have setrefs.c determine the first-to-be-evaluated
> Param, though, since it will traverse the tlist in final order.
> So if we went with this approach I'd have setrefs.c replace the first
> Param with a SubPlan node.  That

70 matches

Mail list logo