Re: [HACKERS] Raising our compiler requirements for 9.6

2015-08-08 Thread Noah Misch
On Sat, Aug 08, 2015 at 02:30:47AM +0200, Andres Freund wrote:
 On 2015-08-07 20:16:20 -0400, Noah Misch wrote:
  I agree that lock.h offers little to frontend code.  Headers that the
  lockdefs.h patch made usable in the frontend, particularly genam.h and 
  hash.h,
  are no better.
 
 It's not that simple. Those two, and tuptoaster.h, are actually somewhat
 validly included by frontend code via the rmgr descriptor routines.

genam.h is a dependency of the non-frontend-relevant content of some
frontend-relevant headers, _exactly_ as lock.h has been.  I count zero things
in genam.h that a frontend program could harness.  The frontend includes
hash.h for two hashdesc.c prototypes, less than the material you moved out of
lock.h for frontend benefit.  Yes, it is that simple.

  The lock.h/lockdefs.h unprincipled split would do more harm
  than letting frontends continue to pull in lock.h.
 
 Why?

Your header comment for lockdefs.h sums up the harm nicely.  Additionally, the
term defs does nothing to explain the split.  lock2.h would be no less
evocative.

 Consider what happens when lock.h/c gets more complicated and
 e.g. grows some atomics. None of the frontend code should see that, and
 it's not much effort to keep it that way. Allowing client code to see
 LOCKMODE isn't something that's going to cause any harm.

Readying the headers for that day brings some value, but you added a worse
mess to achieve it.  The overall achievement has negative value.

nm


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Assert in pg_stat_statements

2015-08-08 Thread Satoshi Nagayasu
Hi,

I just found that pg_stat_statements causes assert when queryId is
set by other module, which is loaded prior to pg_stat_statements in
the shared_preload_libraries parameter.

Theoretically, queryId in the shared memory could be set by other
module by design.

So, IMHO, pg_stat_statements should not cause assert even if queryId
already has non-zero value --- my module has this problem now.

Is my understanding correct? Any comments?

Regards,

-- 
NAGAYASU Satoshi sn...@uptime.jp
commit b975d7c2fe1b36a3ded1e0960be191466704e0fc
Author: Satoshi Nagayasu sn...@uptime.jp
Date:   Sat Aug 8 08:51:45 2015 +

Fix pg_stat_statements to avoid assert failure when queryId already has 
non-zero value.

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c 
b/contrib/pg_stat_statements/pg_stat_statements.c
index 59b8a2e..84c5200 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -776,8 +776,9 @@ pgss_post_parse_analyze(ParseState *pstate, Query *query)
if (prev_post_parse_analyze_hook)
prev_post_parse_analyze_hook(pstate, query);
 
-   /* Assert we didn't do this already */
-   Assert(query-queryId == 0);
+   /* Assume that other module has calculated and set queryId */
+   if (query-queryId  0)
+   return;
 
/* Safety check... */
if (!pgss || !pgss_hash)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] statement_timeout affects query results fetching?

2015-08-08 Thread Shay Rojansky
Hi everyone, I'm seeing some strange behavior and wanted to confirm it.

When executing a query that selects a long result set, if the code
processing the results takes its time (i.e.g more than statement_timeout),
a timeout occurs. My expectation was that statement_timeout only affects
query *processing*, and does not cover the frontend actually processing the
result.

First, I wanted to confirm that this is the case (and not some sort of bug
in my code).

If this is the case, is this intended? Here are some points:
* It makes statement_timeout very difficult to use; clients do very diverse
things with database results, it may be very difficult (or impossible) to
estimate how much time they should take (e.g. random load factors on the
client machine or on some other machine receiving results).
* It makes it impossible to specifically detect problematic *queries* which
take too long to execute (as opposed to the time taken to process their
results).

If you do insist that this behavior is correct, a documentation update for
statement_timeout might make this clearer.

Thanks,

Shay


[HACKERS] [sqlsmith] ERROR: failed to build any %d-way joins

2015-08-08 Thread Andreas Seltenreich
Hi,

there's a 1/1e6 chance that a sqlsmith query on the regression db of
master (c124cef) fails with

ERROR:  failed to build any {4..8}-way joins

They all appear to work fine on REL9_5_STABLE.

sample query:

select 1 from
  information_schema.collations as rel_60113935
  inner join information_schema.sql_sizing_profiles as rel_60113936
  on (rel_60113935.pad_attribute = rel_60113936.sizing_name )
inner join information_schema.foreign_tables as rel_60113937
on (rel_60113936.profile_id = rel_60113937.foreign_table_catalog )
  right join information_schema.constraint_table_usage as rel_60113938
  on (rel_60113935.collation_schema = rel_60113938.table_catalog )
left join public.btree_tall_tbl as rel_60113939
on (rel_60113936.required_value = rel_60113939.id )
where rel_60113938.table_name is not NULL;

regards,
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] ERROR: failed to build any %d-way joins

2015-08-08 Thread Andreas Seltenreich
I wrote:

 They all appear to work fine on REL9_5_STABLE.

oops, that branch was slightly stale.  Updating it with the latest
planner changes (8dccf03..0853388), the queries fail there as well.

regards,
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_filedump patch for 9.5

2015-08-08 Thread Satoshi Nagayasu
Hi,

I have created a patch for pg_filedump to work with 9.5.
Here is a list of changes.

 * Fix to rename CRC32 macros to work with 9.5.
 * Fix to add missing DBState: DB_SHUTDOWNED_IN_RECOVERY.
 * Fix to add missing page flags for Btree and GIN.
 * Update copyright date.

Please take a look. Any comments are welcome.

Regards,

-- 
NAGAYASU Satoshi sn...@uptime.jp
diff --git a/Makefile b/Makefile
index c64dfbf..d80a3a3 100644
--- a/Makefile
+++ b/Makefile
@@ -1,7 +1,7 @@
 # View README.pg_filedump first
 
 # note this must match version macros in pg_filedump.h
-FD_VERSION=9.3.0
+FD_VERSION=9.5.0
 
 CC=gcc
 CFLAGS=-g -O -Wall -Wmissing-prototypes -Wmissing-declarations
diff --git a/README.pg_filedump b/README.pg_filedump
index 3a04d59..13d51ff 100644
--- a/README.pg_filedump
+++ b/README.pg_filedump
@@ -2,7 +2,7 @@ pg_filedump - Display formatted contents of a PostgreSQL heap, 
index,
   or control file.
 
 Copyright (c) 2002-2010 Red Hat, Inc.
-Copyright (c) 2011-2014, PostgreSQL Global Development Group
+Copyright (c) 2011-2015, PostgreSQL Global Development Group
 
 This program is free software; you can redistribute it and/or modify
 it under the terms of the GNU General Public License as published by
diff --git a/pg_filedump.c b/pg_filedump.c
index 6cd02a9..1dfc524 100644
--- a/pg_filedump.c
+++ b/pg_filedump.c
@@ -3,7 +3,7 @@
  *formatting heap (data), index and control 
files.
  *
  * Copyright (c) 2002-2010 Red Hat, Inc.
- * Copyright (c) 2011-2014, PostgreSQL Global Development Group
+ * Copyright (c) 2011-2015, PostgreSQL Global Development Group
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License as published by
@@ -24,7 +24,7 @@
 
 #include pg_filedump.h
 
-#include utils/pg_crc_tables.h
+#include utils/pg_crc.h
 
 /* checksum_impl.h uses Assert, which doesn't work outside the server */
 #undef Assert
@@ -91,7 +91,7 @@ DisplayOptions(unsigned int validOptions)
printf
(\nVersion %s (for %s)
 \nCopyright (c) 2002-2010 Red Hat, Inc.
- \nCopyright (c) 2011-2014, PostgreSQL Global Development 
Group\n,
+ \nCopyright (c) 2011-2015, PostgreSQL Global Development 
Group\n,
 FD_VERSION, FD_PG_VERSION);
 
printf
@@ -1132,6 +1132,8 @@ FormatSpecial()
strcat(flagString, SPLITEND|);
if (btreeSection-btpo_flags  BTP_HAS_GARBAGE)
strcat(flagString, HASGARBAGE|);
+   if (btreeSection-btpo_flags  
BTP_INCOMPLETE_SPLIT)
+   strcat(flagString, INCOMPLETESPLIT|);
if (strlen(flagString))
flagString[strlen(flagString) - 1] = 
'\0';
 
@@ -1216,6 +1218,10 @@ FormatSpecial()
strcat(flagString, LIST|);
if (ginSection-flags  GIN_LIST_FULLROW)
strcat(flagString, FULLROW|);
+   if (ginSection-flags  GIN_INCOMPLETE_SPLIT)
+   strcat(flagString, INCOMPLETESPLIT|);
+   if (ginSection-flags  GIN_COMPRESSED)
+   strcat(flagString, COMPRESSED|);
if (strlen(flagString))
flagString[strlen(flagString) - 1] = 
'\0';
printf( GIN Index Section:\n
@@ -1340,9 +1346,9 @@ FormatControl()
char   *dbState;
 
/* Compute a local copy of the CRC to verify the one on disk */
-   INIT_CRC32(crcLocal);
-   COMP_CRC32(crcLocal, buffer, offsetof(ControlFileData, crc));
-   FIN_CRC32(crcLocal);
+   INIT_CRC32C(crcLocal);
+   COMP_CRC32C(crcLocal, buffer, offsetof(ControlFileData, crc));
+   FIN_CRC32C(crcLocal);
 
/* Grab a readable version of the database state */
switch (controlData-state)
@@ -1353,6 +1359,9 @@ FormatControl()
case DB_SHUTDOWNED:
dbState = SHUTDOWNED;
break;
+   case DB_SHUTDOWNED_IN_RECOVERY:
+   dbState = SHUTDOWNED_IN_RECOVERY;
+   break;
case DB_SHUTDOWNING:
dbState = SHUTDOWNING;
break;
@@ -1400,7 +1409,7 @@ FormatControl()
  Maximum Index Keys: %u\n
TOAST Chunk Size: %u\n
  Date and Time Type Storage: 

Re: [HACKERS] 9.5 release notes

2015-08-08 Thread Bruce Momjian
On Thu, Aug  6, 2015 at 10:35:50PM -0400, Bruce Momjian wrote:
 On Thu, Aug  6, 2015 at 06:42:38PM -0700, Peter Geoghegan wrote:
  On Thu, Aug 6, 2015 at 6:08 PM, Bruce Momjian br...@momjian.us wrote:
   I though tabout this, and it is really an issue for FDW authors, not for
   end users, so I put this text in the Source Code changes section:
  
  I carefully considered where to put it, and chose the compatibility
  section based on the precedent of this 9.4 item:
  
  Foreign data wrappers that support updating foreign tables must
  consider the possible presence of AFTER ROW triggers (Noah Misch)
  
  I don't suppose it matters much, though. I can close out that open item now.
 
 Oh, I think that 9.4 is in the wrong section, but good you researched it.

Actually, 9.4 and 9.5 are both in the right sections.  Users create
triggers, but only people modifying the source code create FDWs, so they
do belong in different sections.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5 release notes

2015-08-08 Thread David Rowley
On 7 August 2015 at 14:24, Bruce Momjian br...@momjian.us wrote:

 On Tue, Jun 30, 2015 at 09:00:44PM +0200, Andres Freund wrote:
  * 2014-12-08 [519b075] Simon ..: Use GetSystemTimeAsFileTime directly in
 win32
2014-12-08 [8001fe6] Simon ..: Windows: use
 GetSystemTimePreciseAsFileTime if ..
Timer resolution isn't a unimportant thing for people using explain?

 This all seemed very internals-only, e.g.:

 On most Windows systems this change will actually have no significant
 effect on
 timestamp resolution as the system timer tick is typically between 1ms
 and 15ms
 depending on what timer resolution currently running applications have
 requested. You can check this with clockres.exe from sysinternals.
 Despite the
 platform limiation this change still permits capture of finer
 timestamps where
 the system is capable of producing them and it gets rid of an
 unnecessary
 syscall.

 Was I wrong?


This does have a user visible change. Timestamps are now likely to have 6
digits after the decimal point, if they're on a version of windows which
supports GetSystemTimePreciseAsFileTime();

Master:

postgres=# select now();
  now
---
 2015-08-09 01:14:01.959645+12
(1 row)

9.4.4
postgres=# select now();
now

 2015-08-09 01:15:09.783+12
(1 row)

Regards

David Rowley

--
 David Rowley   http://www.2ndQuadrant.com/
http://www.2ndquadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


Re: [HACKERS] WIP: SCRAM authentication

2015-08-08 Thread Robert Haas
On Fri, Aug 7, 2015 at 6:54 PM, Michael Paquier
michael.paqu...@gmail.com wrote:
 This filtering machinery definitely looks like a GUC to me, something
 like password_forbidden_encryption that PASSWORD VERIFIERS looks at
 and discards the methods listed in there. This definitely needs to be
 separated from password_encryption.

I don't know what a password verifier is and I bet nobody else does
either.  Well, I think I sort of know: I think it's basically an
encrypted password.  Am I right?  Even if I am, I bet the average user
is going to scratch their head and punt.

I don't see that there's any good reason to allow the same password to
be stored in the catalog encrypted more than one way, and I don't
think there's any good reason to introduce the PASSWORD VERIFIER
terminology.  I think we should store (1) your password, either
encrypted or unencrypted; and (2) the method used to encrypt it.  And
that's it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tap tests remove working directories

2015-08-08 Thread Robert Haas
On Fri, Aug 7, 2015 at 7:17 PM, Andrew Dunstan and...@dunslane.net wrote:
 That certainly isn't what happens, and given the way this is done in
 TestLib.pm, using the CLEANUP parameter of File::Temp's tempdir() function,
 it's not clear how we could do that easily.

shot-in-the-dark

Set cleanup to false and manually remove the directory later in the
code, so that stuff runs only if we haven't died sooner?

/shot-in-the-dark

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] statement_timeout affects query results fetching?

2015-08-08 Thread Robert Haas
On Sat, Aug 8, 2015 at 5:31 AM, Shay Rojansky r...@roji.org wrote:
 Hi everyone, I'm seeing some strange behavior and wanted to confirm it.

 When executing a query that selects a long result set, if the code
 processing the results takes its time (i.e.g more than statement_timeout), a
 timeout occurs. My expectation was that statement_timeout only affects query
 *processing*, and does not cover the frontend actually processing the
 result.

 First, I wanted to confirm that this is the case (and not some sort of bug
 in my code).

 If this is the case, is this intended? Here are some points:
 * It makes statement_timeout very difficult to use; clients do very diverse
 things with database results, it may be very difficult (or impossible) to
 estimate how much time they should take (e.g. random load factors on the
 client machine or on some other machine receiving results).
 * It makes it impossible to specifically detect problematic *queries* which
 take too long to execute (as opposed to the time taken to process their
 results).

 If you do insist that this behavior is correct, a documentation update for
 statement_timeout might make this clearer.

I think the issue here is that we start returning rows to the client
while the query is still executing.

Suppose each row of output takes 100 ms of CPU time to generate and
there are 1,000,000 rows.  Then it's quite conceivable that we could
be under the statement_timeout when we start returning rows to the
client, but over the statement_timeout by the time we finish -- and
the user would probably want statement_timeout to kick in in that
case, because that's a lotta CPU time.

I suppose we could try to toll statement_timeout while we're blocked
waiting for the client, but nobody wrote the code for that yet.  And
it would mean that you can't use statement_timeout to prevent xmin
from lagging, which could be why you set it in the first place.  There
might also be some usability difficulties: pg_stat_activity shows the
time the query started, so if you know what statement_timeout is you
can tell how close it is to being killed.  If some of the time isn't
counted, then you can't tell any more.

Another approach (which I think might be better) is to have GUCs like
statement_cpu_limit and statement_io_limit that kill a query when it
uses more than the configured amount of that resource.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [DOCS] max_worker_processes on the standby

2015-08-08 Thread Robert Haas
On Tue, Aug 4, 2015 at 6:13 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote:
 I think it's totally reasonable for the standby to follow the master's
 behavior rather than the config file.  That should be documented, but
 otherwise, no problem.  If it were technologically possible for the
 standby to follow the config file rather than the master in all cases,
 that would be fine, too.  But the current behavior is somewhere in the
 middle, and that doesn't seem like a good plan.

 So I discussed this with Petr.  He points out that if we make the
 standby follows the master, then the problem would be the misbehavior
 that results once the standby is promoted: at that point the standby
 would no longer follow the master and would start with the feature
 turned off, which could be disastrous (depending on what are you using
 the commit timestamps for).

That seems like an imaginary problem.  If it's critical to have commit
timestamps, don't turn them off on the standby.

 To solve that problem, you could suggest that if we see the setting
 turned on in pg_control then we should follow that instead of the config
 file; but then the problem is that there's no way to turn the feature
 off.  And things are real crazy by then.

There's no existing precedent for a feature that lets the standby be
different from the master *in any way*.  So I don't see why we should
start here.  I think the reasonable definition is that the GUC
controls whether the master tries to update the SLRU (and generate
appropriate WAL records, presumably).  The standby should not get a
choice about whether to replay those WAL records.

Note that if you do allow the standby to decide not to replay the WAL
records for this feature, then the data on the standby could be
partially there but not completely there after promotion, because the
DBA might have flipped the switch on and off at different times.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch for ginCombineData

2015-08-08 Thread Robert Haas
On Wed, Aug 5, 2015 at 6:17 AM, Robert Abraham
robert.abraha...@googlemail.com wrote:
 we are using gin indexes on big tables. these tables happen to have several
 billion rows.
 the index creation fails in ginCombineData in src/backend/access/ginbulk.c
 because repalloc is limited to 1 gb.
 this limitation makes no sense in this context (compare comments in
 src/include/utils/memutils.h).
 To overcome this limitation on tables with lots of rows repalloc_huge is
 used.
 The test suite still succeeds on my machine.
 Find the patch attached,

Since nobody seems ready to act on this patch immediately, I suggest
adding it here so it doesn't get forgotten:

https://commitfest.postgresql.org/action/commitfest_view/open

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add column-name hint to log messages generated by inserts when varchars don't fit

2015-08-08 Thread Robert Haas
On Wed, Aug 5, 2015 at 6:39 AM, Stepan Rutz stepan.r...@gmx.de wrote:
 on our production servers I have quite some errors due to excessively long 
 varchar-values which application-code tries to insert into tables and which 
 don't fit.
 The errors look like

   ERROR:  value too long for type character varying(4)

 This is not helping me much. The patch will turn this too

   ERROR:  value too long for type character varying(4) (hint: column-name is 
 mycolumn)

 if the column that was overflown was mycolumn.

expression_tree_walker is used in enough different places that you
can't really modify that like this.  It'll have too many side effects
that may not be good.

Generally, the way to stick some useful information into the error
message is with an error context callback, rather than by appending to
the primary error message.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dependency between bgw_notify_pid and bgw_flags

2015-08-08 Thread Robert Haas
On Wed, Aug 5, 2015 at 3:33 AM, Ashutosh Bapat
ashutosh.ba...@enterprisedb.com wrote:
 This idea looks good.

Thanks.  It needs testing though to see if it really works as
intended.  Can you look into that?

 Looking at larger picture, we should also enable this feature to be used by
 auxilliary processes. It's very hard to add a new auxilliary process in
 current code. One has to go add code at many places to make sure that the
 auxilliary processes die and are re-started correctly. Even tougher to add a
 parent auxilliary process, which spawns multiple worker processes.That would
 be whole lot simpler if we could allow the auxilliary processes to use
 background worker infrastructure (which is what they are utlimately).

That's a separate patch, but, sure, we could do that.  I agree with
Alvaro's comments: the postmaster should start all children.  Other
processes should just request that it do so.  We have two mechanisms
for that right now: the one used by bgworkers, and the one used by the
AV launcher.

 BGWORKER_SHMEM_ACCESS has similar usage, except that it resets the on exit
 callbacks and detaches the shared memory segment from the background worker.
 That avoids a full cluster restart when one of those worker which can not
 corrupt shared memory dies. But I do not see any check to prevent such
 backend from calling PGSharedMemoryReattach()

There isn't, but you shouldn't do that.  :-)

This is C code; you can't protect against actively malicious code.

 So it looks like, it suffices to assume that background worker either needs
 to access shared memory or doesn't. Any background worker having shared
 memory access can also access database and thus becomes part of the backend
 list. Or may be we just avoid these flags and treat every background worker
 as if it passed both these flags. That will simplify a lot of code.

I think it's useful to support workers that don't have shared memory
access at all, because those can crash without causing a system-wide
reset.  But I don't see the point in distinguishing between workers
with shared-memory access and those with a database connection.  I
mean, obviously the worker needs to be able to initialize itself
either way, but there seems to be no reason to force that to be
signalled in bgw_flags.  It can just depend on whether
BackgroundWorkerInitializeConnection gets called.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] subplan variable reference / unassigned NestLoopParams

2015-08-08 Thread Andreas Seltenreich
Tom Lane writes:

 Andreas Seltenreich seltenre...@gmx.de writes:
 Tom Lane writes:
 Well, I certainly think all of these represent bugs:
 3 | ERROR:  plan should not reference subplan's variable
 2 | ERROR:  failed to assign all NestLoopParams to plan nodes

 These appear to be related.  The following query produces the former,
 but if you replace the very last reference of provider with the literal
 'bar', it raises the latter error.

 Fixed that, thanks for the test case!

I haven't seen the former since your commit, but there recently were
some instances of the latter.  The attached queries all throw the error
against master at 8752bbb.

regards,
Andreas

-- ERROR:  failed to assign all NestLoopParams to plan nodes
select
  subq_19608926.c0 as c0,
  rel_124211605.objname as c1
from
  (select
   rel_124211602.numeric_precision as c0
 from
   information_schema.sequences as rel_124211602
 where EXISTS (
   select
   rel_124211603.unique_constraint_name as c0
 from
   information_schema.referential_constraints as rel_124211603
 where rel_124211603.unique_constraint_schema is not NULL)) as 
subq_19608926
 left join public.tab1 as rel_124211604
 on (subq_19608926.c0 = rel_124211604.a )
  right join pg_catalog.pg_seclabels as rel_124211605
 right join public.phone as rel_124211606
 on (rel_124211605.objtype = rel_124211606.comment )
  on (rel_124211604.b = rel_124211605.objtype )
inner join public.bt_i4_heap as rel_124211729
  inner join public.bt_i4_heap as rel_124211730
  on (rel_124211729.random = rel_124211730.seqno )
on (subq_19608926.c0 = rel_124211730.seqno )
where rel_124211606.comment = rel_124211604.b;

-- ERROR:  failed to assign all NestLoopParams to plan nodes
select
  subq_53656269.c0 as c0
from
  (select
 rel_339945676.id3c as c0
   from
 public.rule_and_refint_t3 as rel_339945676
   where rel_339945676.data !~~ rel_339945676.data) as subq_53656269
  inner join public.dropcolumn as rel_339945677
  on (subq_53656269.c0 = rel_339945677.b )
inner join public.bt_name_heap as rel_339945678
  left join public.rtest_order2 as rel_339945705
   inner join information_schema.sequences as rel_339945706
   on (rel_339945705.a = rel_339945706.numeric_precision )
 inner join public.num_result as rel_339945707
 on (rel_339945705.b = rel_339945707.id1 )
  on (rel_339945678.random = rel_339945706.numeric_precision )
on (rel_339945677.b = rel_339945706.numeric_precision )
where rel_339945678.seqno ~* rel_339945705.c
fetch first 45 rows only;

-- ERROR:  failed to assign all NestLoopParams to plan nodes
select
  rel_273437910.name as c0,
  rel_273437908.sequence_catalog as c1,
  rel_273437865.b as c2,
  rel_273437910.location as c3,
  rel_273437864.id2c as c4,
  rel_273437908.start_value as c5
from
  public.rule_and_refint_t2 as rel_273437864
  inner join public.ruletest_tbl as rel_273437865
  on (rel_273437864.id2c = rel_273437865.a )
inner join information_schema.sequences as rel_273437908
 inner join public.rules_log as rel_273437909
 on (rel_273437908.numeric_precision_radix = rel_273437909.f1 )
  right join public.emp as rel_273437910
  on (rel_273437909.tag = rel_273437910.name )
on (rel_273437865.a = rel_273437908.numeric_precision )
where rel_273437909.tag !~* rel_273437909.tag
fetch first 110 rows only;

-- ERROR:  failed to assign all NestLoopParams to plan nodes
select
  rel_156464410.minimum_value as c0
from
  public.rule_and_refint_t1 as rel_156464330
  inner join public.rules_src as rel_156464331
  on (rel_156464330.id1b = rel_156464331.f1 )
inner join public.main_table as rel_156464401
  inner join pg_catalog.pg_file_settings as rel_156464402
 left join public.person as rel_156464409
   inner join information_schema.sequences as rel_156464410
   on (rel_156464409.age = rel_156464410.numeric_precision )
 on (rel_156464402.sourceline = rel_156464409.age )
  on (rel_156464401.a = rel_156464402.sourceline )
on (rel_156464331.f1 = rel_156464410.numeric_precision )
where rel_156464402.sourcefile @@ rel_156464409.name
fetch first 155 rows only;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: SCRAM authentication

2015-08-08 Thread Joe Conway
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/08/2015 06:27 AM, Robert Haas wrote:
 terminology.  I think we should store (1) your password, either 
 encrypted or unencrypted; and (2) the method used to encrypt it.
 And that's it.

A petty complaint, but it has always bothered me that we say the
password is encrypted when, at least currently, it is a simple hash
(cryptographic hash yes, but not encrypted, and not even an HMAC). I
think we should try to start using accurate terminology.

- -- 
Joe Conway
Crunchy Data
Enterprise PostgreSQL
http://crunchydata.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.22 (GNU/Linux)

iQIcBAEBAgAGBQJVxhm7AAoJEDfy90M199hlWRsQAIPliVReOYRRD/BJaZlB9Vjs
4YDolZZD9zR2+VPNxG/VaGHJ68rXlnfU/P0GrIQrS67t1xgwPxbUW6TCBsXJDnIE
wo7i5mJn9yn+AowccFiZTToKK8oNjRd33OJ2q00lAGiuaksnBhcJjMCNUHqf1Oz2
rUA/YiTp7RHXOQfiAxSoMKytK2y+rnQA+rnvPiE7XLKYE9rZ5rLiGhV0MPaNOFms
aHZIcYX5Tl2I3RsCexLMMA1qM001wSTyoti7o9gL71EXLV6ea6xt10a++k6oJ19y
oU7WjwKgV2XOGlQNC3/rUEKvuAtQhTlJpx9Q6xmTYidN0QHkZDdpJUblGZoxR2Vz
lT2zZdcpDhENynFZ1nTsd+CNWsn5T5vTVgnuKpG5qIMgT+kSG2JeiS7h+RY4rRtk
bl08tZmQBUBu/3hrRxQVPrt1NISteKXem2OLGphIKQEOmu/Kf43msYHQ+1qY0FTB
TZ96tVJnYTjQZp2P0IdjMf0qpOzK8qkMx2Tb6WehMd9yD1DtxQyKmxGpvssgEmQ7
1n3L/HCKWXF0MbI8QefIsO70ft4hzib5V+G7YmF00dWQM7NhDZYf6ejn1WmCP26u
w9wOHQcCAAKPI2knh3k2Ngdynl8Gofkxr7Le+NW7TGM+bp2U5EStTEH0r70mzEIg
KvB4dWX+tlZowujUmFhL
=VDCN
-END PGP SIGNATURE-


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: SCRAM authentication

2015-08-08 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 On Fri, Aug 7, 2015 at 6:54 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:
  This filtering machinery definitely looks like a GUC to me, something
  like password_forbidden_encryption that PASSWORD VERIFIERS looks at
  and discards the methods listed in there. This definitely needs to be
  separated from password_encryption.
 
 I don't know what a password verifier is and I bet nobody else does
 either.  Well, I think I sort of know: I think it's basically an
 encrypted password.  Am I right?  Even if I am, I bet the average user
 is going to scratch their head and punt.

Password verifier is actually a well understood term when it comes to
these protocols and their implementations.  It is not an encrypted
password but rather a value which allows the server to determine if the
client knows the correct password, without having to store the password
directly, or a simple hash of the password, or have the clear password
sent from the client sent to the server.

 I don't see that there's any good reason to allow the same password to
 be stored in the catalog encrypted more than one way, and I don't
 think there's any good reason to introduce the PASSWORD VERIFIER
 terminology.  I think we should store (1) your password, either
 encrypted or unencrypted; and (2) the method used to encrypt it.  And
 that's it.

Perhaps we don't want to expose what a password verifier is to users,
but we shouldn't be missing the distinction between hashed passwords,
encrypted passwords, and password verifiers in the code and in the
implementation of SCRAM.  We really shouldn't use an incorrect term for
what we're storing in pg_authid either though, which is what we do
today.

I can't see us ever storing encrypted passwords as that implies we'd
need a key stored somewhere and further that the server would be able to
get back to the user's original password, neither of which are things we
want to deal with.

You do have a good point that there is some risk associated with having
multiple values in pg_authid related to a user's password and that we
really want to help users move from the old value in pg_authid to the
new one.  I don't believe we should force a hard change as it's going to
cause a lot of trouble for users.  We have to also consider that clients
also have to be changed for this.

As discussed previously, in an ideal world, we would handle the old
values and the new ones while introducing password ageing, client
support for detecting that a password needs to be changed, protocol
support for changing passwords which avoids having them get logged in
cleartext to the server log, password complexity, and perhaps used
password history to boot.  The main issue here is that we really don't
provide any help for large installations to get their userbase moved off
of the old style today.  Password ageing (and good support for it in
clients, etc), would help that greatly.

Unfortunately, it's unlikely that we're going to get all of that done in
one release.  As such, I'd suggest our next release support the existing
values in pg_authid, add the password verifier when the password is
changed, and then add a GUC in the following release which disables the
old pg_authid mechanism, defaulting to true, and the release after that
remove support for the old value and the field for it completely.

We should also provide documentation about how to check if there are any
old style values, for users who want to be proactive about moving off of
the old style.

I'm travelling and so I haven't looked over the patch yet (or even read
the entire thread in depth), so apologies if I've got something confused
about what's being proposed.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] statement_timeout affects query results fetching?

2015-08-08 Thread Tom Lane
Shay Rojansky r...@roji.org writes:
 Hi everyone, I'm seeing some strange behavior and wanted to confirm it.
 When executing a query that selects a long result set, if the code
 processing the results takes its time (i.e.g more than statement_timeout),
 a timeout occurs. My expectation was that statement_timeout only affects
 query *processing*, and does not cover the frontend actually processing the
 result.

Are you using a cursor, or something like that?

libpq ordinarily absorbs the result set as fast as possible and then hands
it back to the application as one blob; the time subsequently spent by the
application looking at the blob doesn't count against statement_timeout.

As Robert says, statement_timeout *does* include time spent sending the
result set to the client, and we're not going to change that, because it
would be too hard to disentangle calculation from I/O.  So if the client
isn't prompt about absorbing all the data then you have to factor that
into your setting.  But it would be a slightly unusual usage pattern
AFAIK.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] statement_timeout affects query results fetching?

2015-08-08 Thread Shay Rojansky
Thanks for your responses.

I'm not using cursors or anything fancy. The expected behavior (as far as I
can tell) for a .NET database driver is to read one row at a time from the
database and make it available. There's even a standard API option for
fetching data on a column-by-column basis: this allows the user to not hold
the entire row in memory (imagine rows with megabyte-sized columns). This
makes sense to me; Tom, doesn't the libpq behavior you describe of
absorbing the result set as fast as possible mean that a lot of memory is
wasted on the client side? I'd be interested in your take on this.

I can definitely appreciate the complexity of changing this behavior. I
think that some usage cases (such as mine) would benefit from a timeout on
the time until the first row is sent, this would allow to put an upper cap
on stuff like query complexity, for example.

Shay

On Sat, Aug 8, 2015 at 5:13 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Shay Rojansky r...@roji.org writes:
  Hi everyone, I'm seeing some strange behavior and wanted to confirm it.
  When executing a query that selects a long result set, if the code
  processing the results takes its time (i.e.g more than
 statement_timeout),
  a timeout occurs. My expectation was that statement_timeout only affects
  query *processing*, and does not cover the frontend actually processing
 the
  result.

 Are you using a cursor, or something like that?

 libpq ordinarily absorbs the result set as fast as possible and then hands
 it back to the application as one blob; the time subsequently spent by the
 application looking at the blob doesn't count against statement_timeout.

 As Robert says, statement_timeout *does* include time spent sending the
 result set to the client, and we're not going to change that, because it
 would be too hard to disentangle calculation from I/O.  So if the client
 isn't prompt about absorbing all the data then you have to factor that
 into your setting.  But it would be a slightly unusual usage pattern
 AFAIK.

 regards, tom lane



Re: [HACKERS] statement_timeout affects query results fetching?

2015-08-08 Thread Shay Rojansky
I'd also recommend adding a sentence about this aspect of statement_timeout
in the docs to prevent confusion...

On Sat, Aug 8, 2015 at 5:30 PM, Shay Rojansky r...@roji.org wrote:

 Thanks for your responses.

 I'm not using cursors or anything fancy. The expected behavior (as far as
 I can tell) for a .NET database driver is to read one row at a time from
 the database and make it available. There's even a standard API option for
 fetching data on a column-by-column basis: this allows the user to not hold
 the entire row in memory (imagine rows with megabyte-sized columns). This
 makes sense to me; Tom, doesn't the libpq behavior you describe of
 absorbing the result set as fast as possible mean that a lot of memory is
 wasted on the client side? I'd be interested in your take on this.

 I can definitely appreciate the complexity of changing this behavior. I
 think that some usage cases (such as mine) would benefit from a timeout on
 the time until the first row is sent, this would allow to put an upper cap
 on stuff like query complexity, for example.

 Shay

 On Sat, Aug 8, 2015 at 5:13 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Shay Rojansky r...@roji.org writes:
  Hi everyone, I'm seeing some strange behavior and wanted to confirm it.
  When executing a query that selects a long result set, if the code
  processing the results takes its time (i.e.g more than
 statement_timeout),
  a timeout occurs. My expectation was that statement_timeout only affects
  query *processing*, and does not cover the frontend actually processing
 the
  result.

 Are you using a cursor, or something like that?

 libpq ordinarily absorbs the result set as fast as possible and then hands
 it back to the application as one blob; the time subsequently spent by the
 application looking at the blob doesn't count against statement_timeout.

 As Robert says, statement_timeout *does* include time spent sending the
 result set to the client, and we're not going to change that, because it
 would be too hard to disentangle calculation from I/O.  So if the client
 isn't prompt about absorbing all the data then you have to factor that
 into your setting.  But it would be a slightly unusual usage pattern
 AFAIK.

 regards, tom lane





Re: [HACKERS] tap tests remove working directories

2015-08-08 Thread Andrew Dunstan


On 08/08/2015 09:31 AM, Robert Haas wrote:

On Fri, Aug 7, 2015 at 7:17 PM, Andrew Dunstan and...@dunslane.net wrote:

That certainly isn't what happens, and given the way this is done in
TestLib.pm, using the CLEANUP parameter of File::Temp's tempdir() function,
it's not clear how we could do that easily.

shot-in-the-dark

Set cleanup to false and manually remove the directory later in the
code, so that stuff runs only if we haven't died sooner?

/shot-in-the-dark




Yeah, maybe. I'm thinking of trying to do it more globally, like in 
src/Makefile.global.in. That way we wouldn't have to add new code to 
every test file.


cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Test code is worth the space

2015-08-08 Thread Noah Misch
We've too often criticized patches for carrying many lines/bytes of test case
additions.  Let's continue to demand debuggable, secure tests that fail only
when something is wrong, but let's stop pushing for test minimalism.  Such
objections may improve the individual patch, but that doesn't make up for the
chilling effect on test contributions.  I remember clearly the first time I
submitted thorough test coverage with a feature.  Multiple committers attacked
it in the name of brevity.  PostgreSQL would be better off with 10x its
current test bulk, even if the average line of test code were considerably
less important than we expect today.

Thanks,
nm


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: SCRAM authentication

2015-08-08 Thread Heikki Linnakangas

On 08/08/2015 04:27 PM, Robert Haas wrote:

I don't see that there's any good reason to allow the same password to
be stored in the catalog encrypted more than one way,


Sure there is. If you want to be able to authenticate using different 
mechanism, you need the same password encrypted in different ways. 
SCRAM uses verifier that's derived from the password in one way, MD5 
authentication needs an MD5 hash, and yet other protocols have other 
requirements.



and I don't think there's any good reason to introduce the PASSWORD
VERIFIER terminology.  I think we should store (1) your password,
either encrypted or unencrypted; and (2) the method used to encrypt
it.  And that's it.


Like Joe and Stephen, I actually find it highly confusing that we call 
the MD5 hash an encrypted password. The term password verifier is 
fairly common in the specifications of authentication mechanisms. I 
think we should adopt it.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] checkpointer continuous flushing

2015-08-08 Thread Heikki Linnakangas

On 07/26/2015 06:01 PM, Fabien COELHO wrote:

Attached is very minor v5 update which does a rebase  completes the
cleanup of doing a full sort instead of a chuncked sort.


Some thoughts on this:

* I think we should drop the flush part of this for now. It's not as 
clearly beneficial as the sorting part, and adds a great deal more code 
complexity. And it's orthogonal to the sorting patch, so we can deal 
with it separately.


* Is it really necessary to parallelize the I/O among tablespaces? I can 
see the point, but I wonder if it makes any difference in practice.


* Is there ever any harm in sorting the buffers? The GUC is useful for 
benchmarking, but could we leave it out of the final patch?


* Do we need to worry about exceeding the 1 GB allocation limit in 
AllocateCheckpointBufferIds? It's enough got 2 TB of shared_buffers. 
That's a lot, but it's not totally crazy these days that someone might 
do that. At the very least, we need to lower the maximum of 
shared_buffers so that you can't hit that limit.


I ripped out the flushing part, keeping only the sorting. I refactored 
the logic in BufferSync() a bit. There's now a separate function, 
nextCheckpointBuffer(), that returns the next buffer ID from the sorted 
list. The tablespace-parallelization behaviour in encapsulated there, 
keeping the code in BufferSync() much simpler. See attached. Needs some 
minor cleanup and commenting still before committing, and I haven't done 
any testing besides a simple make check.


- Heikki

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index e900dcc..1cec243 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2454,6 +2454,28 @@ include_dir 'conf.d'
   /listitem
  /varlistentry
 
+ varlistentry id=guc-checkpoint-sort xreflabel=checkpoint_sort
+  termvarnamecheckpoint_sort/varname (typebool/type)
+  indexterm
+   primaryvarnamecheckpoint_sort/ configuration parameter/primary
+  /indexterm
+  /term
+  listitem
+   para
+Whether to sort buffers before writting them out to disk on checkpoint.
+For a HDD storage, this setting allows to group together
+neighboring pages written to disk, thus improving performance by
+reducing random write activity.
+This sorting should have limited performance effects on SSD backends
+as such storages have good random write performance, but it may
+help with wear-leveling so be worth keeping anyway.
+The default is literalon/.
+This parameter can only be set in the filenamepostgresql.conf/
+file or on the server command line.
+   /para
+  /listitem
+ /varlistentry
+
  varlistentry id=guc-checkpoint-warning xreflabel=checkpoint_warning
   termvarnamecheckpoint_warning/varname (typeinteger/type)
   indexterm
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index e3941c9..f538698 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -546,6 +546,18 @@
   /para
 
   para
+   When hard-disk drives (HDD) are used for terminal data storage
+   xref linkend=guc-checkpoint-sort allows to sort pages
+   so that neighboring pages on disk will be flushed together by
+   chekpoints, reducing the random write load and improving performance.
+   If solid-state drives (SSD) are used, sorting pages induces no benefit
+   as their random write I/O performance is good: this feature could then
+   be disabled by setting varnamecheckpoint_sort/ to valueoff/.
+   It is possible that sorting may help with SSD wear leveling, so it may
+   be kept on that account.
+  /para
+
+  para
The number of WAL segment files in filenamepg_xlog/ directory depends on
varnamemin_wal_size/, varnamemax_wal_size/ and
the amount of WAL generated in previous checkpoint cycles. When old log
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 68e33eb..bee38ab 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7995,11 +7995,13 @@ LogCheckpointEnd(bool restartpoint)
 sync_secs,
 total_secs,
 longest_secs,
+sort_secs,
 average_secs;
 	int			write_usecs,
 sync_usecs,
 total_usecs,
 longest_usecs,
+sort_usecs,
 average_usecs;
 	uint64		average_sync_time;
 
@@ -8030,6 +8032,10 @@ LogCheckpointEnd(bool restartpoint)
 		CheckpointStats.ckpt_end_t,
 		total_secs, total_usecs);
 
+	TimestampDifference(CheckpointStats.ckpt_sort_t,
+		CheckpointStats.ckpt_sort_end_t,
+		sort_secs, sort_usecs);
+
 	/*
 	 * Timing values returned from CheckpointStats are in microseconds.
 	 * Convert to the second plus microsecond form that TimestampDifference
@@ -8048,8 +8054,8 @@ LogCheckpointEnd(bool restartpoint)
 
 	elog(LOG, %s complete: wrote %d buffers (%.1f%%); 
 		 %d transaction log file(s) added, %d removed, %d recycled; 
-		 write=%ld.%03d s, sync=%ld.%03d s, total=%ld.%03d s; 
-		 sync files=%d, 

Re: [HACKERS] 9.5 release notes

2015-08-08 Thread Bruce Momjian
On Fri, Aug  7, 2015 at 09:02:09PM +0200, Andres Freund wrote:
 With regard to the point about the number of buffer mappings: This has
 forced peoples sites down. People have found this out themselves and
 patched postgres. That's an entirely different league.

I was not aware of the magnitude of this issue.  9.5 release note item
attached and applied.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +
diff --git a/doc/src/sgml/release-9.5.sgml b/doc/src/sgml/release-9.5.sgml
new file mode 100644
index 0786a20..bbaa886
*** a/doc/src/sgml/release-9.5.sgml
--- b/doc/src/sgml/release-9.5.sgml
*** FIXME: Add Andres
*** 463,468 
--- 463,482 
 /para
/listitem
  
+   listitem
+para
+ !--
+ 2014-10-02 [3acc10c9] Robert..: Increase the number of buffer mapping partitio..
+ --
+ Increase the number of buffer mapping partitions (Amit Kapila,
+ Andres Freund, Robert Haas)
+/para
+ 
+para
+ This improves performance for highly concurrent workloads.
+/para
+   /listitem
+ 
   /itemizedlist
  
  /sect4

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: SCRAM authentication

2015-08-08 Thread Greg Stark
On Sat, Aug 8, 2015 at 6:23 PM, Heikki Linnakangas hlinn...@iki.fi wrote:
 Like Joe and Stephen, I actually find it highly confusing that we call the
 MD5 hash an encrypted password. The term password verifier is fairly
 common in the specifications of authentication mechanisms. I think we should
 adopt it.

Speaking as someone who hasn't read the specifications I found
password verifier surprising. I would have known what password
hash was but I misread verifier to be something functional like a
PAM plugin. I tend to agree we should just use terminology out of the
specs though even if it's a little opaque, better one opaque piece of
terminology than having to learn and translate between multiple
terminologies.


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5 release notes

2015-08-08 Thread Bruce Momjian
On Fri, Aug  7, 2015 at 09:02:09PM +0200, Andres Freund wrote:
 On 2015-08-07 14:43:11 -0400, Bruce Momjian wrote:
  Well, we could just throw a Postgres 9.5 is faster release note item
  in there and call it a day.  ;-)
 
 Based on my experience one of the prime reason people move to a new
 version of postgres is performance. And it's not like 'faster!!!' is
 really helpful for them to evaluate the benefits appropriately. A
 scalability improvement isn't going to help somebody concerned with
 single query performance. Somebody concerned about OLTP write
 performance isn't going to be overly excited about the sort performance
 improvements, but somebody doing OLAP style queries very much so.
 
 The consequence of not listing that is that we're essentially asking our
 users to read through all the changes between two releases. Something
 that takes a long while even for people like you and me who are very
 familiar with the project..

Well, considering I have used the same criteria for years, and I am only
now hearing complaints, I am unsure about the statement that our
existing criteria isn't generally meeting people's needs.

 The *by far* biggest complaint about upgrades with postgres isn't binary
 compatibility (even before pg_upgrade), it's not that there's minor
 incompatibilities (at least not since 8.3, and maybe bytea_output). It's
 that previously working queries don't work anymore. It's always just a
 minority, but they're there. And it's very hard to figure out what's
 up. Is it stats? Different settings? Planner changes? If we then don't
 list changes to the planner, they're screwed.

Well, if we do list them, is that going to help people?  You can say,
well it might, but we are then in the same logic we use to decide on
adding GUC entries  --- we have to weigh the value of having the entry
vs the overhead of everyone reading the entry.  Now, in this case, it is
a one-time read vs. something that we will keep for years, but the basic
decision process is the same.

And, again, I will say, we are not writing this for ourselves, but for
the general user.

 In comparison to that just about nobody cares about the rest of the
 update but new SQL level stuff and a few other major things? A new field
 in EXPLAIN, debug_assertions read only,  effective_io_concurrency
 settable without effect, VACUUM logs new one more data point, an
 10+ year old obsolete undocumented option being removed, new icons for
 binaries. They just don't matter.

Well, if we have user-visible behavior, and we don't tell them about it,
they are never going to be able to use it, so it is hard to see how we
could avoid mentioning those.

  What I _am_ saying is that you should use the same criteria I am using,
  and just disagree on the place for the line, rather than use a different
  criteria, which will lead to perpetual complaints.  We can change the
  criteria, but that is a different discussion.
 
 We need to change that criteria then.

Then you need to start a new thread on that topic to get community
agreement that I can implement, and we can probably change 9.5 to match.

You might also want to address the fact we don't list all bug fixes in
the release notes either if the bug is a rare, minor event, and/or if
the new error message is sufficient communication to users.

One way of minimizing the downside of any new such entries is to have a
Minor performance improvements or Internal performance improvements
section in the release notes so people will realize they are not of the
same import as other items --- same for possible new bug fix listings.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Test code is worth the space

2015-08-08 Thread Peter Geoghegan
On Sat, Aug 8, 2015 at 9:47 AM, Noah Misch n...@leadboat.com wrote:
 We've too often criticized patches for carrying many lines/bytes of test case
 additions.  Let's continue to demand debuggable, secure tests that fail only
 when something is wrong, but let's stop pushing for test minimalism.  Such
 objections may improve the individual patch, but that doesn't make up for the
 chilling effect on test contributions.  I remember clearly the first time I
 submitted thorough test coverage with a feature.  Multiple committers attacked
 it in the name of brevity.  PostgreSQL would be better off with 10x its
 current test bulk, even if the average line of test code were considerably
 less important than we expect today.

I strongly agree. I am fond of the example of external sorting, which
at present has exactly zero test coverage, even though in production
those code paths are exercised all the time.

I think that there needs to be a way of running an extended set of
regression tests. I could definitely respect the desire for minimalism
when it comes to adding tests to the regression tests proper if there
was an extended set of tests that could be run during development less
frequently. I thought about doing the extended set as a satellite
project, but that may not be workable.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Test code is worth the space

2015-08-08 Thread Josh Berkus
On 08/08/2015 12:24 PM, Peter Geoghegan wrote:
 I think that there needs to be a way of running an extended set of
 regression tests. I could definitely respect the desire for minimalism
 when it comes to adding tests to the regression tests proper if there
 was an extended set of tests that could be run during development less
 frequently. I thought about doing the extended set as a satellite
 project, but that may not be workable.

There already is, isn't there?  All of those named sets of regression
tests which aren't run by default.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Test code is worth the space

2015-08-08 Thread Andrew Dunstan


On 08/08/2015 05:09 PM, Josh Berkus wrote:

On 08/08/2015 12:24 PM, Peter Geoghegan wrote:

I think that there needs to be a way of running an extended set of
regression tests. I could definitely respect the desire for minimalism
when it comes to adding tests to the regression tests proper if there
was an extended set of tests that could be run during development less
frequently. I thought about doing the extended set as a satellite
project, but that may not be workable.

There already is, isn't there?  All of those named sets of regression
tests which aren't run by default.



Exactly. And there is nothing to stop us expanding those. For example, 
it might be useful to have a suite that provably touched every code 
path, or something close to it.


Incidentally, making the buildfarm client run extra sets of tests in the 
main tree is almost completely trivial. Making it run tests from 
somewhere else, such as a different git repo, is only slightly harder.


cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5 release notes

2015-08-08 Thread Bruce Momjian
On Sat, Aug  8, 2015 at 02:49:21PM -0400, Bruce Momjian wrote:
   What I _am_ saying is that you should use the same criteria I am using,
   and just disagree on the place for the line, rather than use a different
   criteria, which will lead to perpetual complaints.  We can change the
   criteria, but that is a different discussion.
  
  We need to change that criteria then.
 
 Then you need to start a new thread on that topic to get community
 agreement that I can implement, and we can probably change 9.5 to match.
 
 You might also want to address the fact we don't list all bug fixes in
 the release notes either if the bug is a rare, minor event, and/or if
 the new error message is sufficient communication to users.
 
 One way of minimizing the downside of any new such entries is to have a
 Minor performance improvements or Internal performance improvements
 section in the release notes so people will realize they are not of the
 same import as other items --- same for possible new bug fix listings.

I have updated src/tools/RELEASE_CHANGES to document the criteria I use
to create the major release notes:

o  new features and options
o  major performance improvements
o  bug fixes for serious or common bugs
o  incompatibilities
o  major source code changes

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Test code is worth the space

2015-08-08 Thread Greg Stark
On Sat, Aug 8, 2015 at 8:24 PM, Peter Geoghegan p...@heroku.com wrote:
 I think that there needs to be a way of running an extended set of
 regression tests. I could definitely respect the desire for minimalism

The larger expense in having extensive test suites is the cost to
maintain them. With our current test framework tests have to be fairly
carefully written to produce output that isn't very fragile and
sensitive to minor changes in the code. In practice this is what
really drives the push towards minimal tests. If we tried testing
every code path right now -- which I tried to do once for the TOAST
code -- what you'll find is that what you're really testing is not
that the code is correct but that it does exactly what it does today.
At that point the test failures become entirely noise meaning
something changed rather than signal meaning something's broken.

A better test framework can go a long way towards fixing this. It
would be much less of a problem if we had a unit test framework rather
than only black box integration tests. That would allow us to test
that every function in tuplestore.c meets its contract, not just that
some SQL query produces output that's correct and we think happened to
go through some code path we were interested in. There are many code
paths that will be hard to arrange to reach from SQL and impossible to
have any confidence will continue to be reached in the future when the
behaviour is intentionally changed.

That said, I don't think anyone would object to adding some regression
tests that at least test basic correctness of the sorting code. That's
a pretty embarrassing gap and iirc the only reason for it is that it
would make the regression tests sensitive to collation locale
settings.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: SCRAM authentication

2015-08-08 Thread Josh Berkus
On 08/08/2015 10:23 AM, Heikki Linnakangas wrote:
 On 08/08/2015 04:27 PM, Robert Haas wrote:
 I don't see that there's any good reason to allow the same password to
 be stored in the catalog encrypted more than one way,
 
 Sure there is. If you want to be able to authenticate using different
 mechanism, you need the same password encrypted in different ways.
 SCRAM uses verifier that's derived from the password in one way, MD5
 authentication needs an MD5 hash, and yet other protocols have other
 requirements.

That's correct.  However, one of the goals of implementing SCRAM
authentication is to allow security-conscious users to get rid of those
reusable md5 hashes, no?

Obviously the backwards-compatibility issues are pretty major ... it'll
be years before all drivers support SCRAM ... but we also want to
provide a path forwards for secure installations in which no md5 hashes
are stored.

This says backwards-compatible GUC to me.  Here's one idea on how to
handle this:

1. we drop the parameter password_encryption

2. we add the parameter password_storage, which takes a list:
   - plain : plain text
   - md5 : current md5 hashes
   - scram : new scram hashed passwords
   This defaults to 'md5, scram' if not specified.
   This list might be extended in the future.

3. All password types in the list are generated.  This means having
multiple columns in pg_shadow, or an array.  An array would support the
addition of future password storage methods.

4. CREATE ROLE / ALTER ROLE syntax is changed to accept a parameter to
ENCRYPTED in order to support md5, scram, and future methods.  If no
parameter is supplied, ENCRYPTED will default to 'md5, scram'.

5. we add the superuser-only function pg_apply_password_policy().  This
applies the policy expressed by password_storage, generating or erasing
passwords for each user.

6. We add a new connection error for authentication __method__ not
supported for user

7. Two versions from now, we change the defaults.

I thought about the idea of determining password storage based on what's
in pg_hba.conf, but that seems like way too much implied authorization
to me, and liable to be a big foot-gun.

--Josh Berkus

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: SCRAM authentication

2015-08-08 Thread Michael Paquier
On Sun, Aug 9, 2015 at 6:51 AM, Josh Berkus j...@agliodbs.com wrote:
 Obviously the backwards-compatibility issues are pretty major ... it'll
 be years before all drivers support SCRAM ... but we also want to
 provide a path forwards for secure installations in which no md5 hashes
 are stored.

 This says backwards-compatible GUC to me.  Here's one idea on how to
 handle this:

 1. we drop the parameter password_encryption
 2. we add the parameter password_storage, which takes a list:
- plain : plain text
- md5 : current md5 hashes
- scram : new scram hashed passwords
This defaults to 'md5, scram' if not specified.
This list might be extended in the future.

Perhaps using a different GUC than password_encryption is safer... I
am not that sure. Still that's how I switched password_encryption to
actually handle a list. Default is 'md5' with the first patch, and
'md5,scram' with the scram patch added and it sets the list of
password verifiers created when PASSWORD with ENCRYPTED/UNENCRYPTED is
used.

 3. All password types in the list are generated.  This means having
 multiple columns in pg_shadow, or an array.  An array would support the
 addition of future password storage methods.

Yeah, the patch switches pg_shadow to an array like that, with as
elements method:value, so you get actually md5:md5blah,scram:stuff in
all the patches applied.

 4. CREATE ROLE / ALTER ROLE syntax is changed to accept a parameter to
 ENCRYPTED in order to support md5, scram, and future methods.  If no
 parameter is supplied, ENCRYPTED will default to 'md5, scram'.

Like password ENCRYPTED (md5,scram) or similar? If no method is
passed, I think that we should default to password_storage instead.
Also, I still think that something like PASSWORD VERIFIERS is needed,
users may want to set the verifier user for each method after
calculating it on client-side: we authorize that for md5 even now, and
that's not something this spec authorizes.

 5. we add the superuser-only function pg_apply_password_policy().  This
 applies the policy expressed by password_storage, generating or erasing
 passwords for each user.

pg_upgrade could make use of that to control password aging with an
option to do the cleanup or not. Not sure what the default should be
though. pg_apply_password_policy(roleid) would be useful as well to do
it on a role base.

 6. We add a new connection error for authentication __method__ not
 supported for user

Hm? This would let any user trying to connect with a given method know
that if a method is used or not. What's wrong with failing as we do
now. In case of PASSWORD NULL for example, an attempt of connection
fails all the time with incorrect password or similar.

 7. Two versions from now, we change the defaults.

Or three. We cannot expect users to change immediately, and it is
wiser to let dust set on the new feature in case critical bugs show up
after the first GA.

Something that sounds more like a detail in this thread by reading
other comments: I think that it is important to store password
verifiers in a different catalog than pg_authid for two reasons:
- that's more user-friendly, a sysadmin could directly join the new
catalog with pg_authid to get all the verifiers for a single user
method
- at password lookup when authorizing connection, there is no need to
fetch all the password verifiers and parse the array with all
verifiers.
- more scalable if we have many verifier methods in the future, though
we are not going to have hundreds of them. Though I am wondering about
per-method validtime and per-method authorization options.
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: SCRAM authentication

2015-08-08 Thread Sehrope Sarkuni
It'd be nice if the new auth mechanism supports multiple passwords in the
same format as well (not just one per format).

That way you could have two different passwords for a user that are active
at the same time. This would simplify rolling database credentials as it
wouldn't have to be done all at once. You could add the new credentials,
update your app servers one by one, then disable the old ones.

A lot of systems that use API keys let you see the last time a particular
set of keys was used. This helps answer the Is this going to break
something if I disable it? question. Having a last used at timestamp for
each auth mechanism (per user) would be useful.

I'm not sure how updates should work when connecting to a read-only slave
though. It would need some way of letting the master know that user X
connected using credentials Y.

Regards,
-- Sehrope Sarkuni
Founder  CEO | JackDB, Inc. | https://www.jackdb.com/


Re: [HACKERS] WIP: Make timestamptz_out less slow.

2015-08-08 Thread David Rowley
On 29 July 2015 at 03:25, Andres Freund and...@anarazel.de wrote:

 On 2015-07-29 03:10:41 +1200, David Rowley wrote:
  timestamp_out() = 2015-07-29 02:24:33.34 in 3.506000
  timestamp_out_old() = 2015-07-29 02:24:33.034 in 64.518000
  timestamp_out_af() = 2015-07-29 02:24:33.034 in 2.981000
 
  timestamp_out_old is master's version, the timestamp_out_af() is yours,
 and
  timestamp_out() is my one. times are in seconds to perform 100 million
  calls.

 That looks good.

  So it appears your version is a bit faster than mine, but we're both
 about
  20 times faster than the current one.
  Also mine needs fixed up as the fractional part is not padded the same as
  yours, but I doubt that'll affect the performance by much.

 Worthwhile to finish that bit and try ;)

  My view: It's probably not worth going quite as far as you've gone for a
  handful of nanoseconds per call, but perhaps something along the lines of
  mine can be fixed up.

 Yes, I agreee that your's is probably going to be fast enough.


I took a bit of weekend time to finish this one off. Patch attached.

A quick test shows a pretty good performance increase:

create table ts (ts timestamp not null);
insert into ts select generate_series('2010-01-01 00:00:00', '2011-01-01
00:00:00', '1 sec'::interval);
vacuum freeze ts;

Master:
david=# copy ts to 'l:/ts.sql';
COPY 31536001
Time: 20444.468 ms

Patched:
david=# copy ts to 'l:/ts.sql';
COPY 31536001
Time: 10947.097 ms

There's probably a bit more to squeeze out of this.
1. EncodeDateTime() could return the length of the string to allow callers
to use pnstrdup() instead of pstrdup(), which would save the strlen() call.
2. Have pg_int2str_zeropad() and pg_int2str() skip appending the NUL char
and leave this up to the calling function.
3. Make something to replace the sprintf(str,  %.*s, MAXTZLEN, tzn); call.

Perhaps 1 and 3 are worth looking into, but I think 2 is likely too error
prone for the small gain we'd get from it.

Also I was not too sure on if pg_int2str() was too similar to pg_ltoa().
Perhaps pg_ltoa() should just be modified to return the end of string?

Thoughts?

Regards

David Rowley

--
 David Rowley   http://www.2ndQuadrant.com/
http://www.2ndquadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


timestamp_out_speedup_2015-08-09.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5 release notes

2015-08-08 Thread Bruce Momjian
On Sun, Aug  9, 2015 at 01:24:33AM +1200, David Rowley wrote:
 
 On 7 August 2015 at 14:24, Bruce Momjian br...@momjian.us wrote:
 
 On Tue, Jun 30, 2015 at 09:00:44PM +0200, Andres Freund wrote:
  * 2014-12-08 [519b075] Simon ..: Use GetSystemTimeAsFileTime directly in
 win32
    2014-12-08 [8001fe6] Simon ..: Windows: use
 GetSystemTimePreciseAsFileTime if ..
    Timer resolution isn't a unimportant thing for people using explain?
 
 This all seemed very internals-only, e.g.:
 
     On most Windows systems this change will actually have no significant
 effect on
     timestamp resolution as the system timer tick is typically between 1ms
 and 15ms
     depending on what timer resolution currently running applications have
     requested. You can check this with clockres.exe from sysinternals.
 Despite the
     platform limiation this change still permits capture of finer
 timestamps where
     the system is capable of producing them and it gets rid of an
 unnecessary
     syscall.
 
 Was I wrong?
 
 
 
 This does have a user visible change. Timestamps are now likely to have 6
 digits after the decimal point, if they're on a version of windows which
 supports GetSystemTimePreciseAsFileTime();
 
 Master:
 
 postgres=# select now();
               now
 ---
  2015-08-09 01:14:01.959645+12
 (1 row)
 
 9.4.4
 postgres=# select now();
             now
 
  2015-08-09 01:15:09.783+12
 (1 row)

Yes, this was already in the release notes:

Allow higher-precision timestamp resolution on systemitem
class=osnameWindows 8/ or systemitem class=osnameWindows
Server 2012/ and later Windows systems (Craig Ringer)

I am not sure why people were saying it was missing.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Broken src/test/regress/mb/* tests

2015-08-08 Thread Tatsuo Ishii
Commit:
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=9043ef390f4f0b4586cfe59cbd22314b9c3e2957
broke src/test/regress/mb/* tests because the message:

WARNING:  hash indexes are not WAL-logged and their use is discouraged

does not emit any more in the test.  I have committed and pushed the
fix to master and 9.5 stable branches (tests in the previous stable
branches do not emit the message).

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: SCRAM authentication

2015-08-08 Thread Josh Berkus
On 08/08/2015 03:21 PM, Michael Paquier wrote:
 On Sun, Aug 9, 2015 at 6:51 AM, Josh Berkus j...@agliodbs.com wrote:
 1. we drop the parameter password_encryption
 2. we add the parameter password_storage, which takes a list:
- plain : plain text
- md5 : current md5 hashes
- scram : new scram hashed passwords
This defaults to 'md5, scram' if not specified.
This list might be extended in the future.
 
 Perhaps using a different GUC than password_encryption is safer... I
 am not that sure. Still that's how I switched password_encryption to
 actually handle a list. Default is 'md5' with the first patch, and
 'md5,scram' with the scram patch added and it sets the list of
 password verifiers created when PASSWORD with ENCRYPTED/UNENCRYPTED is
 used.

Well, generally I feel like if we're going to change the *type* of a GUC
parameter, we ought to change the *name*.  It's far easier for users to
figure out that the contents of a parameter need to change if the name
is also changed.

In other words, I think invalid parameter 'password_encryption' is an
easier to understand error message than invalid password_encryption
type 'on'.  Besides which, password_encryption was always a misnomer.

Unless you're going to still accept on, off in some kind of wierd
backwards-compatibitlity mode?  If so, how does that work?

 Like password ENCRYPTED (md5,scram) or similar? If no method is
 passed, I think that we should default to password_storage instead.

Make sense.

 Also, I still think that something like PASSWORD VERIFIERS is needed,
 users may want to set the verifier user for each method after
 calculating it on client-side: we authorize that for md5 even now, and
 that's not something this spec authorizes.

I don't follow this.  Mind you, I'm not sure that I need to.

 5. we add the superuser-only function pg_apply_password_policy().  This
 applies the policy expressed by password_storage, generating or erasing
 passwords for each user.
 
 pg_upgrade could make use of that to control password aging with an
 option to do the cleanup or not. Not sure what the default should be
 though. pg_apply_password_policy(roleid) would be useful as well to do
 it on a role base.

No objections to an optional roleid parameter, if you think people will
use it.

 6. We add a new connection error for authentication __method__ not
 supported for user
 
 Hm? This would let any user trying to connect with a given method know
 that if a method is used or not. What's wrong with failing as we do
 now. In case of PASSWORD NULL for example, an attempt of connection
 fails all the time with incorrect password or similar.

So, the DBA sets password_storage = 'scram', but doesn't take the md5
lines out of pg_hba.conf.

The app dev tries to connect using a driver which only supports md5.
What error should they get?  A user/DBA who is getting invalid
password is going to spend a long time debugging it.  Also, it would be
very useful to have a distinctive error in the log, so that DBAs could
see who is *trying* to connect with the wrong verifier.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Assert in pg_stat_statements

2015-08-08 Thread Satoshi Nagayasu


On 2015/08/08 22:32, Robert Haas wrote:

On Sat, Aug 8, 2015 at 5:00 AM, Satoshi Nagayasu sn...@uptime.jp wrote:

I just found that pg_stat_statements causes assert when queryId is
set by other module, which is loaded prior to pg_stat_statements in
the shared_preload_libraries parameter.

Theoretically, queryId in the shared memory could be set by other
module by design.

So, IMHO, pg_stat_statements should not cause assert even if queryId
already has non-zero value --- my module has this problem now.

Is my understanding correct? Any comments?


Seems like an ERROR would be more appropriate than an Assert.


Well, it's not that simple, and I'm afraid that it may not fix
the issue.

Here, let's assume that we have two modules (foo and pg_stat_statements)
which calculate, use and store Query-queryId independently.

What we may see when two are enabled is following.

(1) Enable two modules in shared_preload_libraries.

shared_preload_libraries = 'foo,pg_stat_statements'

(2) Once a query comes in, a callback function in module foo
associated with post_parse_analyze_hook calculates and store
queryId in Query-queryId.

(3) After that, a callback function (pgss_post_parse_analyze) in
pg_stat_statements associated with post_parse_analyze_hook
detects that Query-queryId has non-zero value, and asserts it.

As a result, we can not have two or more modules that use queryId
at the same time.

But I think it should be possible because one of the advantages of
PostgreSQL is its extensibility.

So, I think the problem here is the fact that pg_stat_statements
deals with having non-zero value in queryId as error even if
it has exact the same value with other module.

Regards,
--
NAGAYASU Satoshi sn...@uptime.jp


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers