Re: [HACKERS] Raising our compiler requirements for 9.6
On Sat, Aug 08, 2015 at 02:30:47AM +0200, Andres Freund wrote: On 2015-08-07 20:16:20 -0400, Noah Misch wrote: I agree that lock.h offers little to frontend code. Headers that the lockdefs.h patch made usable in the frontend, particularly genam.h and hash.h, are no better. It's not that simple. Those two, and tuptoaster.h, are actually somewhat validly included by frontend code via the rmgr descriptor routines. genam.h is a dependency of the non-frontend-relevant content of some frontend-relevant headers, _exactly_ as lock.h has been. I count zero things in genam.h that a frontend program could harness. The frontend includes hash.h for two hashdesc.c prototypes, less than the material you moved out of lock.h for frontend benefit. Yes, it is that simple. The lock.h/lockdefs.h unprincipled split would do more harm than letting frontends continue to pull in lock.h. Why? Your header comment for lockdefs.h sums up the harm nicely. Additionally, the term defs does nothing to explain the split. lock2.h would be no less evocative. Consider what happens when lock.h/c gets more complicated and e.g. grows some atomics. None of the frontend code should see that, and it's not much effort to keep it that way. Allowing client code to see LOCKMODE isn't something that's going to cause any harm. Readying the headers for that day brings some value, but you added a worse mess to achieve it. The overall achievement has negative value. nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Assert in pg_stat_statements
Hi, I just found that pg_stat_statements causes assert when queryId is set by other module, which is loaded prior to pg_stat_statements in the shared_preload_libraries parameter. Theoretically, queryId in the shared memory could be set by other module by design. So, IMHO, pg_stat_statements should not cause assert even if queryId already has non-zero value --- my module has this problem now. Is my understanding correct? Any comments? Regards, -- NAGAYASU Satoshi sn...@uptime.jp commit b975d7c2fe1b36a3ded1e0960be191466704e0fc Author: Satoshi Nagayasu sn...@uptime.jp Date: Sat Aug 8 08:51:45 2015 + Fix pg_stat_statements to avoid assert failure when queryId already has non-zero value. diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c index 59b8a2e..84c5200 100644 --- a/contrib/pg_stat_statements/pg_stat_statements.c +++ b/contrib/pg_stat_statements/pg_stat_statements.c @@ -776,8 +776,9 @@ pgss_post_parse_analyze(ParseState *pstate, Query *query) if (prev_post_parse_analyze_hook) prev_post_parse_analyze_hook(pstate, query); - /* Assert we didn't do this already */ - Assert(query-queryId == 0); + /* Assume that other module has calculated and set queryId */ + if (query-queryId 0) + return; /* Safety check... */ if (!pgss || !pgss_hash) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] statement_timeout affects query results fetching?
Hi everyone, I'm seeing some strange behavior and wanted to confirm it. When executing a query that selects a long result set, if the code processing the results takes its time (i.e.g more than statement_timeout), a timeout occurs. My expectation was that statement_timeout only affects query *processing*, and does not cover the frontend actually processing the result. First, I wanted to confirm that this is the case (and not some sort of bug in my code). If this is the case, is this intended? Here are some points: * It makes statement_timeout very difficult to use; clients do very diverse things with database results, it may be very difficult (or impossible) to estimate how much time they should take (e.g. random load factors on the client machine or on some other machine receiving results). * It makes it impossible to specifically detect problematic *queries* which take too long to execute (as opposed to the time taken to process their results). If you do insist that this behavior is correct, a documentation update for statement_timeout might make this clearer. Thanks, Shay
[HACKERS] [sqlsmith] ERROR: failed to build any %d-way joins
Hi, there's a 1/1e6 chance that a sqlsmith query on the regression db of master (c124cef) fails with ERROR: failed to build any {4..8}-way joins They all appear to work fine on REL9_5_STABLE. sample query: select 1 from information_schema.collations as rel_60113935 inner join information_schema.sql_sizing_profiles as rel_60113936 on (rel_60113935.pad_attribute = rel_60113936.sizing_name ) inner join information_schema.foreign_tables as rel_60113937 on (rel_60113936.profile_id = rel_60113937.foreign_table_catalog ) right join information_schema.constraint_table_usage as rel_60113938 on (rel_60113935.collation_schema = rel_60113938.table_catalog ) left join public.btree_tall_tbl as rel_60113939 on (rel_60113936.required_value = rel_60113939.id ) where rel_60113938.table_name is not NULL; regards, Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [sqlsmith] ERROR: failed to build any %d-way joins
I wrote: They all appear to work fine on REL9_5_STABLE. oops, that branch was slightly stale. Updating it with the latest planner changes (8dccf03..0853388), the queries fail there as well. regards, Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pg_filedump patch for 9.5
Hi, I have created a patch for pg_filedump to work with 9.5. Here is a list of changes. * Fix to rename CRC32 macros to work with 9.5. * Fix to add missing DBState: DB_SHUTDOWNED_IN_RECOVERY. * Fix to add missing page flags for Btree and GIN. * Update copyright date. Please take a look. Any comments are welcome. Regards, -- NAGAYASU Satoshi sn...@uptime.jp diff --git a/Makefile b/Makefile index c64dfbf..d80a3a3 100644 --- a/Makefile +++ b/Makefile @@ -1,7 +1,7 @@ # View README.pg_filedump first # note this must match version macros in pg_filedump.h -FD_VERSION=9.3.0 +FD_VERSION=9.5.0 CC=gcc CFLAGS=-g -O -Wall -Wmissing-prototypes -Wmissing-declarations diff --git a/README.pg_filedump b/README.pg_filedump index 3a04d59..13d51ff 100644 --- a/README.pg_filedump +++ b/README.pg_filedump @@ -2,7 +2,7 @@ pg_filedump - Display formatted contents of a PostgreSQL heap, index, or control file. Copyright (c) 2002-2010 Red Hat, Inc. -Copyright (c) 2011-2014, PostgreSQL Global Development Group +Copyright (c) 2011-2015, PostgreSQL Global Development Group This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by diff --git a/pg_filedump.c b/pg_filedump.c index 6cd02a9..1dfc524 100644 --- a/pg_filedump.c +++ b/pg_filedump.c @@ -3,7 +3,7 @@ *formatting heap (data), index and control files. * * Copyright (c) 2002-2010 Red Hat, Inc. - * Copyright (c) 2011-2014, PostgreSQL Global Development Group + * Copyright (c) 2011-2015, PostgreSQL Global Development Group * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by @@ -24,7 +24,7 @@ #include pg_filedump.h -#include utils/pg_crc_tables.h +#include utils/pg_crc.h /* checksum_impl.h uses Assert, which doesn't work outside the server */ #undef Assert @@ -91,7 +91,7 @@ DisplayOptions(unsigned int validOptions) printf (\nVersion %s (for %s) \nCopyright (c) 2002-2010 Red Hat, Inc. - \nCopyright (c) 2011-2014, PostgreSQL Global Development Group\n, + \nCopyright (c) 2011-2015, PostgreSQL Global Development Group\n, FD_VERSION, FD_PG_VERSION); printf @@ -1132,6 +1132,8 @@ FormatSpecial() strcat(flagString, SPLITEND|); if (btreeSection-btpo_flags BTP_HAS_GARBAGE) strcat(flagString, HASGARBAGE|); + if (btreeSection-btpo_flags BTP_INCOMPLETE_SPLIT) + strcat(flagString, INCOMPLETESPLIT|); if (strlen(flagString)) flagString[strlen(flagString) - 1] = '\0'; @@ -1216,6 +1218,10 @@ FormatSpecial() strcat(flagString, LIST|); if (ginSection-flags GIN_LIST_FULLROW) strcat(flagString, FULLROW|); + if (ginSection-flags GIN_INCOMPLETE_SPLIT) + strcat(flagString, INCOMPLETESPLIT|); + if (ginSection-flags GIN_COMPRESSED) + strcat(flagString, COMPRESSED|); if (strlen(flagString)) flagString[strlen(flagString) - 1] = '\0'; printf( GIN Index Section:\n @@ -1340,9 +1346,9 @@ FormatControl() char *dbState; /* Compute a local copy of the CRC to verify the one on disk */ - INIT_CRC32(crcLocal); - COMP_CRC32(crcLocal, buffer, offsetof(ControlFileData, crc)); - FIN_CRC32(crcLocal); + INIT_CRC32C(crcLocal); + COMP_CRC32C(crcLocal, buffer, offsetof(ControlFileData, crc)); + FIN_CRC32C(crcLocal); /* Grab a readable version of the database state */ switch (controlData-state) @@ -1353,6 +1359,9 @@ FormatControl() case DB_SHUTDOWNED: dbState = SHUTDOWNED; break; + case DB_SHUTDOWNED_IN_RECOVERY: + dbState = SHUTDOWNED_IN_RECOVERY; + break; case DB_SHUTDOWNING: dbState = SHUTDOWNING; break; @@ -1400,7 +1409,7 @@ FormatControl() Maximum Index Keys: %u\n TOAST Chunk Size: %u\n Date and Time Type Storage:
Re: [HACKERS] 9.5 release notes
On Thu, Aug 6, 2015 at 10:35:50PM -0400, Bruce Momjian wrote: On Thu, Aug 6, 2015 at 06:42:38PM -0700, Peter Geoghegan wrote: On Thu, Aug 6, 2015 at 6:08 PM, Bruce Momjian br...@momjian.us wrote: I though tabout this, and it is really an issue for FDW authors, not for end users, so I put this text in the Source Code changes section: I carefully considered where to put it, and chose the compatibility section based on the precedent of this 9.4 item: Foreign data wrappers that support updating foreign tables must consider the possible presence of AFTER ROW triggers (Noah Misch) I don't suppose it matters much, though. I can close out that open item now. Oh, I think that 9.4 is in the wrong section, but good you researched it. Actually, 9.4 and 9.5 are both in the right sections. Users create triggers, but only people modifying the source code create FDWs, so they do belong in different sections. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release notes
On 7 August 2015 at 14:24, Bruce Momjian br...@momjian.us wrote: On Tue, Jun 30, 2015 at 09:00:44PM +0200, Andres Freund wrote: * 2014-12-08 [519b075] Simon ..: Use GetSystemTimeAsFileTime directly in win32 2014-12-08 [8001fe6] Simon ..: Windows: use GetSystemTimePreciseAsFileTime if .. Timer resolution isn't a unimportant thing for people using explain? This all seemed very internals-only, e.g.: On most Windows systems this change will actually have no significant effect on timestamp resolution as the system timer tick is typically between 1ms and 15ms depending on what timer resolution currently running applications have requested. You can check this with clockres.exe from sysinternals. Despite the platform limiation this change still permits capture of finer timestamps where the system is capable of producing them and it gets rid of an unnecessary syscall. Was I wrong? This does have a user visible change. Timestamps are now likely to have 6 digits after the decimal point, if they're on a version of windows which supports GetSystemTimePreciseAsFileTime(); Master: postgres=# select now(); now --- 2015-08-09 01:14:01.959645+12 (1 row) 9.4.4 postgres=# select now(); now 2015-08-09 01:15:09.783+12 (1 row) Regards David Rowley -- David Rowley http://www.2ndQuadrant.com/ http://www.2ndquadrant.com/ PostgreSQL Development, 24x7 Support, Training Services
Re: [HACKERS] WIP: SCRAM authentication
On Fri, Aug 7, 2015 at 6:54 PM, Michael Paquier michael.paqu...@gmail.com wrote: This filtering machinery definitely looks like a GUC to me, something like password_forbidden_encryption that PASSWORD VERIFIERS looks at and discards the methods listed in there. This definitely needs to be separated from password_encryption. I don't know what a password verifier is and I bet nobody else does either. Well, I think I sort of know: I think it's basically an encrypted password. Am I right? Even if I am, I bet the average user is going to scratch their head and punt. I don't see that there's any good reason to allow the same password to be stored in the catalog encrypted more than one way, and I don't think there's any good reason to introduce the PASSWORD VERIFIER terminology. I think we should store (1) your password, either encrypted or unencrypted; and (2) the method used to encrypt it. And that's it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tap tests remove working directories
On Fri, Aug 7, 2015 at 7:17 PM, Andrew Dunstan and...@dunslane.net wrote: That certainly isn't what happens, and given the way this is done in TestLib.pm, using the CLEANUP parameter of File::Temp's tempdir() function, it's not clear how we could do that easily. shot-in-the-dark Set cleanup to false and manually remove the directory later in the code, so that stuff runs only if we haven't died sooner? /shot-in-the-dark -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] statement_timeout affects query results fetching?
On Sat, Aug 8, 2015 at 5:31 AM, Shay Rojansky r...@roji.org wrote: Hi everyone, I'm seeing some strange behavior and wanted to confirm it. When executing a query that selects a long result set, if the code processing the results takes its time (i.e.g more than statement_timeout), a timeout occurs. My expectation was that statement_timeout only affects query *processing*, and does not cover the frontend actually processing the result. First, I wanted to confirm that this is the case (and not some sort of bug in my code). If this is the case, is this intended? Here are some points: * It makes statement_timeout very difficult to use; clients do very diverse things with database results, it may be very difficult (or impossible) to estimate how much time they should take (e.g. random load factors on the client machine or on some other machine receiving results). * It makes it impossible to specifically detect problematic *queries* which take too long to execute (as opposed to the time taken to process their results). If you do insist that this behavior is correct, a documentation update for statement_timeout might make this clearer. I think the issue here is that we start returning rows to the client while the query is still executing. Suppose each row of output takes 100 ms of CPU time to generate and there are 1,000,000 rows. Then it's quite conceivable that we could be under the statement_timeout when we start returning rows to the client, but over the statement_timeout by the time we finish -- and the user would probably want statement_timeout to kick in in that case, because that's a lotta CPU time. I suppose we could try to toll statement_timeout while we're blocked waiting for the client, but nobody wrote the code for that yet. And it would mean that you can't use statement_timeout to prevent xmin from lagging, which could be why you set it in the first place. There might also be some usability difficulties: pg_stat_activity shows the time the query started, so if you know what statement_timeout is you can tell how close it is to being killed. If some of the time isn't counted, then you can't tell any more. Another approach (which I think might be better) is to have GUCs like statement_cpu_limit and statement_io_limit that kill a query when it uses more than the configured amount of that resource. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [DOCS] max_worker_processes on the standby
On Tue, Aug 4, 2015 at 6:13 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: I think it's totally reasonable for the standby to follow the master's behavior rather than the config file. That should be documented, but otherwise, no problem. If it were technologically possible for the standby to follow the config file rather than the master in all cases, that would be fine, too. But the current behavior is somewhere in the middle, and that doesn't seem like a good plan. So I discussed this with Petr. He points out that if we make the standby follows the master, then the problem would be the misbehavior that results once the standby is promoted: at that point the standby would no longer follow the master and would start with the feature turned off, which could be disastrous (depending on what are you using the commit timestamps for). That seems like an imaginary problem. If it's critical to have commit timestamps, don't turn them off on the standby. To solve that problem, you could suggest that if we see the setting turned on in pg_control then we should follow that instead of the config file; but then the problem is that there's no way to turn the feature off. And things are real crazy by then. There's no existing precedent for a feature that lets the standby be different from the master *in any way*. So I don't see why we should start here. I think the reasonable definition is that the GUC controls whether the master tries to update the SLRU (and generate appropriate WAL records, presumably). The standby should not get a choice about whether to replay those WAL records. Note that if you do allow the standby to decide not to replay the WAL records for this feature, then the data on the standby could be partially there but not completely there after promotion, because the DBA might have flipped the switch on and off at different times. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch for ginCombineData
On Wed, Aug 5, 2015 at 6:17 AM, Robert Abraham robert.abraha...@googlemail.com wrote: we are using gin indexes on big tables. these tables happen to have several billion rows. the index creation fails in ginCombineData in src/backend/access/ginbulk.c because repalloc is limited to 1 gb. this limitation makes no sense in this context (compare comments in src/include/utils/memutils.h). To overcome this limitation on tables with lots of rows repalloc_huge is used. The test suite still succeeds on my machine. Find the patch attached, Since nobody seems ready to act on this patch immediately, I suggest adding it here so it doesn't get forgotten: https://commitfest.postgresql.org/action/commitfest_view/open -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add column-name hint to log messages generated by inserts when varchars don't fit
On Wed, Aug 5, 2015 at 6:39 AM, Stepan Rutz stepan.r...@gmx.de wrote: on our production servers I have quite some errors due to excessively long varchar-values which application-code tries to insert into tables and which don't fit. The errors look like ERROR: value too long for type character varying(4) This is not helping me much. The patch will turn this too ERROR: value too long for type character varying(4) (hint: column-name is mycolumn) if the column that was overflown was mycolumn. expression_tree_walker is used in enough different places that you can't really modify that like this. It'll have too many side effects that may not be good. Generally, the way to stick some useful information into the error message is with an error context callback, rather than by appending to the primary error message. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dependency between bgw_notify_pid and bgw_flags
On Wed, Aug 5, 2015 at 3:33 AM, Ashutosh Bapat ashutosh.ba...@enterprisedb.com wrote: This idea looks good. Thanks. It needs testing though to see if it really works as intended. Can you look into that? Looking at larger picture, we should also enable this feature to be used by auxilliary processes. It's very hard to add a new auxilliary process in current code. One has to go add code at many places to make sure that the auxilliary processes die and are re-started correctly. Even tougher to add a parent auxilliary process, which spawns multiple worker processes.That would be whole lot simpler if we could allow the auxilliary processes to use background worker infrastructure (which is what they are utlimately). That's a separate patch, but, sure, we could do that. I agree with Alvaro's comments: the postmaster should start all children. Other processes should just request that it do so. We have two mechanisms for that right now: the one used by bgworkers, and the one used by the AV launcher. BGWORKER_SHMEM_ACCESS has similar usage, except that it resets the on exit callbacks and detaches the shared memory segment from the background worker. That avoids a full cluster restart when one of those worker which can not corrupt shared memory dies. But I do not see any check to prevent such backend from calling PGSharedMemoryReattach() There isn't, but you shouldn't do that. :-) This is C code; you can't protect against actively malicious code. So it looks like, it suffices to assume that background worker either needs to access shared memory or doesn't. Any background worker having shared memory access can also access database and thus becomes part of the backend list. Or may be we just avoid these flags and treat every background worker as if it passed both these flags. That will simplify a lot of code. I think it's useful to support workers that don't have shared memory access at all, because those can crash without causing a system-wide reset. But I don't see the point in distinguishing between workers with shared-memory access and those with a database connection. I mean, obviously the worker needs to be able to initialize itself either way, but there seems to be no reason to force that to be signalled in bgw_flags. It can just depend on whether BackgroundWorkerInitializeConnection gets called. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [sqlsmith] subplan variable reference / unassigned NestLoopParams
Tom Lane writes: Andreas Seltenreich seltenre...@gmx.de writes: Tom Lane writes: Well, I certainly think all of these represent bugs: 3 | ERROR: plan should not reference subplan's variable 2 | ERROR: failed to assign all NestLoopParams to plan nodes These appear to be related. The following query produces the former, but if you replace the very last reference of provider with the literal 'bar', it raises the latter error. Fixed that, thanks for the test case! I haven't seen the former since your commit, but there recently were some instances of the latter. The attached queries all throw the error against master at 8752bbb. regards, Andreas -- ERROR: failed to assign all NestLoopParams to plan nodes select subq_19608926.c0 as c0, rel_124211605.objname as c1 from (select rel_124211602.numeric_precision as c0 from information_schema.sequences as rel_124211602 where EXISTS ( select rel_124211603.unique_constraint_name as c0 from information_schema.referential_constraints as rel_124211603 where rel_124211603.unique_constraint_schema is not NULL)) as subq_19608926 left join public.tab1 as rel_124211604 on (subq_19608926.c0 = rel_124211604.a ) right join pg_catalog.pg_seclabels as rel_124211605 right join public.phone as rel_124211606 on (rel_124211605.objtype = rel_124211606.comment ) on (rel_124211604.b = rel_124211605.objtype ) inner join public.bt_i4_heap as rel_124211729 inner join public.bt_i4_heap as rel_124211730 on (rel_124211729.random = rel_124211730.seqno ) on (subq_19608926.c0 = rel_124211730.seqno ) where rel_124211606.comment = rel_124211604.b; -- ERROR: failed to assign all NestLoopParams to plan nodes select subq_53656269.c0 as c0 from (select rel_339945676.id3c as c0 from public.rule_and_refint_t3 as rel_339945676 where rel_339945676.data !~~ rel_339945676.data) as subq_53656269 inner join public.dropcolumn as rel_339945677 on (subq_53656269.c0 = rel_339945677.b ) inner join public.bt_name_heap as rel_339945678 left join public.rtest_order2 as rel_339945705 inner join information_schema.sequences as rel_339945706 on (rel_339945705.a = rel_339945706.numeric_precision ) inner join public.num_result as rel_339945707 on (rel_339945705.b = rel_339945707.id1 ) on (rel_339945678.random = rel_339945706.numeric_precision ) on (rel_339945677.b = rel_339945706.numeric_precision ) where rel_339945678.seqno ~* rel_339945705.c fetch first 45 rows only; -- ERROR: failed to assign all NestLoopParams to plan nodes select rel_273437910.name as c0, rel_273437908.sequence_catalog as c1, rel_273437865.b as c2, rel_273437910.location as c3, rel_273437864.id2c as c4, rel_273437908.start_value as c5 from public.rule_and_refint_t2 as rel_273437864 inner join public.ruletest_tbl as rel_273437865 on (rel_273437864.id2c = rel_273437865.a ) inner join information_schema.sequences as rel_273437908 inner join public.rules_log as rel_273437909 on (rel_273437908.numeric_precision_radix = rel_273437909.f1 ) right join public.emp as rel_273437910 on (rel_273437909.tag = rel_273437910.name ) on (rel_273437865.a = rel_273437908.numeric_precision ) where rel_273437909.tag !~* rel_273437909.tag fetch first 110 rows only; -- ERROR: failed to assign all NestLoopParams to plan nodes select rel_156464410.minimum_value as c0 from public.rule_and_refint_t1 as rel_156464330 inner join public.rules_src as rel_156464331 on (rel_156464330.id1b = rel_156464331.f1 ) inner join public.main_table as rel_156464401 inner join pg_catalog.pg_file_settings as rel_156464402 left join public.person as rel_156464409 inner join information_schema.sequences as rel_156464410 on (rel_156464409.age = rel_156464410.numeric_precision ) on (rel_156464402.sourceline = rel_156464409.age ) on (rel_156464401.a = rel_156464402.sourceline ) on (rel_156464331.f1 = rel_156464410.numeric_precision ) where rel_156464402.sourcefile @@ rel_156464409.name fetch first 155 rows only; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: SCRAM authentication
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 08/08/2015 06:27 AM, Robert Haas wrote: terminology. I think we should store (1) your password, either encrypted or unencrypted; and (2) the method used to encrypt it. And that's it. A petty complaint, but it has always bothered me that we say the password is encrypted when, at least currently, it is a simple hash (cryptographic hash yes, but not encrypted, and not even an HMAC). I think we should try to start using accurate terminology. - -- Joe Conway Crunchy Data Enterprise PostgreSQL http://crunchydata.com -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAEBAgAGBQJVxhm7AAoJEDfy90M199hlWRsQAIPliVReOYRRD/BJaZlB9Vjs 4YDolZZD9zR2+VPNxG/VaGHJ68rXlnfU/P0GrIQrS67t1xgwPxbUW6TCBsXJDnIE wo7i5mJn9yn+AowccFiZTToKK8oNjRd33OJ2q00lAGiuaksnBhcJjMCNUHqf1Oz2 rUA/YiTp7RHXOQfiAxSoMKytK2y+rnQA+rnvPiE7XLKYE9rZ5rLiGhV0MPaNOFms aHZIcYX5Tl2I3RsCexLMMA1qM001wSTyoti7o9gL71EXLV6ea6xt10a++k6oJ19y oU7WjwKgV2XOGlQNC3/rUEKvuAtQhTlJpx9Q6xmTYidN0QHkZDdpJUblGZoxR2Vz lT2zZdcpDhENynFZ1nTsd+CNWsn5T5vTVgnuKpG5qIMgT+kSG2JeiS7h+RY4rRtk bl08tZmQBUBu/3hrRxQVPrt1NISteKXem2OLGphIKQEOmu/Kf43msYHQ+1qY0FTB TZ96tVJnYTjQZp2P0IdjMf0qpOzK8qkMx2Tb6WehMd9yD1DtxQyKmxGpvssgEmQ7 1n3L/HCKWXF0MbI8QefIsO70ft4hzib5V+G7YmF00dWQM7NhDZYf6ejn1WmCP26u w9wOHQcCAAKPI2knh3k2Ngdynl8Gofkxr7Le+NW7TGM+bp2U5EStTEH0r70mzEIg KvB4dWX+tlZowujUmFhL =VDCN -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: SCRAM authentication
* Robert Haas (robertmh...@gmail.com) wrote: On Fri, Aug 7, 2015 at 6:54 PM, Michael Paquier michael.paqu...@gmail.com wrote: This filtering machinery definitely looks like a GUC to me, something like password_forbidden_encryption that PASSWORD VERIFIERS looks at and discards the methods listed in there. This definitely needs to be separated from password_encryption. I don't know what a password verifier is and I bet nobody else does either. Well, I think I sort of know: I think it's basically an encrypted password. Am I right? Even if I am, I bet the average user is going to scratch their head and punt. Password verifier is actually a well understood term when it comes to these protocols and their implementations. It is not an encrypted password but rather a value which allows the server to determine if the client knows the correct password, without having to store the password directly, or a simple hash of the password, or have the clear password sent from the client sent to the server. I don't see that there's any good reason to allow the same password to be stored in the catalog encrypted more than one way, and I don't think there's any good reason to introduce the PASSWORD VERIFIER terminology. I think we should store (1) your password, either encrypted or unencrypted; and (2) the method used to encrypt it. And that's it. Perhaps we don't want to expose what a password verifier is to users, but we shouldn't be missing the distinction between hashed passwords, encrypted passwords, and password verifiers in the code and in the implementation of SCRAM. We really shouldn't use an incorrect term for what we're storing in pg_authid either though, which is what we do today. I can't see us ever storing encrypted passwords as that implies we'd need a key stored somewhere and further that the server would be able to get back to the user's original password, neither of which are things we want to deal with. You do have a good point that there is some risk associated with having multiple values in pg_authid related to a user's password and that we really want to help users move from the old value in pg_authid to the new one. I don't believe we should force a hard change as it's going to cause a lot of trouble for users. We have to also consider that clients also have to be changed for this. As discussed previously, in an ideal world, we would handle the old values and the new ones while introducing password ageing, client support for detecting that a password needs to be changed, protocol support for changing passwords which avoids having them get logged in cleartext to the server log, password complexity, and perhaps used password history to boot. The main issue here is that we really don't provide any help for large installations to get their userbase moved off of the old style today. Password ageing (and good support for it in clients, etc), would help that greatly. Unfortunately, it's unlikely that we're going to get all of that done in one release. As such, I'd suggest our next release support the existing values in pg_authid, add the password verifier when the password is changed, and then add a GUC in the following release which disables the old pg_authid mechanism, defaulting to true, and the release after that remove support for the old value and the field for it completely. We should also provide documentation about how to check if there are any old style values, for users who want to be proactive about moving off of the old style. I'm travelling and so I haven't looked over the patch yet (or even read the entire thread in depth), so apologies if I've got something confused about what's being proposed. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] statement_timeout affects query results fetching?
Shay Rojansky r...@roji.org writes: Hi everyone, I'm seeing some strange behavior and wanted to confirm it. When executing a query that selects a long result set, if the code processing the results takes its time (i.e.g more than statement_timeout), a timeout occurs. My expectation was that statement_timeout only affects query *processing*, and does not cover the frontend actually processing the result. Are you using a cursor, or something like that? libpq ordinarily absorbs the result set as fast as possible and then hands it back to the application as one blob; the time subsequently spent by the application looking at the blob doesn't count against statement_timeout. As Robert says, statement_timeout *does* include time spent sending the result set to the client, and we're not going to change that, because it would be too hard to disentangle calculation from I/O. So if the client isn't prompt about absorbing all the data then you have to factor that into your setting. But it would be a slightly unusual usage pattern AFAIK. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] statement_timeout affects query results fetching?
Thanks for your responses. I'm not using cursors or anything fancy. The expected behavior (as far as I can tell) for a .NET database driver is to read one row at a time from the database and make it available. There's even a standard API option for fetching data on a column-by-column basis: this allows the user to not hold the entire row in memory (imagine rows with megabyte-sized columns). This makes sense to me; Tom, doesn't the libpq behavior you describe of absorbing the result set as fast as possible mean that a lot of memory is wasted on the client side? I'd be interested in your take on this. I can definitely appreciate the complexity of changing this behavior. I think that some usage cases (such as mine) would benefit from a timeout on the time until the first row is sent, this would allow to put an upper cap on stuff like query complexity, for example. Shay On Sat, Aug 8, 2015 at 5:13 PM, Tom Lane t...@sss.pgh.pa.us wrote: Shay Rojansky r...@roji.org writes: Hi everyone, I'm seeing some strange behavior and wanted to confirm it. When executing a query that selects a long result set, if the code processing the results takes its time (i.e.g more than statement_timeout), a timeout occurs. My expectation was that statement_timeout only affects query *processing*, and does not cover the frontend actually processing the result. Are you using a cursor, or something like that? libpq ordinarily absorbs the result set as fast as possible and then hands it back to the application as one blob; the time subsequently spent by the application looking at the blob doesn't count against statement_timeout. As Robert says, statement_timeout *does* include time spent sending the result set to the client, and we're not going to change that, because it would be too hard to disentangle calculation from I/O. So if the client isn't prompt about absorbing all the data then you have to factor that into your setting. But it would be a slightly unusual usage pattern AFAIK. regards, tom lane
Re: [HACKERS] statement_timeout affects query results fetching?
I'd also recommend adding a sentence about this aspect of statement_timeout in the docs to prevent confusion... On Sat, Aug 8, 2015 at 5:30 PM, Shay Rojansky r...@roji.org wrote: Thanks for your responses. I'm not using cursors or anything fancy. The expected behavior (as far as I can tell) for a .NET database driver is to read one row at a time from the database and make it available. There's even a standard API option for fetching data on a column-by-column basis: this allows the user to not hold the entire row in memory (imagine rows with megabyte-sized columns). This makes sense to me; Tom, doesn't the libpq behavior you describe of absorbing the result set as fast as possible mean that a lot of memory is wasted on the client side? I'd be interested in your take on this. I can definitely appreciate the complexity of changing this behavior. I think that some usage cases (such as mine) would benefit from a timeout on the time until the first row is sent, this would allow to put an upper cap on stuff like query complexity, for example. Shay On Sat, Aug 8, 2015 at 5:13 PM, Tom Lane t...@sss.pgh.pa.us wrote: Shay Rojansky r...@roji.org writes: Hi everyone, I'm seeing some strange behavior and wanted to confirm it. When executing a query that selects a long result set, if the code processing the results takes its time (i.e.g more than statement_timeout), a timeout occurs. My expectation was that statement_timeout only affects query *processing*, and does not cover the frontend actually processing the result. Are you using a cursor, or something like that? libpq ordinarily absorbs the result set as fast as possible and then hands it back to the application as one blob; the time subsequently spent by the application looking at the blob doesn't count against statement_timeout. As Robert says, statement_timeout *does* include time spent sending the result set to the client, and we're not going to change that, because it would be too hard to disentangle calculation from I/O. So if the client isn't prompt about absorbing all the data then you have to factor that into your setting. But it would be a slightly unusual usage pattern AFAIK. regards, tom lane
Re: [HACKERS] tap tests remove working directories
On 08/08/2015 09:31 AM, Robert Haas wrote: On Fri, Aug 7, 2015 at 7:17 PM, Andrew Dunstan and...@dunslane.net wrote: That certainly isn't what happens, and given the way this is done in TestLib.pm, using the CLEANUP parameter of File::Temp's tempdir() function, it's not clear how we could do that easily. shot-in-the-dark Set cleanup to false and manually remove the directory later in the code, so that stuff runs only if we haven't died sooner? /shot-in-the-dark Yeah, maybe. I'm thinking of trying to do it more globally, like in src/Makefile.global.in. That way we wouldn't have to add new code to every test file. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Test code is worth the space
We've too often criticized patches for carrying many lines/bytes of test case additions. Let's continue to demand debuggable, secure tests that fail only when something is wrong, but let's stop pushing for test minimalism. Such objections may improve the individual patch, but that doesn't make up for the chilling effect on test contributions. I remember clearly the first time I submitted thorough test coverage with a feature. Multiple committers attacked it in the name of brevity. PostgreSQL would be better off with 10x its current test bulk, even if the average line of test code were considerably less important than we expect today. Thanks, nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: SCRAM authentication
On 08/08/2015 04:27 PM, Robert Haas wrote: I don't see that there's any good reason to allow the same password to be stored in the catalog encrypted more than one way, Sure there is. If you want to be able to authenticate using different mechanism, you need the same password encrypted in different ways. SCRAM uses verifier that's derived from the password in one way, MD5 authentication needs an MD5 hash, and yet other protocols have other requirements. and I don't think there's any good reason to introduce the PASSWORD VERIFIER terminology. I think we should store (1) your password, either encrypted or unencrypted; and (2) the method used to encrypt it. And that's it. Like Joe and Stephen, I actually find it highly confusing that we call the MD5 hash an encrypted password. The term password verifier is fairly common in the specifications of authentication mechanisms. I think we should adopt it. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] checkpointer continuous flushing
On 07/26/2015 06:01 PM, Fabien COELHO wrote: Attached is very minor v5 update which does a rebase completes the cleanup of doing a full sort instead of a chuncked sort. Some thoughts on this: * I think we should drop the flush part of this for now. It's not as clearly beneficial as the sorting part, and adds a great deal more code complexity. And it's orthogonal to the sorting patch, so we can deal with it separately. * Is it really necessary to parallelize the I/O among tablespaces? I can see the point, but I wonder if it makes any difference in practice. * Is there ever any harm in sorting the buffers? The GUC is useful for benchmarking, but could we leave it out of the final patch? * Do we need to worry about exceeding the 1 GB allocation limit in AllocateCheckpointBufferIds? It's enough got 2 TB of shared_buffers. That's a lot, but it's not totally crazy these days that someone might do that. At the very least, we need to lower the maximum of shared_buffers so that you can't hit that limit. I ripped out the flushing part, keeping only the sorting. I refactored the logic in BufferSync() a bit. There's now a separate function, nextCheckpointBuffer(), that returns the next buffer ID from the sorted list. The tablespace-parallelization behaviour in encapsulated there, keeping the code in BufferSync() much simpler. See attached. Needs some minor cleanup and commenting still before committing, and I haven't done any testing besides a simple make check. - Heikki diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index e900dcc..1cec243 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -2454,6 +2454,28 @@ include_dir 'conf.d' /listitem /varlistentry + varlistentry id=guc-checkpoint-sort xreflabel=checkpoint_sort + termvarnamecheckpoint_sort/varname (typebool/type) + indexterm + primaryvarnamecheckpoint_sort/ configuration parameter/primary + /indexterm + /term + listitem + para +Whether to sort buffers before writting them out to disk on checkpoint. +For a HDD storage, this setting allows to group together +neighboring pages written to disk, thus improving performance by +reducing random write activity. +This sorting should have limited performance effects on SSD backends +as such storages have good random write performance, but it may +help with wear-leveling so be worth keeping anyway. +The default is literalon/. +This parameter can only be set in the filenamepostgresql.conf/ +file or on the server command line. + /para + /listitem + /varlistentry + varlistentry id=guc-checkpoint-warning xreflabel=checkpoint_warning termvarnamecheckpoint_warning/varname (typeinteger/type) indexterm diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml index e3941c9..f538698 100644 --- a/doc/src/sgml/wal.sgml +++ b/doc/src/sgml/wal.sgml @@ -546,6 +546,18 @@ /para para + When hard-disk drives (HDD) are used for terminal data storage + xref linkend=guc-checkpoint-sort allows to sort pages + so that neighboring pages on disk will be flushed together by + chekpoints, reducing the random write load and improving performance. + If solid-state drives (SSD) are used, sorting pages induces no benefit + as their random write I/O performance is good: this feature could then + be disabled by setting varnamecheckpoint_sort/ to valueoff/. + It is possible that sorting may help with SSD wear leveling, so it may + be kept on that account. + /para + + para The number of WAL segment files in filenamepg_xlog/ directory depends on varnamemin_wal_size/, varnamemax_wal_size/ and the amount of WAL generated in previous checkpoint cycles. When old log diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 68e33eb..bee38ab 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -7995,11 +7995,13 @@ LogCheckpointEnd(bool restartpoint) sync_secs, total_secs, longest_secs, +sort_secs, average_secs; int write_usecs, sync_usecs, total_usecs, longest_usecs, +sort_usecs, average_usecs; uint64 average_sync_time; @@ -8030,6 +8032,10 @@ LogCheckpointEnd(bool restartpoint) CheckpointStats.ckpt_end_t, total_secs, total_usecs); + TimestampDifference(CheckpointStats.ckpt_sort_t, + CheckpointStats.ckpt_sort_end_t, + sort_secs, sort_usecs); + /* * Timing values returned from CheckpointStats are in microseconds. * Convert to the second plus microsecond form that TimestampDifference @@ -8048,8 +8054,8 @@ LogCheckpointEnd(bool restartpoint) elog(LOG, %s complete: wrote %d buffers (%.1f%%); %d transaction log file(s) added, %d removed, %d recycled; - write=%ld.%03d s, sync=%ld.%03d s, total=%ld.%03d s; - sync files=%d,
Re: [HACKERS] 9.5 release notes
On Fri, Aug 7, 2015 at 09:02:09PM +0200, Andres Freund wrote: With regard to the point about the number of buffer mappings: This has forced peoples sites down. People have found this out themselves and patched postgres. That's an entirely different league. I was not aware of the magnitude of this issue. 9.5 release note item attached and applied. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + diff --git a/doc/src/sgml/release-9.5.sgml b/doc/src/sgml/release-9.5.sgml new file mode 100644 index 0786a20..bbaa886 *** a/doc/src/sgml/release-9.5.sgml --- b/doc/src/sgml/release-9.5.sgml *** FIXME: Add Andres *** 463,468 --- 463,482 /para /listitem + listitem +para + !-- + 2014-10-02 [3acc10c9] Robert..: Increase the number of buffer mapping partitio.. + -- + Increase the number of buffer mapping partitions (Amit Kapila, + Andres Freund, Robert Haas) +/para + +para + This improves performance for highly concurrent workloads. +/para + /listitem + /itemizedlist /sect4 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: SCRAM authentication
On Sat, Aug 8, 2015 at 6:23 PM, Heikki Linnakangas hlinn...@iki.fi wrote: Like Joe and Stephen, I actually find it highly confusing that we call the MD5 hash an encrypted password. The term password verifier is fairly common in the specifications of authentication mechanisms. I think we should adopt it. Speaking as someone who hasn't read the specifications I found password verifier surprising. I would have known what password hash was but I misread verifier to be something functional like a PAM plugin. I tend to agree we should just use terminology out of the specs though even if it's a little opaque, better one opaque piece of terminology than having to learn and translate between multiple terminologies. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release notes
On Fri, Aug 7, 2015 at 09:02:09PM +0200, Andres Freund wrote: On 2015-08-07 14:43:11 -0400, Bruce Momjian wrote: Well, we could just throw a Postgres 9.5 is faster release note item in there and call it a day. ;-) Based on my experience one of the prime reason people move to a new version of postgres is performance. And it's not like 'faster!!!' is really helpful for them to evaluate the benefits appropriately. A scalability improvement isn't going to help somebody concerned with single query performance. Somebody concerned about OLTP write performance isn't going to be overly excited about the sort performance improvements, but somebody doing OLAP style queries very much so. The consequence of not listing that is that we're essentially asking our users to read through all the changes between two releases. Something that takes a long while even for people like you and me who are very familiar with the project.. Well, considering I have used the same criteria for years, and I am only now hearing complaints, I am unsure about the statement that our existing criteria isn't generally meeting people's needs. The *by far* biggest complaint about upgrades with postgres isn't binary compatibility (even before pg_upgrade), it's not that there's minor incompatibilities (at least not since 8.3, and maybe bytea_output). It's that previously working queries don't work anymore. It's always just a minority, but they're there. And it's very hard to figure out what's up. Is it stats? Different settings? Planner changes? If we then don't list changes to the planner, they're screwed. Well, if we do list them, is that going to help people? You can say, well it might, but we are then in the same logic we use to decide on adding GUC entries --- we have to weigh the value of having the entry vs the overhead of everyone reading the entry. Now, in this case, it is a one-time read vs. something that we will keep for years, but the basic decision process is the same. And, again, I will say, we are not writing this for ourselves, but for the general user. In comparison to that just about nobody cares about the rest of the update but new SQL level stuff and a few other major things? A new field in EXPLAIN, debug_assertions read only, effective_io_concurrency settable without effect, VACUUM logs new one more data point, an 10+ year old obsolete undocumented option being removed, new icons for binaries. They just don't matter. Well, if we have user-visible behavior, and we don't tell them about it, they are never going to be able to use it, so it is hard to see how we could avoid mentioning those. What I _am_ saying is that you should use the same criteria I am using, and just disagree on the place for the line, rather than use a different criteria, which will lead to perpetual complaints. We can change the criteria, but that is a different discussion. We need to change that criteria then. Then you need to start a new thread on that topic to get community agreement that I can implement, and we can probably change 9.5 to match. You might also want to address the fact we don't list all bug fixes in the release notes either if the bug is a rare, minor event, and/or if the new error message is sufficient communication to users. One way of minimizing the downside of any new such entries is to have a Minor performance improvements or Internal performance improvements section in the release notes so people will realize they are not of the same import as other items --- same for possible new bug fix listings. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Test code is worth the space
On Sat, Aug 8, 2015 at 9:47 AM, Noah Misch n...@leadboat.com wrote: We've too often criticized patches for carrying many lines/bytes of test case additions. Let's continue to demand debuggable, secure tests that fail only when something is wrong, but let's stop pushing for test minimalism. Such objections may improve the individual patch, but that doesn't make up for the chilling effect on test contributions. I remember clearly the first time I submitted thorough test coverage with a feature. Multiple committers attacked it in the name of brevity. PostgreSQL would be better off with 10x its current test bulk, even if the average line of test code were considerably less important than we expect today. I strongly agree. I am fond of the example of external sorting, which at present has exactly zero test coverage, even though in production those code paths are exercised all the time. I think that there needs to be a way of running an extended set of regression tests. I could definitely respect the desire for minimalism when it comes to adding tests to the regression tests proper if there was an extended set of tests that could be run during development less frequently. I thought about doing the extended set as a satellite project, but that may not be workable. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Test code is worth the space
On 08/08/2015 12:24 PM, Peter Geoghegan wrote: I think that there needs to be a way of running an extended set of regression tests. I could definitely respect the desire for minimalism when it comes to adding tests to the regression tests proper if there was an extended set of tests that could be run during development less frequently. I thought about doing the extended set as a satellite project, but that may not be workable. There already is, isn't there? All of those named sets of regression tests which aren't run by default. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Test code is worth the space
On 08/08/2015 05:09 PM, Josh Berkus wrote: On 08/08/2015 12:24 PM, Peter Geoghegan wrote: I think that there needs to be a way of running an extended set of regression tests. I could definitely respect the desire for minimalism when it comes to adding tests to the regression tests proper if there was an extended set of tests that could be run during development less frequently. I thought about doing the extended set as a satellite project, but that may not be workable. There already is, isn't there? All of those named sets of regression tests which aren't run by default. Exactly. And there is nothing to stop us expanding those. For example, it might be useful to have a suite that provably touched every code path, or something close to it. Incidentally, making the buildfarm client run extra sets of tests in the main tree is almost completely trivial. Making it run tests from somewhere else, such as a different git repo, is only slightly harder. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release notes
On Sat, Aug 8, 2015 at 02:49:21PM -0400, Bruce Momjian wrote: What I _am_ saying is that you should use the same criteria I am using, and just disagree on the place for the line, rather than use a different criteria, which will lead to perpetual complaints. We can change the criteria, but that is a different discussion. We need to change that criteria then. Then you need to start a new thread on that topic to get community agreement that I can implement, and we can probably change 9.5 to match. You might also want to address the fact we don't list all bug fixes in the release notes either if the bug is a rare, minor event, and/or if the new error message is sufficient communication to users. One way of minimizing the downside of any new such entries is to have a Minor performance improvements or Internal performance improvements section in the release notes so people will realize they are not of the same import as other items --- same for possible new bug fix listings. I have updated src/tools/RELEASE_CHANGES to document the criteria I use to create the major release notes: o new features and options o major performance improvements o bug fixes for serious or common bugs o incompatibilities o major source code changes -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Test code is worth the space
On Sat, Aug 8, 2015 at 8:24 PM, Peter Geoghegan p...@heroku.com wrote: I think that there needs to be a way of running an extended set of regression tests. I could definitely respect the desire for minimalism The larger expense in having extensive test suites is the cost to maintain them. With our current test framework tests have to be fairly carefully written to produce output that isn't very fragile and sensitive to minor changes in the code. In practice this is what really drives the push towards minimal tests. If we tried testing every code path right now -- which I tried to do once for the TOAST code -- what you'll find is that what you're really testing is not that the code is correct but that it does exactly what it does today. At that point the test failures become entirely noise meaning something changed rather than signal meaning something's broken. A better test framework can go a long way towards fixing this. It would be much less of a problem if we had a unit test framework rather than only black box integration tests. That would allow us to test that every function in tuplestore.c meets its contract, not just that some SQL query produces output that's correct and we think happened to go through some code path we were interested in. There are many code paths that will be hard to arrange to reach from SQL and impossible to have any confidence will continue to be reached in the future when the behaviour is intentionally changed. That said, I don't think anyone would object to adding some regression tests that at least test basic correctness of the sorting code. That's a pretty embarrassing gap and iirc the only reason for it is that it would make the regression tests sensitive to collation locale settings. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: SCRAM authentication
On 08/08/2015 10:23 AM, Heikki Linnakangas wrote: On 08/08/2015 04:27 PM, Robert Haas wrote: I don't see that there's any good reason to allow the same password to be stored in the catalog encrypted more than one way, Sure there is. If you want to be able to authenticate using different mechanism, you need the same password encrypted in different ways. SCRAM uses verifier that's derived from the password in one way, MD5 authentication needs an MD5 hash, and yet other protocols have other requirements. That's correct. However, one of the goals of implementing SCRAM authentication is to allow security-conscious users to get rid of those reusable md5 hashes, no? Obviously the backwards-compatibility issues are pretty major ... it'll be years before all drivers support SCRAM ... but we also want to provide a path forwards for secure installations in which no md5 hashes are stored. This says backwards-compatible GUC to me. Here's one idea on how to handle this: 1. we drop the parameter password_encryption 2. we add the parameter password_storage, which takes a list: - plain : plain text - md5 : current md5 hashes - scram : new scram hashed passwords This defaults to 'md5, scram' if not specified. This list might be extended in the future. 3. All password types in the list are generated. This means having multiple columns in pg_shadow, or an array. An array would support the addition of future password storage methods. 4. CREATE ROLE / ALTER ROLE syntax is changed to accept a parameter to ENCRYPTED in order to support md5, scram, and future methods. If no parameter is supplied, ENCRYPTED will default to 'md5, scram'. 5. we add the superuser-only function pg_apply_password_policy(). This applies the policy expressed by password_storage, generating or erasing passwords for each user. 6. We add a new connection error for authentication __method__ not supported for user 7. Two versions from now, we change the defaults. I thought about the idea of determining password storage based on what's in pg_hba.conf, but that seems like way too much implied authorization to me, and liable to be a big foot-gun. --Josh Berkus -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: SCRAM authentication
On Sun, Aug 9, 2015 at 6:51 AM, Josh Berkus j...@agliodbs.com wrote: Obviously the backwards-compatibility issues are pretty major ... it'll be years before all drivers support SCRAM ... but we also want to provide a path forwards for secure installations in which no md5 hashes are stored. This says backwards-compatible GUC to me. Here's one idea on how to handle this: 1. we drop the parameter password_encryption 2. we add the parameter password_storage, which takes a list: - plain : plain text - md5 : current md5 hashes - scram : new scram hashed passwords This defaults to 'md5, scram' if not specified. This list might be extended in the future. Perhaps using a different GUC than password_encryption is safer... I am not that sure. Still that's how I switched password_encryption to actually handle a list. Default is 'md5' with the first patch, and 'md5,scram' with the scram patch added and it sets the list of password verifiers created when PASSWORD with ENCRYPTED/UNENCRYPTED is used. 3. All password types in the list are generated. This means having multiple columns in pg_shadow, or an array. An array would support the addition of future password storage methods. Yeah, the patch switches pg_shadow to an array like that, with as elements method:value, so you get actually md5:md5blah,scram:stuff in all the patches applied. 4. CREATE ROLE / ALTER ROLE syntax is changed to accept a parameter to ENCRYPTED in order to support md5, scram, and future methods. If no parameter is supplied, ENCRYPTED will default to 'md5, scram'. Like password ENCRYPTED (md5,scram) or similar? If no method is passed, I think that we should default to password_storage instead. Also, I still think that something like PASSWORD VERIFIERS is needed, users may want to set the verifier user for each method after calculating it on client-side: we authorize that for md5 even now, and that's not something this spec authorizes. 5. we add the superuser-only function pg_apply_password_policy(). This applies the policy expressed by password_storage, generating or erasing passwords for each user. pg_upgrade could make use of that to control password aging with an option to do the cleanup or not. Not sure what the default should be though. pg_apply_password_policy(roleid) would be useful as well to do it on a role base. 6. We add a new connection error for authentication __method__ not supported for user Hm? This would let any user trying to connect with a given method know that if a method is used or not. What's wrong with failing as we do now. In case of PASSWORD NULL for example, an attempt of connection fails all the time with incorrect password or similar. 7. Two versions from now, we change the defaults. Or three. We cannot expect users to change immediately, and it is wiser to let dust set on the new feature in case critical bugs show up after the first GA. Something that sounds more like a detail in this thread by reading other comments: I think that it is important to store password verifiers in a different catalog than pg_authid for two reasons: - that's more user-friendly, a sysadmin could directly join the new catalog with pg_authid to get all the verifiers for a single user method - at password lookup when authorizing connection, there is no need to fetch all the password verifiers and parse the array with all verifiers. - more scalable if we have many verifier methods in the future, though we are not going to have hundreds of them. Though I am wondering about per-method validtime and per-method authorization options. Regards, -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: SCRAM authentication
It'd be nice if the new auth mechanism supports multiple passwords in the same format as well (not just one per format). That way you could have two different passwords for a user that are active at the same time. This would simplify rolling database credentials as it wouldn't have to be done all at once. You could add the new credentials, update your app servers one by one, then disable the old ones. A lot of systems that use API keys let you see the last time a particular set of keys was used. This helps answer the Is this going to break something if I disable it? question. Having a last used at timestamp for each auth mechanism (per user) would be useful. I'm not sure how updates should work when connecting to a read-only slave though. It would need some way of letting the master know that user X connected using credentials Y. Regards, -- Sehrope Sarkuni Founder CEO | JackDB, Inc. | https://www.jackdb.com/
Re: [HACKERS] WIP: Make timestamptz_out less slow.
On 29 July 2015 at 03:25, Andres Freund and...@anarazel.de wrote: On 2015-07-29 03:10:41 +1200, David Rowley wrote: timestamp_out() = 2015-07-29 02:24:33.34 in 3.506000 timestamp_out_old() = 2015-07-29 02:24:33.034 in 64.518000 timestamp_out_af() = 2015-07-29 02:24:33.034 in 2.981000 timestamp_out_old is master's version, the timestamp_out_af() is yours, and timestamp_out() is my one. times are in seconds to perform 100 million calls. That looks good. So it appears your version is a bit faster than mine, but we're both about 20 times faster than the current one. Also mine needs fixed up as the fractional part is not padded the same as yours, but I doubt that'll affect the performance by much. Worthwhile to finish that bit and try ;) My view: It's probably not worth going quite as far as you've gone for a handful of nanoseconds per call, but perhaps something along the lines of mine can be fixed up. Yes, I agreee that your's is probably going to be fast enough. I took a bit of weekend time to finish this one off. Patch attached. A quick test shows a pretty good performance increase: create table ts (ts timestamp not null); insert into ts select generate_series('2010-01-01 00:00:00', '2011-01-01 00:00:00', '1 sec'::interval); vacuum freeze ts; Master: david=# copy ts to 'l:/ts.sql'; COPY 31536001 Time: 20444.468 ms Patched: david=# copy ts to 'l:/ts.sql'; COPY 31536001 Time: 10947.097 ms There's probably a bit more to squeeze out of this. 1. EncodeDateTime() could return the length of the string to allow callers to use pnstrdup() instead of pstrdup(), which would save the strlen() call. 2. Have pg_int2str_zeropad() and pg_int2str() skip appending the NUL char and leave this up to the calling function. 3. Make something to replace the sprintf(str, %.*s, MAXTZLEN, tzn); call. Perhaps 1 and 3 are worth looking into, but I think 2 is likely too error prone for the small gain we'd get from it. Also I was not too sure on if pg_int2str() was too similar to pg_ltoa(). Perhaps pg_ltoa() should just be modified to return the end of string? Thoughts? Regards David Rowley -- David Rowley http://www.2ndQuadrant.com/ http://www.2ndquadrant.com/ PostgreSQL Development, 24x7 Support, Training Services timestamp_out_speedup_2015-08-09.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5 release notes
On Sun, Aug 9, 2015 at 01:24:33AM +1200, David Rowley wrote: On 7 August 2015 at 14:24, Bruce Momjian br...@momjian.us wrote: On Tue, Jun 30, 2015 at 09:00:44PM +0200, Andres Freund wrote: * 2014-12-08 [519b075] Simon ..: Use GetSystemTimeAsFileTime directly in win32 2014-12-08 [8001fe6] Simon ..: Windows: use GetSystemTimePreciseAsFileTime if .. Timer resolution isn't a unimportant thing for people using explain? This all seemed very internals-only, e.g.: On most Windows systems this change will actually have no significant effect on timestamp resolution as the system timer tick is typically between 1ms and 15ms depending on what timer resolution currently running applications have requested. You can check this with clockres.exe from sysinternals. Despite the platform limiation this change still permits capture of finer timestamps where the system is capable of producing them and it gets rid of an unnecessary syscall. Was I wrong? This does have a user visible change. Timestamps are now likely to have 6 digits after the decimal point, if they're on a version of windows which supports GetSystemTimePreciseAsFileTime(); Master: postgres=# select now(); now --- 2015-08-09 01:14:01.959645+12 (1 row) 9.4.4 postgres=# select now(); now 2015-08-09 01:15:09.783+12 (1 row) Yes, this was already in the release notes: Allow higher-precision timestamp resolution on systemitem class=osnameWindows 8/ or systemitem class=osnameWindows Server 2012/ and later Windows systems (Craig Ringer) I am not sure why people were saying it was missing. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Broken src/test/regress/mb/* tests
Commit: http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=9043ef390f4f0b4586cfe59cbd22314b9c3e2957 broke src/test/regress/mb/* tests because the message: WARNING: hash indexes are not WAL-logged and their use is discouraged does not emit any more in the test. I have committed and pushed the fix to master and 9.5 stable branches (tests in the previous stable branches do not emit the message). Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: SCRAM authentication
On 08/08/2015 03:21 PM, Michael Paquier wrote: On Sun, Aug 9, 2015 at 6:51 AM, Josh Berkus j...@agliodbs.com wrote: 1. we drop the parameter password_encryption 2. we add the parameter password_storage, which takes a list: - plain : plain text - md5 : current md5 hashes - scram : new scram hashed passwords This defaults to 'md5, scram' if not specified. This list might be extended in the future. Perhaps using a different GUC than password_encryption is safer... I am not that sure. Still that's how I switched password_encryption to actually handle a list. Default is 'md5' with the first patch, and 'md5,scram' with the scram patch added and it sets the list of password verifiers created when PASSWORD with ENCRYPTED/UNENCRYPTED is used. Well, generally I feel like if we're going to change the *type* of a GUC parameter, we ought to change the *name*. It's far easier for users to figure out that the contents of a parameter need to change if the name is also changed. In other words, I think invalid parameter 'password_encryption' is an easier to understand error message than invalid password_encryption type 'on'. Besides which, password_encryption was always a misnomer. Unless you're going to still accept on, off in some kind of wierd backwards-compatibitlity mode? If so, how does that work? Like password ENCRYPTED (md5,scram) or similar? If no method is passed, I think that we should default to password_storage instead. Make sense. Also, I still think that something like PASSWORD VERIFIERS is needed, users may want to set the verifier user for each method after calculating it on client-side: we authorize that for md5 even now, and that's not something this spec authorizes. I don't follow this. Mind you, I'm not sure that I need to. 5. we add the superuser-only function pg_apply_password_policy(). This applies the policy expressed by password_storage, generating or erasing passwords for each user. pg_upgrade could make use of that to control password aging with an option to do the cleanup or not. Not sure what the default should be though. pg_apply_password_policy(roleid) would be useful as well to do it on a role base. No objections to an optional roleid parameter, if you think people will use it. 6. We add a new connection error for authentication __method__ not supported for user Hm? This would let any user trying to connect with a given method know that if a method is used or not. What's wrong with failing as we do now. In case of PASSWORD NULL for example, an attempt of connection fails all the time with incorrect password or similar. So, the DBA sets password_storage = 'scram', but doesn't take the md5 lines out of pg_hba.conf. The app dev tries to connect using a driver which only supports md5. What error should they get? A user/DBA who is getting invalid password is going to spend a long time debugging it. Also, it would be very useful to have a distinctive error in the log, so that DBAs could see who is *trying* to connect with the wrong verifier. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Assert in pg_stat_statements
On 2015/08/08 22:32, Robert Haas wrote: On Sat, Aug 8, 2015 at 5:00 AM, Satoshi Nagayasu sn...@uptime.jp wrote: I just found that pg_stat_statements causes assert when queryId is set by other module, which is loaded prior to pg_stat_statements in the shared_preload_libraries parameter. Theoretically, queryId in the shared memory could be set by other module by design. So, IMHO, pg_stat_statements should not cause assert even if queryId already has non-zero value --- my module has this problem now. Is my understanding correct? Any comments? Seems like an ERROR would be more appropriate than an Assert. Well, it's not that simple, and I'm afraid that it may not fix the issue. Here, let's assume that we have two modules (foo and pg_stat_statements) which calculate, use and store Query-queryId independently. What we may see when two are enabled is following. (1) Enable two modules in shared_preload_libraries. shared_preload_libraries = 'foo,pg_stat_statements' (2) Once a query comes in, a callback function in module foo associated with post_parse_analyze_hook calculates and store queryId in Query-queryId. (3) After that, a callback function (pgss_post_parse_analyze) in pg_stat_statements associated with post_parse_analyze_hook detects that Query-queryId has non-zero value, and asserts it. As a result, we can not have two or more modules that use queryId at the same time. But I think it should be possible because one of the advantages of PostgreSQL is its extensibility. So, I think the problem here is the fact that pg_stat_statements deals with having non-zero value in queryId as error even if it has exact the same value with other module. Regards, -- NAGAYASU Satoshi sn...@uptime.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers