Re: [HACKERS] WAL format and API changes (9.5)
On Sun, Jun 15, 2014 at 6:11 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 06/14/2014 09:43 PM, Alvaro Herrera wrote: Heikki Linnakangas wrote: Here's a new version, rebased against master. No other changes. This is missing xlogrecord.h ... Oops, here you are. Looking at this patch, it does not compile when assertions are enabled because of this assertion in heapam.c: Assert(recdata == data + datalen); It seems like a thinko as removing it makes the code compile. Here is a review of this patch: - Compilation without warnings, regression tests pass - That's not a small patch, but the changes mainly done are xlog record insertion in accordance to the new API format. $ git diff master --stat | tail -n1 52 files changed, 3601 insertions(+), 4371 deletions(-) Some comments about the code: 1) In src/backend/access/transam/README: 's/no further action is no required/no further action is required/g' 2) Description of XLogBeginInsert, XLogHasBlockRef, XLogGetBlockRef and XLogGetBlockRefIds are missing in src/backend/access/transam/README. 3) There are a couple of code blocks marked as FIXME and BROKEN. There are as well some NOT_USED blocks. 4) Current patch crashes when running make check in contrib/test_decoding. Looking closer, wal_level = logical is broken: =# show wal_level; wal_level --- logical (1 row) =# create database foo; The connection to the server was lost. Attempting reset: Failed. Looking even closer, log_heap_new_cid is broken: (lldb) bt * thread #1: tid = 0x, 0x7fff8a3c2866 libsystem_kernel.dylib`__pthread_kill + 10, stop reason = signal SIGSTOP * frame #0: 0x7fff8a3c2866 libsystem_kernel.dylib`__pthread_kill + 10 frame #1: 0x7fff8d37235c libsystem_pthread.dylib`pthread_kill + 92 frame #2: 0x7fff9369db1a libsystem_c.dylib`abort + 125 frame #3: 0x000104428819 postgres`ExceptionalCondition(conditionName=0x0001044a711c, errorType=0x00010448b19f, fileName=0x0001044a2cfd, lineNumber=6879) + 137 at assert.c:54 frame #4: 0x000103f08dbe postgres`log_heap_new_cid(relation=0x7fae4482afb8, tup=0x7fae438062e8) + 126 at heapam.c:6879 frame #5: 0x000103f080cf postgres`heap_insert(relation=0x7fae4482afb8, tup=0x7fae438062e8, cid=0, options=0, bistate=0x) + 383 at heapam.c:2095 frame #6: 0x000103f09b6a postgres`simple_heap_insert(relation=0x7fae4482afb8, tup=0x7fae438062e8) + 74 at heapam.c:2533 frame #7: 0x00010406aacf postgres`createdb(stmt=0x7fae44843690) + 6335 at dbcommands.c:511 frame #8: 0x00010429def8 postgres`standard_ProcessUtility(parsetree=0x7fae44843690, queryString=0x7fae44842c38, context=PROCESS_UTILITY_TOPLEVEL, params=0x, dest=0x7fae44843a18, completionTag=0x7fff5bd4ee30) + 1720 at utility.c:56 5) Yes,there are some WAL records that have only data related to buffers, XLOG_GIN_VACUUM_DATA_LEAF_PAGE being one, with nothing registered with buffer_id = -d, but using a dummy entry seems counter-productive: + rdata = rdata_head; + if (rdata == NULL) + { + /* +* the rest of this function assumes that there is at least some +* rdata entries, so fake an empty rdata entry +*/ + dummy_rdata.data = NULL; + dummy_rdata.len = 0; + dummy_rdata.next = NULL; + rdata = dummy_rdata; + } Could this be removed and replaced by a couple of checks on rdata being NULL in some cases? XLogInsert should be updated in consequence. 6) XLogRegisterData creates a XLogRecData entry and appends it in one of the static XLogRecData entries if buffer_id = 0 or to rdata_head/rdata_tail if buffer_id = -1 (for permanent data related to the WAL record). XLogRegisterXLogRecData does something similar, but with an entry of XLogRecData already built. What do you think about removing XLogRegisterXLogRecData and keeping only XLogRegisterData. This makes the interface simpler, and XLogRecData could be kept private in xlog.c. This would need more work especially on gin side where I am seeing for example constructLeafRecompressWALData directly building a XLogRecData entry. By doing so, we could as remove the while loop at the bottom of XLogRegisterXLogRecData as we would insert in the tail only the latest record created, aka replacing that: while ((*tail)-next) *tail = (*tail)-next; By simply that: *tail = (*tail)-next; 7) New structures at the top of xlog.c should have more comments about their role, particularly rdata_head and rdata_tail that store main WAL data (id of -1) 8) What do you think about adding in the README a description about how to retrieve the block list modified by a WAL record, for a flow similar to that: record = XLogReadRecord(...); nblocks = XLogGetBlockRefIds(record, blockids); for (i = 0; i nblocks; i++) { bkpb = XLogGetBlockRef(record, id, NULL); // stuff }
Re: [HACKERS] review: Built-in binning functions
Hello I recheck this patch 1. applied cleanly and compilation was without warnings and errors 2. all tests was passed ok 3. documentation was rebuild without issues 4. I don't see any issue in code quality - it is well commented, well formatted, with regress tests It is ready for commit Regards Pavel 2014-06-26 22:43 GMT+02:00 Petr Jelinek p...@2ndquadrant.com: Here is v2, with fixed documentation and numeric version of the implementation. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services
Re: [HACKERS] ALTER SYSTEM RESET?
On 06/27/2014 06:22 AM, Amit Kapila wrote: On Thu, Jun 26, 2014 at 8:17 PM, Vik Fearing vik.fear...@dalibo.com mailto:vik.fear...@dalibo.com wrote: On 06/26/2014 05:07 AM, Amit Kapila wrote: I think it will make sense if we support RESET ALL as well similar to Alter Database .. RESET ALL syntax. Do you see any reason why we shouldn't support RESET ALL syntax for Alter System? Yes, that makes sense. I've added that in the attached version 2 of the patch. I think the idea used in patch to implement RESET ALL is sane. In passing by, I noticed that modified lines in .sgml have crossed 80 char boundary which is generally preferred to be maintained.. Refer below modification. !values to the filenamepostgresql.auto.conf/filename file. Setting the parameter to !literalDEFAULT/literal, or using the commandRESET/command variant, removes the configuration entry from I did that on purpose so that it's easy for a reviewer/committer to see what changed. This third patch reformats the documentation in the way I expected it to be committed. -- Vik *** a/doc/src/sgml/ref/alter_system.sgml --- b/doc/src/sgml/ref/alter_system.sgml *** *** 22,27 PostgreSQL documentation --- 22,30 refsynopsisdiv synopsis ALTER SYSTEM SET replaceable class=PARAMETERconfiguration_parameter/replaceable { TO | = } { replaceable class=PARAMETERvalue/replaceable | 'replaceable class=PARAMETERvalue/replaceable' | DEFAULT } + + ALTER SYSTEM RESET replaceable class=PARAMETERconfiguration_parameter/replaceable + ALTER SYSTEM RESET ALL /synopsis /refsynopsisdiv *** *** 30,37 ALTER SYSTEM SET replaceable class=PARAMETERconfiguration_parameter/replace para commandALTER SYSTEM/command writes the configuration parameter !values to the filenamepostgresql.auto.conf/filename file. With !literalDEFAULT/literal, it removes a configuration entry from filenamepostgresql.auto.conf/filename file. The values will be effective after reload of server configuration (SIGHUP) or in next server start based on the type of configuration parameter modified. --- 33,41 para commandALTER SYSTEM/command writes the configuration parameter !values to the filenamepostgresql.auto.conf/filename file. !Setting the parameter to literalDEFAULT/literal, or using the !commandRESET/command variant, removes the configuration entry from filenamepostgresql.auto.conf/filename file. The values will be effective after reload of server configuration (SIGHUP) or in next server start based on the type of configuration parameter modified. *** a/src/backend/parser/gram.y --- b/src/backend/parser/gram.y *** *** 401,407 static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query); %type istmt insert_rest ! %type vsetstmt generic_set set_rest set_rest_more SetResetClause FunctionSetResetClause %type node TableElement TypedTableElement ConstraintElem TableFuncElement %type node columnDef columnOptions --- 401,407 %type istmt insert_rest ! %type vsetstmt generic_set set_rest set_rest_more generic_reset reset_rest SetResetClause FunctionSetResetClause %type node TableElement TypedTableElement ConstraintElem TableFuncElement %type node columnDef columnOptions *** *** 1567,1605 NonReservedWord_or_Sconst: ; VariableResetStmt: ! RESET var_name { VariableSetStmt *n = makeNode(VariableSetStmt); n-kind = VAR_RESET; ! n-name = $2; ! $$ = (Node *) n; } ! | RESET TIME ZONE { VariableSetStmt *n = makeNode(VariableSetStmt); n-kind = VAR_RESET; ! n-name = timezone; ! $$ = (Node *) n; } ! | RESET TRANSACTION ISOLATION LEVEL { VariableSetStmt *n = makeNode(VariableSetStmt); n-kind = VAR_RESET; ! n-name = transaction_isolation; ! $$ = (Node *) n; } ! | RESET SESSION AUTHORIZATION { VariableSetStmt *n = makeNode(VariableSetStmt); n-kind = VAR_RESET; ! n-name = session_authorization; ! $$ = (Node *) n; } ! | RESET ALL { VariableSetStmt *n = makeNode(VariableSetStmt); n-kind = VAR_RESET_ALL; ! $$ = (Node *) n; } ; --- 1567,1613 ; VariableResetStmt: ! RESET reset_rest { $$ = (Node *) $2; } ! ; ! ! reset_rest: ! generic_reset { $$ = $1; } ! | TIME ZONE { VariableSetStmt *n = makeNode(VariableSetStmt); n-kind = VAR_RESET; ! n-name = timezone; ! $$ = n; } ! | TRANSACTION ISOLATION LEVEL { VariableSetStmt *n = makeNode(VariableSetStmt); n-kind = VAR_RESET; ! n-name = transaction_isolation; ! $$ = n; } ! | SESSION AUTHORIZATION { VariableSetStmt *n = makeNode(VariableSetStmt); n-kind = VAR_RESET; ! n-name = session_authorization; !
Re: [Fwd: Re: [HACKERS] proposal: new long psql parameter --on-error-stop]
Hello thank you Peter, so now only setting for MS Windows is missing? Regards Pavel 2014-06-26 21:57 GMT+02:00 Petr Jelinek p...@2ndquadrant.com: Hello, I went through the patch, seems mostly fine, I adjusted some wording, removed the default .pgpass file info since it's not accurate, and replaced couple of phrases with (hopefully) more informative ones. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services
Re: [HACKERS] ALTER SYSTEM RESET?
On 06/27/2014 08:49 AM, Vik Fearing wrote: This third patch reformats the documentation in the way I expected it to be committed. Amit, I added this to the next commitfest with your name as reviewer. https://commitfest.postgresql.org/action/patch_view?id=1495 Please update the status as you see fit. -- Vik -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RLS Design
On 26 June 2014 18:04, Robert Haas robertmh...@gmail.com wrote: ALTER TABLE t1 SET POLICY p1 ON SELECT TO t1_p1_sel_quals; GRANT SELECT ON TABLE t1 TO role1 USING p1; As I see it, the downside of this is that it gets a lot more complex. We have to revise the ACL representation, which is already pretty darn complicated, to keep track not only of the grantee, grantor, and permissions, but also the policies qualifying those permissions. The changes to GRANT will need to propagate into GRANT ON ALL TABLES IN SCHEMA and AFTER DEFAULT PRIVILEGES. No, it can be done without any changes to the permissions code by storing the ACLs on the catalog entries where the RLS quals are held, rather than modifying the ACL items on the table. I.e., instead of thinking of USING polname as a modifier to the grant, think of it as as an additional qualifier on the thing being granted. That means the syntax I proposed earlier is wrong/misleading. Instead of GRANT SELECT ON TABLE tbl TO role USING polname; it should really be GRANT SELECT USING polname ON TABLE tbl TO role; There is administrative complexity as well, because if you want to policy-protect an additional table, you've got to add the table to the policy and then update all the grants as well. I think what will happen in practice is that people will grant to PUBLIC all rights on the policy, and then do all the access control through the GRANT statements. If you assume that most users will only have one policy through which they can access any given table, then there is no more administrative overhead than we have right now. Right now you have to grant each user permissions on each table you define. The only difference is that now you throw in a USING polname. We could also simplify administration by supporting GRANT SELECT USING polname ON ALL TABLES IN SCHEMA sch TO role; The important distinction is that this is only granting permissions on tables that exist now, not on tables that might be created later. An interesting question we haven't much considered is: who can set up policies and add then to users? Maybe we should flip this around, and instead of adding users to policies, we should exempt users from policies. CREATE POLICY p1; And then, if they own p1 and t1, they can do: ALTER TABLE t1 SET POLICY p1 TO t1_p1_quals; (or maybe we should associate it to the policy instead of the table: ALTER POLICY p1 SET TABLE t1 TO t1_p1_quals) And then the policy applies to everyone who doesn't have the grantable EXEMPT privilege on the policy. The policy owner and superuser have that privilege by default and it can be handed out to others like this: GRANT EXEMPT ON POLICY p1 TO snowden; Then users who have row_level_security=on will bypass RLS if possible, and otherwise it will be applied. Users who have row_level_security=off will bypass RLS if possible, and otherwise error. And users who have row_level_security=force will apply RLS even if they are entitled to bypass it. That's interesting. I need to think some more about what that means. Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Index-only scans and non-MVCC snapshots
On 2014-06-26 22:47:47 -0600, Ryan Johnson wrote: Hi, As part of a research project, I'm trying to change Read Committed isolation to use HeapTupleSatisfiesNow rather than acquiring a new snapshot at every command [1]. Things appear to have gone reasonably well so far, except certain queries fail with ERROR: non-MVCC snapshots are not supported in index-only scans. You're aware that unless you employ additional locking you can simply miss individual rows or see them several times because of concurrent updates? The reason it has worked ( 9.4) for system catalogs is that updates of rows were only performed while objects were locked access exclusively - that's the reason why some places in the code use inplace updates btw... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_resetxlog to clear backup start/end locations.
On Fri, Jun 27, 2014 at 12:29 PM, Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp wrote: Hello, I have finished the patches for all of 9.x. I dont' touch what '-n' option shows and rewrite documents for the option a bit. And '-n' won't show the changes of backup location. -8.4: does not have backup locations in ControlFileData. 9.0-9.1: resetxlog_backuploc_9_0-9.1.patch: Add clearance of backupStartPoint. 9.2: resetxlog_backuploc_9_2.patch: Add clearance of backupStart/EndPoint and backupEndRequired 9.3: resetxlog_backuploc_9_3.patch: Ditto. (format of XLogRecPtr changed) 9.4-master: resetxlog_backuploc_9_4-master.patch: Add clearance of backupPoints. help message and html doc changed. Thanks for the patch! But when I read the source code of pg_resetxlog, I found that it has already reset the backup locations. Please see RewriteControlFile() which does that. So I wonder if we need nothing at least in HEAD for the purpose which you'd like to achieve. Thought? Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_receivexlog add synchronous mode
On Thu, Jun 26, 2014 at 7:01 PM, furu...@pm.nttdata.co.jp wrote: The patch looks somewhat complicated and bugs can be easily introduced because it tries to not only add new feature but also reorganize the main loop in HandleCopyStream at the same time. To keep the patch simple, I'm thinking to firstly apply the attached patch which just refactors the main loop. Then we can apply the main patch, i.e., add new feature. Thought? Thank you for the refactoring patch. I did a review of the patch. As a result, I found the calculation of sleeptime when the --status-intarvall is set to 1 was incorrect. --status-intarvall 1 -- sleeptime 1. !? --status-intarvall 2 -- sleeptime 1. OK --status-intarvall 3 -- sleeptime 2. OK Thanks for the review! +if (secs = 0) +secs = 1;/* Always sleep at least 1 sec */ + +sleeptime = secs * 1000 + usecs / 1000; The above is the code which caused that problem. 'usecs' should have been reset to zero when 'secs' are rounded up to 1 second. But not. Attached is the updated version of the patch. Regards, -- Fujii Masao diff --git a/src/bin/pg_basebackup/receivelog.c b/src/bin/pg_basebackup/receivelog.c index d76e605..4aa35da 100644 --- a/src/bin/pg_basebackup/receivelog.c +++ b/src/bin/pg_basebackup/receivelog.c @@ -35,6 +35,8 @@ static PGresult *HandleCopyStream(PGconn *conn, XLogRecPtr startpos, uint32 timeline, char *basedir, stream_stop_callback stream_stop, int standby_message_timeout, char *partial_suffix, XLogRecPtr *stoppos); +static int CopyStreamPoll(PGconn *conn, long timeout_ms); +static int CopyStreamReceive(PGconn *conn, long timeout, char **buffer); static bool ReadEndOfStreamingResult(PGresult *res, XLogRecPtr *startpos, uint32 *timeline); @@ -744,12 +746,7 @@ HandleCopyStream(PGconn *conn, XLogRecPtr startpos, uint32 timeline, int bytes_written; int64 now; int hdr_len; - - if (copybuf != NULL) - { - PQfreemem(copybuf); - copybuf = NULL; - } + long sleeptime; /* * Check if we should continue streaming, or abort at this point. @@ -784,67 +781,38 @@ HandleCopyStream(PGconn *conn, XLogRecPtr startpos, uint32 timeline, last_status = now; } - r = PQgetCopyData(conn, copybuf, 1); - if (r == 0) + /* + * Compute how long send/receive loops should sleep + */ + if (standby_message_timeout still_sending) { - /* - * No data available. Wait for some to appear, but not longer than - * the specified timeout, so that we can ping the server. - */ - fd_set input_mask; - struct timeval timeout; - struct timeval *timeoutptr; - - FD_ZERO(input_mask); - FD_SET(PQsocket(conn), input_mask); - if (standby_message_timeout still_sending) + int64 targettime; + long secs; + int usecs; + + targettime = last_status + (standby_message_timeout - 1) * ((int64) 1000); + feTimestampDifference(now, + targettime, + secs, + usecs); + /* Always sleep at least 1 sec */ + if (secs = 0) { -int64 targettime; -long secs; -int usecs; - -targettime = last_status + (standby_message_timeout - 1) * ((int64) 1000); -feTimestampDifference(now, - targettime, - secs, - usecs); -if (secs = 0) - timeout.tv_sec = 1; /* Always sleep at least 1 sec */ -else - timeout.tv_sec = secs; -timeout.tv_usec = usecs; -timeoutptr = timeout; +secs = 1; +usecs = 0; } - else -timeoutptr = NULL; - r = select(PQsocket(conn) + 1, input_mask, NULL, NULL, timeoutptr); - if (r == 0 || (r 0 errno == EINTR)) - { -/* - * Got a timeout or signal. Continue the loop and either - * deliver a status packet to the server or just go back into - * blocking. - */ -continue; - } - else if (r 0) - { -fprintf(stderr, _(%s: select() failed: %s\n), - progname, strerror(errno)); -goto error; - } - /* Else there is actually data on the socket */ - if (PQconsumeInput(conn) == 0) - { -fprintf(stderr, - _(%s: could not receive data from WAL stream: %s), - progname, PQerrorMessage(conn)); -goto error; - } - continue; + sleeptime = secs * 1000 + usecs / 1000; } + else + sleeptime = -1; + + r = CopyStreamReceive(conn, sleeptime, copybuf); + if (r == 0) + continue; if (r == -1) + goto error; + if (r == -2) { PGresult *res = PQgetResult(conn); @@ -877,15 +845,10 @@ HandleCopyStream(PGconn *conn, XLogRecPtr startpos, uint32 timeline, } if (copybuf != NULL) PQfreemem(copybuf); + copybuf = NULL; *stoppos = blockpos; return res; } - if (r == -2) - { - fprintf(stderr, _(%s: could not read COPY data: %s), - progname, PQerrorMessage(conn)); - goto error; - } /* Check the message type. */ if (copybuf[0] == 'k') @@ -1056,3 +1019,115 @@ error: PQfreemem(copybuf); return NULL; } + +/* + * Wait
Re: [HACKERS] pg_receivexlog add synchronous mode
On 2014-06-05 19:13:34 +0900, furu...@pm.nttdata.co.jp wrote: -Original Message- Hi, On 2014-06-05 17:09:44 +0900, furu...@pm.nttdata.co.jp wrote: Synchronous(synchronous_commit = on) mode offers the ability to confirm WAL have been streamed in the same way as synchronous replication. If an output is used as a different disk from the directory where the transaction log should be stored. Prevent the loss of data due to disk failure. the additional parameter(-m) and replicationslot specify, that its synchronous mode. All received WAL write after, flush is executed and reply flush position. What's the usecase for this? I can see some benefit in easier testing of syncrep, but that's basically it? When used with syncrep, standby server crashes, multiplexing of WAL can be collateral. Data loss can be to nearly zero. I don't see how this gets pg_receivexlog much closer to a solution for multiplexing WAL. Since there's no support for streaming data, removing old WAL and such it seems to me you'd need to have something entirely different anyway? I'm not really averse to adding this feature (on the basis of easier syncrep testing alone), but I don't like the arguments for it so far... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] idle_in_transaction_timeout
On Wed, Jun 25, 2014 at 11:40 AM, Tom Lane t...@sss.pgh.pa.us wrote: Fujii Masao masao.fu...@gmail.com writes: Why is IIT timeout turned on only when send_ready_for_query is true? I was thinking it should be turned on every time a message is received. Imagine the case where the session is in idle-in-transaction state and a client gets stuck after sending Parse message and before sending Bind message. This case would also cause long transaction problem and should be resolved by IIT timeout. But IIT timeout that this patch adds cannot handle this case because it's enabled only when send_ready_for_query is true. Thought? I think you just moved the goalposts to the next county. The point of this feature, AFAICS, is to detect clients that are failing to issue another query or close their transaction as a result of bad client logic. It's not to protect against network glitches. Hmm, so there's no reason a client, after sending one protocol message, might not pause before sending the next protocol message? That seems like a surprising argument. Someone couldn't Parse and then wait before sending Bind and Execute, or Parse and Bind and then wait before sending Execute? Moreover, there would be no way to implement a timeout like that without adding a gettimeofday() call after every packet receipt, which is overhead we do not need and cannot afford. I don't think this feature should add *any* gettimeofday's beyond the ones that are already there. That's an especially good point if we think that this feature will be enabled commonly or by default - but as Fujii Masao says, it might be tricky to avoid. :-( -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Atomics hardware support table supported architectures
On 2014-06-24 10:22:08 -0700, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: On 2014-06-24 13:03:37 -0400, Noah Misch wrote: If a change has the potential to make some architectures give wrong answers only at odd times, that's a different kind of problem. For that reason, actively breaking Alpha is a good thing. Not sure what you mean with the 'actively breaking Alpha' statement? That we should drop Alpha? +1. Especially with no buildfarm critter. Would anyone here care to bet even the price of a burger that Alpha isn't broken already? Here's a patch removing alpha/true64/osf/1 support. I think I got most relevant references, not sure if I missed something. Since there seems to be (unanimous?) support for dropping alpha and some patches coming up that need to deal with platform dependent stuff it seems sensible to do this first. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services From aba43d095344337fb76fb15b53a6ac4c90900d80 Mon Sep 17 00:00:00 2001 From: Andres Freund and...@anarazel.de Date: Fri, 27 Jun 2014 15:43:46 +0200 Subject: [PATCH] Remove Alpha and Tru64 support. Support for running postgres on Alpha hasn't been tested for a long while. Due to Alpha's lax memory order guarantees it's a hard to develop for platform and thought to be unlikely to currently work correctly. As Alpha is the only supported architecture for Tru64 drop support for it. Tru64's support has ended 2012 and it was in maintenance-only mode for much longer. Also remove stray references to __ksr__ and ultrix defines. --- configure | 1 - configure.in | 1 - doc/src/sgml/dfunc.sgml| 21 doc/src/sgml/installation.sgml | 7 ++-- src/Makefile.shlib | 4 --- src/backend/main/main.c| 29 - src/backend/port/dynloader/osf.c | 7 src/backend/port/dynloader/osf.h | 47 --- src/backend/utils/misc/ps_status.c | 2 +- src/include/port/osf.h | 4 --- src/include/storage/barrier.h | 9 -- src/include/storage/s_lock.h | 66 -- src/makefiles/Makefile.osf | 10 -- src/template/osf | 6 14 files changed, 4 insertions(+), 210 deletions(-) delete mode 100644 src/backend/port/dynloader/osf.c delete mode 100644 src/backend/port/dynloader/osf.h delete mode 100644 src/include/port/osf.h delete mode 100644 src/makefiles/Makefile.osf delete mode 100644 src/template/osf diff --git a/configure b/configure index da89c69..c9388f2 100755 --- a/configure +++ b/configure @@ -2855,7 +2855,6 @@ dragonfly*) template=netbsd ;; mingw*) template=win32 ;; netbsd*) template=netbsd ;; openbsd*) template=openbsd ;; - osf*) template=osf ;; sco*) template=sco ;; solaris*) template=solaris ;; sysv5*) template=unixware ;; diff --git a/configure.in b/configure.in index 7dfbe9f..f8bf466 100644 --- a/configure.in +++ b/configure.in @@ -69,7 +69,6 @@ dragonfly*) template=netbsd ;; mingw*) template=win32 ;; netbsd*) template=netbsd ;; openbsd*) template=openbsd ;; - osf*) template=osf ;; sco*) template=sco ;; solaris*) template=solaris ;; sysv5*) template=unixware ;; diff --git a/doc/src/sgml/dfunc.sgml b/doc/src/sgml/dfunc.sgml index 3d90a36..b78537c 100644 --- a/doc/src/sgml/dfunc.sgml +++ b/doc/src/sgml/dfunc.sgml @@ -207,27 +207,6 @@ gcc -G -o foo.so foo.o varlistentry term - systemitem class=osnameTru64 UNIX/ - indextermprimaryTru64 UNIX/secondaryshared library// - indextermprimaryDigital UNIX/seeTru64 UNIX// -/term -listitem - para - acronymPIC/acronym is the default, so the compilation command - is the usual one. commandld/command with special options is - used to do the linking. -programlisting -cc -c foo.c -ld -shared -expect_unresolved '*' -o foo.so foo.o -/programlisting - The same procedure is used with GCC instead of the system - compiler; no special options are required. - /para -/listitem - /varlistentry - - varlistentry -term systemitem class=osnameUnixWare/ indextermprimaryUnixWare/secondaryshared library// /term diff --git a/doc/src/sgml/installation.sgml b/doc/src/sgml/installation.sgml index 7353c61..2e81bbb 100644 --- a/doc/src/sgml/installation.sgml +++ b/doc/src/sgml/installation.sgml @@ -1669,8 +1669,7 @@ PostgreSQL, contrib and HTML documentation successfully made. Ready to install. systemitem class=osnameHP-UX/, systemitem class=osnameLinux/, systemitem class=osnameNetBSD/, systemitem -class=osnameOpenBSD/, systemitem class=osnameTru64 -UNIX/ (formerly systemitem class=osnameDigital UNIX/), and +class=osnameOpenBSD/, and systemitem class=osnameSolaris/. /para @@ -1981,7
Re: [HACKERS] change alter user to be a true alias for alter role
On 06/19/2014 07:18 PM, Tom Lane wrote: Jov am...@amutu.com writes: the doc say: ALTER USER is now an alias for ALTER ROLEhttp://www.postgresql.org/docs/devel/static/sql-alterrole.html but alter user lack the following format: ... If we're going to have a policy that these commands be exactly equivalent, it seems like this patch is just sticking a finger into the dike. It does nothing to prevent anyone from making the same mistake again in future. What about collapsing both sets of productions into one, along the lines of role_or_user: ROLE | USER; AlterRoleSetStmt: ALTER role_or_user RoleId opt_in_database SetResetClause (and similarly to the latter for every existing ALTER ROLE variant). Because MAPPING is an unreserved keyword, I think that this approach might force us to also change ALTER USER MAPPING to ALTER role_or_user MAPPING, which is not contemplated by the SQL standard. But hey, it would satisfy the principle of least surprise no? Anyway we don't have to document that that would work. After a week of silence from Jov, I decided to do this myself since it didn't seem very hard. Many frustrating hours of trying to understand why I'm getting shift/reduce conflicts by the hundreds later, I've decided to give up for now. -- Vik -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Index-only scans and non-MVCC snapshots
Ryan Johnson wrote: On 26/06/2014 11:04 PM, Alvaro Herrera wrote: Ryan Johnson wrote: As part of a research project, I'm trying to change Read Committed isolation to use HeapTupleSatisfiesNow rather than acquiring a new snapshot at every command [1]. Are you aware of this? commit 813fb0315587d32e3b77af1051a0ef517d187763 Author: Robert Haas rh...@postgresql.org Date: Thu Aug 1 10:46:19 2013 -0400 Remove SnapshotNow and HeapTupleSatisfiesNow. That would be wonderful news... if snapshots weren't so darned expensive to create. I take it you aren't aware of this other effort, either: http://archives.postgresql.org/message-id/539ad153.9000...@vmware.com -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WAL replay bugs
On Fri, Jun 27, 2014 at 2:57 PM, Michael Paquier michael.paqu...@gmail.com wrote: Here are rebased patches, their was a conflict with a recent commit in contrib/pg_upgrade. I am resending patch 2 as it contained a rebase conflict not correctly resolved (Thanks Alvaro). Regards, -- Michael From d4f0289ffcece54a78e51e8b707c41e994d549ee Mon Sep 17 00:00:00 2001 From: Michael Paquier mich...@otacoo.com Date: Fri, 27 Jun 2014 23:35:29 +0900 Subject: [PATCH 2/3] Extract generic bash initialization process from pg_upgrade Such initialization is useful as well for some other utilities and makes test settings consistent. --- contrib/pg_upgrade/test.sh | 47 src/test/shell/init_env.sh | 60 ++ 2 files changed, 64 insertions(+), 43 deletions(-) create mode 100644 src/test/shell/init_env.sh diff --git a/contrib/pg_upgrade/test.sh b/contrib/pg_upgrade/test.sh index 7bbd2c7..2e1c61a 100644 --- a/contrib/pg_upgrade/test.sh +++ b/contrib/pg_upgrade/test.sh @@ -9,24 +9,14 @@ # Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group # Portions Copyright (c) 1994, Regents of the University of California -set -e - -: ${MAKE=make} - -# Guard against parallel make issues (see comments in pg_regress.c) -unset MAKEFLAGS -unset MAKELEVEL - -# Establish how the server will listen for connections -testhost=`uname -s` +# Initialize test +. ../../src/test/shell/init_env.sh case $testhost in MINGW*) - LISTEN_ADDRESSES=localhost PGHOST=; unset PGHOST ;; *) - LISTEN_ADDRESSES= # Select a socket directory. The algorithm is from the configure # script; the outcome mimics pg_regress.c:make_temp_sockdir(). PGHOST=$PG_REGRESS_SOCK_DIR @@ -102,37 +92,8 @@ logdir=$PWD/log rm -rf $logdir mkdir $logdir -# Clear out any environment vars that might cause libpq to connect to -# the wrong postmaster (cf pg_regress.c) -# -# Some shells, such as NetBSD's, return non-zero from unset if the variable -# is already unset. Since we are operating under 'set -e', this causes the -# script to fail. To guard against this, set them all to an empty string first. -PGDATABASE=;unset PGDATABASE -PGUSER=;unset PGUSER -PGSERVICE=; unset PGSERVICE -PGSSLMODE=; unset PGSSLMODE -PGREQUIRESSL=; unset PGREQUIRESSL -PGCONNECT_TIMEOUT=; unset PGCONNECT_TIMEOUT -PGHOSTADDR=;unset PGHOSTADDR - -# Select a non-conflicting port number, similarly to pg_regress.c -PG_VERSION_NUM=`grep '#define PG_VERSION_NUM' $newsrc/src/include/pg_config.h | awk '{print $3}'` -PGPORT=`expr $PG_VERSION_NUM % 16384 + 49152` -export PGPORT - -i=0 -while psql -X postgres /dev/null 2/dev/null -do - i=`expr $i + 1` - if [ $i -eq 16 ] - then - echo port $PGPORT apparently in use - exit 1 - fi - PGPORT=`expr $PGPORT + 1` - export PGPORT -done +# Get a port to run the tests +pg_get_test_port $newsrc # buildfarm may try to override port via EXTRA_REGRESS_OPTS ... EXTRA_REGRESS_OPTS=$EXTRA_REGRESS_OPTS --port=$PGPORT diff --git a/src/test/shell/init_env.sh b/src/test/shell/init_env.sh new file mode 100644 index 000..d37eb69 --- /dev/null +++ b/src/test/shell/init_env.sh @@ -0,0 +1,60 @@ +#!/bin/sh + +# src/test/shell/init.sh +# +# Utility initializing environment for tests to be conducted in shell. +# The initialization done here is made to be platform-proof. + +set -e + +: ${MAKE=make} + +# Guard against parallel make issues (see comments in pg_regress.c) +unset MAKEFLAGS +unset MAKELEVEL + +# Set listen_addresses desirably +testhost=`uname -s` +case $testhost in + MINGW*) LISTEN_ADDRESSES=localhost ;; + *) LISTEN_ADDRESSES= ;; +esac + +# Clear out any environment vars that might cause libpq to connect to +# the wrong postmaster (cf pg_regress.c) +# +# Some shells, such as NetBSD's, return nonzero from unset if the variable +# is already unset. Since we are operating under 'set e', this causes the +# script to fail. To guard against this, set them all to an empty string first. +PGDATABASE=;unset PGDATABASE +PGUSER=;unset PGUSER +PGSERVICE=; unset PGSERVICE +PGSSLMODE=; unset PGSSLMODE +PGREQUIRESSL=; unset PGREQUIRESSL +PGCONNECT_TIMEOUT=; unset PGCONNECT_TIMEOUT +PGHOST=;unset PGHOST +PGHOSTADDR=;unset PGHOSTADDR + +# Select a non-conflicting port number, similarly to pg_regress.c, and +# save its value in PGPORT. Caller should either save or use this value +# for the tests. +pg_get_test_port() +{ + PG_ROOT_DIR=$1 + PG_VERSION_NUM=`grep '#define PG_VERSION_NUM' $PG_ROOT_DIR/src/include/pg_config.h | awk '{print $3}'` + PGPORT=`expr $PG_VERSION_NUM % 16384 + 49152` + export PGPORT + + i=0 + while psql -X postgres /dev/null 2/dev/null + do + i=`expr $i + 1` + if [ $i -eq 16 ] + then + echo port $PGPORT apparently in use + exit 1 + fi + PGPORT=`expr $PGPORT + 1` + export PGPORT + done +} -- 2.0.0 -- Sent via pgsql-hackers mailing list
Re: [HACKERS] Index-only scans and non-MVCC snapshots
On 27/06/2014 3:14 AM, Andres Freund wrote: On 2014-06-26 22:47:47 -0600, Ryan Johnson wrote: Hi, As part of a research project, I'm trying to change Read Committed isolation to use HeapTupleSatisfiesNow rather than acquiring a new snapshot at every command [1]. Things appear to have gone reasonably well so far, except certain queries fail with ERROR: non-MVCC snapshots are not supported in index-only scans. You're aware that unless you employ additional locking you can simply miss individual rows or see them several times because of concurrent updates? The reason it has worked ( 9.4) for system catalogs is that updates of rows were only performed while objects were locked access exclusively - that's the reason why some places in the code use inplace updates btw... Yes, I was aware of the need for locking. The documentation just made it sound that locking was already in place for non-MVCC index scans. I was hoping I'd missed some easy way to activate it. Regards, Ryan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Index-only scans and non-MVCC snapshots
On 2014-06-27 08:39:13 -0600, Ryan Johnson wrote: On 27/06/2014 3:14 AM, Andres Freund wrote: On 2014-06-26 22:47:47 -0600, Ryan Johnson wrote: Hi, As part of a research project, I'm trying to change Read Committed isolation to use HeapTupleSatisfiesNow rather than acquiring a new snapshot at every command [1]. Things appear to have gone reasonably well so far, except certain queries fail with ERROR: non-MVCC snapshots are not supported in index-only scans. You're aware that unless you employ additional locking you can simply miss individual rows or see them several times because of concurrent updates? The reason it has worked ( 9.4) for system catalogs is that updates of rows were only performed while objects were locked access exclusively - that's the reason why some places in the code use inplace updates btw... Yes, I was aware of the need for locking. The documentation just made it sound that locking was already in place for non-MVCC index scans. I was hoping I'd missed some easy way to activate it. Well, it is/was for the places (i.e. DDL) that actually use non-MVCC scans. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pgstat_heap() consults freed memory
pgstat_heap() creates a BufferAccessStrategy and attaches it to a HeapScanDesc. It continues to use that strategy after calling heap_endscan(), which frees the strategy. This is only a risk when the table contains empty pages at the end. I get a crash in an assert-enabled build with this test procedure, after disabling autovacuum: -- session 1 create table t (c) as select * from generate_series(1,2); delete from t where c 1; -- session 2 begin; lock table t in access share mode; -- session 1 vacuum t; -- restart PostgreSQL to clear shared buffers -- session 3 select * from pgstattuple('t'); The simplest fix is to move the heap_endscan() call past the last use of the strategy. However, I don't think this function ought to be creating a strategy explicitly. It should use the one that initscan() creates, if any. -- Noah Misch EnterpriseDB http://www.enterprisedb.com diff --git a/contrib/pgstattuple/pgstattuple.c b/contrib/pgstattuple/pgstattuple.c index edc603f..1007748 100644 --- a/contrib/pgstattuple/pgstattuple.c +++ b/contrib/pgstattuple/pgstattuple.c @@ -274,7 +274,6 @@ pgstat_heap(Relation rel, FunctionCallInfo fcinfo) BlockNumber tupblock; Buffer buffer; pgstattuple_type stat = {0}; - BufferAccessStrategy bstrategy; SnapshotData SnapshotDirty; /* Disable syncscan because we assume we scan from block zero upwards */ @@ -283,10 +282,6 @@ pgstat_heap(Relation rel, FunctionCallInfo fcinfo) nblocks = scan-rs_nblocks; /* # blocks to be scanned */ - /* prepare access strategy for this table */ - bstrategy = GetAccessStrategy(BAS_BULKREAD); - scan-rs_strategy = bstrategy; - /* scan the relation */ while ((tuple = heap_getnext(scan, ForwardScanDirection)) != NULL) { @@ -320,26 +315,28 @@ pgstat_heap(Relation rel, FunctionCallInfo fcinfo) { CHECK_FOR_INTERRUPTS(); - buffer = ReadBufferExtended(rel, MAIN_FORKNUM, block, RBM_NORMAL, bstrategy); + buffer = ReadBufferExtended(rel, MAIN_FORKNUM, block, + RBM_NORMAL, scan-rs_strategy); LockBuffer(buffer, BUFFER_LOCK_SHARE); stat.free_space += PageGetHeapFreeSpace((Page) BufferGetPage(buffer)); UnlockReleaseBuffer(buffer); block++; } } - heap_endscan(scan); while (block nblocks) { CHECK_FOR_INTERRUPTS(); - buffer = ReadBufferExtended(rel, MAIN_FORKNUM, block, RBM_NORMAL, bstrategy); + buffer = ReadBufferExtended(rel, MAIN_FORKNUM, block, + RBM_NORMAL, scan-rs_strategy); LockBuffer(buffer, BUFFER_LOCK_SHARE); stat.free_space += PageGetHeapFreeSpace((Page) BufferGetPage(buffer)); UnlockReleaseBuffer(buffer); block++; } + heap_endscan(scan); relation_close(rel, AccessShareLock); stat.table_len = (uint64) nblocks *BLCKSZ; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Index-only scans and non-MVCC snapshots
On 27/06/2014 8:20 AM, Alvaro Herrera wrote: Ryan Johnson wrote: On 26/06/2014 11:04 PM, Alvaro Herrera wrote: Ryan Johnson wrote: As part of a research project, I'm trying to change Read Committed isolation to use HeapTupleSatisfiesNow rather than acquiring a new snapshot at every command [1]. Are you aware of this? commit 813fb0315587d32e3b77af1051a0ef517d187763 Author: Robert Haas rh...@postgresql.org Date: Thu Aug 1 10:46:19 2013 -0400 Remove SnapshotNow and HeapTupleSatisfiesNow. That would be wonderful news... if snapshots weren't so darned expensive to create. I take it you aren't aware of this other effort, either: http://archives.postgresql.org/message-id/539ad153.9000...@vmware.com That is good news, though from reading the thread it sounds like proc array accesses are being exchanged for accesses to an SLRU, so a lot of lwlock calls will remain. It will definitely help, though. SLRU will get ex-locked a lot less often, so the main source of contention will be for the actual lwlock acquire/release operations. Regards, Ryan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On Thu, Jun 26, 2014 at 5:01 PM, Andres Freund and...@2ndquadrant.com wrote: But a) isn't really avoidable because it'll otherwise generate compiler warnings and b) is done that way all over the tree. E.g. lwlock.c has several copies of (note the nonvolatility of proc): volatile LWLock *lock = l; PGPROC *proc = MyProc; ... proc-lwWaiting = true; proc-lwWaitMode = LW_WAIT_UNTIL_FREE; proc-lwWaitLink = NULL; /* waiters are added to the front of the queue */ proc-lwWaitLink = lock-head; if (lock-head == NULL) lock-tail = proc; lock-head = proc; /* Can release the mutex now */ SpinLockRelease(lock-mutex); There's nothing forcing the compiler to not move any of the proc-* assignments past the SpinLockRelease(). And indeed in my case it was actually the store to lwWaitLink that was moved across the lock. I think it's just pure luck that there's no active bug (that we know of) caused by this. I wouldn't be surprised if some dubious behaviour we've seen would be caused by similar issues. Now, we can fix this and similar cases by more gratuitous use of volatile. But for one we're never going to find all cases. For another it won't help *at all* for architectures with looser CPU level memory ordering guarantees. I think we finally need to bite the bullet and make all S_UNLOCKs compiler/write barriers. There are two separate issues here: 1. SpinLockAcquire and SpinLockRelease are not guaranteed to be compiler barriers, so all relevant memory accesses in the critical section need to be done using volatile pointers. Failing to do this is an easy mistake to make, and we've fixed numerous bugs of this type over the years (most recently, to my knowledge, in 4bc15a8bfbc7856bc3426dc9ab99567eebbb64d3). Forcing SpinLockAcquire() and SpinLockRelease() to serve as compiler barriers would let us dispense with a whole lot of volatile calls and make writing future code correctly a lot easier. 2. Some of our implementations of SpinLockAcquire and/or SpinLockRelease, but in particular SpinLockRelease, may not actually provide the memory-ordering semantics which they are required to provide. In particular, ... I'd previously, in http://www.postgresql.org/message-id/20130920151110.ga8...@awork2.anarazel.de, gone through the list of S_UNLOCKs and found several that were lacking. Most prominently the default S_UNLOCK is just #define S_UNLOCK(lock) (*((volatile slock_t *) (lock)) = 0) which allows the compiler to move non volatile access across and does nothing for CPU level cache coherency. ...this default implementation of S_UNLOCK() is pretty sketchy. Even on a platform that enforces reads in program order and writes in program order, this is still unsafe because a read within the critical section might get postponed until after this write. Now, x86 happens to have an additional constraint, which is that it can reorder loads before stores but not stores before loads; so that coding happens to provide release semantics. But that need not be true on every architecture, and the trend seems to be toward weaker memory ordering. As you pointed out to me on chat, the non-intrinsics based ARM implementation brokenly relies on the default S_UNLOCK(), which clearly isn't adequate. Now, in terms of solving these problems: I tend to think that we should try to think about these two problems somewhat separately. As to #1, in the back-branches, I think further volatile-izing the LWLock* routines is probably the only realistic solution. In master, I fully support moving the goalposts such that we require SpinLockAcquire() and SpinLockRelease() are compiler barriers. Once we do this, I think we should go back and rip out all the places where we've used volatile-ized pointers to provide compiler ordering. That way, if we haven't actually managed to provide compiler ordering everywhere, it's more likely that something will fall over and warn us about the problem; plus, that avoids keeping around a coding pattern which isn't actually the one we want people to copy. However, I think your proposed S_UNLOCKED_UNLOCK() hack is plain ugly, probably cripplingly slow, and there's no guarantee that's even correct; see for example the comments about Itanium's tas implementation possibly being only an acquire barrier (blech). On gcc and icc, which account for lines 99 through 700 of spin.h, it should be simple and mechanical to use compiler intrinsics to make sure that every S_UNLOCK implementation includes a compiler barrier. However, lines 710 through 895 support non-gcc, non-icc compilers, and some of those we may not know how to implement a compiler barrier - in particular, Univel CC, the Tru64 Alpha compiler, HPPA, AIX, or Sun's compilers. Except for Sun, we have no buildfarm support for those platforms, so we could consider just dropping support entirely, but I'd be inclined to do
Re: [HACKERS] Spinlocks and compiler/memory barriers
On Thu, Jun 26, 2014 at 6:40 PM, Tom Lane t...@sss.pgh.pa.us wrote: Andres Freund and...@2ndquadrant.com writes: On 2014-06-26 14:13:07 -0700, Tom Lane wrote: Surely it had better be a read barrier as well? I don't immediately see why it has to be read barrier? Hoisting a load from after the release into the locked area of code should be safe? No doubt, but delaying a read till after the unlocking write would certainly not be safe. AFAICT, README.barrier completely fails to define what we think the semantics of pg_read_barrier and pg_write_barrier actually are, so if you believe that a write barrier prevents reordering of reads relative to writes, you'd better propose some new text for that file. It certainly doesn't say that today. The relevant text is in barrier.h -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Atomics hardware support table supported architectures
On Fri, Jun 27, 2014 at 9:59 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-06-24 10:22:08 -0700, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: On 2014-06-24 13:03:37 -0400, Noah Misch wrote: If a change has the potential to make some architectures give wrong answers only at odd times, that's a different kind of problem. For that reason, actively breaking Alpha is a good thing. Not sure what you mean with the 'actively breaking Alpha' statement? That we should drop Alpha? +1. Especially with no buildfarm critter. Would anyone here care to bet even the price of a burger that Alpha isn't broken already? Here's a patch removing alpha/true64/osf/1 support. I think I got most relevant references, not sure if I missed something. Since there seems to be (unanimous?) support for dropping alpha and some patches coming up that need to deal with platform dependent stuff it seems sensible to do this first. I have noticed that most PostgreSQL committers seem for format their commit messages so that paragraphs are separated by a blank line, but you seem not to do that. I find that less readable. I don't personally object to dropping Alpha, but when this was discussed back in October, Stefan did: http://www.postgresql.org/message-id/52616373.10...@kaltenbrunner.cc But I think he's rather in the minority anyway. Also, if we added a fallback implementation for spinlocks that uses GCC intrinsics, it would probably work again, as much as it does now. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] better atomics - v0.5
On Thu, Jun 26, 2014 at 3:04 PM, Andres Freund and...@2ndquadrant.com wrote: I don't really see usecases where it's not related data that's being touched, but: The point is that the fastpath (not using a spinlock) might touch the atomic variable, even while the slowpath has acquired the spinlock. So the slowpath can't just use non-atomic operations on the atomic variable. Imagine something like releasing a buffer pin while somebody else is doing something that requires holding the buffer header spinlock. If the atomic variable can be manipulated without the spinlock under *any* circumstances, then how is it a good idea to ever manipulate it with the spinlock? That seems hard to reason about, and unnecessary. Critical sections for spinlocks should be short and contain only the instructions that need protection, and clearly the atomic op is not one of those. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL
On Thu, Jun 26, 2014 at 3:50 AM, Abhijit Menon-Sen a...@2ndquadrant.com wrote: At 2014-06-25 10:36:07 -0400, sfr...@snowman.net wrote: To me, that's ridiculous on the face of it. Other databases have had this kind of capability as a matter of course for decades- we are far behind in this area. OK, I've done a bit more thinking, but I'm not convinced that it's so cut-and-dried (as ridiculous on the face of it). Ian looked into the auditing capabilities of various (SQL and otherwise) systems some time ago. Informix handles its auditing configuration entirely outside the database. DB2 uses a mixture of in-database and external configuration. Oracle stores everything in the database. We're certainly not going to invent a mechanism other than GUC settings or catalog tables for auditing information (à la Informix). What I think you're saying is that without storing stuff in the catalog tables, we're not going to be able to offer auditing capabilities worth having. I can appreciate that, but I'm not sure I agree. You're right, it isn't possible to sanely do something like AUDIT SELECT ON TABLE t1 FOR role1 without storing configuration in the catalog tables. You're right, it would be nice to have that level of control. The part that I don't agree about is that using a GUC setting and a reloption won't give us anything useful. We can use that to do global, per-database, per-user, and per-object auditing with just mechanisms that are familiar to Postgres users already (ALTER … SET). Some other messages in the thread have suggested a similar mechanism, which implies that at least some people wouldn't be unhappy with it. No, we couldn't do combinations of TABLE/ROLE, but there's also no terrible conflict with doing that via some new AUDIT foo ON bar syntax later. As you point out, we're decades behind, and I seriously doubt if we're going to jump forward a few decades in time for 9.5. What do others think of the tradeoff? I agree with Stephen's statement that the audit configuration shouldn't be managed in flat files. If somebody wants to do that in an extension, fine, but I don't think we should commit anything into our source tree that works that way. The right way to manage that kind of configuration is with GUCs, or reloptions, or (security) labels, or some other kind of catalog structure. Renaming an object shouldn't be able to break auditing, but if the audit configuration is a list of table names in a file somewhere, it will. I disagree with Stephen's proposal that this should be in core, or that it needs its own dedicated syntax. I think contrib is a great place for extensions like this. That makes it a whole lot easier for people to develop customized vesions that meet particular needs they may have, and it avoids bloating core with a bunch of stuff that is only of interest to a subset of users. We spent years getting sepgsql into a form that could be shipped in contrib without substantially burdening core, and I don't see why this feature should be handed any differently. In fact, considering how much variation there is likely to be between auditing requirements at different sites, I'm not sure this should even be in contrib at this point. It's clearly useful, but there is no requirement that every useful extension must be bundled with the core server, and in fact doing so severely limits our ability to make further changes. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On 2014-06-27 13:00:34 -0400, Robert Haas wrote: There are two separate issues here: 1. SpinLockAcquire and SpinLockRelease are not guaranteed to be compiler barriers, so all relevant memory accesses in the critical section need to be done using volatile pointers. Failing to do this is an easy mistake to make, and we've fixed numerous bugs of this type over the years (most recently, to my knowledge, in 4bc15a8bfbc7856bc3426dc9ab99567eebbb64d3). Forcing SpinLockAcquire() and SpinLockRelease() to serve as compiler barriers would let us dispense with a whole lot of volatile calls and make writing future code correctly a lot easier. And actually faster in some cases. I'm just playing around with some bigger POWER machine and removing the volatiles in GetSnapshotData() + some access forcing macro for accessing xmin is worth 1% or so. 2. Some of our implementations of SpinLockAcquire and/or SpinLockRelease, but in particular SpinLockRelease, may not actually provide the memory-ordering semantics which they are required to provide. I tend to think that we should try to think about these two problems somewhat separately. As to #1, in the back-branches, I think further volatile-izing the LWLock* routines is probably the only realistic solution. We could also decide not to do anything :/. In master, I fully support moving the goalposts such that we require SpinLockAcquire() and SpinLockRelease() are compiler barriers. Once we do this, I think we should go back and rip out all the places where we've used volatile-ized pointers to provide compiler ordering. That way, if we haven't actually managed to provide compiler ordering everywhere, it's more likely that something will fall over and warn us about the problem; plus, that avoids keeping around a coding pattern which isn't actually the one we want people to copy. +1 However, I think your proposed S_UNLOCKED_UNLOCK() hack is plain ugly, probably cripplingly slow, and there's no guarantee that's even correct; see for example the comments about Itanium's tas implementation possibly being only an acquire barrier (blech). Heh. I don't think it's worse than the current fallback barrier implementation. The S_UNLOCKED_UNLOCK() thing was just to avoid recursion when using a barrier in the spinlock used to implement barriers... On gcc and icc, which account for lines 99 through 700 of spin.h, it should be simple and mechanical to use compiler intrinsics to make sure that every S_UNLOCK implementation includes a compiler barrier. However, lines 710 through 895 support non-gcc, non-icc compilers, and some of those we may not know how to implement a compiler barrier - in particular, Univel CC, the Tru64 Alpha compiler, HPPA, AIX, or Sun's compilers. Except for Sun, we have no buildfarm support for those platforms, so we could consider just dropping support entirely Both sun's and AIX's compilers can relatively easily be handled: * Solaris has atomic.h with membar_enter() et al. Apparently since at least solaris 9. * XLC has __fence and __isync intrinsics. There's been recent talk about AIX, including about AIX animal, so I'd be hesitant to drop it. It's also still developed. I'm obviously in favor of dropping Alpha. And I'm, unsurprisingly, all for removing unixware support (which is what univel CC seems to be used for after you dropped the univel port proper). I think the only person that has used postgres on hppa in the last 5 years is Tom, so I guess he'll have to speak up about it. Tom? void fake_compiler_barrier(void) { } void (*fake_compiler_barrier_hook) = fake_compiler_barrier; #define pg_compiler_barrier() ((*fake_compiler_barrier_hook)()) But we can do that as a fallback. It's what HPPA's example spinlock implementation does after all. Now, this doesn't remove the circular dependency between s_lock.h and barrier.h, because we still don't have a fallback method, other than acquiring and releasing a spinlock, of implementing a barrier that blocks both compiler reordering and CPU reordering. But it is enough to solve problem #1, and doesn't require that we drop support for anything that works now. I think we can move the fallback into a C function. Compared to the cost of a tas/unlock that shouldn't be significant. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Atomics hardware support table supported architectures
On 2014-06-27 13:12:31 -0400, Robert Haas wrote: I have noticed that most PostgreSQL committers seem for format their commit messages so that paragraphs are separated by a blank line, but you seem not to do that. I find that less readable. I'll change that. Since there seems to be (unanimous?) support for dropping alpha and some patches coming up that need to deal with platform dependent stuff it seems sensible to do this first. I don't personally object to dropping Alpha, but when this was discussed back in October, Stefan did: http://www.postgresql.org/message-id/52616373.10...@kaltenbrunner.cc Ah, right. I still am in favor of dropping it because I don't it is likely to work, but, as a compromise, we could remove only the Tru64 variant? Openbsd + gcc is much less of a hassle. But I think he's rather in the minority anyway. Looks like it. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On 2014-06-27 13:04:02 -0400, Robert Haas wrote: On Thu, Jun 26, 2014 at 6:40 PM, Tom Lane t...@sss.pgh.pa.us wrote: Andres Freund and...@2ndquadrant.com writes: On 2014-06-26 14:13:07 -0700, Tom Lane wrote: Surely it had better be a read barrier as well? I don't immediately see why it has to be read barrier? Hoisting a load from after the release into the locked area of code should be safe? No doubt, but delaying a read till after the unlocking write would certainly not be safe. AFAICT, README.barrier completely fails to define what we think the semantics of pg_read_barrier and pg_write_barrier actually are, so if you believe that a write barrier prevents reordering of reads relative to writes, you'd better propose some new text for that file. It certainly doesn't say that today. The relevant text is in barrier.h Note that that definition of a write barrier is *not* sufficient for the release of a lock... As I said elsewhere I think all the barrier definitions, except maybe alpha, luckily seem to be strong enough for that anyway. Do we want to introduce acquire/release barriers? Or do we want to redefine the current barriers to be strong enough for that? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL
Robert, * Robert Haas (robertmh...@gmail.com) wrote: I agree with Stephen's statement that the audit configuration shouldn't be managed in flat files. If somebody wants to do that in an extension, fine, but I don't think we should commit anything into our source tree that works that way. The right way to manage that kind of configuration is with GUCs, or reloptions, or (security) labels, or some other kind of catalog structure. Renaming an object shouldn't be able to break auditing, but if the audit configuration is a list of table names in a file somewhere, it will. Thanks for your thoughts on this. I disagree with Stephen's proposal that this should be in core, or that it needs its own dedicated syntax. I think contrib is a great place for extensions like this. That makes it a whole lot easier for people to develop customized vesions that meet particular needs they may have, and it avoids bloating core with a bunch of stuff that is only of interest to a subset of users. We spent years getting sepgsql into a form that could be shipped in contrib without substantially burdening core, and I don't see why this feature should be handed any differently. I don't exactly see how an extension or contrib module is going to be able to have a reasonable catalog structure which avoids the risk that renaming an object (role, schema, table, column, policy) will break the configuration without being part of core. Add on to that the desire for audit control to be a grant'able right along the lines of: GRANT AUDIT ON TABLE x TO audit_role; which allows a user who is not the table owner or a superuser to manage the auditing. I'm not against implementing capabilities through contrib modules, provided the goals of the feature can be met that way. In a way, sepgsql is actually a good example for this- the module provides the implementation, but all of the syntax and catalog information is actually in core. An implementation similar to that for auditing might be workable, but I don't see pgaudit as being that. In fact, considering how much variation there is likely to be between auditing requirements at different sites, I'm not sure this should even be in contrib at this point. It's clearly useful, but there is no requirement that every useful extension must be bundled with the core server, and in fact doing so severely limits our ability to make further changes. For my part, I do view auditing as being a requirement we should be addressing through core- and not just for the reasons mentioned above. We might be able to manage all of the above through an extension, however, as we continue to add capabilities and features, new types and methods, ways to extend the system, and more, auditing needs to keep pace and be considered a core feature as much as the GRANT system is. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Atomics hardware support table supported architectures
Andres Freund and...@2ndquadrant.com writes: On 2014-06-27 13:12:31 -0400, Robert Haas wrote: I don't personally object to dropping Alpha, but when this was discussed back in October, Stefan did: http://www.postgresql.org/message-id/52616373.10...@kaltenbrunner.cc As an ex-packager I do not believe the argument that it will matter to packagers if we desupport one of their secondary architectures. There are many, many packages that have never claimed to work on oddball architectures at all. Packagers would be better served by honesty about what we can support. Ah, right. I still am in favor of dropping it because I don't it is likely to work, but, as a compromise, we could remove only the Tru64 variant? Openbsd + gcc is much less of a hassle. But I think he's rather in the minority anyway. Looks like it. There would be value in continuing to support Alpha if we had one in the buildfarm. We don't, and have not had in any recent memory, and I haven't noticed anyone offering to provide one in future. The actual situation is that we're shipping a port that most likely doesn't work, and we have no way to fix it. That's of no benefit to anyone. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] better atomics - v0.5
On 2014-06-27 13:15:25 -0400, Robert Haas wrote: On Thu, Jun 26, 2014 at 3:04 PM, Andres Freund and...@2ndquadrant.com wrote: I don't really see usecases where it's not related data that's being touched, but: The point is that the fastpath (not using a spinlock) might touch the atomic variable, even while the slowpath has acquired the spinlock. So the slowpath can't just use non-atomic operations on the atomic variable. Imagine something like releasing a buffer pin while somebody else is doing something that requires holding the buffer header spinlock. If the atomic variable can be manipulated without the spinlock under *any* circumstances, then how is it a good idea to ever manipulate it with the spinlock? That seems hard to reason about, and unnecessary. Critical sections for spinlocks should be short and contain only the instructions that need protection, and clearly the atomic op is not one of those. Imagine the situation for the buffer header spinlock which is one of the bigger performance issues atm. We could aim to replace all usages of that with clever and complicated logic, but it's hard. The IMO reasonable (and prototyped) way to do it is to make the common paths lockless, but fall back to the spinlock for the more complicated situations. For the buffer header that means that pin/unpin and buffer lookup are lockless, but IO and changing the identity of a buffer still require the spinlock. My attempts to avoid the latter basically required a buffer header specific reimplementation of spinlocks. To make pin/unpin et al safe without acquiring the spinlock the pincount and the flags had to be moved into one variable so that when incrementing the pincount you automatically get the flagbits back. If it's currently valid all is good, otherwise the spinlock needs to be acquired. But for that to work you need to set flags while the spinlock is held. Possibly you can come up with ways to avoid that, but I am pretty damn sure that the code will be less robust by forbidding atomics inside a critical section, not the contrary. It's a good idea to avoid it, but not at all costs. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [BUGS] BUG #8673: Could not open file pg_multixact/members/xxxx on slave during hot_standby
Jeff Janes wrote: This problem was initially fairly easy to reproduce, but since I started adding instrumentation specifically to catch it, it has become devilishly hard to reproduce. I think my next step will be to also log each of the values which goes into the complex if (...) expression that decides on the deletion. Could you please to reproduce it after updating to latest? I pushed fixes that should close these issues. Maybe you want to remove the instrumentation you added, to make failures more likely. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Atomics hardware support table supported architectures
On 06/27/2014 08:26 PM, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: On 2014-06-27 13:12:31 -0400, Robert Haas wrote: I don't personally object to dropping Alpha, but when this was discussed back in October, Stefan did: http://www.postgresql.org/message-id/52616373.10...@kaltenbrunner.cc As an ex-packager I do not believe the argument that it will matter to packagers if we desupport one of their secondary architectures. There are many, many packages that have never claimed to work on oddball architectures at all. Packagers would be better served by honesty about what we can support. yeah I guess so - I was mostly pointing out that alpha looked like to be a way more active platform than most of what was discussed in that thread. I personally dont think that continuing to support alpha will buy us anything... Ah, right. I still am in favor of dropping it because I don't it is likely to work, but, as a compromise, we could remove only the Tru64 variant? Openbsd + gcc is much less of a hassle. But I think he's rather in the minority anyway. Looks like it. There would be value in continuing to support Alpha if we had one in the buildfarm. We don't, and have not had in any recent memory, and I haven't noticed anyone offering to provide one in future. The actual situation is that we're shipping a port that most likely doesn't work, and we have no way to fix it. That's of no benefit to anyone. yeah I dont have access to any alpha hardware to provide a buildfarm box so I cant help there :( Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Atomics hardware support table supported architectures
On Fri, Jun 27, 2014 at 07:51:32PM +0200, Andres Freund wrote: On 2014-06-27 13:12:31 -0400, Robert Haas wrote: I don't personally object to dropping Alpha, but when this was discussed back in October, Stefan did: http://www.postgresql.org/message-id/52616373.10...@kaltenbrunner.cc Ah, right. I still am in favor of dropping it because I don't it is likely to work, but, as a compromise, we could remove only the Tru64 variant? Openbsd + gcc is much less of a hassle. Retaining Alpha support means placing data dependency barriers (Linux kernel term) where the Alpha memory model requires them. I doubt enough users will stress Alpha builds for us to distinguish a missing barrier from hardware flakiness, so we'd never find out whether we did it right. That's why I favor removing Alpha-specific code completely, and it is an OS-independent and compiler-independent motive. -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] delta relations in AFTER triggers
On Sat, Jun 21, 2014 at 11:06:26AM -0700, Kevin Grittner wrote: Kevin Grittner kgri...@ymail.com wrote: Kevin Grittner kgri...@ymail.com wrote: I've already said that I now think we should use the standard CREATE TRIGGER syntax to specify the transition tables, and that if we do that we don't need the reloption that's in the patch. If you ignore the 19 lines of new code to add that reloption, absolutely 100% of the code changes in the patch so far are in trigger.c and trigger.h. Although nobody has actually framed their feedback as a review, I feel that I have enough to work with to throw the patch into Waiting on Author status. Since I started with the assumption that I was going to be using standard syntax and got a ways into that before convincing myself it was a bad idea, I should have a new version of the patch working that way in a couple days. Here is v2. Thanks! I've taken the liberty of making an extension that uses this. Preliminary tests indicate a 10x performance improvement over the user-space hack I did that's similar in functionality. Please find attached the extension, etc., which I've published to https://github.com/davidfetter/postgresql_projects/tree/test_delta_v2 Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate diff --git a/contrib/statement_trigger_row/Makefile b/contrib/statement_trigger_row/Makefile new file mode 100644 index 000..e0cf006 --- /dev/null +++ b/contrib/statement_trigger_row/Makefile @@ -0,0 +1,17 @@ +# contrib/statement_trigger_row/Makefile + +MODULES = statement_trigger_row + +EXTENSION = statement_trigger_row +DATA = statement_trigger_row--1.0.sql + +ifdef USE_PGXS +PG_CONFIG = pg_config +PGXS := $(shell $(PG_CONFIG) --pgxs) +include $(PGXS) +else +subdir = contrib/statement_trigger_row +top_builddir = ../.. +include $(top_builddir)/src/Makefile.global +include $(top_srcdir)/contrib/contrib-global.mk +endif diff --git a/contrib/statement_trigger_row/sql/easy_way.sql b/contrib/statement_trigger_row/sql/easy_way.sql new file mode 100644 index 000..019ae7f --- /dev/null +++ b/contrib/statement_trigger_row/sql/easy_way.sql @@ -0,0 +1,85 @@ +/* + * If these were surfaced to PL/pgsql, this is what it might look like. + */ + +BEGIN; + +CREATE TABLE a( +id SERIAL PRIMARY KEY, +i INT +); + +CREATE FUNCTION summarize_a_inserts() +RETURNS TRIGGER LANGUAGE plpgsql +AS $$ +DECLARE +the_sum BIGINT; +BEGIN +SELECT INTO the_sum sum(NEW.i) +FROM +new_a; +RAISE NOTICE 'Total change: %.', the_sum; +RETURN NULL; +END; +$$; + +CREATE FUNCTION summarize_a_updates() +RETURNS TRIGGER LANGUAGE plpgsql +AS $$ +DECLARE +the_sum BIGINT; +BEGIN +SELECT INTO the_sum sum(COALESCE(NEW.i,0) - COALESCE(OLD.i, 0)) +FROM +old_a +JOIN +new_a +ON(old_a.id = new_a.id); +RAISE NOTICE 'Total change: %.', the_sum; +RETURN NULL; +END; +$$; + +CREATE FUNCTION summarize_a_deletes() +RETURNS TRIGGER LANGUAGE plpgsql +AS $$ +DECLARE +the_sum BIGINT; +BEGIN +SELECT INTO the_sum -1 * sum(OLD.i) +FROM +old_a; +RAISE NOTICE 'Total change: %.', the_sum; +RETURN NULL; +END; +$$; + +CREATE TRIGGER statement_after_insert_a +AFTER INSERT ON a +REFERENCING +NEW TABLE AS new_a +FOR EACH STATEMENT +EXECUTE PROCEDURE summarize_a_inserts(); + +CREATE TRIGGER statement_after_update_a +AFTER UPDATE ON a +REFERENCING +OLD TABLE AS old_a +NEW TABLE AS new_a +FOR EACH STATEMENT +EXECUTE PROCEDURE summarize_a_updates(); + +CREATE TRIGGER statement_after_delete_a +AFTER DELETE ON a +REFERENCING +OLD TABLE AS old_a +FOR EACH STATEMENT +EXECUTE PROCEDURE summarize_a_deletes(); + +INSERT INTO a(i) +SELECT * FROM generate_series(1,1); + +UPDATE a SET i=i+1; + +ROLLBACK; + diff --git a/contrib/statement_trigger_row/sql/hard_way.sql b/contrib/statement_trigger_row/sql/hard_way.sql new file mode 100644 index 000..c6f7c1d --- /dev/null +++ b/contrib/statement_trigger_row/sql/hard_way.sql @@ -0,0 +1,68 @@ +CREATE TABLE IF NOT EXISTS h( +i INTEGER +); + +CREATE FUNCTION set_up_h_rows() +RETURNS TRIGGER LANGUAGE plpgsql +AS $$ +BEGIN +CREATE TEMPORARY TABLE IF NOT EXISTS h_rows(LIKE a) ON COMMIT DROP; +RETURN NULL; +END; +$$; + +CREATE TRIGGER statement_before_writing_h +BEFORE INSERT OR UPDATE OR DELETE ON a +FOR EACH STATEMENT +EXECUTE PROCEDURE set_up_h_rows(); + +CREATE OR REPLACE FUNCTION stash_h_row_deltas() +RETURNS TRIGGER +LANGUAGE plpgsql +AS $$ +BEGIN +INSERT INTO h_rows(i) +VALUES( +CASE TG_OP +WHEN 'INSERT' THEN COALESCE(NEW.i,0) +
[HACKERS] [PATCH 0/3] Tau support
Hi, Please see the following patches for implementing support for and migrating to Tau in PostgreSQL. Happy Tau Day Hacking http://tauday.com/ Make sure to check out this shory story: http://tauday.com/a-parable Asbjørn Sloth Tønnesen (3): Implement tau() function backend: use M_TAU instead of M_PI earthdistance: TWO_PI = M_TAU contrib/earthdistance/earthdistance.c | 12 +++- doc/src/sgml/func.sgml| 13 + src/backend/utils/adt/float.c | 21 + src/backend/utils/adt/geo_ops.c | 8 ++-- src/include/catalog/pg_proc.h | 2 ++ src/include/utils/builtins.h | 1 + 6 files changed, 46 insertions(+), 11 deletions(-) -- 2.0.0 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [PATCH 3/3] earthdistance: TWO_PI = M_TAU
Signed-off-by: Asbjørn Sloth Tønnesen asbj...@asbjorn.it --- contrib/earthdistance/earthdistance.c | 12 +++- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/contrib/earthdistance/earthdistance.c b/contrib/earthdistance/earthdistance.c index 6bbebdf..432309c 100644 --- a/contrib/earthdistance/earthdistance.c +++ b/contrib/earthdistance/earthdistance.c @@ -6,16 +6,18 @@ #include utils/geo_decls.h /* for Point */ -#ifndef M_PI -#define M_PI 3.14159265358979323846 +#ifndef M_TAU +#define M_TAU 6.28318530717958647693 #endif +#ifndef M_PI +#define M_PI (M_TAU / 2.0) +#endif PG_MODULE_MAGIC; /* Earth's radius is in statute miles. */ static const double EARTH_RADIUS = 3958.747716; -static const double TWO_PI = 2.0 * M_PI; /** @@ -30,7 +32,7 @@ static const double TWO_PI = 2.0 * M_PI; static double degtorad(double degrees) { - return (degrees / 360.0) * TWO_PI; + return (degrees / 360.0) * M_TAU; } /** @@ -67,7 +69,7 @@ geo_distance_internal(Point *pt1, Point *pt2) /* compute difference in longitudes - want 180 degrees */ longdiff = fabs(long1 - long2); if (longdiff M_PI) - longdiff = TWO_PI - longdiff; + longdiff = M_TAU - longdiff; sino = sqrt(sin(fabs(lat1 - lat2) / 2.) * sin(fabs(lat1 - lat2) / 2.) + cos(lat1) * cos(lat2) * sin(longdiff / 2.) * sin(longdiff / 2.)); -- 2.0.0 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [PATCH 2/3] backend: use M_TAU instead of M_PI
Signed-off-by: Asbjørn Sloth Tønnesen asbj...@asbjorn.it --- src/backend/utils/adt/float.c | 4 ++-- src/backend/utils/adt/geo_ops.c | 8 ++-- 2 files changed, 8 insertions(+), 4 deletions(-) diff --git a/src/backend/utils/adt/float.c b/src/backend/utils/adt/float.c index 1278053..b0211f8 100644 --- a/src/backend/utils/adt/float.c +++ b/src/backend/utils/adt/float.c @@ -1701,7 +1701,7 @@ degrees(PG_FUNCTION_ARGS) float8 arg1 = PG_GETARG_FLOAT8(0); float8 result; - result = arg1 * (180.0 / M_PI); + result = arg1 * (360.0 / M_TAU); CHECKFLOATVAL(result, isinf(arg1), arg1 == 0); PG_RETURN_FLOAT8(result); @@ -1737,7 +1737,7 @@ radians(PG_FUNCTION_ARGS) float8 arg1 = PG_GETARG_FLOAT8(0); float8 result; - result = arg1 * (M_PI / 180.0); + result = arg1 * (M_TAU / 360.0); CHECKFLOATVAL(result, isinf(arg1), arg1 == 0); PG_RETURN_FLOAT8(result); diff --git a/src/backend/utils/adt/geo_ops.c b/src/backend/utils/adt/geo_ops.c index 54391fd..429de6d 100644 --- a/src/backend/utils/adt/geo_ops.c +++ b/src/backend/utils/adt/geo_ops.c @@ -23,8 +23,12 @@ #include utils/builtins.h #include utils/geo_decls.h +#ifndef M_TAU +#define M_TAU 6.28318530717958647693 +#endif + #ifndef M_PI -#define M_PI 3.14159265358979323846 +#define M_PI (M_TAU / 2.0) #endif @@ -5168,7 +5172,7 @@ circle_poly(PG_FUNCTION_ARGS) SET_VARSIZE(poly, size); poly-npts = npts; - anglestep = (2.0 * M_PI) / npts; + anglestep = M_TAU / npts; for (i = 0; i npts; i++) { -- 2.0.0 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [PATCH 1/3] Implement tau() function
Signed-off-by: Asbjørn Sloth Tønnesen asbj...@asbjorn.it --- doc/src/sgml/func.sgml| 13 + src/backend/utils/adt/float.c | 17 +++-- src/include/catalog/pg_proc.h | 2 ++ src/include/utils/builtins.h | 1 + 4 files changed, 31 insertions(+), 2 deletions(-) diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 551576a..2b48123 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -878,6 +878,19 @@ row entry indexterm + primarytau/primary +/indexterm +literalfunctiontau()/function/literal + /entry + entrytypedp/type/entry + entryquotetau;/quote constant/entry + entryliteraltau()/literal/entry + entryliteral6.28318530717959/literal/entry + /row + + row + entry +indexterm primarytrunc/primary /indexterm literalfunctiontrunc(typedp/type or typenumeric/type)/function/literal diff --git a/src/backend/utils/adt/float.c b/src/backend/utils/adt/float.c index 41b3eaa..1278053 100644 --- a/src/backend/utils/adt/float.c +++ b/src/backend/utils/adt/float.c @@ -26,9 +26,12 @@ #include utils/sortsupport.h +#ifndef M_TAU +#define M_TAU 6.28318530717958647693 +#endif + #ifndef M_PI -/* from my RH5.2 gcc math.h file - thomas 2000-04-03 */ -#define M_PI 3.14159265358979323846 +#define M_PI (M_TAU / 2.0) #endif /* Visual C++ etc lacks NAN, and won't accept 0.0/0.0. NAN definition from @@ -1716,6 +1719,16 @@ dpi(PG_FUNCTION_ARGS) /* + * dtau- returns the constant Tau + */ +Datum +dtau(PG_FUNCTION_ARGS) +{ + PG_RETURN_FLOAT8(M_TAU); +} + + +/* * radians - returns radians converted from degrees */ Datum diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h index 0b6105b..4148bed 100644 --- a/src/include/catalog/pg_proc.h +++ b/src/include/catalog/pg_proc.h @@ -1825,6 +1825,8 @@ DATA(insert OID = 1609 ( radians PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 701 DESCR(degrees to radians); DATA(insert OID = 1610 ( pi PGNSP PGUID 12 1 0 0 0 f f f f t f i 0 0 701 _null_ _null_ _null_ _null_ dpi _null_ _null_ _null_ )); DESCR(PI); +DATA(insert OID = 1611 ( tau PGNSP PGUID 12 1 0 0 0 f f f f t f i 0 0 701 _null_ _null_ _null_ _null_ dtau _null_ _null_ _null_ )); +DESCR(Tau); DATA(insert OID = 1618 ( interval_mul PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 1186 1186 701 _null_ _null_ _null_ _null_ interval_mul _null_ _null_ _null_ )); diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h index bbb5d39..0258a64 100644 --- a/src/include/utils/builtins.h +++ b/src/include/utils/builtins.h @@ -408,6 +408,7 @@ extern Datum dsin(PG_FUNCTION_ARGS); extern Datum dtan(PG_FUNCTION_ARGS); extern Datum degrees(PG_FUNCTION_ARGS); extern Datum dpi(PG_FUNCTION_ARGS); +extern Datum dtau(PG_FUNCTION_ARGS); extern Datum radians(PG_FUNCTION_ARGS); extern Datum drandom(PG_FUNCTION_ARGS); extern Datum setseed(PG_FUNCTION_ARGS); -- 2.0.0 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH 0/3] Tau support
=?UTF-8?q?Asbj=C3=B8rn=20Sloth=20T=C3=B8nnesen?= asbj...@asbjorn.it writes: Please see the following patches for implementing support for and migrating to Tau in PostgreSQL. While I don't particularly object to adding a tau() function, the rest of this seems like change for the sake of change. What's the point? And it had better be a darn good point, because cross-version code differences are a constant source of maintenance pain for us. We don't need ones that have no concrete value. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH 0/3] Tau support
On Fri, Jun 27, 2014 at 06:03:33PM -0700, Tom Lane wrote: =?UTF-8?q?Asbj=C3=B8rn=20Sloth=20T=C3=B8nnesen?= asbj...@asbjorn.it writes: Please see the following patches for implementing support for and migrating to Tau in PostgreSQL. While I don't particularly object to adding a tau() function, the rest of this seems like change for the sake of change. What's the point? And it had better be a darn good point, because cross-version code differences are a constant source of maintenance pain for us. We don't need ones that have no concrete value. It's Tau day (6.28) in some parts of the world already. Might that be the cause? Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH 0/3] Tau support
On Fri, Jun 27, 2014 at 9:03 PM, Tom Lane t...@sss.pgh.pa.us wrote: =?UTF-8?q?Asbj=C3=B8rn=20Sloth=20T=C3=B8nnesen?= asbj...@asbjorn.it writes: Please see the following patches for implementing support for and migrating to Tau in PostgreSQL. While I don't particularly object to adding a tau() function, the rest of this seems like change for the sake of change. What's the point? And it had better be a darn good point, because cross-version code differences are a constant source of maintenance pain for us. We don't need ones that have no concrete value. Perhaps we should also have a pau() function, as proposed here: http://xkcd.com/1292/ On a related note, I think we should declare vi the officially favored editor of the PostgreSQL project. (Note: No, I'm not serious.) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On Fri, Jun 27, 2014 at 2:04 PM, Andres Freund and...@2ndquadrant.com wrote: On 2014-06-27 13:04:02 -0400, Robert Haas wrote: On Thu, Jun 26, 2014 at 6:40 PM, Tom Lane t...@sss.pgh.pa.us wrote: Andres Freund and...@2ndquadrant.com writes: On 2014-06-26 14:13:07 -0700, Tom Lane wrote: Surely it had better be a read barrier as well? I don't immediately see why it has to be read barrier? Hoisting a load from after the release into the locked area of code should be safe? No doubt, but delaying a read till after the unlocking write would certainly not be safe. AFAICT, README.barrier completely fails to define what we think the semantics of pg_read_barrier and pg_write_barrier actually are, so if you believe that a write barrier prevents reordering of reads relative to writes, you'd better propose some new text for that file. It certainly doesn't say that today. The relevant text is in barrier.h Note that that definition of a write barrier is *not* sufficient for the release of a lock... As I said elsewhere I think all the barrier definitions, except maybe alpha, luckily seem to be strong enough for that anyway. Do we want to introduce acquire/release barriers? Or do we want to redefine the current barriers to be strong enough for that? Well, unless we're prepared to dump support for an awful lot of platfomrs, trying to support acquire and release barriers on every platform we support is a doomed effort. The definitions of the barriers implemented by barrier.h are the same as the ones that Linux has (minus read-barrier-depends), which I think is probably good evidence that they are generally useful definitions. If we were going to use any of those in s_lock.h, it'd have to be pg_memory_barrier(), but I don't think making s_lock.h dependent on barrier.h is the way to go. I think we should just adjust s_lock.h in a minimal way, using inline assembler or tweaking the existing assembler or whatever. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] better atomics - v0.5
On Fri, Jun 27, 2014 at 2:00 PM, Andres Freund and...@2ndquadrant.com wrote: On 2014-06-27 13:15:25 -0400, Robert Haas wrote: On Thu, Jun 26, 2014 at 3:04 PM, Andres Freund and...@2ndquadrant.com wrote: I don't really see usecases where it's not related data that's being touched, but: The point is that the fastpath (not using a spinlock) might touch the atomic variable, even while the slowpath has acquired the spinlock. So the slowpath can't just use non-atomic operations on the atomic variable. Imagine something like releasing a buffer pin while somebody else is doing something that requires holding the buffer header spinlock. If the atomic variable can be manipulated without the spinlock under *any* circumstances, then how is it a good idea to ever manipulate it with the spinlock? That seems hard to reason about, and unnecessary. Critical sections for spinlocks should be short and contain only the instructions that need protection, and clearly the atomic op is not one of those. Imagine the situation for the buffer header spinlock which is one of the bigger performance issues atm. We could aim to replace all usages of that with clever and complicated logic, but it's hard. The IMO reasonable (and prototyped) way to do it is to make the common paths lockless, but fall back to the spinlock for the more complicated situations. For the buffer header that means that pin/unpin and buffer lookup are lockless, but IO and changing the identity of a buffer still require the spinlock. My attempts to avoid the latter basically required a buffer header specific reimplementation of spinlocks. To make pin/unpin et al safe without acquiring the spinlock the pincount and the flags had to be moved into one variable so that when incrementing the pincount you automatically get the flagbits back. If it's currently valid all is good, otherwise the spinlock needs to be acquired. But for that to work you need to set flags while the spinlock is held. Possibly you can come up with ways to avoid that, but I am pretty damn sure that the code will be less robust by forbidding atomics inside a critical section, not the contrary. It's a good idea to avoid it, but not at all costs. You might be right, but I'm not convinced. Does the lwlock scalability patch work this way, too? I was hoping to look at that RSN, so if there are roughly the same issues there it might shed some light on this point also. What I'm basically afraid of is that this will work fine in many cases but have really ugly failure modes. That's pretty much what happens with spinlocks already - the overhead is insignificant at low levels of contention but as the spinlock starts to become contended the CPUs all fight over the cache line and performance goes to pot. ISTM that making the spinlock critical section significantly longer by putting atomic ops in there could appear to win in some cases while being very bad in others. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER SYSTEM RESET?
On Fri, Jun 27, 2014 at 12:27 PM, Vik Fearing vik.fear...@dalibo.com wrote: On 06/27/2014 08:49 AM, Vik Fearing wrote: This third patch reformats the documentation in the way I expected it to be committed. Amit, I added this to the next commitfest with your name as reviewer. No issues, I will review it in next CF. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com