Re: [HACKERS] Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL
On Thu, 2010-03-25 at 11:08 +0900, Fujii Masao wrote: On Thu, Mar 25, 2010 at 8:23 AM, Simon Riggs si...@2ndquadrant.com wrote: PANICing won't change the situation, so it just destroys server availability. If we had 1 master and 42 slaves then this behaviour would take down almost the whole server farm at once. Very uncool. You might have reason to prevent the server starting up at that point, when in standby mode, but that is not a reason to PANIC. We don't really want all of the standbys thinking they can be the master all at once either. Better to throw a serious ERROR and have the server still up and available for reads. OK. How about making the startup process emit WARNING, stop WAL replay and wait for the presence of trigger file, when an invalid record is found? Which keeps the server up for readonly queries. And if the trigger file is found, I think that the startup process should emit a FATAL, i.e., the server should exit immediately, to prevent the server from becoming the primary in a half-finished state. Also to allow such a halfway failover, we should provide fast failover mode as pg_standby does? The lack of docs begins to show a lack of coherent high-level design here. By now, I've forgotten what this thread was even about. The major design decision in this that keeps showing up is remove pg_standby, at all costs but no reason has ever been given for that. I do believe there is a better way, but we won't find it by trial and error, even if we had time to do so. Please work on some clear docs for the failure modes in this system. That way we can all read them and understand them, or point out further issues. Moving straight to code is not a solution to this, since what we need now is to all agree on the way forwards. If we ignore this, then there is considerable risk that streaming rep will have a fatal operational flaw. Please just document/diagram how it works now, highlighting the problems that still remain to be solved. We're all behind you and I'm helping wherever I can. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL
Tom Lane wrote: Fujii Masao masao.fu...@gmail.com writes: OK. How about making the startup process emit WARNING, stop WAL replay and wait for the presence of trigger file, when an invalid record is found? Which keeps the server up for readonly queries. And if the trigger file is found, I think that the startup process should emit a FATAL, i.e., the server should exit immediately, to prevent the server from becoming the primary in a half-finished state. Also to allow such a halfway failover, we should provide fast failover mode as pg_standby does? I find it extremely scary to read this sort of blue-sky design discussion going on now, two months after we were supposedly feature-frozen for 9.0. We need to be looking for the *rock bottom minimum* amount of work to do to get 9.0 out the door in a usable state; not what would be nice to have later on. Agreed, this is getting complicated. I'm already worried about the amount of changes needed to make it work, I don't want to add any new modes. PANIC seems like the appropriate solution for now. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL
On Thu, 2010-03-25 at 11:08 +0900, Fujii Masao wrote: On Thu, Mar 25, 2010 at 8:23 AM, Simon Riggs si...@2ndquadrant.com wrote: PANICing won't change the situation, so it just destroys server availability. If we had 1 master and 42 slaves then this behaviour would take down almost the whole server farm at once. Very uncool. You might have reason to prevent the server starting up at that point, when in standby mode, but that is not a reason to PANIC. We don't really want all of the standbys thinking they can be the master all at once either. Better to throw a serious ERROR and have the server still up and available for reads. OK. How about making the startup process emit WARNING, stop WAL replay and wait for the presence of trigger file, when an invalid record is found? Which keeps the server up for readonly queries. Yes. Receiving new WAL records is a completely separate activity from running the rest of the server (in this release...). And if the trigger file is found, I think that the startup process should emit a FATAL, i.e., the server should exit immediately, to prevent the server from becoming the primary in a half-finished state. Please remember that half-finished is your judgment on what has happened in the particular scenario you are considering. In many cases, an invalid WAL record clearly and simply indicates the end of WAL and we should start up normally. State is a good word here. I'd like to see the server have a clear state model with well documented transitions between them. The state should also be externally queriable, so we can work out what its doing and how long we can expect it to keep doing it for. I don't want to be in a position where we are waiting for the server to sort itself out from a complex set of retries. Also to allow such a halfway failover, we should provide fast failover mode as pg_standby does? Yes, we definitely need a JFDI solution for immediate failover. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL
On Thu, 2010-03-25 at 10:11 +0200, Heikki Linnakangas wrote: PANIC seems like the appropriate solution for now. It definitely is not. Think some more. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL
Simon Riggs wrote: On Thu, 2010-03-25 at 11:08 +0900, Fujii Masao wrote: And if the trigger file is found, I think that the startup process should emit a FATAL, i.e., the server should exit immediately, to prevent the server from becoming the primary in a half-finished state. Please remember that half-finished is your judgment on what has happened in the particular scenario you are considering. In many cases, an invalid WAL record clearly and simply indicates the end of WAL and we should start up normally. Not in the archive. An invalid WAL record in a file in pg_xlog is usually an indication of end-of-WAL, but there should be no invalid records in the archived WAL files, or streamed from the master. State is a good word here. I'd like to see the server have a clear state model with well documented transitions between them. The state should also be externally queriable, so we can work out what its doing and how long we can expect it to keep doing it for. Agreed. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL
(cc'ing docs list) Simon Riggs wrote: The lack of docs begins to show a lack of coherent high-level design here. Yeah, I think you're right. It's becoming hard to keep track of how it's supposed to behave. By now, I've forgotten what this thread was even about. The major design decision in this that keeps showing up is remove pg_standby, at all costs but no reason has ever been given for that. I do believe there is a better way, but we won't find it by trial and error, even if we had time to do so. This has nothing to do with pg_standby. Please work on some clear docs for the failure modes in this system. That way we can all read them and understand them, or point out further issues. Moving straight to code is not a solution to this, since what we need now is to all agree on the way forwards. If we ignore this, then there is considerable risk that streaming rep will have a fatal operational flaw. Please just document/diagram how it works now, highlighting the problems that still remain to be solved. We're all behind you and I'm helping wherever I can. Ok, here's my attempt at the docs. Read it as a replacement for the High Availability, Load Balancing, and Replication chapter, but of course many of the sections will be unchanged, as indicated below. - Chapter 25. High Availability, Load Balancing, and Replication 25.1 Comparison of different solutions no changes 25.2 Log-Shipping Standby servers overview from current File-based Log Shipping section. With small changes so that it applies to the built-in standby mode as well as pg_standby like solutions A standby server can also be used for read-only queries. This is called Hot Standby mode, see chapter XXX 25.2.1 Planning Set up two servers with identical hardware ... two first paragraphs of current File-based log-shipping / Planning section 25.2.3 Standby mode In standby mode, the server continously applies WAL received from the master server. The standby server can receive WAL from a WAL archive (see restore_command) or directly from the master over a TCP connection (streaming replication). The standby server will also attempt to restore any WAL found in the standby's pg_xlog. That typically happens after a server restart, to replay again any WAL streamed from the master before the restart, but you can also manually copy files to pg_xlog at any time to have them replayed. At startup, the standby begins by restoring all WAL available in the archive location, calling restore_command. Once it reaches the end of WAL available there and restore_command fails, it tries to restore any WAL available in the pg_xlog directory (possibly stored there by streaming replication before restart). If that fails, and streaming replication has been configured, the standby tries to connect to the master server and stream WAL from it. If that fails or streaming replication is not configured, or if the connection is disconnected later on, the standby goes back to step 1 and tries to restoring the file from the archive again. This loop of retries from the archive, pg_xlog, and via streaming replication goes on until the server is stopped or failover is triggered by a trigger file. A corrupt or half-finished WAL file in the archive, or streamed from the master, causes a PANIC and immediate shutdown of the standby server. A corrupt WAL file is always a serious event which requires administrator action. If you want to recover a WAL file known to be corrupt as far as it can be, you can copy the file manually into pg_xlog. Standby mode is exited and the server switches to normal operation, when a trigger file is found (trigger_file). Before failover, it will restore any WAL available in the archive or in pg_xlog, but won't try to connect to the master or wait for files to become available in the archive. 25.2.4 Preparing Master for Standby servers Set up continous archiving to a WAL archive on the master, as described in the chapter Continous Archiving and Point-In-Time_recovery. The archive location should be accessible from the standby even when the master is down, ie. it should reside on the standby server itself or another trusted server, not on the master server. If you want to use streaming replication, set up authentication to allow streaming replication connections. Set max_wal_senders. Take a base backup as described in chapter Continous Archiving and Point-In-Time_recovery / Making a Base Backup. 25.2.4.1 Authentication for streaming replication Ensure that listen_addresses allows connections from the standby server. current Streaming Replication / Authentication section, describing pg_hba.conf 25.2.5 Setting up the standby server 1. Take a base backup, and copy it to the standby 2. Create a restore_command to restore files from the WAL archive. 3. Set standby_mode=on 4. If you want to use streaming replicaton, set primary_conninfo You can use restartpoint_command to prune the archive of files no longer
Re: [HACKERS] Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL
Simon Riggs wrote: On Thu, 2010-03-25 at 10:11 +0200, Heikki Linnakangas wrote: PANIC seems like the appropriate solution for now. It definitely is not. Think some more. Well, what happens now in previous versions with pg_standby et al is that the standby starts up. That doesn't seem appropriate either. Hmm, it would be trivial to just stay in the standby mode at a corrupt file, continuously retrying to restore it and continue replay. If it's genuinely corrupt, it will never succeed and the standby gets stuck at that point. Maybe that's better; it's close to what Fujii suggested except that you don't need a new mode for it. I'm worried that the administrator won't notice the error promptly because at a quick glance the server is up and running, while it's actually stuck at the error and falling indefinitely behind the master. Maybe if we make it a WARNING, that's enough to alleviate that. It's true that if the standby is actively being used for read-only queries, shutting it down to just get the administrators attention isn't good either. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL
Heikki Linnakangas wrote: Simon Riggs wrote: On Thu, 2010-03-25 at 10:11 +0200, Heikki Linnakangas wrote: PANIC seems like the appropriate solution for now. It definitely is not. Think some more. Well, what happens now in previous versions with pg_standby et al is that the standby starts up. That doesn't seem appropriate either. Hmm, it would be trivial to just stay in the standby mode at a corrupt file, continuously retrying to restore it and continue replay. If it's genuinely corrupt, it will never succeed and the standby gets stuck at that point. Maybe that's better; it's close to what Fujii suggested except that you don't need a new mode for it. On second thought, that's easy for the built-in standby mode, but not for archive recovery where we're not currently retrying anything. In archive recovery, we could throw a WARNING and start up, which would be like the current behavior in older versions except you get a WARNING instead of LOG, or we could PANIC. I'm leaning towards PANIC, which makes most sense for genuine point-in-time or archive recovery (ie. not a standby server), but I can see the rationale for WARNING too, for pg_standby and for the sake of preserving old behavior. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL
Fujii Masao wrote: sources = ~failedSources; failedSources |= readSource; The above lines in XLogPageRead() seem not to be required in normal recovery case (i.e., standby_mode = off). So how about the attached patch? *** 9050,9056 next_record_is_invalid: --- 9047,9056 readSource = 0; if (StandbyMode) + { + failedSources |= readSource; goto retry; + } else return false; That doesn't work because readSource is cleared above. But yeah, failedSources is not needed in archive recovery, so that line can be removed. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL
Fujii Masao wrote: On second thought, the following lines seem to be necessary just after calling XLogPageRead() since it reads new WAL file from another source. if (readSource == XLOG_FROM_STREAM || readSource == XLOG_FROM_ARCHIVE) emode = PANIC; else emode = emode_arg; Yep. Here's an updated patch, with these changes since the last patch: * Fix the bug of a spurious PANIC in archive recovery, if the WAL ends in the middle of a WAL record that continues over a WAL segment boundary. * If a corrupt WAL record is found in archive or streamed from master in standby mode, throw WARNING instead of PANIC, and keep trying. In archive recovery (ie. standby_mode=off) it's still a PANIC. We can make it a WARNING too, which gives the pre-9.0 behavior of starting up the server on corruption. I prefer PANIC but the discussion is still going on. * Small code changes to handling of failedSources, inspired by your comment. No change in functionality. This is also available in my git repository at git://git.postgresql.org/git/users/heikki/postgres.git, branch xlogchanges -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index e57f22e..4aa1870 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -450,20 +450,33 @@ static uint32 openLogSeg = 0; static uint32 openLogOff = 0; /* + * Codes indicating where we got a WAL file from during recovery, or where + * to attempt to get one. + */ +#define XLOG_FROM_ARCHIVE (10) /* Restored using restore_command */ +#define XLOG_FROM_PG_XLOG (11) /* Existing file in pg_xlog */ +#define XLOG_FROM_STREAM (12) /* Streamed from master */ + +/* * These variables are used similarly to the ones above, but for reading * the XLOG. Note, however, that readOff generally represents the offset * of the page just read, not the seek position of the FD itself, which * will be just past that page. readLen indicates how much of the current - * page has been read into readBuf. + * page has been read into readBuf, and readSource indicates where we got + * the currently open file from. */ static int readFile = -1; static uint32 readId = 0; static uint32 readSeg = 0; static uint32 readOff = 0; static uint32 readLen = 0; +static int readSource = 0; /* XLOG_FROM_* code */ -/* Is the currently open segment being streamed from primary? */ -static bool readStreamed = false; +/* + * Keeps track of which sources we've tried to read the current WAL + * record from and failed. + */ +static int failedSources = 0; /* Buffer for currently read page (XLOG_BLCKSZ bytes) */ static char *readBuf = NULL; @@ -517,11 +530,12 @@ static bool InstallXLogFileSegment(uint32 *log, uint32 *seg, char *tmppath, bool find_free, int *max_advance, bool use_lock); static int XLogFileRead(uint32 log, uint32 seg, int emode, TimeLineID tli, - bool fromArchive, bool notexistOk); + int source, bool notexistOk); static int XLogFileReadAnyTLI(uint32 log, uint32 seg, int emode, - bool fromArchive); + int sources); static bool XLogPageRead(XLogRecPtr *RecPtr, int emode, bool fetching_ckpt, bool randAccess); +static int emode_for_corrupt_record(int endofwalmode); static void XLogFileClose(void); static bool RestoreArchivedFile(char *path, const char *xlogfname, const char *recovername, off_t expectedSize); @@ -2573,7 +2587,7 @@ XLogFileOpen(uint32 log, uint32 seg) */ static int XLogFileRead(uint32 log, uint32 seg, int emode, TimeLineID tli, - bool fromArchive, bool notfoundOk) + int source, bool notfoundOk) { char xlogfname[MAXFNAMELEN]; char activitymsg[MAXFNAMELEN + 16]; @@ -2582,23 +2596,28 @@ XLogFileRead(uint32 log, uint32 seg, int emode, TimeLineID tli, XLogFileName(xlogfname, tli, log, seg); - if (fromArchive) + switch (source) { - /* Report recovery progress in PS display */ - snprintf(activitymsg, sizeof(activitymsg), waiting for %s, - xlogfname); - set_ps_display(activitymsg, false); + case XLOG_FROM_ARCHIVE: + /* Report recovery progress in PS display */ + snprintf(activitymsg, sizeof(activitymsg), waiting for %s, + xlogfname); + set_ps_display(activitymsg, false); - restoredFromArchive = RestoreArchivedFile(path, xlogfname, - RECOVERYXLOG, - XLogSegSize); - if (!restoredFromArchive) - return -1; - } - else - { - XLogFilePath(path, tli, log, seg); - restoredFromArchive = false; + restoredFromArchive = RestoreArchivedFile(path, xlogfname, + RECOVERYXLOG, + XLogSegSize); + if (!restoredFromArchive) +return -1; + break; + + case XLOG_FROM_PG_XLOG: + XLogFilePath(path, tli, log, seg); + restoredFromArchive = false; + break; + + default: + elog(ERROR, invalid XLogFileRead source %d, source); } fd = BasicOpenFile(path, O_RDONLY | PG_BINARY, 0); @@
Re: [HACKERS] Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL
On Thu, Mar 25, 2010 at 8:55 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: * If a corrupt WAL record is found in archive or streamed from master in standby mode, throw WARNING instead of PANIC, and keep trying. In archive recovery (ie. standby_mode=off) it's still a PANIC. We can make it a WARNING too, which gives the pre-9.0 behavior of starting up the server on corruption. I prefer PANIC but the discussion is still going on. I don't think we should be changing pre-9.0 behavior more than necessary. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Ask help for putting SP-Gist into postgresql
Hi, I am trying to put the SP-Gist package, a general index framework for space partitioning trees , into Postgresql source code. SP-Gist was developed for postgresql 8.0. However, now it does not work with the new version. So, for the submitted patch, what version of postgresql is required? And, is there anybody can help with that? Please cc to my email, when reply. Thanks -- Pei -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walreceiver is uninterruptible on win32
On Tue, Mar 16, 2010 at 10:35 AM, Fujii Masao masao.fu...@gmail.com wrote: Just replacing PQexec() with PQsendQuery() is pretty straightforward, we could put that replacement in a file in port/win32. Replacing PQconnectdb() is more complicated because you need to handle connection timeout. I suggest that we only add the replacement for PQexec(), and live with the situation for PQconnectdb(), that covers 99% of the scenarios anyway. I'll try to replace PQexec() first, and PQconnectdb() second if I have enough time. Sorry for the delay. The attached patch replaces PQexec() used by dblink and libpqwalreceiver with pgwin32_PQexec() which is the win32 version of PQexec(). pgwin32_PQexec() is provided as the library 'libpqbe.dll', which is created only on win32. dblink.dll and libpqwalreceiver.dll refer to libpqbe.dll. Also libpqbe.dll refers to libpq.dll. I'm not sure if my patch is in the right way. If you notice anything, please feel free to comment! Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center pgwin32_PQexec_v1.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER ROLE/DATABASE RESET ALL versus security
Tom Lane wrote: Alvaro Herrera alvhe...@commandprompt.com writes: I have come up with the attached patch. I haven't tested it fully yet, and I need to backport it. The gist of it is: we can't simply remove the pg_db_role_setting tuple, we need to ask GUC to reset the settings array, for which it checks superuser-ness on each setting. I think you still want to have a code path whereby the tuple will be deleted once the array is empty. Failing to check that is inefficient and also exposes clients such as pg_dump to corner case bugs. Yeah, that's there too -- it behaves the same way as ALTER / RESET for a particular setting. I just applied it all the way back to 7.4. It was a bit of a pain to backport it, because every version seemed to have this or that little incompatibility. I attempted a regression test, but it's also painful because there's no nice way to clean up after a newly created user (i.e. drop it): after the last \c - regress_user_guc, there's no way to go back to the original user. And we can't use SET SESSION AUTHORIZATION because it doesn't cause the settings for the role to be loaded. (I think that's a bug too). Suggestions on how to enable this are welcome. -- Test user-specific settings create role regress_user_guc login; alter role regress_user_guc set work_mem to '128MB'; alter role regress_user_guc set lc_messages to 'C'; \c - regress_user_guc select name, setting, source from pg_settings where name in ('work_mem', 'lc_messages') order by name; alter role regress_user_guc reset all; \c - regress_user_guc -- can't display actual value here because it may be installation-dependant select name, setting, source from pg_settings where name in ('work_mem', 'lc_messages') order by name; (I think I should also use a superuser setting other than lc_messages). -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] last_statrequest is in the future
I wrote: ... lookee what we have here: http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jaguardt=2010-03-24%2004:00:07 Modifying the code to print the actual timestamps resulted in a wealth of information that can be dredged from the postmaster log here: http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jaguardt=2010-03-25%2004:00:04 Just grep for log messages like [4baafb13.cd2:1] LOG: last_statrequest 2010-03-25 06:58:42.118734+01 is later than collector's time 2010-03-25 06:58:40.817027+01 What this shows is that system timekeeping on jaguar is just utterly broken. gettimeofday() sometimes returns answers that are 5 seconds or more skewed in different processes, and there are episodes where its value fails to move at all in the stats collector while multiple seconds elapse in the opinion of another process. I wonder whether this doesn't also explain jaguar's propensity to show some other unexplainable failures, like these recent examples: http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jaguardt=2010-03-20%2004:00:03 http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jaguardt=2010-03-15%2004:00:04 which are not at all unique in that machine's buildfarm history. I had always supposed that these were caused by the extreme overall slowness induced by CLOBBER_CACHE_ALWAYS, but now I think they are more likely explained by system timekeeping issues. In particular the last-cited regression failure can be explained if now() fails to change value across pg_sleep(0.1), and that's isomorphic to what seems to be happening in the stats collector in some of the episodes captured today. In short: either this machine has got broken clock hardware, or there's a very serious bug in the kernel code for reading the clock. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Conditional trigger missing from pg_dump
All, It looks like pg_dump was never modified to include conditional triggers. If you look in pg_dump.c, there's no provision to dump the FOR UPDATE OF part of the trigger declaration. This was discovered by Dave Olszewski while testing conditional triggers. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Conditional trigger missing from pg_dump
Josh Berkus j...@agliodbs.com writes: It looks like pg_dump was never modified to include conditional triggers. If you look in pg_dump.c, there's no provision to dump the FOR UPDATE OF part of the trigger declaration. That proves nothing; it could be relying on server-side code to produce the declaration. This was discovered by Dave Olszewski while testing conditional triggers. Please show a specific example of what you're worried about. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Conditional trigger missing from pg_dump
This was discovered by Dave Olszewski while testing conditional triggers. Please show a specific example of what you're worried about. I'll get you a reproduceable test case later today. In our testing, the conditional part of the trigger was not being dumped. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL
On Thu, 2010-03-25 at 12:15 +0200, Heikki Linnakangas wrote: (cc'ing docs list) Simon Riggs wrote: The lack of docs begins to show a lack of coherent high-level design here. Yeah, I think you're right. It's becoming hard to keep track of how it's supposed to behave. Thank you for responding to that comment positively, I am relieved. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL
On Thu, 2010-03-25 at 12:26 +0200, Heikki Linnakangas wrote: Simon Riggs wrote: On Thu, 2010-03-25 at 10:11 +0200, Heikki Linnakangas wrote: PANIC seems like the appropriate solution for now. It definitely is not. Think some more. Well, what happens now in previous versions with pg_standby et al is that the standby starts up. That doesn't seem appropriate either. Agreed. I said that also, immediately upthread. Bottom line is I am against anyone being allowed to PANIC the server just because their piece of it ain't working. The whole purpose of all of this is High Availability and we don't get that if everybody keeps stopping for a tea break every time things get tricky. Staying up when problems occur is the only way to avoid a falling domino taking out the whole farm. I'm worried that the administrator won't notice the error promptly because at a quick glance the server is up and running, while it's actually stuck at the error and falling indefinitely behind the master. Maybe if we make it a WARNING, that's enough to alleviate that. It's true that if the standby is actively being used for read-only queries, shutting it down to just get the administrators attention isn't good either. That's what monitoring is for. Let's just make sure this state is accessible, so people will notice. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [GENERAL] question (or feature-request): over ( partition by ... order by LIMIT N)
On Wed, Mar 24, 2010 at 06:31:59PM +0100, A. Kretschmer wrote: Hello @all, I know, i can do: select * from (select ... row_number() over (...) ...) foo where row_number N to limit the rows per group, but the inner select has to retrieve the whole set of records and in the outer select most of them discarded. That sounds like the optimizer's falling down on the job. Would this be difficult to fix? Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [GENERAL] question (or feature-request): over ( partition by ... order by LIMIT N)
On Thu, Mar 25, 2010 at 5:17 PM, David Fetter da...@fetter.org wrote: On Wed, Mar 24, 2010 at 06:31:59PM +0100, A. Kretschmer wrote: Hello @all, I know, i can do: select * from (select ... row_number() over (...) ...) foo where row_number N to limit the rows per group, but the inner select has to retrieve the whole set of records and in the outer select most of them discarded. That sounds like the optimizer's falling down on the job. Would this be difficult to fix? I may not be the best person to offer an opinion on this topic, but it sounds tricky to me. I think it would need some kind of extremely specific special-case logic. The planner would have to recognize row_number() n, row_number() = n, and row_number = n as special cases indicating that n-1, n, and n records respectively should be expected to be fetched from the partition. And you might also worry about n row_number(), and n = row_number(). It might be worth doing because I suspect that is actually going to be a fairly common type of query, but some thought needs to be given to how to do it without resorting to abject kludgery. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Remove ROW | ROWS from OFFSET and FETCH
Hi, I was wondering if it might be worth making ROW/ROWS optional for OFFSET and FETCH clauses? And can ONLY be optional too? So: OFFSET start { ROW | ROWS } FETCH { FIRST | NEXT } [ count ] { ROW | ROWS } ONLY Would become OFFSET start [ ROW | ROWS ] FETCH { FIRST | NEXT } [ count ] [ ROW | ROWS ] [ONLY] So instead of: SELECT * FROM my_table OFFSET 4 ROWS FETCH FIRST 10 ROWS ONLY; one could write: SELECT * FROM my_table OFFSET 4 FETCH FIRST 10; The only case where ROW would be mandatory would be for FETCH FIRST ROW ONLY. ...although I suspect I'm missing something obvious or reasonable here ;) Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Remove ROW | ROWS from OFFSET and FETCH
On 3/26/10 2:01 AM +0200, Thom Brown wrote: I was wondering if it might be worth making ROW/ROWS optional for OFFSET and FETCH clauses? And can ONLY be optional too? AIUI the only time you'd want to use that syntax is when you want to write (theoretically) portable code, and making parts of the syntax optional is just breaking that portability. We already have a non-portable, non-standard syntax. Regards, Marko Tiikkaja -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Remove ROW | ROWS from OFFSET and FETCH
On 26 March 2010 00:14, Marko Tiikkaja marko.tiikk...@cs.helsinki.fi wrote: On 3/26/10 2:01 AM +0200, Thom Brown wrote: I was wondering if it might be worth making ROW/ROWS optional for OFFSET and FETCH clauses? And can ONLY be optional too? AIUI the only time you'd want to use that syntax is when you want to write (theoretically) portable code, and making parts of the syntax optional is just breaking that portability. We already have a non-portable, non-standard syntax. I understand. If that's the case I'll concede the point. :) Thanks Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [GENERAL] question (or feature-request): over ( partition by ... order by LIMIT N)
2010/3/26 David Fetter da...@fetter.org: On Wed, Mar 24, 2010 at 06:31:59PM +0100, A. Kretschmer wrote: Hello @all, I know, i can do: select * from (select ... row_number() over (...) ...) foo where row_number N to limit the rows per group, but the inner select has to retrieve the whole set of records and in the outer select most of them discarded. That sounds like the optimizer's falling down on the job. Would this be difficult to fix? I believe this isn't the task of window functions. In fact, over( ... LIMIT n) or optimizer hack will fail on multiple window definitions. To take top N items of each group (I agree this is quite common job), I'd suggest syntax that is done by extending DISTINCT ON. SELECT DISTINCT n ON(key1, key2) ... where n means top n items on each key1, key2 group. The current DISTINCT ON() syntax is equivalent to DISTINCT 1 ON() in this way. That'll be fairly easy to implement and you aren't be bothered by this like multiple window definitions. The cons of this is that it can be applied to only row_number logic. You may want to use rank, dense_rank, etc. sometimes. Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: access control jails (and introduction as aspiring GSoC student)
I apologize for my silence, as I've been busy reading up more on the internals of PostgreSQL. From what I can tell, a big problem with my jails idea (as well as the variables Robert described) is that there really isn't a way to store context in the backend specifically for the end client (e.g. a PHP script) due to connection pooling. Also, I almost feel that storing such context would be a disadvantage, as it would harm some of the referential transparency that pooling and caching take advantage of, now and in the future. However, I'm not going to give up :) Perhaps we could have some sort of LET statement that allows the client to pass data to the server, then have libpq automatically wrap queries with the LET statement (when necessary). Here's what it would look like to the PHP scripter: // New libpq function pg_set('current_user', 'bob'); $result = pg_query_params( 'SELECT answer FROM secrets WHERE user=current_user AND question=$1', array('Birth place')); What this really does is something like: $result = pg_query_params( 'LET current_user=$1 DO $2 $3', array( 'bob', 'SELECT answer FROM secrets WHERE user=current_user AND question=$1', 'Birth place') )); Here, the hypothetical LET statement executes a query string, binding current_user to our desired value. The client library would wrap all future queries in this fashion. Granted, it would be silly to pass the value itself to the server over and over, so a serious implementation would probably pass a context ID, and these variable assignments would live in the backend instead. Moreover, LET is a terrible keyword choice here, considering most PostgreSQL users won't need to use it explicitly thanks to additional libpq support. Alternatively (this might require changing the client/server protocol), a context ID could be passed back and forth, thus providing a way to tell clients apart. Implementing this idea requires adding to the backend and to libpq. The backend would need at least two new statements. One would set a variable of a session context, creating one if necessary and returning its ID. Another would execute a string as a parameter and bind both immediate arguments and session context to it. libpq would need a function to set a variable, and it would need to wrap queries it sends out with LET statements if necessary. Note that these variables can't be used in pre-defined functions unless they are somehow declared in advance. One idea would be to first add global variable support, then make session-local contexts be able to temporarily reassign those variables. Another would be to provide an explicit declaration statement. Would this make a good proposal for GSoC?: Implement the backend part of my proposal, and create a proof-of-concept wrapper demonstrating it. This way, I add the new statements, but don't mess around with existing functionality too much. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL
On Thu, Mar 25, 2010 at 9:55 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: * Fix the bug of a spurious PANIC in archive recovery, if the WAL ends in the middle of a WAL record that continues over a WAL segment boundary. * If a corrupt WAL record is found in archive or streamed from master in standby mode, throw WARNING instead of PANIC, and keep trying. In archive recovery (ie. standby_mode=off) it's still a PANIC. We can make it a WARNING too, which gives the pre-9.0 behavior of starting up the server on corruption. I prefer PANIC but the discussion is still going on. Seems reasonable for me. * Small code changes to handling of failedSources, inspired by your comment. No change in functionality. This is also available in my git repository at git://git.postgresql.org/git/users/heikki/postgres.git, branch xlogchanges I looked the patch and was not able to find any big problems until now. The attached small patch fixes the typo. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center typo.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers