Re: [HACKERS] Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL

2010-03-25 Thread Simon Riggs
On Thu, 2010-03-25 at 11:08 +0900, Fujii Masao wrote:
 On Thu, Mar 25, 2010 at 8:23 AM, Simon Riggs si...@2ndquadrant.com wrote:
  PANICing won't change the situation, so it just destroys server
  availability. If we had 1 master and 42 slaves then this behaviour would
  take down almost the whole server farm at once. Very uncool.
 
  You might have reason to prevent the server starting up at that point,
  when in standby mode, but that is not a reason to PANIC. We don't really
  want all of the standbys thinking they can be the master all at once
  either. Better to throw a serious ERROR and have the server still up and
  available for reads.
 
 OK. How about making the startup process emit WARNING, stop WAL replay and
 wait for the presence of trigger file, when an invalid record is found?
 Which keeps the server up for readonly queries. And if the trigger file is
 found, I think that the startup process should emit a FATAL, i.e., the
 server should exit immediately, to prevent the server from becoming the
 primary in a half-finished state. Also to allow such a halfway failover,
 we should provide fast failover mode as pg_standby does?

The lack of docs begins to show a lack of coherent high-level design
here. By now, I've forgotten what this thread was even about. The major
design decision in this that keeps showing up is remove pg_standby, at
all costs but no reason has ever been given for that. I do believe
there is a better way, but we won't find it by trial and error, even
if we had time to do so.

Please work on some clear docs for the failure modes in this system.
That way we can all read them and understand them, or point out further
issues. Moving straight to code is not a solution to this, since what we
need now is to all agree on the way forwards. If we ignore this, then
there is considerable risk that streaming rep will have a fatal
operational flaw.

Please just document/diagram how it works now, highlighting the problems
that still remain to be solved. We're all behind you and I'm helping
wherever I can.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL

2010-03-25 Thread Heikki Linnakangas
Tom Lane wrote:
 Fujii Masao masao.fu...@gmail.com writes:
 OK. How about making the startup process emit WARNING, stop WAL replay and
 wait for the presence of trigger file, when an invalid record is found?
 Which keeps the server up for readonly queries. And if the trigger file is
 found, I think that the startup process should emit a FATAL, i.e., the
 server should exit immediately, to prevent the server from becoming the
 primary in a half-finished state. Also to allow such a halfway failover,
 we should provide fast failover mode as pg_standby does?
 
 I find it extremely scary to read this sort of blue-sky design
 discussion going on now, two months after we were supposedly
 feature-frozen for 9.0.  We need to be looking for the *rock bottom
 minimum* amount of work to do to get 9.0 out the door in a usable
 state; not what would be nice to have later on.

Agreed, this is getting complicated. I'm already worried about the
amount of changes needed to make it work, I don't want to add any new
modes. PANIC seems like the appropriate solution for now.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL

2010-03-25 Thread Simon Riggs
On Thu, 2010-03-25 at 11:08 +0900, Fujii Masao wrote:
 On Thu, Mar 25, 2010 at 8:23 AM, Simon Riggs si...@2ndquadrant.com wrote:
  PANICing won't change the situation, so it just destroys server
  availability. If we had 1 master and 42 slaves then this behaviour would
  take down almost the whole server farm at once. Very uncool.
 
  You might have reason to prevent the server starting up at that point,
  when in standby mode, but that is not a reason to PANIC. We don't really
  want all of the standbys thinking they can be the master all at once
  either. Better to throw a serious ERROR and have the server still up and
  available for reads.
 
 OK. How about making the startup process emit WARNING, stop WAL replay and
 wait for the presence of trigger file, when an invalid record is found?
 Which keeps the server up for readonly queries. 

Yes. Receiving new WAL records is a completely separate activity from
running the rest of the server (in this release...).

 And if the trigger file is
 found, I think that the startup process should emit a FATAL, i.e., the
 server should exit immediately, to prevent the server from becoming the
 primary in a half-finished state. 

Please remember that half-finished is your judgment on what has
happened in the particular scenario you are considering. In many cases,
an invalid WAL record clearly and simply indicates the end of WAL and we
should start up normally.

State is a good word here. I'd like to see the server have a clear
state model with well documented transitions between them. The state
should also be externally queriable, so we can work out what its doing
and how long we can expect it to keep doing it for.

I don't want to be in a position where we are waiting for the server to
sort itself out from a complex set of retries. 

 Also to allow such a halfway failover,
 we should provide fast failover mode as pg_standby does?

Yes, we definitely need a JFDI solution for immediate failover.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL

2010-03-25 Thread Simon Riggs
On Thu, 2010-03-25 at 10:11 +0200, Heikki Linnakangas wrote:

 PANIC seems like the appropriate solution for now.

It definitely is not. Think some more.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL

2010-03-25 Thread Heikki Linnakangas
Simon Riggs wrote:
 On Thu, 2010-03-25 at 11:08 +0900, Fujii Masao wrote:
 And if the trigger file is
 found, I think that the startup process should emit a FATAL, i.e., the
 server should exit immediately, to prevent the server from becoming the
 primary in a half-finished state. 
 
 Please remember that half-finished is your judgment on what has
 happened in the particular scenario you are considering. In many cases,
 an invalid WAL record clearly and simply indicates the end of WAL and we
 should start up normally.

Not in the archive. An invalid WAL record in a file in pg_xlog is
usually an indication of end-of-WAL, but there should be no invalid
records in the archived WAL files, or streamed from the master.

 State is a good word here. I'd like to see the server have a clear
 state model with well documented transitions between them. The state
 should also be externally queriable, so we can work out what its doing
 and how long we can expect it to keep doing it for.

Agreed.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL

2010-03-25 Thread Heikki Linnakangas
(cc'ing docs list)

Simon Riggs wrote:
 The lack of docs begins to show a lack of coherent high-level design
 here.

Yeah, I think you're right. It's becoming hard to keep track of how it's
supposed to behave.

 By now, I've forgotten what this thread was even about. The major
 design decision in this that keeps showing up is remove pg_standby, at
 all costs but no reason has ever been given for that. I do believe
 there is a better way, but we won't find it by trial and error, even
 if we had time to do so.

This has nothing to do with pg_standby.

 Please work on some clear docs for the failure modes in this system.
 That way we can all read them and understand them, or point out further
 issues. Moving straight to code is not a solution to this, since what we
 need now is to all agree on the way forwards. If we ignore this, then
 there is considerable risk that streaming rep will have a fatal
 operational flaw.
 
 Please just document/diagram how it works now, highlighting the problems
 that still remain to be solved. We're all behind you and I'm helping
 wherever I can.

Ok, here's my attempt at the docs. Read it as a replacement for the
High Availability, Load Balancing, and Replication chapter, but of
course many of the sections will be unchanged, as indicated below.

-
Chapter 25. High Availability, Load Balancing, and Replication

25.1 Comparison of different solutions

no changes


25.2 Log-Shipping Standby servers

overview from current File-based Log Shipping section. With small
changes so that it applies to the built-in standby mode as well as
pg_standby like solutions

A standby server can also be used for read-only queries. This is called
Hot Standby mode, see chapter XXX

25.2.1 Planning

Set up two servers with identical hardware ...

two first paragraphs of current File-based log-shipping / Planning section

25.2.3 Standby mode

In standby mode, the server continously applies WAL received from the
master server. The standby server can receive WAL from a WAL archive
(see restore_command) or directly from the master over a TCP connection
(streaming replication). The standby server will also attempt to restore
any WAL found in the standby's pg_xlog. That typically happens after a
server restart, to replay again any WAL streamed from the master before
the restart, but you can also manually copy files to pg_xlog at any time
to have them replayed.

At startup, the standby begins by restoring all WAL available in the
archive location, calling restore_command. Once it reaches the end of
WAL available there and restore_command fails, it tries to restore any
WAL available in the pg_xlog directory (possibly stored there by
streaming replication before restart). If that fails, and streaming
replication has been configured, the standby tries to connect to the
master server and stream WAL from it. If that fails or streaming
replication is not configured, or if the connection is disconnected
later on, the standby goes back to step 1 and tries to restoring the
file from the archive again. This loop of retries from the archive,
pg_xlog, and via streaming replication goes on until the server is
stopped or failover is triggered by a trigger file.

A corrupt or half-finished WAL file in the archive, or streamed from the
master, causes a PANIC and immediate shutdown of the standby server. A
corrupt WAL file is always a serious event which requires administrator
action. If you want to recover a WAL file known to be corrupt as far as
it can be, you can copy the file manually into pg_xlog.

Standby mode is exited and the server switches to normal operation, when
a trigger file is found (trigger_file). Before failover, it will restore
any WAL available in the archive or in pg_xlog, but won't try to connect
to the master or wait for files to become available in the archive.


25.2.4 Preparing Master for Standby servers

Set up continous archiving to a WAL archive on the master, as described
in the chapter Continous Archiving and Point-In-Time_recovery. The
archive location should be accessible from the standby even when the
master is down, ie. it should reside on the standby server itself or
another trusted server, not on the master server.

If you want to use streaming replication, set up authentication to allow
streaming replication connections. Set max_wal_senders.

Take a base backup as described in chapter Continous Archiving and
Point-In-Time_recovery / Making a Base Backup.

25.2.4.1 Authentication for streaming replication

Ensure that listen_addresses allows connections from the standby server.

current Streaming Replication / Authentication section, describing
pg_hba.conf


25.2.5 Setting up the standby server

1. Take a base backup, and copy it to the standby

2. Create a restore_command to restore files from the WAL archive.

3. Set standby_mode=on

4. If you want to use streaming replicaton, set primary_conninfo


You can use restartpoint_command to prune the archive of files no longer

Re: [HACKERS] Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL

2010-03-25 Thread Heikki Linnakangas
Simon Riggs wrote:
 On Thu, 2010-03-25 at 10:11 +0200, Heikki Linnakangas wrote:
 
 PANIC seems like the appropriate solution for now.
 
 It definitely is not. Think some more.

Well, what happens now in previous versions with pg_standby et al is
that the standby starts up. That doesn't seem appropriate either.

Hmm, it would be trivial to just stay in the standby mode at a corrupt
file, continuously retrying to restore it and continue replay. If it's
genuinely corrupt, it will never succeed and the standby gets stuck at
that point. Maybe that's better; it's close to what Fujii suggested
except that you don't need a new mode for it.

I'm worried that the administrator won't notice the error promptly
because at a quick glance the server is up and running, while it's
actually stuck at the error and falling indefinitely behind the master.
Maybe if we make it a WARNING, that's enough to alleviate that. It's
true that if the standby is actively being used for read-only queries,
shutting it down to just get the administrators attention isn't good either.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL

2010-03-25 Thread Heikki Linnakangas
Heikki Linnakangas wrote:
 Simon Riggs wrote:
 On Thu, 2010-03-25 at 10:11 +0200, Heikki Linnakangas wrote:

 PANIC seems like the appropriate solution for now.
 It definitely is not. Think some more.
 
 Well, what happens now in previous versions with pg_standby et al is
 that the standby starts up. That doesn't seem appropriate either.
 
 Hmm, it would be trivial to just stay in the standby mode at a corrupt
 file, continuously retrying to restore it and continue replay. If it's
 genuinely corrupt, it will never succeed and the standby gets stuck at
 that point. Maybe that's better; it's close to what Fujii suggested
 except that you don't need a new mode for it.

On second thought, that's easy for the built-in standby mode, but not
for archive recovery where we're not currently retrying anything. In
archive recovery, we could throw a WARNING and start up, which would be
like the current behavior in older versions except you get a WARNING
instead of LOG, or we could PANIC. I'm leaning towards PANIC, which
makes most sense for genuine point-in-time or archive recovery (ie. not
a standby server), but I can see the rationale for WARNING too, for
pg_standby and for the sake of preserving old behavior.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL

2010-03-25 Thread Heikki Linnakangas
Fujii Masao wrote:
 sources = ~failedSources;
 failedSources |= readSource;
 
 The above lines in XLogPageRead() seem not to be required in normal
 recovery case (i.e., standby_mode = off). So how about the attached
 patch?

 *** 9050,9056  next_record_is_invalid:
 --- 9047,9056 
   readSource = 0;
   
   if (StandbyMode)
 + {
 + failedSources |= readSource;
   goto retry;
 + }
   else
   return false;

That doesn't work because readSource is cleared above. But yeah,
failedSources is not needed in archive recovery, so that line can be
removed.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL

2010-03-25 Thread Heikki Linnakangas
Fujii Masao wrote:
 On second thought, the following lines seem to be necessary just after
 calling XLogPageRead() since it reads new WAL file from another source.
 
  if (readSource == XLOG_FROM_STREAM || readSource == XLOG_FROM_ARCHIVE)
  emode = PANIC;
  else
  emode = emode_arg;

Yep.

Here's an updated patch, with these changes since the last patch:

* Fix the bug of a spurious PANIC in archive recovery, if the WAL ends
in the middle of a WAL record that continues over a WAL segment boundary.

* If a corrupt WAL record is found in archive or streamed from master in
standby mode, throw WARNING instead of PANIC, and keep trying. In
archive recovery (ie. standby_mode=off) it's still a PANIC. We can make
it a WARNING too, which gives the pre-9.0 behavior of starting up the
server on corruption. I prefer PANIC but the discussion is still going on.

* Small code changes to handling of failedSources, inspired by your
comment. No change in functionality.

This is also available in my git repository at
git://git.postgresql.org/git/users/heikki/postgres.git, branch xlogchanges

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index e57f22e..4aa1870 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -450,20 +450,33 @@ static uint32 openLogSeg = 0;
 static uint32 openLogOff = 0;
 
 /*
+ * Codes indicating where we got a WAL file from during recovery, or where
+ * to attempt to get one.
+ */
+#define XLOG_FROM_ARCHIVE		(10)	/* Restored using restore_command */
+#define XLOG_FROM_PG_XLOG		(11)	/* Existing file in pg_xlog */
+#define XLOG_FROM_STREAM		(12)	/* Streamed from master */
+
+/*
  * These variables are used similarly to the ones above, but for reading
  * the XLOG.  Note, however, that readOff generally represents the offset
  * of the page just read, not the seek position of the FD itself, which
  * will be just past that page. readLen indicates how much of the current
- * page has been read into readBuf.
+ * page has been read into readBuf, and readSource indicates where we got
+ * the currently open file from.
  */
 static int	readFile = -1;
 static uint32 readId = 0;
 static uint32 readSeg = 0;
 static uint32 readOff = 0;
 static uint32 readLen = 0;
+static int readSource = 0;		/* XLOG_FROM_* code */
 
-/* Is the currently open segment being streamed from primary? */
-static bool readStreamed = false;
+/*
+ * Keeps track of which sources we've tried to read the current WAL
+ * record from and failed.
+ */
+static int failedSources = 0;
 
 /* Buffer for currently read page (XLOG_BLCKSZ bytes) */
 static char *readBuf = NULL;
@@ -517,11 +530,12 @@ static bool InstallXLogFileSegment(uint32 *log, uint32 *seg, char *tmppath,
 	   bool find_free, int *max_advance,
 	   bool use_lock);
 static int XLogFileRead(uint32 log, uint32 seg, int emode, TimeLineID tli,
-			 bool fromArchive, bool notexistOk);
+			 int source, bool notexistOk);
 static int XLogFileReadAnyTLI(uint32 log, uint32 seg, int emode,
-   bool fromArchive);
+   int sources);
 static bool XLogPageRead(XLogRecPtr *RecPtr, int emode, bool fetching_ckpt,
 			 bool randAccess);
+static int emode_for_corrupt_record(int endofwalmode);
 static void XLogFileClose(void);
 static bool RestoreArchivedFile(char *path, const char *xlogfname,
 	const char *recovername, off_t expectedSize);
@@ -2573,7 +2587,7 @@ XLogFileOpen(uint32 log, uint32 seg)
  */
 static int
 XLogFileRead(uint32 log, uint32 seg, int emode, TimeLineID tli,
-			 bool fromArchive, bool notfoundOk)
+			 int source, bool notfoundOk)
 {
 	char		xlogfname[MAXFNAMELEN];
 	char		activitymsg[MAXFNAMELEN + 16];
@@ -2582,23 +2596,28 @@ XLogFileRead(uint32 log, uint32 seg, int emode, TimeLineID tli,
 
 	XLogFileName(xlogfname, tli, log, seg);
 
-	if (fromArchive)
+	switch (source)
 	{
-		/* Report recovery progress in PS display */
-		snprintf(activitymsg, sizeof(activitymsg), waiting for %s,
- xlogfname);
-		set_ps_display(activitymsg, false);
+		case XLOG_FROM_ARCHIVE:
+			/* Report recovery progress in PS display */
+			snprintf(activitymsg, sizeof(activitymsg), waiting for %s,
+	 xlogfname);
+			set_ps_display(activitymsg, false);
 
-		restoredFromArchive = RestoreArchivedFile(path, xlogfname,
-  RECOVERYXLOG,
-  XLogSegSize);
-		if (!restoredFromArchive)
-			return -1;
-	}
-	else
-	{
-		XLogFilePath(path, tli, log, seg);
-		restoredFromArchive = false;
+			restoredFromArchive = RestoreArchivedFile(path, xlogfname,
+	  RECOVERYXLOG,
+	  XLogSegSize);
+			if (!restoredFromArchive)
+return -1;
+			break;
+
+		case XLOG_FROM_PG_XLOG:
+			XLogFilePath(path, tli, log, seg);
+			restoredFromArchive = false;
+			break;
+
+		default:
+			elog(ERROR, invalid XLogFileRead source %d, source);
 	}
 
 	fd = BasicOpenFile(path, O_RDONLY | PG_BINARY, 0);
@@ 

Re: [HACKERS] Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL

2010-03-25 Thread Robert Haas
On Thu, Mar 25, 2010 at 8:55 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 * If a corrupt WAL record is found in archive or streamed from master in
 standby mode, throw WARNING instead of PANIC, and keep trying. In
 archive recovery (ie. standby_mode=off) it's still a PANIC. We can make
 it a WARNING too, which gives the pre-9.0 behavior of starting up the
 server on corruption. I prefer PANIC but the discussion is still going on.

I don't think we should be changing pre-9.0 behavior more than necessary.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Ask help for putting SP-Gist into postgresql

2010-03-25 Thread Pei He
Hi,
I am trying to put the SP-Gist package, a general index framework for
space partitioning trees , into Postgresql source code.

SP-Gist was developed for postgresql 8.0. However, now it does not
work with the new version.

So, for the submitted patch, what version of postgresql is required?
And, is there anybody can help with that?

Please cc to my email, when reply.
Thanks
--
Pei

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walreceiver is uninterruptible on win32

2010-03-25 Thread Fujii Masao
On Tue, Mar 16, 2010 at 10:35 AM, Fujii Masao masao.fu...@gmail.com wrote:
 Just replacing PQexec() with PQsendQuery() is pretty straightforward, we
 could put that replacement in a file in port/win32. Replacing
 PQconnectdb() is more complicated because you need to handle connection
 timeout. I suggest that we only add the replacement for PQexec(), and
 live with the situation for PQconnectdb(), that covers 99% of the
 scenarios anyway.

 I'll try to replace PQexec() first, and PQconnectdb() second if I have
 enough time.

Sorry for the delay. The attached patch replaces PQexec() used by dblink
and libpqwalreceiver with pgwin32_PQexec() which is the win32 version of
PQexec().

pgwin32_PQexec() is provided as the library 'libpqbe.dll', which is created
only on win32. dblink.dll and libpqwalreceiver.dll refer to libpqbe.dll.
Also libpqbe.dll refers to libpq.dll.

I'm not sure if my patch is in the right way. If you notice anything,
please feel free to comment!

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


pgwin32_PQexec_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER ROLE/DATABASE RESET ALL versus security

2010-03-25 Thread Alvaro Herrera
Tom Lane wrote:
 Alvaro Herrera alvhe...@commandprompt.com writes:
  I have come up with the attached patch.  I haven't tested it fully yet,
  and I need to backport it.  The gist of it is: we can't simply remove
  the pg_db_role_setting tuple, we need to ask GUC to reset the settings
  array, for which it checks superuser-ness on each setting.
 
 I think you still want to have a code path whereby the tuple will be
 deleted once the array is empty.  Failing to check that is inefficient
 and also exposes clients such as pg_dump to corner case bugs.

Yeah, that's there too -- it behaves the same way as ALTER / RESET for a
particular setting.

I just applied it all the way back to 7.4.  It was a bit of a pain to
backport it, because every version seemed to have this or that little
incompatibility.

I attempted a regression test, but it's also painful because there's no
nice way to clean up after a newly created user (i.e. drop it): after
the last \c - regress_user_guc, there's no way to go back to the
original user.  And we can't use SET SESSION AUTHORIZATION because it
doesn't cause the settings for the role to be loaded.  (I think that's a
bug too).  Suggestions on how to enable this are welcome.

-- Test user-specific settings
create role regress_user_guc login;
alter role regress_user_guc set work_mem to '128MB';
alter role regress_user_guc set lc_messages to 'C';
\c - regress_user_guc
select name, setting, source
  from pg_settings
  where name in ('work_mem', 'lc_messages')
order by name;
alter role regress_user_guc reset all;
\c - regress_user_guc
-- can't display actual value here because it may be installation-dependant
select name, setting, source
  from pg_settings
  where name in ('work_mem', 'lc_messages')
order by name;


(I think I should also use a superuser setting other than lc_messages).

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] last_statrequest is in the future

2010-03-25 Thread Tom Lane
I wrote:
 ... lookee what we have here:
 http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jaguardt=2010-03-24%2004:00:07

Modifying the code to print the actual timestamps resulted in a wealth
of information that can be dredged from the postmaster log here:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jaguardt=2010-03-25%2004:00:04
Just grep for log messages like
[4baafb13.cd2:1] LOG:  last_statrequest 2010-03-25 06:58:42.118734+01 is later 
than collector's time 2010-03-25 06:58:40.817027+01

What this shows is that system timekeeping on jaguar is just utterly
broken.  gettimeofday() sometimes returns answers that are 5 seconds or
more skewed in different processes, and there are episodes where its
value fails to move at all in the stats collector while multiple seconds
elapse in the opinion of another process.

I wonder whether this doesn't also explain jaguar's propensity to show
some other unexplainable failures, like these recent examples:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jaguardt=2010-03-20%2004:00:03
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jaguardt=2010-03-15%2004:00:04
which are not at all unique in that machine's buildfarm history.

I had always supposed that these were caused by the extreme overall
slowness induced by CLOBBER_CACHE_ALWAYS, but now I think they are
more likely explained by system timekeeping issues.  In particular the
last-cited regression failure can be explained if now() fails to change
value across pg_sleep(0.1), and that's isomorphic to what seems to be
happening in the stats collector in some of the episodes captured today.

In short: either this machine has got broken clock hardware, or there's
a very serious bug in the kernel code for reading the clock.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Conditional trigger missing from pg_dump

2010-03-25 Thread Josh Berkus
All,

It looks like pg_dump was never modified to include conditional
triggers.  If you look in pg_dump.c, there's no provision to dump the
FOR UPDATE OF part of the trigger declaration.

This was discovered by Dave Olszewski while testing conditional triggers.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Conditional trigger missing from pg_dump

2010-03-25 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 It looks like pg_dump was never modified to include conditional
 triggers.  If you look in pg_dump.c, there's no provision to dump the
 FOR UPDATE OF part of the trigger declaration.

That proves nothing; it could be relying on server-side code to produce
the declaration.

 This was discovered by Dave Olszewski while testing conditional triggers.

Please show a specific example of what you're worried about.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Conditional trigger missing from pg_dump

2010-03-25 Thread Josh Berkus

 This was discovered by Dave Olszewski while testing conditional triggers.
 
 Please show a specific example of what you're worried about.

I'll get you a reproduceable test case later today.  In our testing, the
conditional part of the trigger was not being dumped.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL

2010-03-25 Thread Simon Riggs
On Thu, 2010-03-25 at 12:15 +0200, Heikki Linnakangas wrote:
 (cc'ing docs list)
 
 Simon Riggs wrote:
  The lack of docs begins to show a lack of coherent high-level design
  here.
 
 Yeah, I think you're right. It's becoming hard to keep track of how it's
 supposed to behave.

Thank you for responding to that comment positively, I am relieved.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL

2010-03-25 Thread Simon Riggs
On Thu, 2010-03-25 at 12:26 +0200, Heikki Linnakangas wrote:
 Simon Riggs wrote:
  On Thu, 2010-03-25 at 10:11 +0200, Heikki Linnakangas wrote:
  
  PANIC seems like the appropriate solution for now.
  
  It definitely is not. Think some more.
 
 Well, what happens now in previous versions with pg_standby et al is
 that the standby starts up. That doesn't seem appropriate either.

Agreed. I said that also, immediately upthread.

Bottom line is I am against anyone being allowed to PANIC the server
just because their piece of it ain't working. The whole purpose of all
of this is High Availability and we don't get that if everybody keeps
stopping for a tea break every time things get tricky. Staying up when
problems occur is the only way to avoid a falling domino taking out the
whole farm.

 I'm worried that the administrator won't notice the error promptly
 because at a quick glance the server is up and running, while it's
 actually stuck at the error and falling indefinitely behind the master.
 Maybe if we make it a WARNING, that's enough to alleviate that. It's
 true that if the standby is actively being used for read-only queries,
 shutting it down to just get the administrators attention isn't good either.

That's what monitoring is for. Let's just make sure this state is
accessible, so people will notice.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [GENERAL] question (or feature-request): over ( partition by ... order by LIMIT N)

2010-03-25 Thread David Fetter
On Wed, Mar 24, 2010 at 06:31:59PM +0100, A. Kretschmer wrote:
 Hello @all,
 
 I know, i can do:
 
 select * from (select ... row_number() over (...) ...) foo where
 row_number  N
 
 to limit the rows per group, but the inner select has to retrieve
 the whole set of records and in the outer select most of them
 discarded.

That sounds like the optimizer's falling down on the job.  Would this
be difficult to fix?

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [GENERAL] question (or feature-request): over ( partition by ... order by LIMIT N)

2010-03-25 Thread Robert Haas
On Thu, Mar 25, 2010 at 5:17 PM, David Fetter da...@fetter.org wrote:
 On Wed, Mar 24, 2010 at 06:31:59PM +0100, A. Kretschmer wrote:
 Hello @all,

 I know, i can do:

 select * from (select ... row_number() over (...) ...) foo where
 row_number  N

 to limit the rows per group, but the inner select has to retrieve
 the whole set of records and in the outer select most of them
 discarded.

 That sounds like the optimizer's falling down on the job.  Would this
 be difficult to fix?

I may not be the best person to offer an opinion on this topic, but it
sounds tricky to me.  I think it would need some kind of extremely
specific special-case logic.  The planner would have to recognize
row_number()  n, row_number() = n, and row_number = n as special
cases indicating that n-1, n, and n records respectively should be
expected to be fetched from the partition.  And you might also worry
about n  row_number(), and n = row_number().

It might be worth doing because I suspect that is actually going to be
a fairly common type of query, but some thought needs to be given to
how to do it without resorting to abject kludgery.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Remove ROW | ROWS from OFFSET and FETCH

2010-03-25 Thread Thom Brown
Hi,

I was wondering if it might be worth making ROW/ROWS optional for
OFFSET and FETCH clauses?  And can ONLY be optional too?

So:

OFFSET start { ROW | ROWS }
FETCH { FIRST | NEXT } [ count ] { ROW | ROWS } ONLY

Would become

OFFSET start [ ROW | ROWS ]
FETCH { FIRST | NEXT } [ count ] [ ROW | ROWS ] [ONLY]

So instead of:

SELECT *
FROM my_table
OFFSET 4 ROWS
FETCH FIRST 10 ROWS ONLY;

one could write:

SELECT *
FROM my_table
OFFSET 4
FETCH FIRST 10;

The only case where ROW would be mandatory would be for FETCH FIRST ROW ONLY.

...although I suspect I'm missing something obvious or reasonable here ;)

Thom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remove ROW | ROWS from OFFSET and FETCH

2010-03-25 Thread Marko Tiikkaja

On 3/26/10 2:01 AM +0200, Thom Brown wrote:

I was wondering if it might be worth making ROW/ROWS optional for
OFFSET and FETCH clauses?  And can ONLY be optional too?


AIUI the only time you'd want to use that syntax is when you want to 
write (theoretically) portable code, and making parts of the syntax 
optional is just breaking that portability.  We already have a 
non-portable, non-standard syntax.



Regards,
Marko Tiikkaja

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remove ROW | ROWS from OFFSET and FETCH

2010-03-25 Thread Thom Brown
On 26 March 2010 00:14, Marko Tiikkaja marko.tiikk...@cs.helsinki.fi wrote:
 On 3/26/10 2:01 AM +0200, Thom Brown wrote:

 I was wondering if it might be worth making ROW/ROWS optional for
 OFFSET and FETCH clauses?  And can ONLY be optional too?

 AIUI the only time you'd want to use that syntax is when you want to write
 (theoretically) portable code, and making parts of the syntax optional is
 just breaking that portability.  We already have a non-portable,
 non-standard syntax.


I understand.  If that's the case I'll concede the point. :)

Thanks

Thom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [GENERAL] question (or feature-request): over ( partition by ... order by LIMIT N)

2010-03-25 Thread Hitoshi Harada
2010/3/26 David Fetter da...@fetter.org:
 On Wed, Mar 24, 2010 at 06:31:59PM +0100, A. Kretschmer wrote:
 Hello @all,

 I know, i can do:

 select * from (select ... row_number() over (...) ...) foo where
 row_number  N

 to limit the rows per group, but the inner select has to retrieve
 the whole set of records and in the outer select most of them
 discarded.

 That sounds like the optimizer's falling down on the job.  Would this
 be difficult to fix?

I believe this isn't the task of window functions. In fact, over( ...
LIMIT n) or optimizer hack will fail on multiple window definitions.

To take top N items of each group (I agree this is quite common job),
I'd suggest syntax that is done by extending DISTINCT ON.

SELECT DISTINCT n ON(key1, key2) ...

where n means top n items on each key1, key2 group. The current
DISTINCT ON() syntax is equivalent to DISTINCT 1 ON() in this way.
That'll be fairly easy to implement and you aren't be bothered by this
like multiple window definitions. The cons of this is that it can be
applied to only row_number logic. You may want to use rank,
dense_rank, etc. sometimes.

Regards,

-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: access control jails (and introduction as aspiring GSoC student)

2010-03-25 Thread Joseph Adams
I apologize for my silence, as I've been busy reading up more on the
internals of PostgreSQL.

From what I can tell, a big problem with my jails idea (as well as the
variables Robert described) is that there really isn't a way to store
context in the backend specifically for the end client (e.g. a PHP
script) due to connection pooling.  Also, I almost feel that storing
such context would be a disadvantage, as it would harm some of the
referential transparency that pooling and caching take advantage of,
now and in the future.  However, I'm not going to give up :)

Perhaps we could have some sort of LET statement that allows the
client to pass data to the server, then have libpq automatically wrap
queries with the LET statement (when necessary).  Here's what it would
look like to the PHP scripter:

// New libpq function
pg_set('current_user', 'bob');

$result = pg_query_params(
'SELECT answer FROM secrets WHERE user=current_user AND question=$1',
array('Birth place'));


What this really does is something like:

$result = pg_query_params(
'LET current_user=$1 DO $2 $3',
array(
'bob',
'SELECT answer FROM secrets WHERE user=current_user AND 
question=$1',
'Birth place')
));


Here, the hypothetical LET statement executes a query string, binding
current_user to our desired value.  The client library would wrap all
future queries in this fashion.

Granted, it would be silly to pass the value itself to the server over
and over, so a serious implementation would probably pass a context
ID, and these variable assignments would live in the backend instead.
Moreover, LET is a terrible keyword choice here, considering most
PostgreSQL users won't need to use it explicitly thanks to additional
libpq support.

Alternatively (this might require changing the client/server
protocol), a context ID could be passed back and forth, thus providing
a way to tell clients apart.

Implementing this idea requires adding to the backend and to libpq.
The backend would need at least two new statements.  One would set a
variable of a session context, creating one if necessary and returning
its ID.  Another would execute a string as a parameter and bind both
immediate arguments and session context to it.  libpq would need a
function to set a variable, and it would need to wrap queries it sends
out with LET statements if necessary.

Note that these variables can't be used in pre-defined functions
unless they are somehow declared in advance.  One idea would be to
first add global variable support, then make session-local contexts be
able to temporarily reassign those variables.  Another would be to
provide an explicit declaration statement.

Would this make a good proposal for GSoC?:  Implement the backend part
of my proposal, and create a proof-of-concept wrapper demonstrating
it.  This way, I add the new statements, but don't mess around with
existing functionality too much.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL

2010-03-25 Thread Fujii Masao
On Thu, Mar 25, 2010 at 9:55 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 * Fix the bug of a spurious PANIC in archive recovery, if the WAL ends
 in the middle of a WAL record that continues over a WAL segment boundary.

 * If a corrupt WAL record is found in archive or streamed from master in
 standby mode, throw WARNING instead of PANIC, and keep trying. In
 archive recovery (ie. standby_mode=off) it's still a PANIC. We can make
 it a WARNING too, which gives the pre-9.0 behavior of starting up the
 server on corruption. I prefer PANIC but the discussion is still going on.

Seems reasonable for me.

 * Small code changes to handling of failedSources, inspired by your
 comment. No change in functionality.

 This is also available in my git repository at
 git://git.postgresql.org/git/users/heikki/postgres.git, branch xlogchanges

I looked the patch and was not able to find any big problems until now.
The attached small patch fixes the typo.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


typo.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers