date:20100427

Re: [HACKERS] Wierd quirk of HS/SR, probably not fixable

2010-04-27 Thread Heikki Linnakangas

Josh Berkus wrote:
 Here's a way to trap yourself:
 
 (1) Set up an HS/SR master
 (2) pg_start_backup on the master
 (3) clone the master to 1 or more slaves
 (4) Fast shutdown the master (without pg_stop_backup)
 (5) Restart the master
 (6) Bring up the slaves
 
 Result: the slaves will come up fine in recovery mode.  However, they
 will never switch over to HS mode or start SR.  You will not be able to
 pg_stop_backup() on the master.  At this point, you have no option but
 to shut down the slaves and re-clone.
 
 The only reason why this is somewhat problematic for users is that you
 will not get any messages from the master or the slaves to indicate why
 they won't switch modes.  So I can imagine someone wasting a lot of time
 troubleshooting the wrong problems.
 
 Suggested resolution: I don't think there's and logical fix for this
 case; it should just be added to the docs as a failure/troubleshooting
 condition.

Hmm, we could throw an error in the standby, when we see a shutdown
checkpoint while we're waiting for an end-backup record. If the database
was shut down before pg_stop_backup(), we know that the backup was
cancelled and the end-backup record we're waiting for will never arrive.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Make CheckRequiredParameterValues() depend upon correct

2010-04-27 Thread Heikki Linnakangas

Heikki Linnakangas wrote:
 Robert Haas wrote:
 On Fri, Apr 23, 2010 at 4:44 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Well, actually, now that I've looked at the patch I think it's starting
 from a fundamentally wrong position anyway.  Checkpoint records are a
 completely wrong mechanism for transmitting this data to slaves, because
 a checkpoint is emitted *after* we do something, not *before* we do it.
 In particular it's ludicrous to be looking at shutdown checkpoints to
 try to determine whether the subsequent WAL will meet the slave's
 requirements.  There's no connection at all between what the GUC state
 was at shutdown and what it might be after starting again.

 A design that might work is
 (1) store the active value of wal_mode in pg_control (but NOT as part of
 the last-checkpoint-record image).
 (2) invent a new WAL record type that is transmitted when we change
 wal_mode.

 Then, slaves could check whether the master's wal_mode is high enough
 by looking at pg_control when they start plus any wal_mode_change
 records they come across.

 If we did this then we could get rid of those WAL record types that were
 added to signify that information had been omitted from WAL at specific
 times.
 dons project manager hat

 I notice that Heikki's patch doesn't include doing the above.  Should
 we?  If so, who's going to do it?
 
 I'll give it a shot.

Ok, here's a patch that includes the changes to add new wal_mode GUC
(http://archives.postgresql.org/message-id/4bd581a6.60...@enterprisedb.com),
and implements Tom's design to keep a copy of wal_mode and the
max_connections, max_prepared_xacts and max_locks_per_xact settings in
pg_control.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com
diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml
index eb5765a..6c6a504 100644
--- a/doc/src/sgml/backup.sgml
+++ b/doc/src/sgml/backup.sgml
@@ -689,8 +689,7 @@ archive_command = 'test ! -f /mnt/server/archivedir/%f amp;amp; cp %p /mnt/ser
/para
 
para
-When varnamearchive_mode/ is literaloff/ and xref
-linkend=guc-max-wal-senders is zero some SQL commands
+When varnamewal_mode/ is literalminimal/ some SQL commands
 are optimized to avoid WAL logging, as described in xref
 linkend=populate-pitr.  If archiving or streaming replication were
 turned on during execution of one of these statements, WAL would not
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index c5692ba..63ca749 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1353,6 +1353,43 @@ SET ENABLE_SEQSCAN TO OFF;
  titleSettings/title
  variablelist
  
+ varlistentry id=guc-wal-mode xreflabel=wal_mode
+  termvarnamewal_mode/varname (typeenum/type)/term
+  indexterm
+   primaryvarnamewal_mode/ configuration parameter/primary
+  /indexterm
+  listitem
+   para
+varnamewal_mode/ determines how much information is written
+to the WAL. The default value is literalminimal/, which writes
+only minimal information needed to recover from a crash or immediate
+shutdown. literalarchive/ adds logging required for WAL archiving,
+and literalhot_standby/ further adds extra information about
+running transactions required to run read-only queries on a standby
+server.
+This parameter can only be set at server start.
+   /para
+   para
+In literalminimal/ mode, WAL-logging of some bulk operations, like
+commandCREATE INDEX/, commandCLUSTER/ and commandCOPY/ on
+a table that was created or truncated in the same transaction can be
+safely skipped, which can make those operations much faster, but
+minimal WAL does not contain enough information to reconstruct the
+data from a base backup and the WAL logs, so at least
+literalarchive/ level must be used to enable WAL archiving
+(xref linkend=guc-archive-mode) and streaming replication. See
+also xref linkend=populate-pitr.
+   /para
+   para
+In literalhot_standby/ mode, the same information is logged as
+in literalarchive/ mode, plus information needed to reconstruct
+the status of running transactions from the WAL. To enable read-only
+queries on a standby server, varnamewal_mode/ must be set to
+literalhot_standby/ on the primary.
+   /para
+  /listitem
+ /varlistentry
+
  varlistentry id=guc-fsync xreflabel=fsync
   indexterm
primaryvarnamefsync/ configuration parameter/primary
@@ -1726,7 +1763,9 @@ SET ENABLE_SEQSCAN TO OFF;
 varnamearchive_mode/ and varnamearchive_command/ are
 separate variables so that varnamearchive_command/ can be
 changed without leaving archiving mode.
-This parameter can only be set at server start.
+This parameter can only be set at server start. It is ignored
+unless varnamewal_mode/ is set

Re: [HACKERS] Wierd quirk of HS/SR, probably not fixable

2010-04-27 Thread Fujii Masao

On Tue, Apr 27, 2010 at 4:19 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 Hmm, we could throw an error in the standby, when we see a shutdown
 checkpoint while we're waiting for an end-backup record. If the database
 was shut down before pg_stop_backup(), we know that the backup was
 cancelled and the end-backup record we're waiting for will never arrive.

Sounds good. This would work fine even if an immediate shutdown is done
instead since the primary ends up generating a shutdown checkpoint record
when restarting.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] recovery_connections cannot start

2010-04-27 Thread Dimitri Fontaine

Robert Haas robertmh...@gmail.com writes:
 On Mon, 2010-04-26 at 10:41 +0200, Dimitri Fontaine wrote:
 Would it be possible to have internal commands there, as for example
 cd is in my shell, or test, or time, or some more ?

 That would allow for providing a portable /usr/bin/true command as far
 as archiving is concerned (say, pg_archive_bypass), and will allow for
 providing a default archiving command in the future, like pg_archive_cp
 /location or something.

 Separating wal_mode and archive_mode, as we recently discussed, might
 eliminate the need for this kludge, if archive_mode can then be made
 changeable without a restart.

I don't see my proposal as anything like a kludge at all. Internal
commands are hugely practical and here would allow for PostgreSQL to
provide basic portable archive and restore commands for simple cases,
providing the necessary guarantees and error management. 

Bypass the archiving is the most obvious flavor and in my mind shouldn't
require an external dependency. Make simple things simple and complex
one possible, as they say. PostgreSQL is one of the best software I've
ever worked with on this point, but the WAL management is still in its
infancy there: whatever you want to setup, it's complex.

Having internal commands will not remove any feature we already
have. Users would still be able to hook-in their own solutions for more
complex or demanding environments.

Please do explain in what sense that proposal is a kludge, I'd like to
be able to understand your viewpoint. Or maybe it's just either bad
wording on your part or bad reading on mine, nonetheless I felt like
having to give some more details here. That's an important point in my
mind. 

Dunno how much it's relevant for 9.0 though, maybe we'll be able to
reach a good enough solution without an internal bypass archive
command, but having (only this) one does not sound so complex that we
should not consider it at all.

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Wierd quirk of HS/SR, probably not fixable

2010-04-27 Thread Heikki Linnakangas

Fujii Masao wrote:
 On Tue, Apr 27, 2010 at 4:19 PM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:
 Hmm, we could throw an error in the standby, when we see a shutdown
 checkpoint while we're waiting for an end-backup record. If the database
 was shut down before pg_stop_backup(), we know that the backup was
 cancelled and the end-backup record we're waiting for will never arrive.
 
 Sounds good. This would work fine even if an immediate shutdown is done
 instead since the primary ends up generating a shutdown checkpoint record
 when restarting.

Yep. I've committed a patch to do that.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_migrator

2010-04-27 Thread Bruce Momjian

Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
  On Mon, Apr 26, 2010 at 9:26 PM, Bruce Momjian br...@momjian.us wrote:
  There was talk of including pg_migrator in Postgres 9.0 in /contrib. ?Do
  we still want to do that?
 
  I think you articulated some pretty good reasons previously for
  keeping it separate and, at any rate, I'm not eager to do it at the
  11th hour without due consideration and adequate engineering time.
 
 I concur; it's about a month too late to propose this.

I am confused why it is late.  We add to /contrib even during beta, and
I didn't bring it up earlier because I didn't want to be pushing my own
software.  Was someone else supposed to suggest it a month ago?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Make CheckRequiredParameterValues() depend upon correct

2010-04-27 Thread Fujii Masao

On Tue, Apr 27, 2010 at 5:09 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 Ok, here's a patch that includes the changes to add new wal_mode GUC
 (http://archives.postgresql.org/message-id/4bd581a6.60...@enterprisedb.com),
 and implements Tom's design to keep a copy of wal_mode and the
 max_connections, max_prepared_xacts and max_locks_per_xact settings in
 pg_control.

I have some comments:

config.sgml
 literalon/literal.  It is thought that there is little
 measurable difference in performance from using this feature, so
 feedback is welcome if any production impacts are noticeable.
 It is likely that this parameter will be removed in later releases.

Is this description still required for recovery_connections?


 if (!XLogArchivingActive())
   ereport(ERROR,
 (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
  errmsg(WAL archiving is not active),
  errhint(archive_mode must be enabled at server start.)));

You need to change the error messages which refer to archive_mode,
like the above.


+ /*
+  * For Hot Standby, the WAL must be generated with 'hot_standby' mode,
+  * and we must have at least as many backend slots as the primary.
+  */
+ if (InArchiveRecovery  XLogRequestRecoveryConnections)
+ {
+   if (ControlFile-wal_mode  WAL_MODE_HOT_STANDBY)
+   ereport(ERROR,
+  (errmsg(recovery connections cannot start because
wal_mode was not set to 'hot_standby' on the WAL source server)));

This seems to always prevent the server from doing an archive recovery
since wal_mode is expected to be WAL_MODE_ARCHIVE in that case.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Make CheckRequiredParameterValues() depend upon correct

2010-04-27 Thread Heikki Linnakangas

Fujii Masao wrote:
 config.sgml
 literalon/literal.  It is thought that there is little
 measurable difference in performance from using this feature, so
 feedback is welcome if any production impacts are noticeable.
 It is likely that this parameter will be removed in later releases.
 
 Is this description still required for recovery_connections?

Hmm, I guess it was referring to setting recovery_connections in the
master, I don't see us removing that option from the standby in the
future. recovery_connections in the master is being replaced with the
wal_mode setting, so I guess that's not required anymore.

 if (!XLogArchivingActive())
   ereport(ERROR,
 (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
  errmsg(WAL archiving is not active),
  errhint(archive_mode must be enabled at server start.)));
 
 You need to change the error messages which refer to archive_mode,
 like the above.

Hmm, I think we should change not only the error message, but the logic
too. There's two related checks there:

   if (!XLogArchivingActive())
   ereport(ERROR,
   
 (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg(WAL archiving is not active),
errhint(archive_mode must be enabled at 
 server start.)));
 
   if (!XLogArchiveCommandSet())
   ereport(ERROR,
   
 (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg(WAL archiving is not active),
errhint(archive_command must be defined 
 before 
online backups can be made 
 safely.)));

You can use streaming replication too to transport the WAL generated
during the backup, so I think we should just check that wal_mode='archive'.

 + /*
 +  * For Hot Standby, the WAL must be generated with 'hot_standby' mode,
 +  * and we must have at least as many backend slots as the primary.
 +  */
 + if (InArchiveRecovery  XLogRequestRecoveryConnections)
 + {
 +   if (ControlFile-wal_mode  WAL_MODE_HOT_STANDBY)
 +   ereport(ERROR,
 +  (errmsg(recovery connections cannot start because
 wal_mode was not set to 'hot_standby' on the WAL source server)));
 
 This seems to always prevent the server from doing an archive recovery
 since wal_mode is expected to be WAL_MODE_ARCHIVE in that case.

No, it doesn't prevent archive recovery. It only prevents hot standby if
wal_mode was not 'hot_standby' in the master. I think you missed the 
XLogRequestRecoveryConnections condition above.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Make CheckRequiredParameterValues() depend upon correct

2010-04-27 Thread Fujii Masao

On Tue, Apr 27, 2010 at 6:49 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 Fujii Masao wrote:
 config.sgml
 literalon/literal.  It is thought that there is little
 measurable difference in performance from using this feature, so
 feedback is welcome if any production impacts are noticeable.
 It is likely that this parameter will be removed in later releases.

 Is this description still required for recovery_connections?

 Hmm, I guess it was referring to setting recovery_connections in the
 master, I don't see us removing that option from the standby in the
 future. recovery_connections in the master is being replaced with the
 wal_mode setting, so I guess that's not required anymore.

Agreed.

 if (!XLogArchivingActive())
   ereport(ERROR,
     (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
      errmsg(WAL archiving is not active),
      errhint(archive_mode must be enabled at server start.)));

 You need to change the error messages which refer to archive_mode,
 like the above.

 Hmm, I think we should change not only the error message, but the logic
 too. There's two related checks there:

       if (!XLogArchivingActive())
               ereport(ERROR,
                               
 (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
                                errmsg(WAL archiving is not active),
                                errhint(archive_mode must be enabled at 
 server start.)));

       if (!XLogArchiveCommandSet())
               ereport(ERROR,
                               
 (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
                                errmsg(WAL archiving is not active),
                                errhint(archive_command must be defined 
 before 
                                                online backups can be made 
 safely.)));

 You can use streaming replication too to transport the WAL generated
 during the backup, so I think we should just check that wal_mode='archive'.

It's OK in pg_start_backup(), but seems NG in pg_stop_backup() since
it waits until some WAL files have been archived by the archiver. No?

 + /*
 +  * For Hot Standby, the WAL must be generated with 'hot_standby' mode,
 +  * and we must have at least as many backend slots as the primary.
 +  */
 + if (InArchiveRecovery  XLogRequestRecoveryConnections)
 + {
 +   if (ControlFile-wal_mode  WAL_MODE_HOT_STANDBY)
 +       ereport(ERROR,
 +              (errmsg(recovery connections cannot start because
 wal_mode was not set to 'hot_standby' on the WAL source server)));

 This seems to always prevent the server from doing an archive recovery
 since wal_mode is expected to be WAL_MODE_ARCHIVE in that case.

 No, it doesn't prevent archive recovery. It only prevents hot standby if
 wal_mode was not 'hot_standby' in the master. I think you missed the 
 XLogRequestRecoveryConnections condition above.

Even if we do only archive recovery, XLogRequestRecoveryConnections
might be TRUE. Or we need to ensure that the recovery_connection is
FALSE in the postgresql.conf before starting archive recovery?

And I tried archive recovery, and encountered the following error.

  LOG:  starting archive recovery
  LOG:  restored log file 00010001 from archive
  FATAL:  recovery connections cannot start because wal_mode was not
set to 'hot_standby' on the WAL source server
  LOG:  startup process (PID 32512) exited with exit code 1
  LOG:  aborting startup due to startup process failure

XLOG-related parameters in postgresql.conf
  archive_mode = on
  archive_command = 'cp %p ../data.arh/%f'
  wal_mode = archive
  recovery_connections = on

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] CP949 for EUC-KR?

2010-04-27 Thread Takahiro Itagaki

I heard pg_get_encoding_from_locale() failed in kor locale.

WARNING:  could not determine encoding for locale kor: codeset is CP949

I found the following description in the web:
CP949 is EUC-KR, extended with UHC (Unified Hangul Code).

http://www.opensource.apple.com/source/libiconv/libiconv-13.2/libiconv/lib/cp949.h

but we define CP51949 for EUC-KR in chklocale.c.
{PG_EUC_KR, CP51949}, /* or 20949 ? */

Which is the compatible codeset with our PG_EUC_KR encoding?
949, 51949, or 20949? Should we add (or replace) CP949 for EUC-KR?

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Make CheckRequiredParameterValues() depend upon correct

2010-04-27 Thread Heikki Linnakangas

Fujii Masao wrote:
 On Tue, Apr 27, 2010 at 6:49 PM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:
 Fujii Masao wrote:
 if (!XLogArchivingActive())
   ereport(ERROR,
 (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
  errmsg(WAL archiving is not active),
  errhint(archive_mode must be enabled at server start.)));
 You need to change the error messages which refer to archive_mode,
 like the above.
 Hmm, I think we should change not only the error message, but the logic
 too. There's two related checks there:

   if (!XLogArchivingActive())
   ereport(ERROR,
   
 (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg(WAL archiving is not active),
errhint(archive_mode must be enabled at 
 server start.)));

   if (!XLogArchiveCommandSet())
   ereport(ERROR,
   
 (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg(WAL archiving is not active),
errhint(archive_command must be defined 
 before 
online backups can be made 
 safely.)));
 You can use streaming replication too to transport the WAL generated
 during the backup, so I think we should just check that wal_mode='archive'.
 
 It's OK in pg_start_backup(), but seems NG in pg_stop_backup() since
 it waits until some WAL files have been archived by the archiver. No?

Good point, that logic would need to be changed too. Should it simply
return immediately if archive_mode=off?

 + /*
 +  * For Hot Standby, the WAL must be generated with 'hot_standby' mode,
 +  * and we must have at least as many backend slots as the primary.
 +  */
 + if (InArchiveRecovery  XLogRequestRecoveryConnections)
 + {
 +   if (ControlFile-wal_mode  WAL_MODE_HOT_STANDBY)
 +   ereport(ERROR,
 +  (errmsg(recovery connections cannot start because
 wal_mode was not set to 'hot_standby' on the WAL source server)));

 This seems to always prevent the server from doing an archive recovery
 since wal_mode is expected to be WAL_MODE_ARCHIVE in that case.
 No, it doesn't prevent archive recovery. It only prevents hot standby if
 wal_mode was not 'hot_standby' in the master. I think you missed the 
 XLogRequestRecoveryConnections condition above.
 
 Even if we do only archive recovery, XLogRequestRecoveryConnections
 might be TRUE. Or we need to ensure that the recovery_connection is
 FALSE in the postgresql.conf before starting archive recovery?

Umm, yes, if you have recovery_connnections=on, it means you want hot
standby. And for that you need wal_mode='hot_standby'.

By it doesn't prevent archive recovery I meant you can do traditional
archive recovery without hot standby. It doesn't matter how the WAL is
transported, via the archive or via streaming replication.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CP949 for EUC-KR?

2010-04-27 Thread Heikki Linnakangas

Takahiro Itagaki wrote:
 I heard pg_get_encoding_from_locale() failed in kor locale.
 
 WARNING:  could not determine encoding for locale kor: codeset is 
 CP949
 
 I found the following description in the web:
 CP949 is EUC-KR, extended with UHC (Unified Hangul Code).
 
 http://www.opensource.apple.com/source/libiconv/libiconv-13.2/libiconv/lib/cp949.h
 
 but we define CP51949 for EUC-KR in chklocale.c.
 {PG_EUC_KR, CP51949},   /* or 20949 ? */
 
 Which is the compatible codeset with our PG_EUC_KR encoding?
 949, 51949, or 20949?

A bit of googling suggests that 51949 is indeed the Windows codepage
that's equivalent with EUC-KR.

 Should we add (or replace) CP949 for EUC-KR?

No. CP949 is not plain EUC-KR, but EUC-KR with some extensions (UHC). At
least on CVS HEAD, we recognize CP949 as an alias for the PostgreSQL
PG_UHC encoding. There's a significant difference between the two,
because PG_EUC_KR is supported as a server-encoding while PG_UHC is not.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Wierd quirk of HS/SR, probably not fixable

2010-04-27 Thread Robert Haas

On Tue, Apr 27, 2010 at 5:25 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 Yep. I've committed a patch to do that.

Is there no way for the slave to recover from this situation?

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] recovery_connections cannot start

2010-04-27 Thread Robert Haas

On Tue, Apr 27, 2010 at 4:07 AM, Dimitri Fontaine
dfonta...@hi-media.com wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Mon, 2010-04-26 at 10:41 +0200, Dimitri Fontaine wrote:
 Would it be possible to have internal commands there, as for example
 cd is in my shell, or test, or time, or some more ?

 That would allow for providing a portable /usr/bin/true command as far
 as archiving is concerned (say, pg_archive_bypass), and will allow for
 providing a default archiving command in the future, like pg_archive_cp
 /location or something.

 Separating wal_mode and archive_mode, as we recently discussed, might
 eliminate the need for this kludge, if archive_mode can then be made
 changeable without a restart.

 I don't see my proposal as anything like a kludge at all. Internal
 commands are hugely practical and here would allow for PostgreSQL to
 provide basic portable archive and restore commands for simple cases,
 providing the necessary guarantees and error management.

Treating the string true as a special case seems like a kludge to
me.  Maybe a robust set of internal commands wouldn't be a kludge, but
that's not what's being proposed here.  I guess it's just a matter of
opinion.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CP949 for EUC-KR?

2010-04-27 Thread Takahiro Itagaki


Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote:

  Should we add (or replace) CP949 for EUC-KR?
 
 No. CP949 is not plain EUC-KR, but EUC-KR with some extensions (UHC). At
 least on CVS HEAD, we recognize CP949 as an alias for the PostgreSQL
 PG_UHC encoding.

That's it! We should have added an additional alias to chklocale, too.

Index: src/port/chklocale.c
===
--- src/port/chklocale.c(HEAD)
+++ src/port/chklocale.c(fixed)
@@ -172,6 +172,7 @@
{PG_GBK, CP936},
 
{PG_UHC, UHC},
+   {PG_UHC, CP949},
 
{PG_JOHAB, JOHAB},
{PG_JOHAB, CP1361},


Except UHC, we don't have any codepage aliases for the encodings below.
I assume we don't need to add CPxxx because Windows does not have
corresponding codepages for them, right?

{PG_LATIN6, ISO-8859-10},
{PG_LATIN7, ISO-8859-13},
{PG_LATIN8, ISO-8859-14},
{PG_LATIN10, ISO-8859-16},
{PG_SHIFT_JIS_2004, SJIS_2004},

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Make CheckRequiredParameterValues() depend upon correct

2010-04-27 Thread Fujii Masao

On Tue, Apr 27, 2010 at 7:50 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 It's OK in pg_start_backup(), but seems NG in pg_stop_backup() since
 it waits until some WAL files have been archived by the archiver. No?

 Good point, that logic would need to be changed too. Should it simply
 return immediately if archive_mode=off?

What if we wrongly set archive_mode to on and wal_mode to minimal?
I think that checking XLogArchivingActive() in pg_stop_backup() is
adequate.

 + /*
 +  * For Hot Standby, the WAL must be generated with 'hot_standby' mode,
 +  * and we must have at least as many backend slots as the primary.
 +  */
 + if (InArchiveRecovery  XLogRequestRecoveryConnections)
 + {
 +   if (ControlFile-wal_mode  WAL_MODE_HOT_STANDBY)
 +       ereport(ERROR,
 +              (errmsg(recovery connections cannot start because
 wal_mode was not set to 'hot_standby' on the WAL source server)));

 This seems to always prevent the server from doing an archive recovery
 since wal_mode is expected to be WAL_MODE_ARCHIVE in that case.
 No, it doesn't prevent archive recovery. It only prevents hot standby if
 wal_mode was not 'hot_standby' in the master. I think you missed the 
 XLogRequestRecoveryConnections condition above.

 Even if we do only archive recovery, XLogRequestRecoveryConnections
 might be TRUE. Or we need to ensure that the recovery_connection is
 FALSE in the postgresql.conf before starting archive recovery?

 Umm, yes, if you have recovery_connnections=on, it means you want hot
 standby. And for that you need wal_mode='hot_standby'.

Since the default value of recovery_connections is TRUE, I think that
the trouble which I encountered would often happen. We should disable
recovery_connections by default? Furthermore should move it from
postgresql.conf to recovery.conf?

On the other hand, I feel that recovery_connections=on in an archive
recovery is valid configuration *until* any read only connections are
requested. How about moving the above check to postmaster or backend?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Wierd quirk of HS/SR, probably not fixable

2010-04-27 Thread Fujii Masao

On Tue, Apr 27, 2010 at 8:07 PM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Apr 27, 2010 at 5:25 AM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:
 Yep. I've committed a patch to do that.

 Is there no way for the slave to recover from this situation?

Probably Yes. You would need to take a fresh base backup and
restart the slave from it.

On second thought, seeing a shutdown checkpoint during waiting
end-backup means mostly that the database has already reached
the consistent state. We might be able to relax the error check.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_migrator

2010-04-27 Thread Andrew Dunstan




Bruce Momjian wrote:

I concur; it's about a month too late to propose this.



I am confused why it is late.  We add to /contrib even during beta, and
I didn't bring it up earlier because I didn't want to be pushing my own
software.  Was someone else supposed to suggest it a month ago?

  


Bruce,

you're not usually such a shrinking violet. If you don't push your 
project it's less likely others will, IMNSHO.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Make CheckRequiredParameterValues() depend upon correct

2010-04-27 Thread Robert Haas

On Tue, Apr 27, 2010 at 7:24 AM, Fujii Masao masao.fu...@gmail.com wrote:
 On Tue, Apr 27, 2010 at 7:50 PM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:
 It's OK in pg_start_backup(), but seems NG in pg_stop_backup() since
 it waits until some WAL files have been archived by the archiver. No?

 Good point, that logic would need to be changed too. Should it simply
 return immediately if archive_mode=off?

 What if we wrongly set archive_mode to on and wal_mode to minimal?
 I think that checking XLogArchivingActive() in pg_stop_backup() is
 adequate.

That case should be rejected at primary startup.

 + /*
 +  * For Hot Standby, the WAL must be generated with 'hot_standby' mode,
 +  * and we must have at least as many backend slots as the primary.
 +  */
 + if (InArchiveRecovery  XLogRequestRecoveryConnections)
 + {
 +   if (ControlFile-wal_mode  WAL_MODE_HOT_STANDBY)
 +       ereport(ERROR,
 +              (errmsg(recovery connections cannot start because
 wal_mode was not set to 'hot_standby' on the WAL source server)));

 This seems to always prevent the server from doing an archive recovery
 since wal_mode is expected to be WAL_MODE_ARCHIVE in that case.
 No, it doesn't prevent archive recovery. It only prevents hot standby if
 wal_mode was not 'hot_standby' in the master. I think you missed the 
 XLogRequestRecoveryConnections condition above.

 Even if we do only archive recovery, XLogRequestRecoveryConnections
 might be TRUE. Or we need to ensure that the recovery_connection is
 FALSE in the postgresql.conf before starting archive recovery?

 Umm, yes, if you have recovery_connnections=on, it means you want hot
 standby. And for that you need wal_mode='hot_standby'.

 Since the default value of recovery_connections is TRUE, I think that
 the trouble which I encountered would often happen. We should disable
 recovery_connections by default? Furthermore should move it from
 postgresql.conf to recovery.conf?

 On the other hand, I feel that recovery_connections=on in an archive
 recovery is valid configuration *until* any read only connections are
 requested. How about moving the above check to postmaster or backend?

Or just not starting recovery connections, but still doing archive
recovery?  I think in this case a WARNING might be adequate.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Wierd quirk of HS/SR, probably not fixable

2010-04-27 Thread Simon Riggs

On Tue, 2010-04-27 at 12:25 +0300, Heikki Linnakangas wrote:
 Fujii Masao wrote:
  On Tue, Apr 27, 2010 at 4:19 PM, Heikki Linnakangas
  heikki.linnakan...@enterprisedb.com wrote:
  Hmm, we could throw an error in the standby, when we see a shutdown
  checkpoint while we're waiting for an end-backup record. If the database
  was shut down before pg_stop_backup(), we know that the backup was
  cancelled and the end-backup record we're waiting for will never arrive.
  
  Sounds good. This would work fine even if an immediate shutdown is done
  instead since the primary ends up generating a shutdown checkpoint record
  when restarting.
 
 Yep. I've committed a patch to do that.

We should be able to do this earlier in the run.

If pg_stop_backup() is run it creates the .backup file in the archive.
In the absence of that file, we should be able to work out that
pg_stop_backup() was not run. Almost, because we support starting
recovery without need to run start/stop backup. If we introduced a
special option for that in recovery.conf it would be much simpler to
fail if the file were unavailable.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] recovery_connections cannot start

2010-04-27 Thread Dimitri Fontaine

Robert Haas robertmh...@gmail.com writes:
 Treating the string true as a special case seems like a kludge to
 me.  Maybe a robust set of internal commands wouldn't be a kludge, but
 that's not what's being proposed here.  I guess it's just a matter of
 opinion.

I don't see how to have internal commands without having special cases
for the setting, and I did propose pg_archive_bypass as the name. I
guess the implementation would be what Simon was talking about, though.

I don't see true as meaningful in the context of an archive_command…

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] recovery_connections cannot start

2010-04-27 Thread Simon Riggs

On Tue, 2010-04-27 at 15:10 +0200, Dimitri Fontaine wrote:
 Robert Haas robertmh...@gmail.com writes:
  Treating the string true as a special case seems like a kludge to
  me.  Maybe a robust set of internal commands wouldn't be a kludge, but
  that's not what's being proposed here.  I guess it's just a matter of
  opinion.
 
 I don't see how to have internal commands without having special cases
 for the setting, and I did propose pg_archive_bypass as the name. I
 guess the implementation would be what Simon was talking about, though.
 
 I don't see true as meaningful in the context of an archive_command…

Saying its a kludge doesn't really address the issue and goes nowhere
towards fixing it. If we don't like the proposal, fine, then what is the
alternative solution?

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Differential backup

2010-04-27 Thread Simon Riggs


Thinking about allowing a backup to tell which files have changed in the
database since last backup. This would allow an external utility to copy
away only changed files.

Now there's a few ways of doing this and many will say this is already
possible using file access times.

An explicit mechanism where Postgres could authoritatively say which
files have changed would make many feel safer, especially when other
databases also do this.

We keep track of which files require fsync(), so we could also keep
track of changed files using that same information.

Is this route worthwhile? Or in some way unacceptable?

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Differential backup

2010-04-27 Thread Alvaro Herrera

Simon Riggs wrote:
 
 Thinking about allowing a backup to tell which files have changed in the
 database since last backup. This would allow an external utility to copy
 away only changed files.
 
 Now there's a few ways of doing this and many will say this is already
 possible using file access times.
 
 An explicit mechanism where Postgres could authoritatively say which
 files have changed would make many feel safer, especially when other
 databases also do this.
 
 We keep track of which files require fsync(), so we could also keep
 track of changed files using that same information.

Why file level?  Seems a bit too coarse (particularly if you have large
file support enabled).  Maybe we could keep block-level last change info
in a separate fork.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Wierd quirk of HS/SR, probably not fixable

2010-04-27 Thread Tom Lane

Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 Hmm, we could throw an error in the standby, when we see a shutdown
 checkpoint while we're waiting for an end-backup record. If the database
 was shut down before pg_stop_backup(), we know that the backup was
 cancelled and the end-backup record we're waiting for will never arrive.

Isn't the above statement complete nonsense?  There's nothing to stop
the DBA from issuing pg_stop_backup() after he restarts the master.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Differential backup

2010-04-27 Thread Simon Riggs

On Tue, 2010-04-27 at 09:50 -0400, Alvaro Herrera wrote:
 Simon Riggs wrote:
  
  Thinking about allowing a backup to tell which files have changed in the
  database since last backup. This would allow an external utility to copy
  away only changed files.
  
  Now there's a few ways of doing this and many will say this is already
  possible using file access times.
  
  An explicit mechanism where Postgres could authoritatively say which
  files have changed would make many feel safer, especially when other
  databases also do this.
  
  We keep track of which files require fsync(), so we could also keep
  track of changed files using that same information.
 
 Why file level?  Seems a bit too coarse (particularly if you have large
 file support enabled).  Maybe we could keep block-level last change info
 in a separate fork.

Block-level is mostly available by using LSN, you just need to scan the
file. So block level seems not useful enough for the extra overhead.

File-level would be sufficient for most purposes. If you wanted to go
finer grained you can then scan just the files that have changed.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Differential backup

2010-04-27 Thread Kevin Grittner

Simon Riggs si...@2ndquadrant.com wrote:
 
 Thinking about allowing a backup to tell which files have changed
 in the database since last backup. This would allow an external
 utility to copy away only changed files.
 
 Now there's a few ways of doing this and many will say this is
 already possible using file access times.
 
Who would say otherwise?  Under what circumstances would PostgreSQL
modify a file without changing the last modified timestamp or the
file size?  If you're concerned about the converse, with daemon-
based rsync you can copy just the modified portions of a file on
which the directory information has changed.  Or is this targeting
platforms which don't have rsync?
 
 An explicit mechanism where Postgres could authoritatively say
 which files have changed would make many feel safer, especially
 when other databases also do this.
 
Why?  I must be missing something, because my feeling is that if you
can't trust your OS to cover something like this, how can you trust
any application *running* under that OS to do it?
 
 Is this route worthwhile?
 
I'm not seeing it, but I could be missing something.  Can you
describe a use case where this would be beneficial?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Differential backup

2010-04-27 Thread Simon Riggs

On Tue, 2010-04-27 at 08:59 -0500, Kevin Grittner wrote:
  An explicit mechanism where Postgres could authoritatively say
  which files have changed would make many feel safer, especially
  when other databases also do this.
  
 Why?  I must be missing something, because my feeling is that if you
 can't trust your OS to cover something like this, how can you trust
 any application *running* under that OS to do it?

Good questions. I'm exploring a perceived need. 

I don't think people want this because they think the OS is flaky. It's
more about trusting all of the configurations of all of the filesystems
in use. An explicit mechanism would be more verifiably accurate. It
might just be about control and blame.
 
-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Differential backup

2010-04-27 Thread Florian Pflug

On Apr 27, 2010, at 15:50 , Alvaro Herrera wrote:
 Simon Riggs wrote:
 Thinking about allowing a backup to tell which files have changed in the
 database since last backup. This would allow an external utility to copy
 away only changed files.
 
 Now there's a few ways of doing this and many will say this is already
 possible using file access times.
 
 An explicit mechanism where Postgres could authoritatively say which
 files have changed would make many feel safer, especially when other
 databases also do this.
 
 We keep track of which files require fsync(), so we could also keep
 track of changed files using that same information.
 
 Why file level?  Seems a bit too coarse (particularly if you have large
 file support enabled).  Maybe we could keep block-level last change info
 in a separate fork.

Hm, but most backup solutions work per-file and not per-block, so file-level 
tracking probably has more use-cases that block-level tracking..

In any case, it seems that this information could easily be extracted from the 
WAL. The archive_command could call a simple tool that parses the WAL and 
tracks the latest LSN per database file or page or whatever granularity is 
required. This, together with the backup label of the last backup should be 
enough to compute the list of changed files I think.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Differential backup

2010-04-27 Thread Florian Pflug

On Apr 27, 2010, at 16:08 , Simon Riggs wrote:
 On Tue, 2010-04-27 at 08:59 -0500, Kevin Grittner wrote:
 Why?  I must be missing something, because my feeling is that if you
 can't trust your OS to cover something like this, how can you trust
 any application *running* under that OS to do it?
 
 Good questions. I'm exploring a perceived need. 
 
 I don't think people want this because they think the OS is flaky. It's
 more about trusting all of the configurations of all of the filesystems
 in use. An explicit mechanism would be more verifiably accurate. It
 might just be about control and blame.

I believe a reason for people (including me) to not have 100% faith in file 
modification times are non-monotone system clocks. I've seen more than one 
system where a cron job running ntpdate every night was used as a poor man's 
replacement for ntpd...

So the real advantage of rolling our own solution is the ability to use LSNs 
instead of timestamps I'd say.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Differential backup

2010-04-27 Thread Michael Tharp


On 04/27/2010 09:59 AM, Kevin Grittner wrote:

Under what circumstances would PostgreSQL
modify a file without changing the last modified timestamp or the
file size?


Do all OSes have sub-second precision mtimes? Because otherwise I could 
see a scenario such at this:


* File is modified
* Backup inspects and copies the file in the same second
* File is modified again in the same second, so the mtime doesn't change
* Backup is run again some time later and sees that the mtime has not 
changed


Even with microsecond precision this kind of scenario makes me squidgy, 
especially if some OSes decide that skipping frequent mtime updates is 
OK. Florian's point about clock changes is also very relevant. Since 
Postgres has the capability to give a better answer about what is in the 
file, it would be best to use that.


-- m. tharp

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] recovery_connections cannot start

2010-04-27 Thread Robert Haas

On Apr 27, 2010, at 9:20 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On Tue, 2010-04-27 at 15:10 +0200, Dimitri Fontaine wrote:
 Robert Haas robertmh...@gmail.com writes:
 Treating the string true as a special case seems like a kludge to
 me.  Maybe a robust set of internal commands wouldn't be a kludge,
 but
 that's not what's being proposed here.  I guess it's just a matter
 of
 opinion.

 I don't see how to have internal commands without having special
 cases
 for the setting, and I did propose pg_archive_bypass as the name. I
 guess the implementation would be what Simon was talking about,
 though.

 I don't see true as meaningful in the context of an
 archive_command…

 Saying its a kludge doesn't really address the issue and goes
 nowhere
 towards fixing it. If we don't like the proposal, fine, then what is
 the
 alternative solution?

I proposed one upthread.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Wierd quirk of HS/SR, probably not fixable

2010-04-27 Thread Heikki Linnakangas

Tom Lane wrote:
 Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 Hmm, we could throw an error in the standby, when we see a shutdown
 checkpoint while we're waiting for an end-backup record. If the database
 was shut down before pg_stop_backup(), we know that the backup was
 cancelled and the end-backup record we're waiting for will never arrive.
 
 Isn't the above statement complete nonsense?  There's nothing to stop
 the DBA from issuing pg_stop_backup() after he restarts the master.

pg_stop_backup() can't be called if there's no backup in progress.
Restart cancels it.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Wierd quirk of HS/SR, probably not fixable

2010-04-27 Thread Heikki Linnakangas

Robert Haas wrote:
 On Tue, Apr 27, 2010 at 5:25 AM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:
 Yep. I've committed a patch to do that.
 
 Is there no way for the slave to recover from this situation?

No, it will never open up for hot standby, and it will error at the end
of recovery anyway. This just makes it happen earlier and with a smarter
error message.

In theory, if the data directory was fully copied by the time of the
shutdown/crash and the only thing that was missing was pg_stop_backup(),
all the data is there, so you could get a consistent database. But we
can't know if it's consistent or not, so we don't allow it.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Differential backup

2010-04-27 Thread Merlin Moncure

On Tue, Apr 27, 2010 at 10:32 AM, Michael Tharp
g...@partiallystapled.com wrote:
 On 04/27/2010 09:59 AM, Kevin Grittner wrote:

 Under what circumstances would PostgreSQL
 modify a file without changing the last modified timestamp or the
 file size?

 Do all OSes have sub-second precision mtimes? Because otherwise I could see
 a scenario such at this:

 * File is modified
 * Backup inspects and copies the file in the same second
 * File is modified again in the same second, so the mtime doesn't change
 * Backup is run again some time later and sees that the mtime has not
 changed

 Even with microsecond precision this kind of scenario makes me squidgy,
 especially if some OSes decide that skipping frequent mtime updates is OK.
 Florian's point about clock changes is also very relevant. Since Postgres
 has the capability to give a better answer about what is in the file, it
 would be best to use that.

Why not just force all files to be checked irregardless of mtime?  The
proposal only seems a win to me if a fair percentage of the larger
files don't change, which strikes me as a relatively low level case to
optimize for.  Maybe I'm missing the objective, but it looks like the
payoff is to avoid scanning large files for checksums.  If I was even
infinitesimally insecure about rsync missing files because of
clock/filesystem issues, I'd simply force it.

One cool thing about making postgres 'aware' of last backup time is
that you could warn the user in various places that the database is
not being properly backed up (pg_dump would have to monitor
last_backup_time as well then).  Good stuff, but I bet most people who
aren't backing up the database also aren't checking the log :-).

The block level case seems pretty much covered by the hot standby feature.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Differential backup

2010-04-27 Thread Kevin Grittner

Merlin Moncure mmonc...@gmail.com wrote:
 
 The proposal only seems a win to me if a fair percentage of the
 larger files don't change, which strikes me as a relatively low
 level case to optimize for.
 
That's certainly a situation we face, with a relatively slow WAN in
the middle.
 
http://archives.postgresql.org/pgsql-admin/2009-07/msg00071.php
 
I don't know how rare or common that is.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Wierd quirk of HS/SR, probably not fixable

2010-04-27 Thread Heikki Linnakangas

Simon Riggs wrote:
 If pg_stop_backup() is run it creates the .backup file in the archive.
 In the absence of that file, we should be able to work out that
 pg_stop_backup() was not run. 

It's just as likely that the file is there even though the backup didn't
finish, though.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Wierd quirk of HS/SR, probably not fixable

2010-04-27 Thread Tom Lane

Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 Tom Lane wrote:
 Isn't the above statement complete nonsense?  There's nothing to stop
 the DBA from issuing pg_stop_backup() after he restarts the master.

 pg_stop_backup() can't be called if there's no backup in progress.
 Restart cancels it.

Doh, right (not enough caffeine yet).

Given that, I concur this change is a good idea.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Make CheckRequiredParameterValues() depend upon correct

2010-04-27 Thread Tom Lane

Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 Ok, here's a patch that includes the changes to add new wal_mode GUC
 (http://archives.postgresql.org/message-id/4bd581a6.60...@enterprisedb.com),

I haven't read this in any detail, but why does it add inclusion of
pg_control.h to xlog.h?  I don't see any reason for that in the actual
changes in xlog.h.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Differential backup

2010-04-27 Thread Csaba Nagy

Hi all,

On Tue, 2010-04-27 at 11:07 -0400, Merlin Moncure wrote:
 The block level case seems pretty much covered by the hot standby feature.

One use case we would have is to dump only the changes from the last
backup of a single table. This table takes 30% of the DB disk space, it
is in the order of ~400GB, and it's only inserted, never updated, then
after ~1 year the old entries are archived. There's ~10M new entries
daily in this table. If the backup would be smart enough to only read
the changed blocks (in this case only for newly inserted records), it
would be a fairly big win...

Cheers,
Csaba.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Wierd quirk of HS/SR, probably not fixable

2010-04-27 Thread Simon Riggs

On Tue, 2010-04-27 at 18:13 +0300, Heikki Linnakangas wrote:
 Simon Riggs wrote:
  If pg_stop_backup() is run it creates the .backup file in the archive.
  In the absence of that file, we should be able to work out that
  pg_stop_backup() was not run. 
 
 It's just as likely that the file is there even though the backup didn't
 finish, though.

It's possible, but not likely. It would need to break at a very specific
place for that to be the case. Whereas the test I explained would work
for about 99% of the time between start and stop backup, except for the
caveat I explained also. I'm not sure that pointing out a minor hole
stops it being a worthwhile test? Surely if you care to fix the problem
then a better test can only be a good thing?

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Differential backup

2010-04-27 Thread Merlin Moncure

On Tue, Apr 27, 2010 at 11:13 AM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Merlin Moncure mmonc...@gmail.com wrote:

 The proposal only seems a win to me if a fair percentage of the
 larger files don't change, which strikes me as a relatively low
 level case to optimize for.

 That's certainly a situation we face, with a relatively slow WAN in
 the middle.

 http://archives.postgresql.org/pgsql-admin/2009-07/msg00071.php

 I don't know how rare or common that is.

hm...interesting read.  pretty clever.  Your archiving requirements are high.

With the new stuff (HS/SR) taken into consideration, would you have
done your DR the same way if you had to do it all over again?

Part of my concern here is that manual filesystem level backups are
going to become an increasingly arcane method of doing things as the
HS/SR train starts leaving the station.

hm, it would be pretty neat to see some of the things you do pushed
into logical (pg_dump) style backups...with some enhancements so that
it can skip tables haven't changed and are exhibited in a previously
supplied dump.  This is more complicated but maybe more useful for a
broader audience?

Side question: is it impractical to backup via pg_dump a hot standby
because of query conflict issues?

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Differential backup

2010-04-27 Thread Kevin Grittner

Merlin Moncure mmonc...@gmail.com wrote:
 
 Your archiving requirements are high.
 
They are set by a Steering Committee composed of the Directory of
State Courts and various District Court Administrators, Judges,
Clerks of Court, and Registers in Probate who rely on this data and
*really* want to be safe.  I just work here.  ;-)
 
 With the new stuff (HS/SR) taken into consideration, would you
 have done your DR the same way if you had to do it all over again?
 
When SR is available, if I can maintain the flow of WAL files while
doing so, I would feed our warm standby farm with SR connections. 
Otherwise I'd do the same.  It's pretty much mandated that we keep
those copies.  It'd be ideal if SR could reconstruct the WAL file
segments on the receiving end, to avoid sending the data twice.
Dare I dream?  :-)
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Wierd quirk of HS/SR, probably not fixable

2010-04-27 Thread Heikki Linnakangas

Simon Riggs wrote:
 On Tue, 2010-04-27 at 18:13 +0300, Heikki Linnakangas wrote:
 Simon Riggs wrote:
 If pg_stop_backup() is run it creates the .backup file in the archive.
 In the absence of that file, we should be able to work out that
 pg_stop_backup() was not run. 
 It's just as likely that the file is there even though the backup didn't
 finish, though.
 
 It's possible, but not likely. It would need to break at a very specific
 place for that to be the case. Whereas the test I explained would work
 for about 99% of the time between start and stop backup, except for the
 caveat I explained also.

I don't understand how you arrived at that figure. Roughly speaking,
there's two possibilities: backup_label is backed up before the bulk of
the data in base-directory or tablespaces, in which case it will almost
certainly be included in the backup, or it will be backed up after the
bulk of the data, in which case it will almost certainly not be included
if the backup is stopped prematurely. I don't know which is more common,
but both seem plausible.

 I'm not sure that pointing out a minor hole
 stops it being a worthwhile test? Surely if you care to fix the problem
 then a better test can only be a good thing?

Yeah, it might be worthwhile if it's not a lot of code.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Make CheckRequiredParameterValues() depend upon correct

2010-04-27 Thread Heikki Linnakangas

Tom Lane wrote:
 Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 Ok, here's a patch that includes the changes to add new wal_mode GUC
 (http://archives.postgresql.org/message-id/4bd581a6.60...@enterprisedb.com),
 
 I haven't read this in any detail, but why does it add inclusion of
 pg_control.h to xlog.h?  I don't see any reason for that in the actual
 changes in xlog.h.

I put the enum for wal_mode to pg_control.h, so that it's available to
pg_controlinfo.c without #including xlog.h there. The
XLogArchivingActive() macro in xlog.h needs the enum values:

#define XLogArchivingActive()  (XLogArchiveMode  wal_mode =
WAL_MODE_ARCHIVE

I'm all ears for better suggestions, I didn't like that much either.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] testing HS/SR - 1 vs 2 performance

2010-04-27 Thread Tom Lane

Simon Riggs si...@2ndquadrant.com writes:
 v3 attached

This patch changes KnownAssignedXidsRemove() so that failure to find
the target XID is elog(ERROR) (ie, a PANIC, since this is in the
startup process).  However, this comment is still there:

/*
 * We can fail to find an xid if the xid came from a subtransaction that
 * aborts, though the xid hadn't yet been reported and no WAL records 
have
 * been written using the subxid. In that case the abort record will
 * contain that subxid and we haven't seen it before.
 */

WTF?  Either the comment is wrong or this should not be an elog
condition.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] testing HS/SR - 1 vs 2 performance

2010-04-27 Thread Simon Riggs

On Tue, 2010-04-27 at 13:52 -0400, Tom Lane wrote:
 Simon Riggs si...@2ndquadrant.com writes:
  v3 attached
 
 This patch changes KnownAssignedXidsRemove() so that failure to find
 the target XID is elog(ERROR) (ie, a PANIC, since this is in the
 startup process).  

Not in all cases. The code is correct, as far as I am aware from
testing.

 However, this comment is still there:
   /*
* We can fail to find an xid if the xid came from a subtransaction that
* aborts, though the xid hadn't yet been reported and no WAL records 
 have
* been written using the subxid. In that case the abort record will
* contain that subxid and we haven't seen it before.
*/
 
 WTF?  Either the comment is wrong or this should not be an elog
 condition.

That section of code has been rewritten many times. I think it is now
inaccurate and should be removed. I left it there because the
unfortunate history of the project has been the removal of comments and
then later rediscovery of the truth, sometimes more than once. I could
no longer reproduce that error; someone else may know differently.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Wierd quirk of HS/SR, probably not fixable

2010-04-27 Thread Simon Riggs

On Tue, 2010-04-27 at 20:14 +0300, Heikki Linnakangas wrote:
 Simon Riggs wrote:
  On Tue, 2010-04-27 at 18:13 +0300, Heikki Linnakangas wrote:
  Simon Riggs wrote:
  If pg_stop_backup() is run it creates the .backup file in the archive.
  In the absence of that file, we should be able to work out that
  pg_stop_backup() was not run. 
  It's just as likely that the file is there even though the backup didn't
  finish, though.
  
  It's possible, but not likely. It would need to break at a very specific
  place for that to be the case. Whereas the test I explained would work
  for about 99% of the time between start and stop backup, except for the
  caveat I explained also.
 
 I don't understand how you arrived at that figure. 

You're talking about the backup_label file, I'm talking about
the .backup file in the archive.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Make CheckRequiredParameterValues() depend upon correct

2010-04-27 Thread Tom Lane

Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 Tom Lane wrote:
 I haven't read this in any detail, but why does it add inclusion of
 pg_control.h to xlog.h?  I don't see any reason for that in the actual
 changes in xlog.h.

 I put the enum for wal_mode to pg_control.h, so that it's available to
 pg_controlinfo.c without #including xlog.h there. The
 XLogArchivingActive() macro in xlog.h needs the enum values:

Oh, I see.

 I'm all ears for better suggestions, I didn't like that much either.

How about putting the enum {} declaration in xlog.h, and making the
field in pg_control.h just be declared int?  I'm not sure declaring
it as enum is a great idea anyway, since that makes the on-disk
representation dependent on a compiler's whim as to how wide the
enum will be.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Differential backup

2010-04-27 Thread Robert Haas

On Tue, Apr 27, 2010 at 10:08 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On Tue, 2010-04-27 at 08:59 -0500, Kevin Grittner wrote:
  An explicit mechanism where Postgres could authoritatively say
  which files have changed would make many feel safer, especially
  when other databases also do this.

 Why?  I must be missing something, because my feeling is that if you
 can't trust your OS to cover something like this, how can you trust
 any application *running* under that OS to do it?

 Good questions. I'm exploring a perceived need.

 I don't think people want this because they think the OS is flaky. It's
 more about trusting all of the configurations of all of the filesystems
 in use. An explicit mechanism would be more verifiably accurate. It
 might just be about control and blame.

What I think would be cool, though it's not what you proposed, is an
integrated base backup feature.  Say your SR slave gets too far behind
and can't catch up for some reason (the system administrator
accidentally nuked the archive, or you were living on the edge and not
keeping one).  It would be neat to have a way, either manually or
maybe even automatically, to tell the slave, hey, go make a new base
backup.  And it would connect to the master and do pg_start_backup()
and stream down the whole database contents and do pg_stop_backup().
Of course you can do all of this with scripts, but ISTM an integrated
capability would be much easier to administer and might offer some
interesting opportunities for compression.

With respect to what you actually proposed, like Kevin, I'm not sure
what it's good for.  It might make sense if we know what the use case
is but the value certainly isn't obvious.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] testing HS/SR - 1 vs 2 performance

2010-04-27 Thread Tom Lane

Hmm ... there's another point here, which is that the array size creates
a hard maximum on the number of entries, whereas the hash table was a
bit more forgiving.  What is the proof that the array won't overflow?
The fact that the equivalent data structure on the master can't hold
more than this many entries doesn't seem to me to prove that, because
we will add intermediate not-observed XIDs to the array.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Make CheckRequiredParameterValues() depend upon correct

2010-04-27 Thread Heikki Linnakangas

Tom Lane wrote:
 How about putting the enum {} declaration in xlog.h, and making the
 field in pg_control.h just be declared int?

I tried that at first, but the problem was with pg_controldata.c. In
bin/. I wanted it to print wal_mode in human-readable format, so it
needed the values of the enum from somewhere. I tried to #include
access/xlog.h in pg_controlinfo.c, but got a bunch of errors.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Wierd quirk of HS/SR, probably not fixable

2010-04-27 Thread Heikki Linnakangas

Simon Riggs wrote:
 On Tue, 2010-04-27 at 20:14 +0300, Heikki Linnakangas wrote:
 Simon Riggs wrote:
 On Tue, 2010-04-27 at 18:13 +0300, Heikki Linnakangas wrote:
 Simon Riggs wrote:
 If pg_stop_backup() is run it creates the .backup file in the archive.
 In the absence of that file, we should be able to work out that
 pg_stop_backup() was not run. 
 It's just as likely that the file is there even though the backup didn't
 finish, though.
 It's possible, but not likely. It would need to break at a very specific
 place for that to be the case. Whereas the test I explained would work
 for about 99% of the time between start and stop backup, except for the
 caveat I explained also.
 I don't understand how you arrived at that figure. 
 
 You're talking about the backup_label file, I'm talking about
 the .backup file in the archive.

Oh, the backup history file. We stopped relying on that with the
introduction of the end-of-backup record, to make life easier for
streaming replication, and because it's simpler anyway. I don't think we
should go back to it.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Make CheckRequiredParameterValues() depend upon correct

2010-04-27 Thread Alvaro Herrera

Heikki Linnakangas escribió:
 Tom Lane wrote:
  How about putting the enum {} declaration in xlog.h, and making the
  field in pg_control.h just be declared int?
 
 I tried that at first, but the problem was with pg_controldata.c. In
 bin/. I wanted it to print wal_mode in human-readable format, so it
 needed the values of the enum from somewhere. I tried to #include
 access/xlog.h in pg_controlinfo.c, but got a bunch of errors.

Hmm, AFAICS the problem with controldata is that it uses postgres_fe.h
instead of postgres.h.  It's a bit of a stretch to use the latter, but
maybe that's a better solution?  After all, it *is* poking into the
backend internals.

I know I had to hack around pg_controldata some time ago (I don't recall
what for) and found that it could be cleaned up like this.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] testing HS/SR - 1 vs 2 performance

2010-04-27 Thread Simon Riggs

On Tue, 2010-04-27 at 14:53 -0400, Tom Lane wrote:
 Hmm ... there's another point here, which is that the array size
 creates
 a hard maximum on the number of entries, whereas the hash table was a
 bit more forgiving.  What is the proof that the array won't overflow?
 The fact that the equivalent data structure on the master can't hold
 more than this many entries doesn't seem to me to prove that, because
 we will add intermediate not-observed XIDs to the array.

We know that not-observed xids have actually been allocated on the
primary. We log an assignment record every 64 subtransactions, so that
the peak size of the array is 65 xids per connection.

It's possible for xids to stay in the array for longer, in the event of
a FATAL error that doesn't log an abort record. We clean those up every
checkpoint, if they exist. The potential number of them is unbounded, so
making special allowance for them doesn't remove the theoretical risk.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CP949 for EUC-KR?

2010-04-27 Thread Heikki Linnakangas

Takahiro Itagaki wrote:
 That's it! We should have added an additional alias to chklocale, too.
 
 Index: src/port/chklocale.c
 ===
 --- src/port/chklocale.c  (HEAD)
 +++ src/port/chklocale.c  (fixed)
 @@ -172,6 +172,7 @@
   {PG_GBK, CP936},
  
   {PG_UHC, UHC},
 + {PG_UHC, CP949},
  
   {PG_JOHAB, JOHAB},
   {PG_JOHAB, CP1361},

Yeah, seems correct.

 Except UHC, we don't have any codepage aliases for the encodings below.
 I assume we don't need to add CPxxx because Windows does not have
 corresponding codepages for them, right?
 
 {PG_LATIN6, ISO-8859-10},
 {PG_LATIN7, ISO-8859-13},
 {PG_LATIN8, ISO-8859-14},
 {PG_LATIN10, ISO-8859-16},
 {PG_SHIFT_JIS_2004, SJIS_2004},

Yeah, I guess so. I can't find Windows codepages for these either, by
google.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] testing HS/SR - 1 vs 2 performance

2010-04-27 Thread Tom Lane

Simon Riggs si...@2ndquadrant.com writes:
 On Tue, 2010-04-27 at 13:52 -0400, Tom Lane wrote:
 WTF?  Either the comment is wrong or this should not be an elog
 condition.

 That section of code has been rewritten many times. I think it is now
 inaccurate and should be removed. I left it there because the
 unfortunate history of the project has been the removal of comments and
 then later rediscovery of the truth, sometimes more than once. I could
 no longer reproduce that error; someone else may know differently.

I haven't tested this, but it appears to me that the failure would occur
in overflow situations.  If we have too many subxacts, we'll generate
XLOG_XACT_ASSIGNMENT, which will cause the subxids to be removed from
KnownAssignedXids[].  Then later when the top-level xact commits or
aborts we'll try to remove them again as a consequence of processing
the top-level's commit/abort record.  No?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] testing HS/SR - 1 vs 2 performance

2010-04-27 Thread Simon Riggs

On Tue, 2010-04-27 at 16:18 -0400, Tom Lane wrote:
 Simon Riggs si...@2ndquadrant.com writes:
  On Tue, 2010-04-27 at 13:52 -0400, Tom Lane wrote:
  WTF?  Either the comment is wrong or this should not be an elog
  condition.
 
  That section of code has been rewritten many times. I think it is now
  inaccurate and should be removed. I left it there because the
  unfortunate history of the project has been the removal of comments and
  then later rediscovery of the truth, sometimes more than once. I could
  no longer reproduce that error; someone else may know differently.
 
 I haven't tested this, but it appears to me that the failure would occur
 in overflow situations.  If we have too many subxacts, we'll generate
 XLOG_XACT_ASSIGNMENT, which will cause the subxids to be removed from
 KnownAssignedXids[].  Then later when the top-level xact commits or
 aborts we'll try to remove them again as a consequence of processing
 the top-level's commit/abort record.  No?

Yes, thank you for clear thinking. Anyway, looks like the comment was
right after all and the new code to throw an error is wrong in some
cases. It was useful for testing, at least. The comment was slightly
misleading, which is a good reason to rewrite it.

It seems like it might be possible to identify which xids could cause an
error and which won't. Much harder than that. We still have the possible
case where we have 64 subtransactions allocated but many of them abort
and we are left with a final commit of 64 subtransactions.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] testing HS/SR - 1 vs 2 performance

2010-04-27 Thread Tom Lane

Isn't the snapshotOldestActiveXid filter in
RecordKnownAssignedTransactionIds completely wrong/useless/bogus?

AFAICS, snapshotOldestActiveXid is only set once at the start of
recovery.  This means it will soon be too old to provide any useful
filtering.  But what's far worse is that the XID space will eventually
wrap around, and that test will start filtering *everything*.

I think we should just lose that test, as well as the variable.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] testing HS/SR - 1 vs 2 performance

2010-04-27 Thread Simon Riggs

On Tue, 2010-04-27 at 17:24 -0400, Tom Lane wrote:

 Isn't the snapshotOldestActiveXid filter in
 RecordKnownAssignedTransactionIds completely wrong/useless/bogus?
 
 AFAICS, snapshotOldestActiveXid is only set once at the start of
 recovery.  This means it will soon be too old to provide any useful
 filtering.  But what's far worse is that the XID space will eventually
 wrap around, and that test will start filtering *everything*.
 
 I think we should just lose that test, as well as the variable.

Yes, though it looks like it is still necessary in creating a valid
initial state because otherwise we may have xids in KnownAssigned array
that are already complete. The comment there talks about wasting memory,
though it appears to be a correctness issue.

So perhaps a similar test is required in ProcArrayApplyRecoveryInfo()
but not in RecordKnownAssignedTransactionIds(). That way it is applied,
but only once at initialisation.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] testing HS/SR - 1 vs 2 performance

2010-04-27 Thread Tom Lane

Simon Riggs si...@2ndquadrant.com writes:
 On Tue, 2010-04-27 at 17:24 -0400, Tom Lane wrote:
 I think we should just lose that test, as well as the variable.

 Yes, though it looks like it is still necessary in creating a valid
 initial state because otherwise we may have xids in KnownAssigned array
 that are already complete.

Huh?  How is a filter as coarse as an oldest-running-XID filter going
to prevent that?  And aren't we initializing from trustworthy data in
ProcArrayApplyRecoveryInfo, anyway?

I still say it's useless.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Schema.Table.Col resolution seems broken in Alpha5

2010-04-27 Thread Josh Berkus

Hackers,

It appears that something broke the ability to refer to columns by full
SQL path names in 9.0.  That is, references to columns as
schema.table.col will produce a completely bogus error which did not
exist on previous versions.

The following works perfectly well in 8.4:

postgres=# create table test1( id serial, val integer );
NOTICE:  CREATE TABLE will create implicit sequence test1_id_seq for
serial column test1.id
CREATE TABLE
postgres=# create table test2( id serial, val integer );
NOTICE:  CREATE TABLE will create implicit sequence test2_id_seq for
serial column test2.id
CREATE TABLE
postgres=# insert into test1(val) select gs.i from generate_series(1,10)
as gs(i);
INSERT 0 10
postgres=# insert into test2(val) select gs.i from generate_series(1,10)
as gs(i);
INSERT 0 10
postgres=# select test1.* from public.test1, public.test2 where
public.test1.id = public.test2.id;
 id | val
+-
  1 |   1
  2 |   2
  3 |   3
  4 |   4
  5 |   5
  6 |   6
  7 |   7
  8 |   8
  9 |   9
 10 |  10
(10 rows)

postgres=# update public.test1 set val=public.test2.val from
public.test2 where public.test1.id = public.test2.id;
UPDATE 10

However, it breaks in 9.0a5:

postgres=# create table test1( id serial, val integer );
NOTICE:  CREATE TABLE will create implicit sequence test1_id_seq for
serial column test1.id
CREATE TABLE
postgres=# create table test2( id serial, val integer );
NOTICE:  CREATE TABLE will create implicit sequence test2_id_seq for
serial column test2.id
CREATE TABLE
postgres=# insert into test1(val) select gs.i from generate_series(1,10)
as gs(i);
INSERT 0 10
postgres=# insert into test2(val) select gs.i from generate_series(1,10)
as gs(i);
INSERT 0 10
postgres=# select test1.* from public.test1, public.test2 where
public.test1.id = public.test2.id;
ERROR:  invalid reference to FROM-clause entry for table test1
LINE 1: ...ect test1.* from public.test1, public.test2 where public.tes...
 ^
HINT:  There is an entry for table test1, but it cannot be referenced
from this part of the query.
postgres=# select public.test1.* from public.test1, public.test2 where
public.test1.id = public.test2.id;
ERROR:  invalid reference to FROM-clause entry for table test1
LINE 1: select public.test1.* from public.test1, public.test2 where ...
   ^
HINT:  There is an entry for table test1, but it cannot be referenced
from this part of the query.
postgres=#



-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Make CheckRequiredParameterValues() depend upon correct

2010-04-27 Thread Tom Lane

Alvaro Herrera alvhe...@commandprompt.com writes:
 Hmm, AFAICS the problem with controldata is that it uses postgres_fe.h
 instead of postgres.h.  It's a bit of a stretch to use the latter, but
 maybe that's a better solution?  After all, it *is* poking into the
 backend internals.

I seem to recall that Solaris had problems with that due to dtrace
support or something?  However, we are doing it in pg_resetxlog,
so I suppose it's ok for pg_controldata as well.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Add column if not exists (CINE)

2010-04-27 Thread Kjell Rune Skaaraas

Hello,

I've been reading the earlier threads at:
http://archives.postgresql.org/pgsql-hackers/2009-05/thrd7.php#00252
http://archives.postgresql.org/pgsql-hackers/2005-10/thrd4.php#00632
and I'm not sure I have anything that substantially new to add but:

1. I can't see there's an unambiguity about what the syntax would do. It is IF 
NOT EXISTS, not IF NOT LIKE. Anyone who shoots themselves in the foot by 
calling a CINE and thinking that a preexisting differently defined column is 
magically converted deserves it. Either it should act exactly like the non-CINE 
command, or do nothing at all as if the statement wasn't there.

2. The use case is pretty clear to me - flexible scripts that'll bring all 
earlier database versions to the latest schema. I've been experimenting in 9.0 
alpha with calling DROP CONSTRAINT IF EXISTS then ADD CONSTRAINT with named 
constants for a CINE effect. which as a side effect will correct any updated 
constraints too - and it works great. Unfortunately DROP COLUMN IF EXISTS then 
ADD COLUMN has the side effect of deleting all the data, so that's hardly 
usable.

I saw some indications that this might be a minority opinion, well I would like 
to cast a vote FOR this functionality. The workarounds are ugly, the solution 
simple and while I agree it's possible to misuse it, my opinion is that you 
shouldn't become a surgeon if you can't handle a scalpel. In this case I get 
the feeling I'm reading instructions on how to do surgery with a butter knife 
because we don't dare hand out anything sharper.

Regards,
Kjell Rune Skaaraas



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] testing HS/SR - 1 vs 2 performance

2010-04-27 Thread Simon Riggs

On Tue, 2010-04-27 at 18:08 -0400, Tom Lane wrote:
 Simon Riggs si...@2ndquadrant.com writes:
  On Tue, 2010-04-27 at 17:24 -0400, Tom Lane wrote:
  I think we should just lose that test, as well as the variable.
 
  Yes, though it looks like it is still necessary in creating a valid
  initial state because otherwise we may have xids in KnownAssigned array
  that are already complete.
 
 Huh?  How is a filter as coarse as an oldest-running-XID filter going
 to prevent that?  And aren't we initializing from trustworthy data in
 ProcArrayApplyRecoveryInfo, anyway?
 
 I still say it's useless.

Quite possibly. Your looking at other code outside of this patch. I'm
happy that you do so, but is it immediately related? I can have another
look when we finish this.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Make CheckRequiredParameterValues() depend upon correct

2010-04-27 Thread Alvaro Herrera

Tom Lane escribió:
 Alvaro Herrera alvhe...@commandprompt.com writes:
  Hmm, AFAICS the problem with controldata is that it uses postgres_fe.h
  instead of postgres.h.  It's a bit of a stretch to use the latter, but
  maybe that's a better solution?  After all, it *is* poking into the
  backend internals.
 
 I seem to recall that Solaris had problems with that due to dtrace
 support or something?

Hmm, I wonder if you're referring to the fact that Zdenek wanted to
restructure the headers for something?  I don't know if this was because
of compiler issues or the binary migration tool he was working on.

 However, we are doing it in pg_resetxlog, so I suppose it's ok for
 pg_controldata as well.

I hadn't noticed that, but yes.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] road.thepath no longer in pg_stats?

2010-04-27 Thread Alvaro Herrera

Hi,

I just noticed that the road.thepath column in the regression database
is no longer in pg_stats for some reason after ANALYZE, so the example
in section 14.2 in the docs (Statistics Used by the Planner) is now
wrong.

What I'm trying to do is replace that exceedingly wide output with
something along these lines instead:

SELECT attname, n_distinct, array_to_string(most_common_vals, E'\n') as 
most_common_vals
FROM pg_stats
WHERE tablename = 'road';

but this is failing because of missing values for that column, as well
as there being two rows for the column that is there (stainherint=f
producing the other one)

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] providing tokenized version of parsed SQL script (was: nodeToString format and exporting the SQL parser)

2010-04-27 Thread Jehan-Guillaume (ioguix) de Rorthais

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 25/04/2010 03:02, Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Sat, Apr 24, 2010 at 8:07 PM, Bruce Momjian br...@momjian.us wrote:
 Sounds useful to me, though as a function like suggested in a later
 email.
 
 If tool-builders think this is useful, I have no problem with making
 it available.  It should be suitably disclaimed: We reserve the right
 to rip out the entire flex/yacc-based lexer and parser at any time and
 replace them with a hand-coded system written in Prolog that emits
 tokenization information only in ASN.1-encoded pig latin.  If massive
 changes in the way this function works - or its complete disappearance
 - are going to make you grumpy, don't call it.
 
 I'm a bit concerned with the vagueness of the goals here.  We started
 with a request to dump out node trees, ie, post-parsing representation;
 but the example use case of syntax highlighting would find that
 representation quite useless.  (Example: foo::bar and CAST(foo AS bar)
 yield the same parse tree.)  

Well, the tokenizer stuff was actually my understanding of the following
quote from Michael Tharp :
« ... making the internal SQL parser available to clients via a
C-language SQL function. ».

I thought Michael was trying to write a tokenizer based on node tree
returned by raw_parser. As it seems Michael is not even sure about what
he's trying to do, I prefer refocus a bit this thread

 A syntax highlighter might get some use
 out of the lexer-output token stream, but I'm afraid from the proposed
 output that people might be expecting more semantic information than
 the lexer can provide.  The lexer doesn't, for example, have any clue
 that some keywords are commands and others aren't; nor any very clear
 understanding about the semantic difference between the tokens '='
 and ';'.

Exact, a proper tokenizer function should be able to give some (simple)
information about the type of each token. That is what I tried to define
in this draft with the type field :

  = SELECT pgtokenize($script$
  SELECT 1;
  UPDATE test SET a=2;
$script$);

 type  | pos |   value  | line
  -+-+--+--
   SQL_COMMAND | 1   | 'SELECT' |   1
   CONSTANT| 8   | '1'  |   1
   DELIMITER   | 9   | ';'  |   1
   SQL_COMMAND | 11  | 'UPDATE' |   2
   IDENTIFIER  | 18  | 'test'   |   2
   SQL_KEYWORD | 23  | 'SET'|   2
   IDENTIFIER  | 27  | 'a'|   2
   OPERATOR| 30  | '='  |   2
   CONSTANT| 31  | '1'  |   2


 
 Also, if all you want is the lexer, it's not that hard to steal psql's
 version and adapt it to your purposes.  The lexer doesn't change very
 fast, and it's not that big either.

Stealing the lexer from psql is possible...for C application.
Don't know yet if we could port it to other languages easily and if a
simple lexer would really answer the use cases here.

 
 Anyway, it certainly wouldn't be hard for an add-on module to provide a
 SRF that calls the lexer (or parser) and returns some sort of tabular
 representation of the results.  I'm just not sure how useful it'll be
 in the real world.

Well, I would prefer not to tell users of pgAdmin or phpPgAdmin that
they depend on a contrib module.
Moreover, PostgreSQL already expose a lot of informations about its
internal mechanisms, configuration, ddl etc. I think having a proper
tokenizer function is just a natural new functionality for core if possible.

Having dropped an eye here and there in the parser code, I am not sure
where I could get required info and mix them to produce something close
to my draft yet.
But I prefer to discussing first before spending too much time and
throwing any potential code after...

 
   regards, tom lane

- -- 
JGuillaume (ioguix) de Rorthais
http://www.dalibo.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkvXdxgACgkQxWGfaAgowiJujQCglXpCYpFttwHOkmkCd92zMxnv
r00An1sjmRrR6u61VjCtXputcNBevHsz
=ri3i
-END PGP SIGNATURE-

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add column if not exists (CINE)

2010-04-27 Thread Robert Haas

On Tue, Apr 27, 2010 at 6:45 PM, Kjell Rune Skaaraas kjell...@yahoo.no wrote:
 Hello,

 I've been reading the earlier threads at:
 http://archives.postgresql.org/pgsql-hackers/2009-05/thrd7.php#00252
 http://archives.postgresql.org/pgsql-hackers/2005-10/thrd4.php#00632
 and I'm not sure I have anything that substantially new to add but:

 1. I can't see there's an unambiguity about what the syntax would do. It is 
 IF NOT EXISTS, not IF NOT LIKE. Anyone who shoots themselves in the foot by 
 calling a CINE and thinking that a preexisting differently defined column is 
 magically converted deserves it. Either it should act exactly like the 
 non-CINE command, or do nothing at all as if the statement wasn't there.

 2. The use case is pretty clear to me - flexible scripts that'll bring all 
 earlier database versions to the latest schema. I've been experimenting in 
 9.0 alpha with calling DROP CONSTRAINT IF EXISTS then ADD CONSTRAINT with 
 named constants for a CINE effect. which as a side effect will correct any 
 updated constraints too - and it works great. Unfortunately DROP COLUMN IF 
 EXISTS then ADD COLUMN has the side effect of deleting all the data, so 
 that's hardly usable.

 I saw some indications that this might be a minority opinion, well I would 
 like to cast a vote FOR this functionality. The workarounds are ugly, the 
 solution simple and while I agree it's possible to misuse it, my opinion is 
 that you shouldn't become a surgeon if you can't handle a scalpel. In this 
 case I get the feeling I'm reading instructions on how to do surgery with a 
 butter knife because we don't dare hand out anything sharper.

I've already said my piece on this, but I couldn't agree more.  Well
said, and your use case is exactly the one I want it for.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: providing tokenized version of parsed SQL script (was: nodeToString format and exporting the SQL parser)

2010-04-27 Thread Robert Haas

On Tue, Apr 27, 2010 at 7:45 PM, Jehan-Guillaume (ioguix) de Rorthais
iog...@free.fr wrote:
 I thought Michael was trying to write a tokenizer based on node tree
 returned by raw_parser. As it seems Michael is not even sure about what
 he's trying to do, I prefer refocus a bit this thread
[...]
 Having dropped an eye here and there in the parser code, I am not sure
 where I could get required info and mix them to produce something close
 to my draft yet.
 But I prefer to discussing first before spending too much time and
 throwing any potential code after...

I can't quite tell, from reading this, what you're trying to do with this...

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] road.thepath no longer in pg_stats?

2010-04-27 Thread Tom Lane

Alvaro Herrera alvhe...@commandprompt.com writes:
 I just noticed that the road.thepath column in the regression database
 is no longer in pg_stats for some reason after ANALYZE, so the example
 in section 14.2 in the docs (Statistics Used by the Planner) is now
 wrong.

It's an old example :-( ... a quick check shows no version since 7.3
generates stats for that column.

 but this is failing because of missing values for that column, as well
 as there being two rows for the column that is there (stainherint=f
 producing the other one)

Two rows are expected now, since that table has children --- there
should be one for inherited = t and one for inherited = f.  Probably
wouldn't hurt to explain that explicitly here.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] testing HS/SR - 1 vs 2 performance

2010-04-27 Thread Tom Lane

Simon Riggs si...@2ndquadrant.com writes:
 On Tue, 2010-04-27 at 18:08 -0400, Tom Lane wrote:
 Huh?  How is a filter as coarse as an oldest-running-XID filter going
 to prevent that?  And aren't we initializing from trustworthy data in
 ProcArrayApplyRecoveryInfo, anyway?
 
 I still say it's useless.

 Quite possibly. Your looking at other code outside of this patch. I'm
 happy that you do so, but is it immediately related? I can have another
 look when we finish this.

Well, it's nearby anyway.  I've committed the present patch (with a
number of fixes).  While I was looking at it I came across several
things in the existing code that I think are either wrong or at least
inadequately documented --- the above complaint is just the tip of the
iceberg.  I'm going to make another pass over it to see if I'm just
missing things, and then report back.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Schema.Table.Col resolution seems broken in Alpha5

2010-04-27 Thread Tom Lane

Josh Berkus j...@agliodbs.com writes:
 It appears that something broke the ability to refer to columns by full
 SQL path names in 9.0.  That is, references to columns as
 schema.table.col will produce a completely bogus error which did not
 exist on previous versions.

My fault :-(.  Seems we have no regression tests covering this specific
point :-( :-(

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] including PID or backend ID in relpath of temp rels

2010-04-27 Thread Robert Haas

On Sun, Apr 25, 2010 at 9:07 PM, Robert Haas robertmh...@gmail.com wrote:
 4. We could add an additional 32-bit value to RelFileNode to identify
 the backend (or a sentinel value when not temp) and create a separate
 structure XLogRelFileNode or PermRelFileNode or somesuch for use in
 contexts where no temp rels are allowed.

I experimented with this approach and created LocalRelFileNode and
GlobalRelFileNode and, for use in the buffer headers,
BufferRelFileNode (same as GlobalRelFileNode, but named differently
for clarity).  LocallRelFileNode = GlobalRelFileNode + the ID of the
owning backend for temp rels; or InvalidBackendId if referencing a
non-temporary rel.  These might not be the greatest names, but I think
the concept is good, because it really breaks the things that need to
be adjusted quite thoroughly.  In the course of repairing the damage I
came across a couple of things I wasn't sure about:

[relcache.c] RelationInitPhysicalAddr can't initialize
relation-rd_node.backend properly for a non-local temporary relation,
because that information isn't available.  But I'm not clear on why we
would need to create a relcache entry for a non-local temporary
relation.  If we do need to, then we'll probably need to store the
backend ID in pg_class.  That seems like something that would be best
avoided, all things being equal, especially since I can't see how to
generalize it to global temporary tables.

[smgr.c,inval.c] Do we need to call CacheInvalidSmgr for temporary
relations?  I think the only backend that can have an smgr reference
to a temprel other than the owning backend is bgwriter, and AFAICS
bgwriter will only have such a reference if it's responding to a
request by the owning backend to unlink the associated files, in which
case (I think) the owning backend will have no reference.

[dbsize.c] As with relcache.c, there's a problem if we're asked for
the size of a temporary relation that is not our own: we can't call
relpath() without knowing the ID of the owning backend, and there's no
way to acquire that information for pg_class.  I guess we could just
refuse to answer the question in that case, but that doesn't seem real
cool.  Or we could physically scan the directory for files that match
a suitably constructed wildcard, I suppose.

[storage.c,xact.c,twophase.c] smgrGetPendingDeletes returns via an out
parameter (its second argument) a list of RelFileNodes pending delete,
which we then write to WAL or to the two-phase state file.  Of course,
if the backend ID (or pid, but I picked backend ID somewhat
arbitrarily) is part of the filename, then we need to write that to
WAL, too.  It seems somewhat unfortunate to have to WAL-log temprels
here; as best I can tell, this is the only case where it's necessary.
 But if we implement a more general mechanism for cleaning up temp
files, then might the need to do this go away?  Not sure.

[syncscan.c] It seems we pursue this optimization even for temprels; I
can't think of why that would be useful in practice.  If it's useless
overhead, should we skip it?  This is really independent of this
project; just a side thought.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Error handling for ShmemInitStruct and ShmemInitHash

2010-04-27 Thread Tom Lane

The functions ShmemInitStruct and ShmemInitHash will return NULL on
certain failure conditions, apparently on the grounds that their caller
can print a more useful error message than they can.  A quick survey
shows that about half the callers aren't remembering to check for NULL,
and none of the other half are printing messages that are more useful
than out of shared memory (which isn't even necessarily correct).

I think that this is pretty error-prone, and that considering that
PG hackers are accustomed to not checking palloc() results, it's
inevitable that we'll make the same mistake in future if we leave
this API as it is.  I suggest making these functions throw
their own errors rather than returning NULL on failure, and removing
the redundant error reports from the callers that have 'em.

Comments?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Wierd quirk of HS/SR, probably not fixable

2010-04-27 Thread Fujii Masao

On Wed, Apr 28, 2010 at 4:12 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 Simon Riggs wrote:
 On Tue, 2010-04-27 at 20:14 +0300, Heikki Linnakangas wrote:
 Simon Riggs wrote:
 On Tue, 2010-04-27 at 18:13 +0300, Heikki Linnakangas wrote:
 Simon Riggs wrote:
 If pg_stop_backup() is run it creates the .backup file in the archive.
 In the absence of that file, we should be able to work out that
 pg_stop_backup() was not run.
 It's just as likely that the file is there even though the backup didn't
 finish, though.
 It's possible, but not likely. It would need to break at a very specific
 place for that to be the case. Whereas the test I explained would work
 for about 99% of the time between start and stop backup, except for the
 caveat I explained also.
 I don't understand how you arrived at that figure.

 You're talking about the backup_label file, I'm talking about
 the .backup file in the archive.

 Oh, the backup history file. We stopped relying on that with the
 introduction of the end-of-backup record, to make life easier for
 streaming replication, and because it's simpler anyway. I don't think we
 should go back to it.

Right.

When restore_command is not given, the backup history file would be
unavailable in the standby. We cannot regard the absence of the file
as non-run of pg_stop_backup().

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Differential backup

2010-04-27 Thread Craig Ringer

Simon Riggs wrote:

 Is this route worthwhile? Or in some way unacceptable?

From an admin perspective, I think block-level differentials would be a
lot more useful, especially if index storage could be safely excluded.

IMO Pg really could use an index bad or missing, rebuild on postmaster
start flag so that indexes could simply be omitted from backups and
would be automatically REINDEXed on startup. That'd be *great* for
pg_start_backup() / pg_stop_backup() filesystem level backups,
especially if indexes were configured to live in another tablespace.



Another avenue possibly worth investigating may be using the in-heap
mvcc information to do SQL-level differential backups of individual
tables or of the whole database. think:

pg_dump --incremental --last-backup-id '10296:10296:'

where --last-backup-id is the output of select
txid_current_snapshot() from the last backup, and could possibly be
fished out of a header in the previous dump.

This would be *incredibly* handy for people who have one database in a
that's more important than another and needs long-term history storage,
but for whom PITR is a PITA because it's whole-cluster-or-nothing.

This is trivial to do for individual append-only tables. I was trying to
figure out how to handle updates/deletes but quickly found myself
completely stumped.

I'd be surprised if this hasn't been looked at and put in the
impossible or too hard pile, but thought it was worth mentioning on
the off chance.


-- 
Craig Ringer

Tech-related writing: http://soapyfrogs.blogspot.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add column if not exists (CINE)

2010-04-27 Thread Takahiro Itagaki


Kjell Rune Skaaraas kjell...@yahoo.no wrote:

 I've been reading the earlier threads at:
 http://archives.postgresql.org/pgsql-hackers/2009-05/thrd7.php#00252
 http://archives.postgresql.org/pgsql-hackers/2005-10/thrd4.php#00632
 and I'm not sure I have anything that substantially new to add but:
 
 I saw some indications that this might be a minority opinion,
 well I would like to cast a vote FOR this functionality.

+1 for CINE, just because MySQL supports it.

But before developing, we need to decide how to handle an added object
that has the same name but has different definitions. 

Also, developers should consider not only ADD COLUMN but also other
CREATE or ADD commands. The patch will be large, including documentation
adjustments in many places -- it would be hard work.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

78 matches

Mail list logo