Re: [HACKERS] system views for walsender activity

2010-06-22 Thread Guillaume Lelarge
Le 22/06/2010 06:40, Takahiro Itagaki a écrit :
 [...]
 Tom Lane t...@sss.pgh.pa.us wrote:
 
 I'm of the opinion that this is a 9.1 problem.  It needs more thought
 than we can put into it now --- one obvious question is what about
 monitoring on the slave side?  Another is who should be able to see the
 data?
 
 Sure. We should research user's demands for monitoring and management
 of replication. I'll report some voices from users as of this moment:
 
 * Managers often ask DBAs How long standby servers are behind the master?
   We should provide such methods for DBAs. We have pg_xlog_location()
   functions, but they should be improved for:
 - The returned values are xxx/yyy texts, but more useful information
   is the difference of two values. Subtraction functions are required.
 - For easier management, the master server should provide not only
   sent/flush locations but also received/replayed locations for each
   standby servers. Users don't want to access both master and slaves.
 
 * Some developers want to pause and restart replication from the master
   server. They're going to use replication for application version
   managements. They'll pause all replications, and test their new features
   at the master, and restart replication to spread the changes to slaves.
 

I agree on these two.

Something I found lacking when I added support for Hot Standby /
Streaming Replication in pgAdmin (that was a really small patch, there
was not a lot to do) was that one cannot get the actual value of each
recovery.conf parameter. Try a SHOW primary_conninfo; and it will
juste reply that primary_conninfo is an unknown parameter. I already
talked about this to Heikki, but didn't get a chance to actually look at
the code.


-- 
Guillaume
 http://www.postgresql.fr
 http://dalibo.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Explicit psqlrc

2010-06-22 Thread Simon Riggs
On Mon, 2010-06-21 at 20:53 -0400, Robert Haas wrote:
 On Mon, Jun 21, 2010 at 7:51 PM, gabrielle gor...@gmail.com wrote:
  On Thu, 2010-06-17 at 14:50 -0400, Alvaro Herrera asked:
  How does it play with ON_ERROR_STOP/ROLLBACK?
 
  With ON_ERROR_STOP=ON, psql issues an error when it encounters one,
  stops processing the file that contains the error, and then continues
  to process any remaining files.

That would be undesirable.

  I'm still investigating ON_ERROR_ROLLBACK.  I need to tinker with it
  some more before I say anything concrete.
 
  On Fri, Jun 18, 2010 at 1:48 AM, Simon Riggs si...@2ndquadrant.com wrote:
  Also, how does it play with --single-transaction.
  That was buried in our original report :) BEGIN-COMMIT statements
  within the files cause warnings when the command is wrapped in a
  transaction with the -1 switch (as specified in the patch submission)
 
  To expand upon that a bit:  when psql encounters a file that contains
  a BEGIN statement, you get the expected WARNING: there is already a
  transaction in progress message.  The COMMIT at the end of that file
  (assuming the user doesn't forget it) generates a COMMIT.  Commands
  after that commit, or in any remaining files to be processed, are
  dealt with according to the user's autocommit settings:
  - if autocommit is ON, statements in the remaining files are processed
   committed;  the implicit COMMIT at the end of the whole thing then
  generates a WARNING: there is no transaction in progress message
  - if autocommit is OFF, statements in the remaining files generate
  ERROR:  current transaction is aborted, commands ignored until end of
  transaction block messages.

This is the existing behaviour.

 So none of the above sounds like desired behavior to me...  is that just me?

Single transaction needs some help, but that's not the fault of this
patch.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] system views for walsender activity

2010-06-22 Thread Simon Riggs
On Tue, 2010-06-22 at 09:54 +0200, Guillaume Lelarge wrote:
 I added support for Hot Standby /
 Streaming Replication in pgAdmin (that was a really small patch, there
 was not a lot to do)

Well done.

Does this mean that pgAdmin has a read only mode now?

What are the details of that support? I couldn't easily see the commits
in the pgadmin list.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TCP keepalive support for libpq

2010-06-22 Thread Robert Haas
On Mon, Feb 15, 2010 at 8:58 PM, Fujii Masao masao.fu...@gmail.com wrote:
 On Tue, Feb 16, 2010 at 1:18 AM, Robert Haas robertmh...@gmail.com wrote:
 I'm all for this as a 9.1 submission, but let's not commit to trying to
 debug it now.  I would like a green buildfarm for awhile before we wrap
 alpha4, and this sort of untested it can't hurt patch is exactly what
 is likely to make things not green.

 Mmm.  OK, fair enough.

 Okay. I added the patch to the first CF for v9.1.
 Let's discuss about it later.

There is talk of applying this patch, or something like it, for 9.0,
so I guess now is the time for discussion.  The overriding issue is
that we need walreceiver to notice if the master goes away.  Rereading
this thread, the major argument against applying this patch is that it
changes the default behavior: it ALWAYS enables keepalives, and then
additionally provides libpq parameters to change some related
parameters (number of seconds between sending keepalives, number of
seconds after which to retransmit a keepalive, number of lost
keepalives after which a connection is declared dead).  Although the
consensus seems to be that keepalives are a good idea much more often
than not, I am wary of unconditionally turning on a behavior that has,
in previous releases, been unconditionally turned off.  I don't want
to do this in 9.0, and I don't think I want to do it in 9.1, either.

What I think would make sense is to add an option to control whether
keepalives get turned on.   If you say keepalives=1, you get on = 1;
setsockopt(conn-sock, SOL_SOCKET, SO_KEEPALIVE,
(char *) on, sizeof(on); if you say keepalives=0, we do nothing
special.  If you say neither, you get the default behavior, which I'm
inclined to make keepalives=1.  That way, everyone gets the benefit of
this patch (keepalives turned on) by default, but if for some reason
someone is using libpq over the deep-space network or a connection for
which they pay by the byte, they can easily shut it off.  We can note
the behavior change under observe the following incompatibilities.

I am inclined to punt the keepalives_interval, keepalives_idle, and
keepalives_count parameters to 9.1.  If these are needed for
walreciever to work reliably, this whole approach is a dead-end,
because those parameters are not portable.  I will post a patch later
today along these lines.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TCP keepalive support for libpq

2010-06-22 Thread Magnus Hagander
On Tue, Jun 22, 2010 at 15:20, Robert Haas robertmh...@gmail.com wrote:
 On Mon, Feb 15, 2010 at 8:58 PM, Fujii Masao masao.fu...@gmail.com wrote:
 On Tue, Feb 16, 2010 at 1:18 AM, Robert Haas robertmh...@gmail.com wrote:
 I'm all for this as a 9.1 submission, but let's not commit to trying to
 debug it now.  I would like a green buildfarm for awhile before we wrap
 alpha4, and this sort of untested it can't hurt patch is exactly what
 is likely to make things not green.

 Mmm.  OK, fair enough.

 Okay. I added the patch to the first CF for v9.1.
 Let's discuss about it later.

 There is talk of applying this patch, or something like it, for 9.0,
 so I guess now is the time for discussion.  The overriding issue is
 that we need walreceiver to notice if the master goes away.  Rereading
 this thread, the major argument against applying this patch is that it
 changes the default behavior: it ALWAYS enables keepalives, and then
 additionally provides libpq parameters to change some related
 parameters (number of seconds between sending keepalives, number of
 seconds after which to retransmit a keepalive, number of lost
 keepalives after which a connection is declared dead).  Although the
 consensus seems to be that keepalives are a good idea much more often
 than not, I am wary of unconditionally turning on a behavior that has,
 in previous releases, been unconditionally turned off.  I don't want
 to do this in 9.0, and I don't think I want to do it in 9.1, either.

 What I think would make sense is to add an option to control whether
 keepalives get turned on.   If you say keepalives=1, you get on = 1;
 setsockopt(conn-sock, SOL_SOCKET, SO_KEEPALIVE,
 (char *) on, sizeof(on); if you say keepalives=0, we do nothing
 special.  If you say neither, you get the default behavior, which I'm
 inclined to make keepalives=1.  That way, everyone gets the benefit of
 this patch (keepalives turned on) by default, but if for some reason
 someone is using libpq over the deep-space network or a connection for
 which they pay by the byte, they can easily shut it off.  We can note
 the behavior change under observe the following incompatibilities.

+1 on enabling it by default, but providing a switch to turn it off.


 I am inclined to punt the keepalives_interval, keepalives_idle, and
 keepalives_count parameters to 9.1.  If these are needed for
 walreciever to work reliably, this whole approach is a dead-end,
 because those parameters are not portable.  I will post a patch later
 today along these lines.

Do we know how unportable? If it still helps the majority, it might be
worth doing. But I agree, if it's not really needed for walreceiver,
then it should be punted to 9.1.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] system views for walsender activity

2010-06-22 Thread Guillaume Lelarge
Le 22/06/2010 12:42, Simon Riggs a écrit :
 On Tue, 2010-06-22 at 12:19 +0200, Guillaume Lelarge wrote:
 Shamely simple : I only added some informations on the server's
 properties. See
 http://www.pgadmin.org/images/visualtour12/visualtour08.jpg. We only
 display the fact that the server is (or isn't) in recovery, and the
 result of the two admin functions (receive and replay location). 
 
 If you store the is-in-Recovery result you could set the .enabled
 property of many of the dialog boxes. I think its going to be painful
 for people to attempt to submit a DDL command and get an error.
 

That's what I first thought. But it would be weird that we disabled all
the OK button of the dialog properties only for hotstandby servers, but
not when a user doesn't have the permission. At least, that was the
reasonning I had at the time.

 Too bad the other admin functions aren't there, I could have used them
 (and hope to do so in 9.1). Too bad also we cannot know the primary
 server from a connection to the slave (that's why I would love to get
 the value of
 primary_conninfo, to found the alias/IP of the primary server).
 
 Agreed
 

:)


-- 
Guillaume
 http://www.postgresql.fr
 http://dalibo.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] system views for walsender activity

2010-06-22 Thread Guillaume Lelarge
Le 22/06/2010 11:41, Simon Riggs a écrit :
 On Tue, 2010-06-22 at 09:54 +0200, Guillaume Lelarge wrote:
 I added support for Hot Standby /
 Streaming Replication in pgAdmin (that was a really small patch, there
 was not a lot to do)
 
 Well done.
 
 Does this mean that pgAdmin has a read only mode now?
 

Nope, it does not really have one. Though I intend to work on having
pgAdmin more aware of the actual rights of the connected user (allowing
him to get to display the create table dialog when we should already
know he cannot is an issue, at least to me).

 What are the details of that support? I couldn't easily see the commits
 in the pgadmin list.
 

Shamely simple : I only added some informations on the server's
properties. See
http://www.pgadmin.org/images/visualtour12/visualtour08.jpg. We only
display the fact that the server is (or isn't) in recovery, and the
result of the two admin functions (receive and replay location). Too bad
the other admin functions aren't there, I could have used them (and hope
to do so in 9.1). Too bad also we cannot know the primary server from a
connection to the slave (that's why I would love to get the value of
primary_conninfo, to found the alias/IP of the primary server).


-- 
Guillaume
 http://www.postgresql.fr
 http://dalibo.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] system views for walsender activity

2010-06-22 Thread Simon Riggs
On Tue, 2010-06-22 at 12:19 +0200, Guillaume Lelarge wrote:
 Shamely simple : I only added some informations on the server's
 properties. See
 http://www.pgadmin.org/images/visualtour12/visualtour08.jpg. We only
 display the fact that the server is (or isn't) in recovery, and the
 result of the two admin functions (receive and replay location). 

If you store the is-in-Recovery result you could set the .enabled
property of many of the dialog boxes. I think its going to be painful
for people to attempt to submit a DDL command and get an error.

 Too bad the other admin functions aren't there, I could have used them
 (and hope to do so in 9.1). Too bad also we cannot know the primary
 server from a connection to the slave (that's why I would love to get
 the value of
 primary_conninfo, to found the alias/IP of the primary server).

Agreed

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TCP keepalive support for libpq

2010-06-22 Thread Robert Haas
On Tue, Jun 22, 2010 at 9:27 AM, Magnus Hagander mag...@hagander.net wrote:
 I am inclined to punt the keepalives_interval, keepalives_idle, and
 keepalives_count parameters to 9.1.  If these are needed for
 walreciever to work reliably, this whole approach is a dead-end,
 because those parameters are not portable.  I will post a patch later
 today along these lines.

 Do we know how unportable? If it still helps the majority, it might be
 worth doing. But I agree, if it's not really needed for walreceiver,
 then it should be punted to 9.1.

This might not be such a good idea as I had thought.  It looks like
the default parameters on Linux (Fedora 12) are:

tcp_keepalive_intvl:75
tcp_keepalive_probes:9
tcp_keepalive_time:7200

[ See also http://tldp.org/HOWTO/TCP-Keepalive-HOWTO/usingkeepalive.html ]

That's clearly better than no keepalives, but I venture to say it's
not going to be anything close to the behavior people want for
walreceiver...  I think we're going to need to either vastly reduce
the keepalive time and interval, or abandon the strategy of using TCP
keepalives completely.

Which brings us to the question of portability.  A quick search around
the Internet suggests that this is supported on recent versions of
Linux, Free/OpenBSD, AIX, and HP/UX, and it appears to work on my Mac
also.  I'm not clear how long it's been implemented on each of these
platforms, though.  With respect to Windows, it looks like there are
registry settings for all of these parameters, but I'm unclear whether
they can be set on a per-connection basis and what's required to make
this happen.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade issues

2010-06-22 Thread Bruce Momjian
depst...@alliedtesting.com wrote:
 PostgreSQL 9.0 beta 2 Windows XP Professional SP2
 
 While migrating the database from 8.4 to 9.0 using pg_upgrade (now part
 of the Postgres project), the following issues came up:
 
 1. When using the --logfile option, pg_upgrade quits with an error like
 this:
 
 The process cannot access the file because it is being used by another
 process.
 
 There were problems executing C:\PostgreSQL\8.4\bin/pg_ctl -l
 pg_upgrade.log  -D D:\PostgreSQL84_matlab1b -o -p 5432 -c
 autovacuum=off -c autovacuum_free ze_max_age=20 start 
 pg_upgrade.log 21

[ Email moved to hackers list.]

Ah, interesting.  I did some research and it turns out there is no way to
send server output and pg_ctl output to the same file on Win32.  I have
updated the pg_ctl docs to reflect this (first attached patch).  I have
also developed a patch (second attached patch) that sends pg_ctl output
to 'nul' on win32 so we can store the server output in the log file.

 2. Although pg_upgrade has a username option, this option is not used
 when pg_dump is called, which tries to use the current logon account
 name as the user name. As a result, pg_upgrade can only be used when
 the command shell is launched under the postgres account. (I know that
 this is mentioned on the doc page, but this doesn't seem right).

Thanks, fixed in second attached patch.

 3. The old database had the pgadmin debugger installed. The module is
 part of 8.4 and 9.0 distributions for Windows. However, pg_upgrade
 reported the following error:
 
 ERROR:  could not load library C:/PostgreSQL/9.0/lib/pldbgapi.dll:
 The specified module could not be found.
 
 I had to uninstall the debugger from the old database before I could
 proceed.

Uh, pg_upgrade has to have the same dll's, so I suggest you install that
into the new server the same way you did in the old server.  If it was
installed by the old installer, the new installer should have done the
same.

--
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +
Index: ref/pg_ctl-ref.sgml
===
RCS file: /cvsroot/pgsql/doc/src/sgml/ref/pg_ctl-ref.sgml,v
retrieving revision 1.50
diff -c -r1.50 pg_ctl-ref.sgml
*** ref/pg_ctl-ref.sgml	8 Apr 2010 01:39:37 -	1.50
--- ref/pg_ctl-ref.sgml	22 Jun 2010 16:18:31 -
***
*** 134,149 
  
para
 In optionstart/option mode, a new server is launched.  The
!server is started in the background, and standard input is attached to
!filename/dev/null/filename.  The standard output and standard
!error are either appended to a log file (if the option-l/option
!option is used), or redirected to applicationpg_ctl/application's 
!standard output (not standard error).  If no log file is chosen, the 
!standard output of applicationpg_ctl/application should be redirected 
!to a file or piped to another process such as a log rotating program
!like applicationrotatelogs/; otherwise commandpostgres/command 
!will write its output to the controlling terminal (from the background) 
!and will not leave the shell's process group.
/para
  
para
--- 134,152 
  
para
 In optionstart/option mode, a new server is launched.  The
!server is started in the background, and standard input is attached
!to filename/dev/null/filename (or literalnul/ on Windows).
!On Unix-like systems, by default, the server's standard output and
!standard error are send to applicationpg_ctl/application's
!standard output (not standard error).  The standard output of
!applicationpg_ctl/application should then be redirected to a
!file or piped to another process such as a log rotating program
!like applicationrotatelogs/; otherwise commandpostgres/command
!will write its output to the controlling terminal (from the
!background) and will not leave the shell's process group.  On
!Windows, by default the server's standard output and standard error
!are sent to the terminal.  These default  behaviors can be changed
!by using option-l/option to append server output to a log file.
/para
  
para

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TCP keepalive support for libpq

2010-06-22 Thread Magnus Hagander
On Tue, Jun 22, 2010 at 18:16, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Jun 22, 2010 at 9:27 AM, Magnus Hagander mag...@hagander.net wrote:
 I am inclined to punt the keepalives_interval, keepalives_idle, and
 keepalives_count parameters to 9.1.  If these are needed for
 walreciever to work reliably, this whole approach is a dead-end,
 because those parameters are not portable.  I will post a patch later
 today along these lines.

 Do we know how unportable? If it still helps the majority, it might be
 worth doing. But I agree, if it's not really needed for walreceiver,
 then it should be punted to 9.1.

 This might not be such a good idea as I had thought.  It looks like
 the default parameters on Linux (Fedora 12) are:

 tcp_keepalive_intvl:75
 tcp_keepalive_probes:9
 tcp_keepalive_time:7200

 [ See also http://tldp.org/HOWTO/TCP-Keepalive-HOWTO/usingkeepalive.html ]

 That's clearly better than no keepalives, but I venture to say it's
 not going to be anything close to the behavior people want for
 walreceiver...  I think we're going to need to either vastly reduce
 the keepalive time and interval, or abandon the strategy of using TCP
 keepalives completely.

 Which brings us to the question of portability.  A quick search around
 the Internet suggests that this is supported on recent versions of
 Linux, Free/OpenBSD, AIX, and HP/UX, and it appears to work on my Mac
 also.  I'm not clear how long it's been implemented on each of these
 platforms, though.  With respect to Windows, it looks like there are
 registry settings for all of these parameters, but I'm unclear whether
 they can be set on a per-connection basis and what's required to make
 this happen.

I looked around quickly earlier when we chatted about this, and I
think I found an API call to change them for a socket as well - but a
Windows specific one, not the ones you'd find on Unix...


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TCP keepalive support for libpq

2010-06-22 Thread Robert Haas
On Tue, Jun 22, 2010 at 12:32 PM, Magnus Hagander mag...@hagander.net wrote:
 Which brings us to the question of portability.  A quick search around
 the Internet suggests that this is supported on recent versions of
 Linux, Free/OpenBSD, AIX, and HP/UX, and it appears to work on my Mac
 also.  I'm not clear how long it's been implemented on each of these
 platforms, though.  With respect to Windows, it looks like there are
 registry settings for all of these parameters, but I'm unclear whether
 they can be set on a per-connection basis and what's required to make
 this happen.

 I looked around quickly earlier when we chatted about this, and I
 think I found an API call to change them for a socket as well - but a
 Windows specific one, not the ones you'd find on Unix...

That, in itself, doesn't bother me, especially if you're willing to
write and test a patch that uses them.

What does bother me is the fact that we are engineering a critical
aspect of our system reliability around vendor-specific implementation
details of the TCP stack, and that if any version of any operating
system that we support (or ever wish to support in the future) fails
to have a reliable implementation of this feature AND configurable
knobs that we can tune to suit our needs, then we're screwed.  Does
anyone want to argue that this is NOT a house of cards?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade issues

2010-06-22 Thread Bruce Momjian
depst...@alliedtesting.com wrote:
 Another issue:
 
 4. The --link option doesn't seem to work on Windows: pg_upgrade still
 copies data from the old cluster to the new. There doesn't appear to
 be a way to upgrade a database on Windows without copying the entire
 uncompressed database, which can be a problem where disk space is
 limited.

[ Email moved to hackers list.]

I am confused why you are seeing this behavior.  I know my Win32 tester
had it working.  We have this code:

#ifdef WIN32
static int
win32_pghardlink(const char *src, const char *dst)
{
/*
 * CreateHardLinkA returns zero for failure
 * http://msdn.microsoft.com/en-us/library/aa363860(VS.85).aspx
 */
if (CreateHardLinkA(dst, src, NULL) == 0)
return -1;
else
return 0;
}
#endif

and we test for failures.  We even have code that tests to make sure
hard links work before we start the migration.  In fact, pg_upgrade
--check will perform the hard link test without performing the upgrade.

I assume you don't have the ability to try a patch that would exit just
before the hard link test removes its test file.

--
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TCP keepalive support for libpq

2010-06-22 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 What does bother me is the fact that we are engineering a critical
 aspect of our system reliability around vendor-specific implementation
 details of the TCP stack, and that if any version of any operating
 system that we support (or ever wish to support in the future) fails
 to have a reliable implementation of this feature AND configurable
 knobs that we can tune to suit our needs, then we're screwed.  Does
 anyone want to argue that this is NOT a house of cards?

By that argument, we need to be programming to bare metal on every disk
access.  Does anyone want to argue that depending on vendor-specific
filesystem functionality is not a house of cards?  (And unfortunately,
that's much too close to the truth ... but yet we're not going there.)

As for the original point: *of course* we are going to have to expose
the keepalive parameters.  The default timeouts are specified by RFC,
and they're of the order of hours.  That's not going to satisfy anyone
for this usage.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TCP keepalive support for libpq

2010-06-22 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 
 What does bother me is the fact that we are engineering a critical
 aspect of our system reliability around vendor-specific
 implementation details of the TCP stack, and that if any version
 of any operating system that we support (or ever wish to support
 in the future) fails to have a reliable implementation of this
 feature AND configurable knobs that we can tune to suit our needs,
 then we're screwed. Does anyone want to argue that this is NOT a
 house of cards?
 
[/me raises hand]
 
TCP keepalive has been available and a useful part of my reliability
solutions since I had so find a way to clean up zombie database
connections caused by clients powering down their workstations
without closing their apps -- that was in OS/2 circa 1990.  I'm
pretty sure I've also used it on HP-UX, whatever Unix flavor was on
our Sun SPARC servers, several versions of Windows, and several
versions of Linux. As far as I can recall, the default was always
two hours before doing anything, followed by nine small packets sent
over the course of ten minutes before giving up (if none were
answered).
 
I'm not sure whether the timings were controllable through the
applications, because we generally changed the OS defaults.  Even
so, recovery after two hours and ten minutes is way better than
waiting for eternity.
 
As someone else said, we may want to add some sort of keepalive-
style ping to our application's home-grown protocol; but I don't see
that as an argument to suppress a very widely supported standard
protocol.  These address slightly different problem sets, let's
solve the one that came up in testing for the vast majority of
runtime environments by turning on TCP keepalives.
 
No, I don't see it as a house of cards.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TCP keepalive support for libpq

2010-06-22 Thread Robert Haas
On Tue, Jun 22, 2010 at 12:50 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 What does bother me is the fact that we are engineering a critical
 aspect of our system reliability around vendor-specific implementation
 details of the TCP stack, and that if any version of any operating
 system that we support (or ever wish to support in the future) fails
 to have a reliable implementation of this feature AND configurable
 knobs that we can tune to suit our needs, then we're screwed.  Does
 anyone want to argue that this is NOT a house of cards?

 By that argument, we need to be programming to bare metal on every disk
 access.  Does anyone want to argue that depending on vendor-specific
 filesystem functionality is not a house of cards?  (And unfortunately,
 that's much too close to the truth ... but yet we're not going there.)

I think you're making my argument for me.  The file system API is far
more portable than the behavior we're proposing to depend on here, and
yet it's only arguably good enough to meet our needs.

 As for the original point: *of course* we are going to have to expose
 the keepalive parameters.  The default timeouts are specified by RFC,
 and they're of the order of hours.  That's not going to satisfy anyone
 for this usage.

So I see.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TCP keepalive support for libpq

2010-06-22 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Jun 22, 2010 at 12:50 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 By that argument, we need to be programming to bare metal on every disk
 access.  Does anyone want to argue that depending on vendor-specific
 filesystem functionality is not a house of cards?  (And unfortunately,
 that's much too close to the truth ... but yet we're not going there.)

 I think you're making my argument for me.  The file system API is far
 more portable than the behavior we're proposing to depend on here, and
 yet it's only arguably good enough to meet our needs.

Uh, it's not API that's at issue here, and as for not portable I think
you have failed to make that case.  It is true that there are some old
platforms where keepalive isn't adjustable, but I doubt that anything
anyone is likely to be running mission-critical PG 9.0 on will lack it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding XMLEXISTS to the grammar

2010-06-22 Thread Mike Fowler



Look at how the POSITION() pseudofunction is defined around gram.y
line 9651.  Essentially any special syntax of this type gets converted
to a regular function call internally.  So in your case I think there
will be some function that gets called something ike this:

xmlexists(xpath_expression, xml_expression)

...but the grammar can be modified to allow a different syntax for
that function call.
  
I've finally managed to get gram.y to parse the syntax correctly. After 
progressing from a segmentation fault that occured when the grammar was 
correct I'm now left with a cryptic error:


xmltest=# SELECT COUNT(id) FROM xmltest WHERE xmlexists('/menu/beers' 
PASSING BY REF data);

ERROR:  unrecognized node type: 1852140847

At a guess there is another step that I need to do after modifying 
gram.y. One mailing list posting I found mentioned copyfuncs.c but 
really I'm unsure as to what next. Anyone know what the missing step is?


Regards,

--
Mike Fowler
Registered Linux user: 379787


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] missing else in postmaster.c?

2010-06-22 Thread Alvaro Herrera
Excerpts from Robert Haas's message of lun jun 21 22:18:46 -0400 2010:
 In pmdie(), we have the following code, which doesn't seem to make
 much sense.  If the state is PM_RECOVERY at the top of this section it
 will get changed to PM_WAIT_BACKENDS and then to PM_WAIT_BACKENDS
 again.  Either the two if statements should be merged (and both bits
 should be handled with the same block of code) or the second one
 should say else if.  Or at least, I think so...

I'd just move the first block below the second one.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TCP keepalive support for libpq

2010-06-22 Thread Florian Pflug
On Jun 22, 2010, at 18:43 , Robert Haas wrote:
 What does bother me is the fact that we are engineering a critical
 aspect of our system reliability around vendor-specific implementation
 details of the TCP stack, and that if any version of any operating
 system that we support (or ever wish to support in the future) fails
 to have a reliable implementation of this feature AND configurable
 knobs that we can tune to suit our needs, then we're screwed.  Does
 anyone want to argue that this is NOT a house of cards?


We already depend on TCP keepalives to prevent backends orphaned by client 
crashes or network outages from lingering around forever. If such a lingering 
backend is inside a transaction, I'll cause table bloat, prevent clog 
truncations, and keep tables locked forever.

I'd therefore argue that lingering backends are as least as severe a problem as 
hung S/R connections are. Since we've trusted keepalives to prevent the former 
for 10 years now, I think we can risk trusting keepalives to prevent the latter 
too.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TCP keepalive support for libpq

2010-06-22 Thread Robert Haas
On Tue, Jun 22, 2010 at 1:14 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Tue, Jun 22, 2010 at 12:50 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 By that argument, we need to be programming to bare metal on every disk
 access.  Does anyone want to argue that depending on vendor-specific
 filesystem functionality is not a house of cards?  (And unfortunately,
 that's much too close to the truth ... but yet we're not going there.)

 I think you're making my argument for me.  The file system API is far
 more portable than the behavior we're proposing to depend on here, and
 yet it's only arguably good enough to meet our needs.

 Uh, it's not API that's at issue here, and as for not portable I think
 you have failed to make that case. It is true that there are some old
 platforms where keepalive isn't adjustable, but I doubt that anything
 anyone is likely to be running mission-critical PG 9.0 on will lack it.

I don't think the burden of proof is on me to demonstrate that there's
a case where this feature isn't available - we're usually quite
reluctant to take advantage of platform-specific features unless we
have strong evidence that they are fully portable across our entire
set of supported platforms.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding XMLEXISTS to the grammar

2010-06-22 Thread Robert Haas
On Tue, Jun 22, 2010 at 1:17 PM, Mike Fowler m...@mlfowler.com wrote:

 Look at how the POSITION() pseudofunction is defined around gram.y
 line 9651.  Essentially any special syntax of this type gets converted
 to a regular function call internally.  So in your case I think there
 will be some function that gets called something ike this:

 xmlexists(xpath_expression, xml_expression)

 ...but the grammar can be modified to allow a different syntax for
 that function call.


 I've finally managed to get gram.y to parse the syntax correctly. After
 progressing from a segmentation fault that occured when the grammar was
 correct I'm now left with a cryptic error:

 xmltest=# SELECT COUNT(id) FROM xmltest WHERE xmlexists('/menu/beers'
 PASSING BY REF data);
 ERROR:  unrecognized node type: 1852140847

 At a guess there is another step that I need to do after modifying gram.y.
 One mailing list posting I found mentioned copyfuncs.c but really I'm unsure
 as to what next. Anyone know what the missing step is?

I usually troubleshoot things like this by setting a breakpoint in
elog_start or elog_finish.  Then you can see where it's blowing up.
Off the top of my head, I would guess you've added a node type whose
structure definition doesn't begin with NodeTag, or else you've got a
memory clobber.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TCP keepalive support for libpq

2010-06-22 Thread Josh Berkus
All,

If we *don't* rely on tcp-keepalive for terminating SR connections where
the master is dead, what is the alternative?  That issue, IMHO, is a
blocker for 9.0.

If tcp-keepalives are the only idea we have, then we need to work around
the limitations and implement them.

I'll also point out that keepalives are already a supported feature for
production PostgreSQL on the server side, so I don't see that adding
them for libpq is a big deal.  We might not want to enable them by
default, though.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TCP keepalive support for libpq

2010-06-22 Thread Kevin Grittner
Josh Berkus j...@agliodbs.com wrote:
 
 We might not want to enable them by default, though.
 
I have a hard time believing that enabled by default is a problem
with the default timings.  That would result in sending and
receiving one small packet every two hours on an open connection
with no application traffic.
 
In what environment do you see that causing a problem (compared to
no keepalive)?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TCP keepalive support for libpq

2010-06-22 Thread Josh Berkus

 In what environment do you see that causing a problem (compared to
 no keepalive)?

If it were Alpha3 right now, I'd have no issue with it, and if we're
talking about it for 9.1 I'd have no issue with it.  I am, however,
extremely reluctant to introduce a default behavior change for Beta3.


-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TCP keepalive support for libpq

2010-06-22 Thread Robert Haas
On Tue, Jun 22, 2010 at 1:32 PM, Robert Haas robertmh...@gmail.com wrote:
 I don't think the burden of proof is on me to demonstrate that there's
 a case where this feature isn't available - we're usually quite
 reluctant to take advantage of platform-specific features unless we
 have strong evidence that they are fully portable across our entire
 set of supported platforms.

Either I'm doing something wrong, or this doesn't work on Fedora 12.
I can adjust the system-wide settings by writing to the /proc
filesystem, but setsockopt() blows up (setting keepalives is fine, but
changing the subsidiary parameters does not seem to work).

[rh...@f12dev pgsql]$ uname -a
Linux f12dev 2.6.32.11-99.fc12.x86_64 #1 SMP Mon Apr 5 19:59:38 UTC
2010 x86_64 x86_64 x86_64 GNU/Linux
[rh...@f12dev pgsql]$ psql -l 'keepalives_idle=30'
psql: setsockopt(TCP_KEEPIDLE) failed: Operation not supported
[rh...@f12dev pgsql]$ psql -l 'keepalives_interval=10'
psql: setsockopt(TCP_KEEPINTVL) failed: Operation not supported
[rh...@f12dev pgsql]$ psql -l 'keepalives_count=5'
psql: setsockopt(TCP_KEEPCNT) failed: Operation not supported

WIP patch attached, based on a previous version by Fujii Masao.  Note
that the same commands work OK on MacOS X 10.6.3.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


libpq-optional-keepalive.diff
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TCP keepalive support for libpq

2010-06-22 Thread Robert Haas
On Tue, Jun 22, 2010 at 3:28 PM, Robert Haas robertmh...@gmail.com wrote:
 Either I'm doing something wrong,

I think it's this one.  Stand by.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Parallel pg_restore versus old dump files

2010-06-22 Thread Tom Lane
I've dug into the problem reported by Igor Neyman:
http://archives.postgresql.org/pgsql-admin/2010-06/msg00148.php
Unlike previous complainants, Igor was kind enough to supply a pg_dump
archive file that triggers the problem.  What I find is that his dump
file contains no data offsets, ie, dataState == K_OFFSET_POS_NOT_SET
for every TABLE DATA item.  This causes _PrintTocData to take the same
path taken for a non-seekable input file, ie, search forward looking for
the desired item.  In a parallel restore, all threads will start from
the same file location, right after the last serially-restored item.
Therefore, of course every one of them fails, except for the one told
to process the very first parallel-restore item.

The reason the dump file contains no offsets is that pg_dump can't write
them unless it thinks the dump file is seekable *at dump time* ---
otherwise it can't rewind to modify the dump's table of contents.
And guess what: pre-8.4 pg_dump on Windows will NEVER believe that the
output file is seekable, because we didn't bother to define HAVE_FSEEKO
in the Windows port until 8.4.

In short, parallel pg_restore is guaranteed to fail on any input file
made with a pre-8.4 pg_dump on Windows.  It may be that there's some
other mechanism involved in the reports we've gotten of parallel restore
failing only some of the time, but I'm thinking that the heretofore
unrecognized dependency on pg_dump-time seekability could well explain
those too.

I see several action items here:

1. The error message emitted by _PrintTocData is incredibly misleading.
It needs to be fixed to tell people if the problem is lack of data
offsets rather than lack of seek capability.

2. The reason that _PrintTocData thinks it's an error to hit a
restorable data item other than the one it wants is that, lacking seek
capability, there'd be no way to rewind to get at that data item later.
However, this is only an issue in serial restore.  In a parallel restore
worker thread, we're not going to need to seek back on that file pointer
anyway, so we should just allow the code to continue forward.  There
seem to be two plausible ways of implementing that:

* Just skip the error test altogether if in a worker child.

* Modify the error test so that the only data item considered
  wanted is the specific one the current worker wants.

The existing parallel restore logic in pg_backup_archiver.c doesn't
appear to export enough state to allow either of these strategies to be
implemented.  In the Unix implementation I'd be inclined to export the
state by creating a suitable static variable, but that's not going to
work in the thread-based Windows code.  It looks like we'd need some
thread-local storage which the current code hasn't got any of.

Another possibility is to just remove the inside-the-loop error test
altogether: make it just skip till it finds the desired item, and only
throw an error if it hits EOF without finding it.  In the case that
the error test is trying to catch, this would mean significantly more
work done before reporting the error, but do we really care?  I'm
leaning to this solution because it would not require exporting state
from the parallel restore control logic.

3. Perhaps pg_dump ought to emit a warning when it can't seek, instead
of just silently not writing the data offsets.  That behavior was okay
before when lack of data offsets didn't really matter that much, but
lack of data offsets is a serious performance handicap for parallel
restore even after we fix the outright failure condition (because each
worker is going to read through a lot of data to find what it needs).

4. Is there any value in back-porting the Windows FSEEKO support into
8.3 and 8.2?  Arguably, not writing the data offsets is a performance
bug.  However a back-port won't do anything for people who are dumping
with less than the latest minor release of pg_dump, so doing this might
be largely wasted effort.

Comments?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TCP keepalive support for libpq

2010-06-22 Thread Robert Haas
On Tue, Jun 22, 2010 at 3:45 PM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Jun 22, 2010 at 3:28 PM, Robert Haas robertmh...@gmail.com wrote:
 Either I'm doing something wrong,

 I think it's this one.  Stand by.

OK, here's a new version with several fewer bugs.  This does appear to
work on both Linux and MacOS now, which are the platforms I have
handy, and it does in fact solve the problem with walreceiver given
the following contents for recovery.conf:

primary_conninfo='host=192.168.84.136 keepalives_count=5
keepalives_interval=10 keepalives_idle=30'
standby_mode='on'

In theory, we could apply this as-is and call it good: if you want
master failures to be detected faster than they will be with the
default keepalive settings, do the above (assuming your platform
supports it).  Or we could try to be more clever, though the exact
shape of that cleverness is not obvious to me at this point.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


libpq-optional-keepalive-v2.diff
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Using multidimensional indexes in ordinal queries

2010-06-22 Thread Alexander Korotkov
 On Tue, Jun 22, 2010 at 1:58 AM, Robert Haas robertmh...@gmail.com wrote:

 It doesn't?   I didn't think it was making any assumptions about the
 ordering data type beyond the fact that it had a default btree
 opclass.

Actually, the return type of consistent method was replaced by float8.
Negative values are used for unconsistent state. Non-negative values are
used for consistent and ordering.


Re: [HACKERS] Parallel pg_restore versus old dump files

2010-06-22 Thread Greg Stark
On Tue, Jun 22, 2010 at 9:07 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 3. Perhaps pg_dump ought to emit a warning when it can't seek, instead
 of just silently not writing the data offsets.  That behavior was okay
 before when lack of data offsets didn't really matter that much, but
 lack of data offsets is a serious performance handicap for parallel
 restore even after we fix the outright failure condition (because each
 worker is going to read through a lot of data to find what it needs).


I'm not terribly familiar with the pg_dump format, but... the usual
strategy for storing a TOC on a non-seekable output stream is to store
it at the end of the file. So you just accumulate all the offsets in
memory as you generate the file and then write the TOC at the end. Of
course you need a seekable input stream when you load it then but it
would narrow the slow case to when you have a non-seekable output
stream when dumping *and* a non-seekable input stream on restore.

On the other hand if we didn't notice this dependency when there was
only one variable making it depend on two variables would make it that
much more obscure when the slow case hits and users wonder why the
restore is taking so long.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel pg_restore versus old dump files

2010-06-22 Thread Andrew Dunstan



Tom Lane wrote:

In short, parallel pg_restore is guaranteed to fail on any input file
made with a pre-8.4 pg_dump on Windows.  It may be that there's some
other mechanism involved in the reports we've gotten of parallel restore
failing only some of the time, but I'm thinking that the heretofore
unrecognized dependency on pg_dump-time seekability could well explain
those too.
  



IIRC, you can reproduce this on Unix too by sending the output of 
pg_dump into a pipe. So it's not uniquely a Windows problem.


As Greg suggests, the solution would be to have a second TOC at the end 
of the file with the offsets. But I think that's way beyond what we 
should do on the back branches, and really beyond what we should do for 
9.0. We should document the limitation.



I see several action items here:

1. The error message emitted by _PrintTocData is incredibly misleading.
It needs to be fixed to tell people if the problem is lack of data
offsets rather than lack of seek capability.
  


Agreed.


Another possibility is to just remove the inside-the-loop error test
altogether: make it just skip till it finds the desired item, and only
throw an error if it hits EOF without finding it.  In the case that
the error test is trying to catch, this would mean significantly more
work done before reporting the error, but do we really care?  I'm
leaning to this solution because it would not require exporting state
from the parallel restore control logic.
  


Would exporting a bit of state be so bad? It seems like it would be a 
bit cleaner, and I'll be surprised if it's terribly difficult. It can be 
set at the top of parallel_restore().



3. Perhaps pg_dump ought to emit a warning when it can't seek, instead
of just silently not writing the data offsets.  That behavior was okay
before when lack of data offsets didn't really matter that much, but
lack of data offsets is a serious performance handicap for parallel
restore even after we fix the outright failure condition (because each
worker is going to read through a lot of data to find what it needs).
  


For now, yes. But in 9.1 we should write out a second TOC and teach 
pg_restore to look for it.



4. Is there any value in back-porting the Windows FSEEKO support into
8.3 and 8.2?  Arguably, not writing the data offsets is a performance
bug.  However a back-port won't do anything for people who are dumping
with less than the latest minor release of pg_dump, so doing this might
be largely wasted effort.
  



I doubt it's worth it, but I could be persuaded otherwise.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel pg_restore versus old dump files

2010-06-22 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 Tom Lane wrote:
 In short, parallel pg_restore is guaranteed to fail on any input file
 made with a pre-8.4 pg_dump on Windows.

 IIRC, you can reproduce this on Unix too by sending the output of 
 pg_dump into a pipe. So it's not uniquely a Windows problem.

Right.  We need to be able to cope, albeit with degraded performance.

 As Greg suggests, the solution would be to have a second TOC at the end 
 of the file with the offsets.

Uh, that doesn't fix anything: if you can't seek, a TOC at the end of
the file is useless.  And the cases where the writer can't seek are
likely to be identically the ones where the reader can't seek, viz
pg_dump piped to pg_restore (perhaps with some other programs between).

 Another possibility is to just remove the inside-the-loop error test
 altogether: make it just skip till it finds the desired item, and only
 throw an error if it hits EOF without finding it.  In the case that
 the error test is trying to catch, this would mean significantly more
 work done before reporting the error, but do we really care?  I'm
 leaning to this solution because it would not require exporting state
 from the parallel restore control logic.

 Would exporting a bit of state be so bad?

The threaded case seems a bit messy, and frankly I don't believe that
we'd be buying anything.  The error case never actually occurs in the real
world, except perhaps on corrupted archive files, so why should we care
about performance for it?

 For now, yes. But in 9.1 we should write out a second TOC and teach 
 pg_restore to look for it.

I don't think this is useful.

 4. Is there any value in back-porting the Windows FSEEKO support into
 8.3 and 8.2?  Arguably, not writing the data offsets is a performance
 bug.  However a back-port won't do anything for people who are dumping
 with less than the latest minor release of pg_dump, so doing this might
 be largely wasted effort.

 I doubt it's worth it, but I could be persuaded otherwise.

I'm leaning in that direction too.  Anybody who's doing a version
upgrade really ought to be using the newer pg_dump version anyway ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel pg_restore versus old dump files

2010-06-22 Thread Andrew Dunstan



Tom Lane wrote:

Another possibility is to just remove the inside-the-loop error test
altogether: make it just skip till it finds the desired item, and only
throw an error if it hits EOF without finding it.  In the case that
the error test is trying to catch, this would mean significantly more
work done before reporting the error, but do we really care?  I'm
leaning to this solution because it would not require exporting state
from the parallel restore control logic.
  



Would exporting a bit of state be so bad?



The threaded case seems a bit messy, and frankly I don't believe that
we'd be buying anything.  The error case never actually occurs in the real
world, except perhaps on corrupted archive files, so why should we care
about performance for it?

  


OK, I can buy  that.

cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Implementation of Date/Time Input Interpretation

2010-06-22 Thread Francis Markham
Greetings all,

I am currently implementing a script to import data into postgres.  I would
like to apply the algorithm to detect date and time values, outlined at
http://developer.postgresql.org/pgdocs/postgres/datetime-input-rules.html

However, I am unfamiliar (and somewhat intimidated) by the postgres source
tree.  Would any kind person be able to point me to the source file(s) that
implement the above algorithm?

Thanks in advance,

Francis Markham


Re: [HACKERS] Implementation of Date/Time Input Interpretation

2010-06-22 Thread Dann Corbit
From: pgsql-hackers-ow...@postgresql.org 
[mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Francis Markham
Sent: Tuesday, June 22, 2010 7:13 PM
To: pgsql-hackers@postgresql.org
Subject: [HACKERS] Implementation of Date/Time Input Interpretation

Greetings all,

I am currently implementing a script to import data into postgres.  I would 
like to apply the algorithm to detect date and time values, outlined at 
http://developer.postgresql.org/pgdocs/postgres/datetime-input-rules.html

However, I am unfamiliar (and somewhat intimidated) by the postgres source 
tree.  Would any kind person be able to point me to the source file(s) that 
implement the above algorithm?

You will find it under \src\backend\utils\adt\datetime.c
To import data into postgres, I guess that reading the date time routine is 
probably not what you want to do.
If you want to move the data in using a compiled program then use an ODBC 
driver.  PostgreSQL comes with a free one.  OLEDB is another sensible 
alternative.  Or JDBC if you want to use Java.
If you want to bulk load lots of data at high speed, read up on the COPY 
command.
If you just want to insert some rows using SQL, then simply perform an INSERT 
using PSQL  or some other interface of your choice.
What is it exactly that you are trying to accomplish?



Re: [HACKERS] Implementation of Date/Time Input Interpretation

2010-06-22 Thread Francis Markham
Thank you for your prompt reply.

 What is it exactly that you are trying to accomplish?

I want to be able to, from my own script, determine if postgres will
be able to interpret a string as a date or time.  If you can suggest a
better way of accomplishing this beyond reimplementing your algorithm
I would be happy to hear it!

Cheers,

Francis Markham

On 23 June 2010 12:21, Dann Corbit dcor...@connx.com wrote:

 From: pgsql-hackers-ow...@postgresql.org 
 [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Francis Markham
 Sent: Tuesday, June 22, 2010 7:13 PM
 To: pgsql-hackers@postgresql.org
 Subject: [HACKERS] Implementation of Date/Time Input Interpretation



 Greetings all,

 I am currently implementing a script to import data into postgres.  I would 
 like to apply the algorithm to detect date and time values, outlined at 
 http://developer.postgresql.org/pgdocs/postgres/datetime-input-rules.html

 However, I am unfamiliar (and somewhat intimidated) by the postgres source 
 tree.  Would any kind person be able to point me to the source file(s) that 
 implement the above algorithm?
 

 You will find it under \src\backend\utils\adt\datetime.c

 To import data into postgres, I guess that reading the date time routine is 
 probably not what you want to do.

 If you want to move the data in using a compiled program then use an ODBC 
 driver.  PostgreSQL comes with a free one.  OLEDB is another sensible 
 alternative.  Or JDBC if you want to use Java.

 If you want to bulk load lots of data at high speed, read up on the COPY 
 command.

 If you just want to insert some rows using SQL, then simply perform an INSERT 
 using PSQL  or some other interface of your choice.

 What is it exactly that you are trying to accomplish?

 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] testing plpython3u on 9.0beta2

2010-06-22 Thread Chris
I received two errors (described below) in installing 9.0beta2 on Kubuntu
10.04 , RhodiumToad on IRC recommended I post them here.

I did not have a 2.x or 3.x python dev installed, but I was really only
interested in python3 via plython3u.
So...
sudo apt-get install python3-all-dev

Configure works fine...
./configure --with-pgport=5433 --with-python --with-ossp-uuid --with-libxml
--with-libxslt --with-perl
make fails while trying to build plpython, Cannot find python.h i
believe(I sadly did not save the text of the error and have since
continued onward.  But it was not finding a file while building plpython.  I
believe I would have to uninstall a few things to reproduce.)
So I tried installing the 2.x dev.  After running sudo apt-get install
python-all-dev then make;make check;sudo make install, it all worke fine.

However,  when issuing a createlang plpython3u template1, I get
createlang: language installation failed: ERROR:  could not access file
$libdir/plpython3: No such file or directory.  But if I createlang
plpython2u template1 first, then it will allow createlang plpython3u
template1.

Am I doing something incorrect, or...?

(As an aside, functions created with LANGUAGE plpython3u appear to work.)


-- 
Chris Spotts
rfu...@gmail.com


Re: [HACKERS] Implementation of Date/Time Input Interpretation

2010-06-22 Thread Andrew Dunstan



Francis Markham wrote:

Thank you for your prompt reply.

  

What is it exactly that you are trying to accomplish?



I want to be able to, from my own script, determine if postgres will
be able to interpret a string as a date or time.  If you can suggest a
better way of accomplishing this beyond reimplementing your algorithm
I would be happy to hear it!

  


Call the appropriate input function in plpgsql and trap a data 
exception? These routines are going to be quite hard to mimic, I 
suspect. Getting postgres to do the work for you is probably a better 
way to go if you can.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implementation of Date/Time Input Interpretation

2010-06-22 Thread Dann Corbit
 -Original Message-
 From: Andrew Dunstan [mailto:and...@dunslane.net]
 Sent: Tuesday, June 22, 2010 7:47 PM
 To: Francis Markham
 Cc: Dann Corbit; pgsql-hackers@postgresql.org
 Subject: Re: [HACKERS] Implementation of Date/Time Input Interpretation
 
 
 
 Francis Markham wrote:
  Thank you for your prompt reply.
 
 
  What is it exactly that you are trying to accomplish?
 
 
  I want to be able to, from my own script, determine if postgres will
  be able to interpret a string as a date or time.  If you can suggest
 a
  better way of accomplishing this beyond reimplementing your algorithm
  I would be happy to hear it!
 
 
 
 Call the appropriate input function in plpgsql and trap a data
 exception? These routines are going to be quite hard to mimic, I
 suspect. Getting postgres to do the work for you is probably a better
 way to go if you can.

For the O.P.:

This is the specification of the input format that is needed for date/time 
values:
http://www.postgresql.org/docs/8.4/interactive/datatype-datetime.html
See also:
http://www.postgresql.org/docs/8.4/interactive/datetime-appendix.html

If he wants to be able to simply validate date/time values before insertion, I 
would suggest a package like libmcal and pick out the file datetime.c, in 
particular:
extern bool datevalid(int year,int mon,int mday);
extern bool timevalid(int hour,int min,int sec);

The PostgreSQL database routine has lots of fluff intended for interfacing with 
the database, etc. which makes a simpler approach easier if validation is what 
is wanted.  Of course date/time/calendar libraries are available in just about 
every language.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TCP keepalive support for libpq

2010-06-22 Thread Fujii Masao
On Wed, Jun 23, 2010 at 5:32 AM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Jun 22, 2010 at 3:45 PM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Jun 22, 2010 at 3:28 PM, Robert Haas robertmh...@gmail.com wrote:
 Either I'm doing something wrong,

 I think it's this one.  Stand by.

 OK, here's a new version with several fewer bugs.

Since valid values for keepalives parameter are 0 and 1, its field size should
be 1 rather than 10.

diff --git a/src/interfaces/libpq/fe-connect.c
b/src/interfaces/libpq/fe-connect.c
index 8240404..f0085ab 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -184,7 +184,7 @@ static const PQconninfoOption PQconninfoOptions[] = {
Fallback-Application-Name, , 64},

{keepalives, NULL, NULL, NULL,
-   TCP-Keepalives, , 10}, /* strlen(INT32_MAX) == 10 */
+   TCP-Keepalives, , 1},

{keepalives_idle, NULL, NULL, NULL,
TCP-Keepalives-Idle, , 10}, /* strlen(INT32_MAX) == 10 */

In this case, you can check the value of keepalives parameter by seeing
conn-keepalives[0] instead of using strtol() in useKeepalives().

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers