Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09

2009-10-06 Thread Fujii Masao
Hi, On Mon, Sep 21, 2009 at 4:51 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: I've pushed that to 'replication-orig' branch in my git repository, attached is the same as a diff against your SR_0914.patch. The following changes about crossing a xlogid boundary seem wrong,

Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09

2009-10-06 Thread Alvaro Herrera
Fujii Masao escribió: On Thu, Sep 17, 2009 at 5:08 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Walreceiver is really a slave to the startup process. The startup process decides when it's launched, and it's the startup process that then waits for it to advance. But the

Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09

2009-10-06 Thread Fujii Masao
Hi, On Tue, Oct 6, 2009 at 10:42 PM, Alvaro Herrera alvhe...@commandprompt.com wrote: Hmm.  Without looking at the patch at all, this seems similar to how autovacuum does things: autovac launcher signals postmaster that a worker needs to be started.  Postmaster proceeds to fork a worker.  This

Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09

2009-09-30 Thread Fujii Masao
On Thu, Sep 17, 2009 at 5:08 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Walreceiver is really a slave to the startup process. The startup process decides when it's launched, and it's the startup process that then waits for it to advance. But the way it's set up at the

Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09

2009-09-25 Thread Fujii Masao
On Thu, Sep 24, 2009 at 7:57 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Anyway, I'll change walreceiver to retry connecting to the primary after an error occurs in PQstartXLogStreaming()/PQgetXLogData()/ PQputXLogRecPtr(). Should we set an upper limit of the number of

Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09

2009-09-25 Thread Heikki Linnakangas
Fujii Masao wrote: On Thu, Sep 24, 2009 at 7:57 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: - I know I said we should have just asynchronous replication at first, but looking ahead, how would you do synchronous? As the previous patch did, I'm going to make walsender read

Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09

2009-09-25 Thread Fujii Masao
Hi, On Fri, Sep 25, 2009 at 7:10 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Fujii Masao wrote: On Thu, Sep 24, 2009 at 7:57 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: - I know I said we should have just asynchronous replication at first, but

Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09

2009-09-24 Thread Fujii Masao
Hi, Sorry for the delay. On Mon, Sep 21, 2009 at 4:51 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Having gone through the patch now in more detail, I think it's in pretty good shape. I'm happy with the overall design, except that I haven't been able to make up my mind if

Re: walreceiver settings Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09

2009-09-24 Thread Fujii Masao
Hi, On Mon, Sep 21, 2009 at 1:55 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: The startup process could capture stderr from walreceiver and forward it with elog(LOG). The startup process should obtain also the message level in some way (pipe?), and control the messages

Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09

2009-09-24 Thread Fujii Masao
Hi, On Mon, Sep 21, 2009 at 4:51 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: I found the paging logic in walsender confusing, and didn't like the idea that walsender needs to set the XLOGSTREAM_END_SEG flag. Surely walreceiver knows how to split the WAL into files without

Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09

2009-09-24 Thread Heikki Linnakangas
Fujii Masao wrote: In the 'replication-orig' branch, walreceiver fsyncs the previous XLOG file after receiving new XLOG records before writing them. This would increase the backend's waiting time for replication in synchronous case. The walreceiver should fsync the XLOG file after sending ACK

Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09

2009-09-24 Thread Fujii Masao
Hi, On Thu, Sep 24, 2009 at 7:41 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Fujii Masao wrote: In the 'replication-orig' branch, walreceiver fsyncs the previous XLOG file after receiving new XLOG records before writing them. This would increase the backend's waiting

Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09

2009-09-24 Thread Heikki Linnakangas
Fujii Masao wrote: On Mon, Sep 21, 2009 at 4:51 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: - Can we replace read/write_conninfo with just a long-enough field in shared mem? Would be simpler. (this is moot if we go with the stand-alone walreceiver program and pass it as

Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09

2009-09-24 Thread Heikki Linnakangas
Fujii Masao wrote: On Thu, Sep 24, 2009 at 7:41 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Fujii Masao wrote: In the 'replication-orig' branch, walreceiver fsyncs the previous XLOG file after receiving new XLOG records before writing them. This would increase the

Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09

2009-09-21 Thread Heikki Linnakangas
Having gone through the patch now in more detail, I think it's in pretty good shape. I'm happy with the overall design, except that I haven't been able to make up my mind if walreceiver should indeed be a stand-alone program as discussed, or a postmaster child process as in the patch you

Re: walreceiver settings Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09

2009-09-20 Thread Heikki Linnakangas
Fujii Masao wrote: On Fri, Sep 18, 2009 at 7:34 PM, Fujii Masao masao.fu...@gmail.com wrote: This approach is OK if the stand-alone walreceiver is treated steadily by the startup process like a child process under postmaster: * Handling of some interrupts: SIGHUP, SIGTERM?, SIGINT, SIGQUIT...

walreceiver settings Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09

2009-09-19 Thread Fujii Masao
Hi, On Fri, Sep 18, 2009 at 7:34 PM, Fujii Masao masao.fu...@gmail.com wrote: This approach is OK if the stand-alone walreceiver is treated steadily by the startup process like a child process under postmaster: * Handling of some interrupts: SIGHUP, SIGTERM?, SIGINT, SIGQUIT...   For

Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09

2009-09-18 Thread Heikki Linnakangas
Heikki Linnakangas wrote: Heikki Linnakangas wrote: I'm thinking that walreceiver should be a stand-alone program that the startup process launches, similar to how it invokes restore_command in PITR recovery. Instead of using system(), though, it would use fork+exec, and a pipe to

Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09

2009-09-18 Thread Fujii Masao
Hi, On Fri, Sep 18, 2009 at 2:47 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Heikki Linnakangas wrote: I'm thinking that walreceiver should be a stand-alone program that the startup process launches, similar to how it invokes restore_command in PITR recovery. Instead of

Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09

2009-09-18 Thread Heikki Linnakangas
Fujii Masao wrote: Hi, On Fri, Sep 18, 2009 at 2:47 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Heikki Linnakangas wrote: I'm thinking that walreceiver should be a stand-alone program that the startup process launches, similar to how it invokes restore_command in

Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09

2009-09-18 Thread Fujii Masao
Hi, On Thu, Sep 17, 2009 at 5:08 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: I'm thinking that walreceiver should be a stand-alone program that the startup process launches, similar to how it invokes restore_command in PITR recovery. Instead of using system(), though, it

Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09

2009-09-17 Thread Heikki Linnakangas
Fujii Masao wrote: On Tue, Sep 15, 2009 at 7:53 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: After playing with this a little bit, I think we need logic in the slave to reconnect to the master if the connection is broken for some reason, or can't be established in the

Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09

2009-09-17 Thread Magnus Hagander
On Thu, Sep 17, 2009 at 10:08, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Fujii Masao wrote: On Tue, Sep 15, 2009 at 7:53 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: After playing with this a little bit, I think we need logic in the slave to reconnect

Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09

2009-09-17 Thread Heikki Linnakangas
Some random comments: I don't think we need the new PM_SHUTDOWN_3 postmaster state. We can treat walsenders the same as the archive process, and kill and wait for both of them to die in PM_SHUTDOWN_2 state. I think there's something wrong with the napping in walsender. When I perform

Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09

2009-09-17 Thread Csaba Nagy
On Thu, 2009-09-17 at 10:08 +0200, Heikki Linnakangas wrote: Robert Haas suggested a while ago that walreceiver could be a stand-alone utility, not requiring postmaster at all. That would allow you to set up streaming replication as another way to implement WAL archiving. Looking at how the

Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09

2009-09-17 Thread Fujii Masao
Hi, On Thu, Sep 17, 2009 at 8:32 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Some random comments: Thanks for the comments. I don't think we need the new PM_SHUTDOWN_3 postmaster state. We can treat walsenders the same as the archive process, and kill and wait for both

Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09

2009-09-16 Thread Fujii Masao
Hi, On Wed, Sep 16, 2009 at 11:37 AM, Fujii Masao masao.fu...@gmail.com wrote: I was thinking that the automatic reconnection capability is the TODO item for the later CF. The infrastructure for it has already been introduced in the current patch. Please see the macro MAX_WALRCV_RETRIES

Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09

2009-09-15 Thread Fujii Masao
Hi, On Tue, Sep 15, 2009 at 2:54 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: The first thing that caught my eye is that I don't think replication should be a real database. Rather, it should by a keyword in pg_hba.conf, like the existing all, sameuser, samerole keywords

Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09

2009-09-15 Thread Heikki Linnakangas
Kevin Grittner wrote: Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Kevin Grittner wrote: IMO, it would be best if the status could be sent via NOTIFY. To where? To registered listeners? I guess I should have worded that as it would be best if a change is

Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09

2009-09-15 Thread Heikki Linnakangas
After playing with this a little bit, I think we need logic in the slave to reconnect to the master if the connection is broken for some reason, or can't be established in the first place. At the moment, that is considered as the end of recovery, and the slave starts up. You have the trigger file

Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09

2009-09-15 Thread Fujii Masao
Hi, On Tue, Sep 15, 2009 at 7:53 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: After playing with this a little bit, I think we need logic in the slave to reconnect to the master if the connection is broken for some reason, or can't be established in the first place. At the

Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09

2009-09-14 Thread Heikki Linnakangas
Greg Smith wrote: Putting on my DBA hat for a minute, the first question I see people asking is how do I measure how far behind the slaves are?. Presumably you can get that out of pg_controldata; my first question is whether that's complete enough information? If not, what else should be

Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09

2009-09-14 Thread Andrew Dunstan
Greg Smith wrote: This is looking really neat now, making async replication really solid first before even trying to move on to sync is the right way to go here IMHO. I agree with both of those sentiments. One question I have is what is the level of traffic involved between the master and

Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09

2009-09-14 Thread Kevin Grittner
Greg Smith gsm...@gregsmith.com wrote: Putting on my DBA hat for a minute, the first question I see people asking is how do I measure how far behind the slaves are?. Presumably you can get that out of pg_controldata; my first question is whether that's complete enough information? If not,

Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09

2009-09-14 Thread Greg Smith
This is looking really neat now, making async replication really solid first before even trying to move on to sync is the right way to go here IMHO. I just cleaned up the docs on the Wiki page, when this patch is closer to being committed I officially volunteer to do the same on the internal

Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09

2009-09-14 Thread Heikki Linnakangas
Kevin Grittner wrote: Greg Smith gsm...@gregsmith.com wrote: I don't think running that program going to fly for a production quality integrated replication setup though. The UI admins are going to want would allow querying this easily via a standard database query. Most monitoring systems

Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09

2009-09-14 Thread Kevin Grittner
Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Kevin Grittner wrote: IMO, it would be best if the status could be sent via NOTIFY. To where? To registered listeners? I guess I should have worded that as it would be best if a change is replication status could be signaled

Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09

2009-09-14 Thread Heikki Linnakangas
Fujii Masao wrote: Here is the latest version of Streaming Replication (SR) patch. The first thing that caught my eye is that I don't think replication should be a real database. Rather, it should by a keyword in pg_hba.conf, like the existing all, sameuser, samerole keywords that you can put

Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09

2009-09-14 Thread Simon Riggs
On Mon, 2009-09-14 at 20:24 +0900, Fujii Masao wrote: The latest patch has overcome those problems: Well done. I hope to look at it myself in a few days time. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make

Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09

2009-09-14 Thread Fujii Masao
Hi, On Tue, Sep 15, 2009 at 12:47 AM, Greg Smith gsm...@gregsmith.com wrote: Putting on my DBA hat for a minute, the first question I see people asking is how do I measure how far behind the slaves are?.  Presumably you can get that out of pg_controldata; my first question is whether that's

Re: [HACKERS] Streaming Replication patch for CommitFest 2009-09

2009-09-14 Thread Fujii Masao
Hi, On Tue, Sep 15, 2009 at 1:06 AM, Andrew Dunstan and...@dunslane.net wrote: One question I have is what is the level of traffic involved between the master and the slave. I know numbers of people have found the traffic involved in shipping of log files to be a pain, and thus we get things