Re: [HACKERS] Standbys which don't synch to disk?
On Wed, May 11, 2011 at 1:12 PM, Josh Berkus j...@agliodbs.com wrote: For 9.1, both master and replica in a sync replication relationship are required to be fsync'ing to disk. I understand why we had to do that for our first cut at synch rep. Do you think, though, that it might become possible to replicate without synch-to-disk for 9.2? The use case I have is cloud hosting, where I'd rather have two or three synchronous standbys than synch to disk. It's already possible to set fsync=off on the standby if you want. If there is an OS-level crash you'll need to rebuild the standby, but in some cases that may be acceptable. And Simon has already written a patch to add a receive mode to sync rep, which I expected will get committed to 9.2. In that mode, the standby can acknowledge the WAL records as soon as they are received, and write them to disk just after. I think we do need some benchmarking there, to figure out whether any changes to the timing of replies are needed in that case. But the basic principal seems sound. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Standbys which don't synch to disk?
It's already possible to set fsync=off on the standby if you want. If there is an OS-level crash you'll need to rebuild the standby, but in some cases that may be acceptable. Yes, generally if there's an OS-level crash on cloud hosting, you've lost the instance anyway. And Simon has already written a patch to add a receive mode to sync rep, which I expected will get committed to 9.2. In that mode, the standby can acknowledge the WAL records as soon as they are received, and write them to disk just after. I think we do need some benchmarking there, to figure out whether any changes to the timing of replies are needed in that case. But the basic principal seems sound. Yes, that's what I'm looking for. The one other thing would be the ability not to fsync the master, which would come out of the whole stream from buffers patch which Fujii was working on. Fujii, is that still something you're working on? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Standbys which don't synch to disk?
Josh Berkus j...@agliodbs.com writes: It's already possible to set fsync=off on the standby if you want. If there is an OS-level crash you'll need to rebuild the standby, but in some cases that may be acceptable. ... The one other thing would be the ability not to fsync the master, which would come out of the whole stream from buffers patch which Fujii was working on. Fujii, is that still something you're working on? Huh? Surely you can just set fsync=off on the master if you feel like it. Data integrity not guaranteed, of course, but if you don't care... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Standbys which don't synch to disk?
Robert, That WAL has effectively disappeared from the master, but is still present on the slave. Now the master comes up and starts processing read-write transactions again, and generates a new and different 1kB of WAL. Hilarity ensues, because the two machines are now out of step with each other. Yeah, you'd need some kind of instant failover and STONITH. That is, any interruption on the master would be a failover situation. While that seems conceivable for crashes, consider that a planned restart of the master might be an issue, and an OOM-kill would certainly be. You could possibly fix this by making provision for the master to connect to the slave on start-up and stream WAL backwards from slave to master. That'd be pretty spiffy. Ouch, now you're making my head hurt. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Standbys which don't synch to disk?
On Thu, May 12, 2011 at 3:48 AM, Josh Berkus j...@agliodbs.com wrote: Robert, That WAL has effectively disappeared from the master, but is still present on the slave. Now the master comes up and starts processing read-write transactions again, and generates a new and different 1kB of WAL. Hilarity ensues, because the two machines are now out of step with each other. Yeah, you'd need some kind of instant failover and STONITH. That is, any interruption on the master would be a failover situation. While that seems conceivable for crashes, consider that a planned restart of the master might be an issue, and an OOM-kill would certainly be. You could possibly fix this by making provision for the master to connect to the slave on start-up and stream WAL backwards from slave to master. That'd be pretty spiffy. Ouch, now you're making my head hurt. I believe many people who use SR with a clusterware would do failover instead of restarting the master when it crashes. So I don't think it's bad idea to allow them to use the stream-WAL-from-buffers feature with self-responsibility. It's the same thing as we can specify fsync=off or full_page_writes=off. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers