Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-25 Thread Bruce Momjian
On Fri, Jul 13, 2012 at 08:08:59PM -0430, Jose Ildefonso Camargo Tolosa wrote: On Fri, Jul 13, 2012 at 10:22 AM, Bruce Momjian br...@momjian.us wrote: On Fri, Jul 13, 2012 at 09:12:56AM +0200, Hampus Wessman wrote: How you decide what to do with the servers on failures isn't that important

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-17 Thread Heikki Linnakangas
On 16.07.2012 22:01, Robert Haas wrote: On Sat, Jul 14, 2012 at 7:54 PM, Josh Berkusj...@agliodbs.com wrote: So, here's the core issue with degraded mode. I'm not mentioning this to block any patch anyone has, but rather out of a desire to see someone address this core problem with some

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-17 Thread Daniel Farina
On Mon, Jul 16, 2012 at 10:58 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: BTW, one little detail that I don't think has been mentioned in this thread before: Even though the master currently knows whether a standby is connected or not, and you could write a patch to act

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-16 Thread Robert Haas
On Sat, Jul 14, 2012 at 7:54 PM, Josh Berkus j...@agliodbs.com wrote: So, here's the core issue with degraded mode. I'm not mentioning this to block any patch anyone has, but rather out of a desire to see someone address this core problem with some clever idea I've not thought of. The problem

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-14 Thread Jose Ildefonso Camargo Tolosa
On Sat, Jul 14, 2012 at 12:42 AM, Amit kapila amit.kap...@huawei.com wrote: From: Jose Ildefonso Camargo Tolosa [ildefonso.cama...@gmail.com] Sent: Saturday, July 14, 2012 9:36 AM On Fri, Jul 13, 2012 at 11:12 PM, Amit kapila amit.kap...@huawei.com wrote: From:

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-14 Thread Josh Berkus
So, here's the core issue with degraded mode. I'm not mentioning this to block any patch anyone has, but rather out of a desire to see someone address this core problem with some clever idea I've not thought of. The problem in a nutshell is: indeterminancy. Assume someone implements degraded

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-13 Thread Hampus Wessman
Hi all, Here are some (slightly too long) thoughts about this. Shaun Thomas skrev 2012-07-12 22:40: On 07/12/2012 12:02 PM, Bruce Momjian wrote: Well, the problem also exists if add it as an internal database feature --- how long do we wait to consider the standby dead, how do we inform

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-13 Thread Bruce Momjian
On Fri, Jul 13, 2012 at 09:12:56AM +0200, Hampus Wessman wrote: How you decide what to do with the servers on failures isn't that important here, really. You can probably run e.g. Pacemaker on 3+ machines and have it check for quorums to accomplish this. That's a good approach at least. You

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-13 Thread Jose Ildefonso Camargo Tolosa
On Fri, Jul 13, 2012 at 12:25 AM, Amit Kapila amit.kap...@huawei.com wrote: From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Jose Ildefonso Camargo Tolosa On Thu, Jul 12, 2012 at 9:28 AM, Aidan Van Dyk ai...@highrise.ca wrote: On Thu, Jul 12,

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-13 Thread Jose Ildefonso Camargo Tolosa
Hi Hampus, On Fri, Jul 13, 2012 at 2:42 AM, Hampus Wessman ham...@hampuswessman.se wrote: Hi all, Here are some (slightly too long) thoughts about this. Nah, not that long. Shaun Thomas skrev 2012-07-12 22:40: On 07/12/2012 12:02 PM, Bruce Momjian wrote: Well, the problem also exists

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-13 Thread Jose Ildefonso Camargo Tolosa
On Fri, Jul 13, 2012 at 10:22 AM, Bruce Momjian br...@momjian.us wrote: On Fri, Jul 13, 2012 at 09:12:56AM +0200, Hampus Wessman wrote: How you decide what to do with the servers on failures isn't that important here, really. You can probably run e.g. Pacemaker on 3+ machines and have it check

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-13 Thread Amit kapila
From: pgsql-hackers-ow...@postgresql.org [pgsql-hackers-ow...@postgresql.org] on behalf of Jose Ildefonso Camargo Tolosa [ildefonso.cama...@gmail.com] Sent: Saturday, July 14, 2012 6:08 AM On Fri, Jul 13, 2012 at 10:22 AM, Bruce Momjian br...@momjian.us wrote: On Fri, Jul 13, 2012 at 09:12:56AM

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-13 Thread Jose Ildefonso Camargo Tolosa
On Fri, Jul 13, 2012 at 11:12 PM, Amit kapila amit.kap...@huawei.com wrote: From: pgsql-hackers-ow...@postgresql.org [pgsql-hackers-ow...@postgresql.org] on behalf of Jose Ildefonso Camargo Tolosa [ildefonso.cama...@gmail.com] Sent: Saturday, July 14, 2012 6:08 AM On Fri, Jul 13, 2012 at

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-13 Thread Amit kapila
From: Jose Ildefonso Camargo Tolosa [ildefonso.cama...@gmail.com] Sent: Saturday, July 14, 2012 9:36 AM On Fri, Jul 13, 2012 at 11:12 PM, Amit kapila amit.kap...@huawei.com wrote: From: pgsql-hackers-ow...@postgresql.org [pgsql-hackers-ow...@postgresql.org] on behalf of Jose Ildefonso Camargo

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Amit Kapila
From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Jose Ildefonso Camargo Tolosa Please, stop arguing on all of this: I don't think that adding an option will hurt anybody (specially because the work was already done by someone), we are not

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Dimitri Fontaine
Hi, Jose Ildefonso Camargo Tolosa ildefonso.cama...@gmail.com writes: environments. And no, it doesn't makes synchronous replication meaningless, because it will work synchronous if it have someone to sync to, and work async (or standalone) if it doesn't: that's perfect for HA environment.

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Shaun Thomas
On 07/12/2012 12:31 AM, Daniel Farina wrote: But RAID-1 as nominally seen is a fundamentally different problem, with much tinier differences in latency, bandwidth, and connectivity. Perhaps useful for study, but to suggest the problem is *that* similar I think is wrong. Well, yes and no. One

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Aidan Van Dyk
On Thu, Jul 12, 2012 at 9:21 AM, Shaun Thomas stho...@optionshouse.com wrote: So far as transaction durability is concerned... we have a continuous background rsync over dark fiber for archived transaction logs, DRBD for block-level sync, filesystem snapshots for our backups, a redundant async

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Bruce Momjian
On Thu, Jul 12, 2012 at 11:33:26AM +0530, Amit Kapila wrote: From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Jose Ildefonso Camargo Tolosa Please, stop arguing on all of this: I don't think that adding an option will hurt anybody

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Bruce Momjian
On Thu, Jul 12, 2012 at 08:21:08AM -0500, Shaun Thomas wrote: But, putting that aside, why not write a piece of middleware that does precisely this, or whatever you want? It can live on the same machine as Postgres and ack synchronous commit when nobody is home, and notify (e.g. page) you in

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Shaun Thomas
On 07/12/2012 12:02 PM, Bruce Momjian wrote: Well, the problem also exists if add it as an internal database feature --- how long do we wait to consider the standby dead, how do we inform administrators, etc. True. Though if there is no secondary connected, either because it's not there yet,

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Jose Ildefonso Camargo Tolosa
On Thu, Jul 12, 2012 at 8:35 AM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Hi, Jose Ildefonso Camargo Tolosa ildefonso.cama...@gmail.com writes: environments. And no, it doesn't makes synchronous replication meaningless, because it will work synchronous if it have someone to sync to,

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Jose Ildefonso Camargo Tolosa
On Thu, Jul 12, 2012 at 9:28 AM, Aidan Van Dyk ai...@highrise.ca wrote: On Thu, Jul 12, 2012 at 9:21 AM, Shaun Thomas stho...@optionshouse.com wrote: So far as transaction durability is concerned... we have a continuous background rsync over dark fiber for archived transaction logs, DRBD for

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Jose Ildefonso Camargo Tolosa
On Thu, Jul 12, 2012 at 12:17 PM, Bruce Momjian br...@momjian.us wrote: On Thu, Jul 12, 2012 at 11:33:26AM +0530, Amit Kapila wrote: From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Jose Ildefonso Camargo Tolosa Please, stop arguing on all

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Aidan Van Dyk
On Thu, Jul 12, 2012 at 8:27 PM, Jose Ildefonso Camargo Tolosa Yeah, you need that with PostgreSQL, but no with DRBD, for example (sorry, but DRBD is one of the flagships of HA things in the Linux world). Also, I'm not convinced about the 2nd standby thing... I mean, just read this on the

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Jose Ildefonso Camargo Tolosa
On Thu, Jul 12, 2012 at 8:29 PM, Aidan Van Dyk ai...@highrise.ca wrote: On Thu, Jul 12, 2012 at 8:27 PM, Jose Ildefonso Camargo Tolosa Yeah, you need that with PostgreSQL, but no with DRBD, for example (sorry, but DRBD is one of the flagships of HA things in the Linux world). Also, I'm not

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Jose Ildefonso Camargo Tolosa
On Thu, Jul 12, 2012 at 4:10 PM, Shaun Thomas stho...@optionshouse.com wrote: On 07/12/2012 12:02 PM, Bruce Momjian wrote: Well, the problem also exists if add it as an internal database feature --- how long do we wait to consider the standby dead, how do we inform administrators, etc.

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Amit Kapila
From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Jose Ildefonso Camargo Tolosa On Thu, Jul 12, 2012 at 9:28 AM, Aidan Van Dyk ai...@highrise.ca wrote: On Thu, Jul 12, 2012 at 9:21 AM, Shaun Thomas stho...@optionshouse.com wrote: As currently

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-11 Thread Dimitri Fontaine
Daniel Farina dan...@heroku.com writes: Notable caveat: one can't very easily measure or bound the amount of transaction loss in any graceful way as-is. We only have unlimited lag and 2-safe or bust. ¡per-transaction! You can change your mind mid-transaction and ask for 2-safe or bust.

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-11 Thread Shaun Thomas
On 07/10/2012 06:02 PM, Daniel Farina wrote: For example, what if DRBD can only complete one page per second for some reason? Does it it simply have the primary wait at this glacial pace, or drop synchronous replication and go degraded? Or does it do something more clever than just a timeout?

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-11 Thread Dimitri Fontaine
Shaun Thomas stho...@optionshouse.com writes: Regardless of what DRBD does, I think the problem with the async/sync duality as-is is there is no nice way to manage exposure to transaction loss under various situations and requirements. Yeah. Which would be handy. With synchronous commits,

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-11 Thread Josh Berkus
On 7/11/12 6:41 AM, Shaun Thomas wrote: Which would be handy. With synchronous commits, it's given that the protocol is bi-directional. Then again, PG can detect when clients disconnect the instant they do so, and having such an event implicitly disable synchronous_standby_names until

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-11 Thread Robert Haas
On Tue, Jul 10, 2012 at 12:57 PM, Josh Berkus j...@agliodbs.com wrote: Per your exchange with Heikki, that's not actually how SyncRep works in 9.1. So it's not giving you what you want anyway. This is why we felt that the sync rep if you can mode was useless and didn't accept it into 9.1.

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-11 Thread Jose Ildefonso Camargo Tolosa
Greetings, On Wed, Jul 11, 2012 at 9:11 AM, Shaun Thomas stho...@optionshouse.com wrote: On 07/10/2012 06:02 PM, Daniel Farina wrote: For example, what if DRBD can only complete one page per second for some reason? Does it it simply have the primary wait at this glacial pace, or drop

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-11 Thread Josh Berkus
Please, stop arguing on all of this: I don't think that adding an option will hurt anybody (specially because the work was already done by someone), we are not asking to change how the things work, we just want an option to decided whether we want it to freeze on standby disconnection, or if

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-11 Thread Jose Ildefonso Camargo Tolosa
On Wed, Jul 11, 2012 at 11:48 PM, Josh Berkus j...@agliodbs.com wrote: Please, stop arguing on all of this: I don't think that adding an option will hurt anybody (specially because the work was already done by someone), we are not asking to change how the things work, we just want an option

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-11 Thread Daniel Farina
On Wed, Jul 11, 2012 at 3:03 AM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Daniel Farina dan...@heroku.com writes: Notable caveat: one can't very easily measure or bound the amount of transaction loss in any graceful way as-is. We only have unlimited lag and 2-safe or bust.

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-11 Thread Daniel Farina
On Wed, Jul 11, 2012 at 6:41 AM, Shaun Thomas stho...@optionshouse.com wrote: Regardless of what DRBD does, I think the problem with the async/sync duality as-is is there is no nice way to manage exposure to transaction loss under various situations and requirements. Which would be handy.

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-10 Thread Daniel Farina
On Mon, Jul 9, 2012 at 1:30 PM, Shaun Thomas stho...@optionshouse.com wrote: 1. Slave wants to be synchronous with master. Master wants replication on at least one slave. They have this, and are happy. 2. For whatever reason, slave crashes or becomes unavailable. 3. Master notices no more

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-10 Thread Amit Kapila
From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Daniel Farina Sent: Tuesday, July 10, 2012 11:42 AM On Mon, Jul 9, 2012 at 1:30 PM, Shaun Thomas stho...@optionshouse.com wrote: 1. Slave wants to be synchronous with master. Master wants

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-10 Thread Magnus Hagander
On Tue, Jul 10, 2012 at 8:42 AM, Amit Kapila amit.kap...@huawei.com wrote: From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Daniel Farina Sent: Tuesday, July 10, 2012 11:42 AM On Mon, Jul 9, 2012 at 1:30 PM, Shaun Thomas stho...@optionshouse.com

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-10 Thread Shaun Thomas
On 07/10/2012 01:11 AM, Daniel Farina wrote: So if I get this straight, what you are saying is be asynchronous replication unless someone is around, in which case be synchronous is the mode you want. Er, no. I think I see where you might have gotten that, but no. This is a pretty tricky

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-10 Thread Aidan Van Dyk
On Tue, Jul 10, 2012 at 9:28 AM, Shaun Thomas stho...@optionshouse.com wrote: Async is simply too slow for our OLTP system except for the disaster recovery node, which isn't expected to carry on within seconds of the primary's failure. I briefly considered sync mode when it appeared as a

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-10 Thread Shaun Thomas
On 07/09/2012 05:15 PM, Josh Berkus wrote: Total-consistency replication is what I think you want, that is, to guarantee that at any given time a read query on the master will return the same results as a read query on the standby. Heck, *most* people would like to have that. You would also

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-10 Thread Heikki Linnakangas
On 10.07.2012 17:31, Shaun Thomas wrote: On 07/09/2012 05:15 PM, Josh Berkus wrote: So I'm unclear on why sync rep would be faster than async rep given that they use exactly the same mechanism. Explain? Too many mental gymnastics. I get that async is faster than sync, but the inconsistent

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-10 Thread Shaun Thomas
On 07/10/2012 09:40 AM, Heikki Linnakangas wrote: You are mistaken. It only guarantees that it's been sync'd to disk in the standby, but if there are open snapshots or the system is simply busy, it might takes minutes or more until the effects of that transaction become visible. Well, crap.

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-10 Thread Daniel Farina
On Tue, Jul 10, 2012 at 6:28 AM, Shaun Thomas stho...@optionshouse.com wrote: On 07/10/2012 01:11 AM, Daniel Farina wrote: So if I get this straight, what you are saying is be asynchronous replication unless someone is around, in which case be synchronous is the mode you want. Er, no. I

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-10 Thread Josh Berkus
Shaun, Too many mental gymnastics. I get that async is faster than sync, but the inconsistent transactional state makes it *look* slower. If a customer makes an order, but just happens to check that order state on the secondary before it can catch up, that's a net loss. Like I said, that's

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-10 Thread Dimitri Fontaine
Shaun Thomas stho...@optionshouse.com writes: When you re-connect a secondary device, it catches up as fast as possible by replaying waiting transactions, and then re-attaching to the cluster. Until it's fully caught-up, it doesn't exist. DRBD acknowledges the secondary is there and attempting

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-10 Thread Daniel Farina
On Tue, Jul 10, 2012 at 2:42 PM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: What you explain you want reads to me Async replication + Archiving. Notable caveat: one can't very easily measure or bound the amount of transaction loss in any graceful way as-is. We only have unlimited lag and

[HACKERS] Synchronous Standalone Master Redoux

2012-07-09 Thread Shaun Thomas
Hey everyone, Upon doing some usability tests with PostgreSQL 9.1 recently, I ran across this discussion: http://archives.postgresql.org/pgsql-hackers/2011-12/msg01224.php And after reading the entire thing, I found it odd that the overriding pushback was because nobody could think of a use

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-09 Thread Josh Berkus
Shaun, PostgreSQL's implementation means the master will block until someone/something notices and tells it to stop waiting, or the slave comes back. For pretty much any high-availability environment, this is not viable. Based on that alone, I can't imagine a scenario where synchronous