Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-24 Thread Simon Riggs
On Wed, 2008-12-24 at 11:39 +0900, Fujii Masao wrote: We might ask why pg_start_backup() needs to perform checkpoint though, since you have remarked that is a problem also. The answer is that it doesn't really need to, we just need to be certain that archiving has been running since

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-24 Thread Fujii Masao
Hi, On Wed, Dec 24, 2008 at 6:57 PM, Simon Riggs si...@2ndquadrant.com wrote: Yes, OK. So I think it would only work when full_page_writes = on, and has been on since last checkpoint. So two changes: * We just need a boolean that starts at true every checkpoint and gets set to false anytime

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-24 Thread Fujii Masao
Hi, On Wed, Dec 24, 2008 at 7:58 PM, Fujii Masao masao.fu...@gmail.com wrote: Hi, On Wed, Dec 24, 2008 at 6:57 PM, Simon Riggs si...@2ndquadrant.com wrote: Yes, OK. So I think it would only work when full_page_writes = on, and has been on since last checkpoint. So two changes: * We just

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-24 Thread Simon Riggs
On Thu, 2008-12-25 at 00:10 +0900, Fujii Masao wrote: Hi, On Wed, Dec 24, 2008 at 7:58 PM, Fujii Masao masao.fu...@gmail.com wrote: Hi, On Wed, Dec 24, 2008 at 6:57 PM, Simon Riggs si...@2ndquadrant.com wrote: Yes, OK. So I think it would only work when full_page_writes = on, and

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-24 Thread Fujii Masao
Hi, I fixed some bugs. On Thu, Dec 25, 2008 at 12:31 AM, Simon Riggs si...@2ndquadrant.com wrote: Can we change to IMMEDIATE when it we need the checkpoint? Perhaps yes, though current patch doesn't care about it. I'm not sure if we really need the feature. Yes, as you say, I'd like to also

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-23 Thread Simon Riggs
On Sun, 2008-12-21 at 14:46 +0900, Fujii Masao wrote: XLogFlush() flushes because of an interlock between a dirty buffer write and an outstanding WAL write. Dirty buffer writes are not replicated, so there is no need to have a similar interlock on WAL streaming. So making those call

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-23 Thread Fujii Masao
Hi, On Tue, Dec 23, 2008 at 5:22 PM, Simon Riggs si...@2ndquadrant.com wrote: On Sun, 2008-12-21 at 14:46 +0900, Fujii Masao wrote: XLogFlush() flushes because of an interlock between a dirty buffer write and an outstanding WAL write. Dirty buffer writes are not replicated, so there is no

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-23 Thread Simon Riggs
On Tue, 2008-12-23 at 18:00 +0900, Fujii Masao wrote: I don't get this argument. Why would we care what happens on the failed server? It's because, in the future, I'd like to use the data on the failed server when making it catch up with new primary. This desire might be violated by the

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-23 Thread Fujii Masao
Hi, On Tue, Dec 23, 2008 at 6:28 PM, Simon Riggs si...@2ndquadrant.com wrote: On Tue, 2008-12-23 at 18:00 +0900, Fujii Masao wrote: I don't get this argument. Why would we care what happens on the failed server? It's because, in the future, I'd like to use the data on the failed server

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-23 Thread Pavan Deolasee
On Tue, Dec 23, 2008 at 4:23 PM, Fujii Masao masao.fu...@gmail.com wrote: But, since I cannot obtain consensus from hackers including you, I would change my course, and forbid XLogFlush (called from other than RecordTransactionCommit) to replicate xlog synchronously if asynchronous

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-23 Thread Simon Riggs
On Tue, 2008-12-23 at 16:54 +0530, Pavan Deolasee wrote: On Tue, Dec 23, 2008 at 4:23 PM, Fujii Masao masao.fu...@gmail.com wrote: But, since I cannot obtain consensus from hackers including you, I would change my course, and forbid XLogFlush (called from other than

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-23 Thread Pavan Deolasee
On Tue, Dec 23, 2008 at 5:55 PM, Simon Riggs si...@2ndquadrant.com wrote: We stream constantly from primary to standby. That point is not being debated. The issue is whether we should add additional synchronisation points (i.e. additional times we need to wait) into the WAL stream.

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-23 Thread Simon Riggs
On Tue, 2008-12-23 at 18:36 +0530, Pavan Deolasee wrote: Personally, I would like to have a simple setup where I can initially setup primary and standby and they continue to work in a single-failure mode without any additional administrative overhead (such as rsync). But that's just me and I

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-23 Thread Fujii Masao
Hi, On Tue, Dec 23, 2008 at 10:41 PM, Simon Riggs si...@2ndquadrant.com wrote: I'm happy if that whole feature is added. If we do add it, it will be a utility like pg_resync. So in admin terms it will be almost identical to using rsync, just a specific version that minimizes effort even more

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-23 Thread Fujii Masao
Hi, On Tue, Dec 23, 2008 at 11:31 PM, Fujii Masao masao.fu...@gmail.com wrote: Of course, since I'm not planning to tackle that problem in 8.4, I would not add additional synchronization point. Second thought: For normal shutdown case, we probably should force synchronous replication in

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-23 Thread Simon Riggs
On Tue, 2008-12-23 at 23:31 +0900, Fujii Masao wrote: Hi, On Tue, Dec 23, 2008 at 10:41 PM, Simon Riggs si...@2ndquadrant.com wrote: I'm happy if that whole feature is added. If we do add it, it will be a utility like pg_resync. So in admin terms it will be almost identical to using

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-23 Thread Mark Mielke
Simon Riggs wrote: You scare me that you see failover as sufficiently frequent that you are worried that being without one of the servers for an extra 60 seconds during a failover is a problem. And then say you're not going to add the feature after all. I really don't understand. If its

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-23 Thread Fujii Masao
Hi, On Wed, Dec 24, 2008 at 12:38 AM, Simon Riggs si...@2ndquadrant.com wrote: Perhaps, but why do you say that? Since you often pointed out that getting backup is not problem because of incremental backup (e.g. rsync), I just thought so. I've not blocked you from adding anything useful to

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-23 Thread Simon Riggs
On Wed, 2008-12-24 at 02:23 +0900, Fujii Masao wrote: Oh, sorry. I don't want to scare you ;) But, yes, it's important. We should rethink the question? Why does the failed server always need a fresh backup? Though we discussed it previously and concluded that it should be done next time.

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-23 Thread Fujii Masao
Hi, On Wed, Dec 24, 2008 at 2:37 AM, Simon Riggs si...@2ndquadrant.com wrote: On Wed, 2008-12-24 at 02:23 +0900, Fujii Masao wrote: Oh, sorry. I don't want to scare you ;) But, yes, it's important. We should rethink the question? Why does the failed server always need a fresh backup? Though

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-23 Thread Fujii Masao
Hi, On Mon, Dec 22, 2008 at 1:29 PM, Fujii Masao masao.fu...@gmail.com wrote: Not so simple. At least the primary has to additionally maintain the byte position the standby has already fsynced. The main difference from the current patch is whether the standby fsyncs the logfile when it

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-21 Thread Markus Wanner
Hi, Simon Riggs wrote: The second way can be done by taking a snapshot on the primary, with an associated LSN, then using that snapshot on the standby. That is somewhat complex, but possible. I see the requirement for getting the same answer on multiple nodes as a further extension of

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-21 Thread Fujii Masao
Hi, On Wed, Dec 17, 2008 at 12:07 PM, Fujii Masao masao.fu...@gmail.com wrote: No, we've been through that loop already a few months back: Transaction-controlled robustness. It should be up to the client on the primary to decide how much waiting they would like to perform in order to provide

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-20 Thread Markus Wanner
Hi, Mark Mielke wrote: Where does the expectation come from? I find the seat reservation, bank account or stock trading examples pretty obvious WRT user expectations. Nonetheless, I've compiled some hints from the documentation and sources: Since in Read Committed mode each new command starts

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-20 Thread Mark Mielke
Good answers, Markus. Thanks. I've bought the thinking of several here that the user should have some control over what they expect (and what optimizations they are willing to accept as a good choice), but that commit should still be able to have a capped time limit. I can think of many of

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-20 Thread Markus Wanner
Hi, Mark Mielke wrote: Robert Haas wrote: On Sat, Dec 13, 2008 at 1:29 PM, Tom Lane t...@sss.pgh.pa.us wrote: We won't call it anything, because we never will or can implement that. See the theory of relativity: the notion of exactly simultaneous events OK, fine. I'll be more precise. I

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-20 Thread Markus Wanner
Hi, Josh Berkus wrote: Peter Eisentraut wrote: It's the color of the bikeshed ... Agreed. It's why I've decided to support various modes for Postgres-R. I'm glad to see that the current Sync Rep approach does the same. Hmmm. I thought this was pretty clear. There's three levels of synch

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-20 Thread Markus Wanner
Hi, Mark Mielke wrote: Good answers, Markus. Thanks. You are welcome. So it looks like there is value to both ends of the spectrum, and while I feel the most value would be in providing a very fast system that scales near linear to the number of nodes in the system, even at the expense of

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-20 Thread Fujii Masao
Hi, On Fri, Dec 19, 2008 at 5:50 PM, Simon Riggs si...@2ndquadrant.com wrote: On Fri, 2008-12-19 at 09:43 +0900, Fujii Masao wrote: Yes, please check the call points for ForceSyncCommit. Do I think every xlog flush should be synchronous, no, I don't. That's why we have a user settable

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-19 Thread Simon Riggs
On Fri, 2008-12-19 at 09:43 +0900, Fujii Masao wrote: Yes, please check the call points for ForceSyncCommit. Do I think every xlog flush should be synchronous, no, I don't. That's why we have a user settable parameter for it. Umm.. I focus attention on XLogFlush() called except

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-19 Thread Heikki Linnakangas
Simon Riggs wrote: On a related but different point: We don't need an interlock between dirty buffers and WAL during recovery because the WAL has already been written. Assuming the WAL has also been fsync'd. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-19 Thread Simon Riggs
On Fri, 2008-12-19 at 11:04 +0200, Heikki Linnakangas wrote: Simon Riggs wrote: On a related but different point: We don't need an interlock between dirty buffers and WAL during recovery because the WAL has already been written. Assuming the WAL has also been fsync'd. True, so this

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-18 Thread Simon Riggs
On Thu, 2008-12-18 at 12:08 +0900, Fujii Masao wrote: Agreed, I also think that hard code is better. But I'm nervous that off keeps us waiting for replication in cases other than DDL, e.g. flush buffer, truncate clog, checkpoint.. etc. synchronous_replication = off is quite similar to

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-18 Thread Fujii Masao
Hi, On Thu, Dec 18, 2008 at 6:35 PM, Simon Riggs si...@2ndquadrant.com wrote: On Thu, 2008-12-18 at 12:08 +0900, Fujii Masao wrote: Agreed, I also think that hard code is better. But I'm nervous that off keeps us waiting for replication in cases other than DDL, e.g. flush buffer,

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-17 Thread Simon Riggs
On Wed, 2008-12-17 at 12:07 +0900, Fujii Masao wrote: OK. I will extend synchronous_replication, make walsender send XLOG with synchronization mode flag and make walreceiver perform according to the flag. Sounds good. My perspective is that synchronous_replication specifies how long to

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-17 Thread Fujii Masao
Hi, Thanks for the helpful comments! On Wed, Dec 17, 2008 at 8:50 PM, Simon Riggs si...@2ndquadrant.com wrote: On Wed, 2008-12-17 at 12:07 +0900, Fujii Masao wrote: OK. I will extend synchronous_replication, make walsender send XLOG with synchronization mode flag and make walreceiver

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-17 Thread Simon Riggs
On Thu, 2008-12-18 at 11:03 +0900, Fujii Masao wrote: Hi, Thanks for the helpful comments! On Wed, Dec 17, 2008 at 8:50 PM, Simon Riggs si...@2ndquadrant.com wrote: On Wed, 2008-12-17 at 12:07 +0900, Fujii Masao wrote: OK. I will extend synchronous_replication, make walsender send

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-17 Thread Fujii Masao
Hi, On Thu, Dec 18, 2008 at 11:19 AM, Simon Riggs si...@2ndquadrant.com wrote: On Thu, 2008-12-18 at 11:03 +0900, Fujii Masao wrote: Hi, Thanks for the helpful comments! On Wed, Dec 17, 2008 at 8:50 PM, Simon Riggs si...@2ndquadrant.com wrote: On Wed, 2008-12-17 at 12:07 +0900, Fujii

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-16 Thread Simon Riggs
On Tue, 2008-12-16 at 12:36 +0900, Fujii Masao wrote: So from my previous list 1. We sent the message to standby (A) 2. We received the message on standby 3. We wrote the WAL to the WAL file (B) 4. We fsync'd the WAL file (C) 5. We CRC checked the WAL commit record 6. We applied

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-16 Thread Fujii Masao
Hi, On Tue, Dec 16, 2008 at 7:21 PM, Simon Riggs si...@2ndquadrant.com wrote: On Tue, 2008-12-16 at 12:36 +0900, Fujii Masao wrote: So from my previous list 1. We sent the message to standby (A) 2. We received the message on standby 3. We wrote the WAL to the WAL file (B) 4. We

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-15 Thread Heikki Linnakangas
Mark Mielke wrote: Where does the expectation come from? I don't recall ever reading it in the documentation, and unless the session processes are contending over the integers (using some sort of synchronization primitive) in memory that represent the latest visible commit on every single

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-15 Thread Aidan Van Dyk
* Robert Haas robertmh...@gmail.com [081215 07:32]: In fact, waiting for reply from standby server before acknowledging a commit to the client is a bit pointless otherwise. It puts you in a strange situation, where you're waiting for the commits in normal operation, but if there's a

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-15 Thread Robert Haas
So you'd want all commits to wait until the transaction is safely replicated in the standby. But if there's a network glitch, or the standby is restarted, you're happy to reply to the client that it's committed if it's only safely committed in the primary. Essentially, you wait for the reply

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-15 Thread Simon Riggs
On Sun, 2008-12-14 at 21:41 -0500, Robert Haas wrote: If this is right, #2, #3, #4, and #6 feel similar except that they're protecting against failures of different (but still all incomplete) subsets of the hardware on the slave, right? Right. Actually, the biggest difference with #6

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-15 Thread Robert Haas
In fact, waiting for reply from standby server before acknowledging a commit to the client is a bit pointless otherwise. It puts you in a strange situation, where you're waiting for the commits in normal operation, but if there's a network glitch or the standby goes down, you're willing to go

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-15 Thread Simon Riggs
Fujii-san, Just repeating this in case you lost this comment: On Mon, 2008-12-15 at 09:40 +, Simon Riggs wrote: Fujii-san, please can we incorporate those two options, rather than just one choice synchronous_replication = on. They look like two commonly requested options. I see the

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-15 Thread Simon Riggs
On Sun, 2008-12-14 at 12:57 -0500, Mark Mielke wrote: I'm curious about your suggestion to direct queries that need the latest snapshot to the 'primary'. I might have misunderstood it - but it seems that the expectation from some is that *all* sessions see the latest snapshot, so would

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-15 Thread Peter Eisentraut
Simon Riggs wrote: I am truly lost to understand why the *name* synchronous replication causes so much discussion, yet nobody has discussed what they would actually like the software to *do* It's the color of the bikeshed ... We can make the reply to a commit message when any of the

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-15 Thread Heikki Linnakangas
Robert Haas wrote: In fact, waiting for reply from standby server before acknowledging a commit to the client is a bit pointless otherwise. It puts you in a strange situation, where you're waiting for the commits in normal operation, but if there's a network glitch or the standby goes down,

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-15 Thread Greg Stark
It's a real promise. The reason you're getting hand-wavy answers is because it's such a basic requirement that I'm trying to point out just how fundamental a requirement it is. If you want to see the actual code which guarantees this take a look around the code for procarray - in

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-15 Thread Jeff Davis
On Mon, 2008-12-15 at 09:19 -0500, Robert Haas wrote: I understand you're point, but I think there's still a use case. The idea is that declaring the secondary dead is a rare event, and there's some mechanism by which you're enabled to page your network staff, and they hightail it into the

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-15 Thread Josh Berkus
Peter Eisentraut wrote: Simon Riggs wrote: I am truly lost to understand why the *name* synchronous replication causes so much discussion, yet nobody has discussed what they would actually like the software to *do* It's the color of the bikeshed ... Hmmm. I thought this was pretty clear.

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-15 Thread Ron Mayer
Josh Berkus wrote: Hmmm. I thought this was pretty clear. There's three levels of synch which are useful features: 1) synchronus standby which is really asynchronous, but only has a gap of 100ms. 2) Synchronous standby which guarentees that all committed transactions are on the

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-15 Thread Josh Berkus
Isn't the queryable read-only feature totally orthogonal with how synchronous the replication is? Yes. However, it introduces specific difficult issues which an unreadable synchronous slave does not have. --Josh -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-15 Thread Simon Riggs
On Mon, 2008-12-15 at 13:43 -0800, Josh Berkus wrote: Isn't the queryable read-only feature totally orthogonal with how synchronous the replication is? Yes. However, it introduces specific difficult issues which an unreadable synchronous slave does not have. Don't think it's hugely

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-15 Thread Josh Berkus
Simon, I've explained this twice now on different parts of this thread. Could I politely direct your attention to those posts? Chill. I was just explaining that the *goal* of sync standby was not complicated or really something to be argued about. It's pretty clear. --Josh -- Sent via

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-15 Thread Simon Riggs
On Mon, 2008-12-15 at 13:06 -0800, Josh Berkus wrote: Peter Eisentraut wrote: Simon Riggs wrote: I am truly lost to understand why the *name* synchronous replication causes so much discussion, yet nobody has discussed what they would actually like the software to *do* It's the color

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-15 Thread Fujii Masao
Hi, Sorry for this late reply. And, thanks for the hot discussion ;) On Tue, Dec 16, 2008 at 1:24 AM, Simon Riggs si...@2ndquadrant.com wrote: Fujii-san, Just repeating this in case you lost this comment: On Mon, 2008-12-15 at 09:40 +, Simon Riggs wrote: Fujii-san, please can we

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-14 Thread Mark Mielke
Robert Haas wrote: On Sat, Dec 13, 2008 at 1:29 PM, Tom Lane t...@sss.pgh.pa.us wrote: We won't call it anything, because we never will or can implement that. See the theory of relativity: the notion of exactly simultaneous events OK, fine. I'll be more precise. I think we need to

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-14 Thread Emmanuel Cecchet
Robert Haas wrote: The term of art for making sure that transactions committed on the primary are visible on the secondary seems to be one-copy serializability (see, for example, a Google Books search on that term). Not exactly. 1-copy-serializability which is the standard for multi-master

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-14 Thread Simon Riggs
On Sun, 2008-12-14 at 13:31 +0900, Tatsuo Ishii wrote: The point here is that synchronous replication, at least to some people, is going to imply that the user-visible states of the two copies are consistent. To other people, it is going to imply that committed transactions will never be

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-14 Thread Mark Mielke
Simon Riggs wrote: I am truly lost to understand why the *name* synchronous replication causes so much discussion, yet nobody has discussed what they would actually like the software to *do* (this being a software discussion list...). AFAICS we can make the software behave like *any* of the

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-14 Thread Robert Haas
We can make the reply to a commit message when any of the following events have occurred 1. We sent the message to standby 2. We received the message on standby 3. We wrote the WAL to the WAL file 4. We fsync'd the WAL file 5. We CRC checked the WAL commit record 6. We applied the WAL

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-14 Thread Mark Mielke
Mark Mielke wrote: Forget replication - even for the exact same server - I don't expect that if I commit from one session, I will be able to see the change immediately from my other session or a new session that I just opened. Perhaps this is often stable to rely on this, and it is useful for

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-14 Thread Heikki Linnakangas
Mark Mielke wrote: Mark Mielke wrote: Forget replication - even for the exact same server - I don't expect that if I commit from one session, I will be able to see the change immediately from my other session or a new session that I just opened. Perhaps this is often stable to rely on this,

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-14 Thread Dimitri Fontaine
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi, Le 14 déc. 08 à 16:48, Simon Riggs a écrit : I am truly lost to understand why the *name* synchronous replication causes so much discussion, yet nobody has discussed what they would actually like the software to *do* (this being a software

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-14 Thread Ron Mayer
Robert Haas wrote: We can make the reply to a commit message when any of the following events have occurred 1. We sent the message to standby 2. We received the message on standby 3. We wrote the WAL to the WAL file 4. We fsync'd the WAL file 5. We CRC checked the WAL commit record 6. We

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-14 Thread Mark Mielke
Heikki Linnakangas wrote: Mark Mielke wrote: FYI: I haven't been able to prove this. Multiple sessions running on my dual-core CPU seem to be able to see the latest commits before they begin executing. Am I wrong about this? Does PostgreSQL provide a intentional guarantee that a commit from

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-14 Thread Greg Stark
When the database says the data is committed it has to mean the data is really committed. Imagine if you looked at a bank account balance after withdrawing all the money and saw a balance which didn't reflect the withdrawal and allowed you to withdraw more money again... -- Greg On 14

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-14 Thread Robert Haas
If this is right, #2, #3, #4, and #6 feel similar except that they're protecting against failures of different (but still all incomplete) subsets of the hardware on the slave, right? Right. Actually, the biggest difference with #6 has nothing to do with protecting against failures. It has

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-14 Thread Mark Mielke
Greg Stark wrote: When the database says the data is committed it has to mean the data is really committed. Imagine if you looked at a bank account balance after withdrawing all the money and saw a balance which didn't reflect the withdrawal and allowed you to withdraw more money again...

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-14 Thread Heikki Linnakangas
Mark Mielke wrote: When I asked for does PostgreSQL guarantee this? I didn't mean hand waving examples or hand waving expectations. I meant a pointer into the code that has some comment that says we want to guarantee that a commit in one session will be immediately visible to other sessions,

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Simon Riggs
On Sat, 2008-12-13 at 00:00 +0100, Markus Wanner wrote: Hi, Fujii Masao wrote: I'd like to define the meaning of synch rep again. synch rep means: (1) Transaction commit waits for WAL records to be replicated to the standby before the command returns a success indication to the

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Markus Wanner
Hi, Simon Riggs wrote: You're right that neither the data transfer nor data availability is entirely synchronous, but data transfer is synchronous at time of *commit*: it is recorded on multiple nodes at the same time. I'm unsure what you mean by a data transfer being synchronous. To what

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Grzegorz Jaskiewicz
On 2008-12-13, at 13:07, Markus Wanner wrote: However, that is a marketing decision [1], which should not be mixed with the technical discussion here. Speaking of a synchronous commit is utterly misleading, because the commit itself is exactly the thing that's *not* synchronous. [1]: Some

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Simon Riggs
On Sat, 2008-12-13 at 14:07 +0100, Markus Wanner wrote: Speaking of a synchronous commit is utterly misleading, because the commit itself is exactly the thing that's *not* synchronous. Not really sure where you're going here. synchronous replication is used exactly as described in the

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Markus Wanner
Hi, Simon Riggs wrote: On Sat, 2008-12-13 at 14:07 +0100, Markus Wanner wrote: Speaking of a synchronous commit is utterly misleading, because the commit itself is exactly the thing that's *not* synchronous. Not really sure where you're going here. I'm pointing to a potential

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Robert Haas
I certainly agree to using such terms. Unfortunately, in my experience, synchronous replication is commonly used to mean that transactions are guaranteed to be immediately visible on remote nodes after the client got commit acknowledgment. That's the cause for confusion I'm envisioning. I

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: I think we need to reserve the term synchronous replication for a system where transactions that begin at the same time on the primary and standby see the same tuples. Clearly that is more synchronous than what is being proposed here; if we call this

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Aidan Van Dyk
Synchronous replication, sync rep is *not* intersted in the slave's visiblity of the commit, because PostgreSQL doesn't serve requests when in recovery (wal receiving) mode *now*. This sync rep patch/proposal/discution is *strictly* (at this point yet, hot standby may eventually or hopefully soon

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Simon Riggs
On Sat, 2008-12-13 at 13:05 -0500, Robert Haas wrote: Hot Standby (although the latter seems to have stalled a bit...) It's just being worked on asynchronously. ;-) -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Hannu Krosing
On Sat, 2008-12-13 at 13:05 -0500, Robert Haas wrote: I certainly agree to using such terms. Unfortunately, in my experience, synchronous replication is commonly used to mean that transactions are guaranteed to be immediately visible on remote nodes after the client got commit

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Hannu Krosing
On Sat, 2008-12-13 at 21:35 +0200, Hannu Krosing wrote: We still could call Sync Rep as a feature synchronous replication on basis that WAL Streaming - Synchronous Write is the highest security level achievable using the feature. And maybe have Sync Hot Standby as a feature on top of that

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Markus Wanner
Hi, Tom Lane wrote: We won't call it anything, because we never will or can implement that. See the theory of relativity: the notion of exactly simultaneous events at distinct locations isn't even well-defined That has never been the point of the discussion. It's rather about the question if

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Markus Wanner
Hi, Simon Riggs wrote: Hot Standby (although the latter seems to have stalled a bit...) It's just being worked on asynchronously. ;-) LOL, thanks for bringing humor into this discussion :-) Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Markus Wanner
Hi, Hannu Krosing wrote: You can have a variantof sync rep + hot standby where the master does not return committed before the slave has both synced the data and replied the transaction so that it is visible on slave but in that case you may have a usecase, where it is actually visible on

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Aidan Van Dyk
* Markus Wanner mar...@bluegap.ch [081213 16:33]: Hi, Hannu Krosing wrote: You can have a variantof sync rep + hot standby where the master does not return committed before the slave has both synced the data and replied the transaction so that it is visible on slave but in that case

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Mark Mielke
Markus Wanner wrote: Tom Lane wrote: We won't call it anything, because we never will or can implement that. See the theory of relativity: the notion of exactly simultaneous events at distinct locations isn't even well-defined That has never been the point of the discussion. It's

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Markus Wanner
Hi, Aidan Van Dyk wrote: Well, I think the PG MVCC (which wal-streaming just ships across somewhere else) will save that. So with hot-standby you could have another client could see the result *after* the COMMIT has been requested, but *before* the COMMIT returns... But we have this

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Markus Wanner
Hi, Mark Mielke wrote: Might it not be true that anybody unfamiliar would be confused and that this is a bit of a straw man? Might be. I've neglected the issue myself for a while. I don't think synchronous replication guarantees that it will be immediately visible. Even if it did push the

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Mark Mielke
Markus Wanner wrote: I don't think synchronous replication guarantees that it will be immediately visible. Even if it did push the change to the other machine, and the other machine had committed it, that doesn't guarantee that any reader sees it any more than if I commit to the same machine (no

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Robert Haas
On Sat, Dec 13, 2008 at 1:29 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: I think we need to reserve the term synchronous replication for a system where transactions that begin at the same time on the primary and standby see the same tuples. Clearly that is

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Jeff Davis
On Sat, 2008-12-13 at 21:35 -0500, Robert Haas wrote: On Sat, Dec 13, 2008 at 1:29 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: I think we need to reserve the term synchronous replication for a system where transactions that begin at the same time on the

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Robert Haas
If it's guaranteed to be visible on the standby after it's committed on the master, and you don't have any way to make it actually simultaneous, then that implies that it's visible on the slave for some brief period of time before it's committed on the master. That situation is still

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Robert Haas
Might it not be true that anybody unfamiliar would be confused and that this is a bit of a straw man? [...] If my application assumes that it can commit to one server, and then read back the commit from another server, and my application breaks as a result, it's because I didn't understand

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Jeff Davis
On Sat, 2008-12-13 at 22:23 -0500, Robert Haas wrote: If it's guaranteed to be visible on the standby after it's committed on the master, and you don't have any way to make it actually simultaneous, then that implies that it's visible on the slave for some brief period of time before it's

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Tatsuo Ishii
The point here is that synchronous replication, at least to some people, is going to imply that the user-visible states of the two copies are consistent. To other people, it is going to imply that committed transactions will never be lost even in the event of a catastropic loss of the

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-13 Thread Robert Haas
The point here is that synchronous replication, at least to some people, is going to imply that the user-visible states of the two copies are consistent. To other people, it is going to imply that committed transactions will never be lost even in the event of a catastropic loss of the

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-12 Thread Fujii Masao
Hi, On Fri, Dec 12, 2008 at 1:34 PM, Aidan Van Dyk ai...@highrise.ca wrote: * Fujii Masao masao.fu...@gmail.com [081211 23:00]: Hi, Or, should I create the feature for the user to confirm whether it's in synch rep via SQL? I

Re: [HACKERS] Sync Rep: First Thoughts on Code

2008-12-12 Thread Simon Riggs
On Fri, 2008-12-12 at 12:53 +0900, Fujii Masao wrote: Quite possibly a terminology problem.. I my case I said sync rep meaning the mode such that the transaction doesn't commit successfully for my PG client until the xlog record has been streamed to the client... and I understand that

  1   2   >