Re: [HACKERS] Hot standby and synchronous replication status
On Tue, Aug 11, 2009 at 5:40 PM, Dimitri Fontaine wrote: > Le 11 août 09 à 23:30, Robert Haas a écrit : > >> On Tue, Aug 11, 2009 at 5:20 PM, Dimitri Fontaine >> wrote: >>> >>> We should somehow provide a default archive and restore command >>> integrated >>> into the main product, so that it's as easy as turning it 'on' in the >>> configuration for users to have something trustworthy: PostgreSQL will >>> keep >>> past logs into a pg_xlog/archives subdir or some other default place, and >>> will know about the setup at startup time when/if needed. >> >> I might be missing something, but isn't this completely silly? If you >> archive your logs to the same partition where you keep your database >> cluster, it seems to me that you might as well delete them. Even >> better, turn off XLogArchiving altogether and save yourself the >> overhead of not using WAL-bypass. > > Nice, the pushback is about the default location, thanks for supporting the > idea :) > > Seriously, debian package will install pg_xlog in $PGDATA which is often not > what I want. So first thing after install, I stop the cluster, move the > pg_xlog, setup a ln -s and restart. I figured having to do the same for > setting up archiving would make my day, when compared to current > documentation setup. Any better idea for a safe enough default location is > welcome, of course. *scratches head* I don't really know how you COULD pick a safe default location. Presumably any location that's in the default postgresql.conf file would be under $PGDATA, which kind of defeats the purpose of the whole thing. In other words, you're always going to have to move it anyway, so why bother with a default that is bound to be wrong? Maybe I'm all wet? ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby and synchronous replication status
On Tue, 2009-08-11 at 17:30 -0400, Robert Haas wrote: > On Tue, Aug 11, 2009 at 5:20 PM, Dimitri Fontaine > wrote: > > We should somehow provide a default archive and restore command integrated > > into the main product, so that it's as easy as turning it 'on' in the > > configuration for users to have something trustworthy: PostgreSQL will keep > > past logs into a pg_xlog/archives subdir or some other default place, and > > will know about the setup at startup time when/if needed. > > I might be missing something, but isn't this completely silly? If you > archive your logs to the same partition where you keep your database > cluster, it seems to me that you might as well delete them. Even > better, turn off XLogArchiving altogether and save yourself the > overhead of not using WAL-bypass. Depends on all kinds of factors. For example, PITRTools will keep a copy local until it knows that the remote has received it. Joshua D. Drake > > ...Robert > -- PostgreSQL - XMPP: jdr...@jabber.postgresql.org Consulting, Development, Support, Training 503-667-4564 - http://www.commandprompt.com/ The PostgreSQL Company, serving since 1997 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby and synchronous replication status
Le 11 août 09 à 23:30, Robert Haas a écrit : On Tue, Aug 11, 2009 at 5:20 PM, Dimitri Fontaine> wrote: We should somehow provide a default archive and restore command integrated into the main product, so that it's as easy as turning it 'on' in the configuration for users to have something trustworthy: PostgreSQL will keep past logs into a pg_xlog/archives subdir or some other default place, and will know about the setup at startup time when/if needed. I might be missing something, but isn't this completely silly? If you archive your logs to the same partition where you keep your database cluster, it seems to me that you might as well delete them. Even better, turn off XLogArchiving altogether and save yourself the overhead of not using WAL-bypass. Nice, the pushback is about the default location, thanks for supporting the idea :) Seriously, debian package will install pg_xlog in $PGDATA which is often not what I want. So first thing after install, I stop the cluster, move the pg_xlog, setup a ln -s and restart. I figured having to do the same for setting up archiving would make my day, when compared to current documentation setup. Any better idea for a safe enough default location is welcome, of course. Oh, and I hope you didn't read that the archive mode be 'on' by default in my proposal, because that's not what I meant. Regards, -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby and synchronous replication status
On Tue, Aug 11, 2009 at 5:20 PM, Dimitri Fontaine wrote: > We should somehow provide a default archive and restore command integrated > into the main product, so that it's as easy as turning it 'on' in the > configuration for users to have something trustworthy: PostgreSQL will keep > past logs into a pg_xlog/archives subdir or some other default place, and > will know about the setup at startup time when/if needed. I might be missing something, but isn't this completely silly? If you archive your logs to the same partition where you keep your database cluster, it seems to me that you might as well delete them. Even better, turn off XLogArchiving altogether and save yourself the overhead of not using WAL-bypass. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] "Hot standby"?
Incidentally, we billed pg_dump as hot backup at some point. mysql calls "mysqlhotcopy" a script that locks and flushes all tables, then makes a copy of the database directory (all queries being locked out while this is in progress, of course). -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby and synchronous replication status
Hi, Le 11 août 09 à 07:50, Heikki Linnakangas a écrit : >2009/8/11 Robert Haas > We should probably have a separate discussion about what the least > committable unit would be for this patch. I wonder if it might be > sufficient to provide a facility for streaming WAL, plus a standalone > tool for receving it and storing it to a file. That's an interesting idea. That would essentially be another method to set up a WAL archive. I'm not sure it's worthwhile on its own, but once we have the wal-sender infrastructure in place it should be easy to write such a tool. Well it might be over engineering time *again* but here's what it makes me think about: We should somehow provide a default archive and restore command integrated into the main product, so that it's as easy as turning it 'on' in the configuration for users to have something trustworthy: PostgreSQL will keep past logs into a pg_xlog/archives subdir or some other default place, and will know about the setup at startup time when/if needed. Now, the archive and restore commands would make a independent subsystem, the only one (for modularisation sake) allowed to work with the archives. So we extend it to support sending and receiving archives to/from a remote PostgreSQL server, using libpq and the already proposed protocol in the current patch form. It could be that for integration purpose we'd need something like the autovacuum launcher, an archive manager daemon child of postmaster accepting signals in order to spawn a specific tasks. Sender part could be launched more than once at any time, e.g. Of course the included automatic and easy to setup daemon wouldn't care about setting up a remote archiving policy, but on the other hand a remote PostgreSQL instance could easily be set up as a wal receiver from the master's archive. The archive retention policy and how it applies to known list of receivers is to discuss :) As far as the (a)Sync Rep patch is concerned, this could solve the setup part of it, this step where starting from a filesystem level backup you realize you need archived WALs before being able to apply currently received entries (LSN granularity). Regards, -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] "Hot standby"?
On Tue, Aug 11, 2009 at 4:07 PM, Josh Berkus wrote: > So really, the "streaming replication" patch should be called "hot > standby", No. AIUI, hot standby means that when your primary falls over, the secondary automatically promotes itself and takes over. It requires things like heartbeat monitoring and STONITH and is unrelated to anything we currently have under consideration. > and the "hot standby" patch should be called "read only slaves"? Yes. > And *why* can't we call it log-based replication? Well, we can call it anything we want. For example, up until now we've been calling it "hot standby", even though that's clearly wrong. :-) But on the merits: log-based replication is, I think, what we already have. Both of these patches do things that make it better. Streaming replication (fka synch rep) makes it more nearly real-time, and read-only slaves (fka hot standby) makes the standby server more useful. But neither is ADDING log-based replication, they're both IMPROVING it, in different ways. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] "Hot standby"?
On Tuesday 11 August 2009 18:16:04 Gianni Ciolli wrote: > As for "warm/hot", it depends on what you exactly mean with "get > ready": > > (A) If you mean "it is possible to connect to the second node", then > Simon's patch is "hot". Yeah, but by that definiton doing a pg_dump/pg_restore every hour is also "hot". ;-) Incidentally, we billed pg_dump as hot backup at some point. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] "Hot standby"?
On Tue, Aug 11, 2009 at 01:14:56PM -0400, Robert Haas wrote: > On Tue, Aug 11, 2009 at 1:08 PM, Josh Berkus wrote: > > I believe we're just copying Oracle's terminology. While that > > terminology is not consistent, it is understood by the industry. Oracle > > defined their Hot Standby to have both asynchronous and synchronous modes: > > http://www.oracle.com/technology/products/rdb/htdocs/dbms/hotstandby.html > > I think that you are wrong. Hot Standby is described in that document > as a way of replication the database and providing failover in the > event that the master dies. This feature has to do with allowing > queries to be executed on the slave, which is not discussed in that > document. That web page contains only a brief summary of Oracle's Hot Standby; by looking at the attached PDF guide you can see that the capability of allowing read-only queries on the standby node is actually part of their architecture. Since we are comparing the features with Oracle's Hot Standby, it is perhaps useful to mention two of the major differences: * Oracle's Hot Standby allows only one Standby node, while on PostgreSQL we can have as many as we want, since the Primary is totally unaware of them. Allowing N clones instead of only one may be a crucial advantage in these use cases where read-only queries represent the vast majority of the load. * Oracle's Hot Standby allows cloning hash indexes, which on PostgreSQL are not WAL safe (anyway, they are currently discouraged by the documentation). Best regards, Dr. Gianni Ciolli - 2ndQuadrant Italia PostgreSQL Training, Services and Support gianni.cio...@2ndquadrant.it | www.2ndquadrant.it -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] "Hot standby"?
On Tue, Aug 11, 2009 at 4:07 PM, Josh Berkus wrote: > All, > > So really, the "streaming replication" patch should be called "hot > standby", and the "hot standby" patch should be called "read only slaves"? > > And *why* can't we call it log-based replication? +1 *) it _is_ used to replicate a database (replicate means make a copy!) *) our target market will perceive it that way *) sounds cool 'synchronous log-based replication' 'asynchronous log-based replication' or, 'log-based replication', in both synchronous and asynchronous modes merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] "Hot standby"?
All, So really, the "streaming replication" patch should be called "hot standby", and the "hot standby" patch should be called "read only slaves"? And *why* can't we call it log-based replication? -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] "Hot standby"?
Mark Mielke wrote: > This patch does not provide streaming replication? There's a separate effort to provide asynchronous and synchronous streaming replication. Different patch. > "Hot standby" to me means "the slave is as close to up-to-date as > possible and can potentially take over at any time in a near > instance." This *implies* some sort of streaming replication (byte > level rather than file-by-file) rather than waiting for WAL logs to > become full and shipped. Most of use would expect that from something called "hot standby". That's why so many of us have been saying that the name is misleading. This patch, as I understand it, would allow you to use the warm standby to run read-only queries -- for reports and such, to take some load off the primary database. No more; no less. So it's still warm, not hot, and it's still usable as a warm standby. It just lets you squeeze a little extra benefit from the copy while it sits there periodically updating itself. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] "Hot standby"?
Mark Mielke escribió: > I don't think I was confused before - but I am confused now. :-) > > This patch does not provide streaming replication? No. What it does is allow you to query the slave while it's still replaying transactions. There's another patch allowing you to do "stream replication". Hopefully both things could be used simultaneously. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] "Hot standby"?
On 08/11/2009 02:52 PM, Robert Haas wrote: On Tue, Aug 11, 2009 at 2:48 PM, Mark Mielke wrote: I remember this debate from 6 months ago. :-) I prefer not to try and fit square pegs into round holes. Streaming replication sounds like the best description. It may not be the keywords that newbies are looking for, but too bad for them. Calling it something different than what it is, just so that people who don't understand why it is wrong will have something that approximates the right understanding, is not a just cause. :-) Uhm, I think you are confused. Hot Standby = Allow read-only queries on a PostgreSQL server during archive recovery Synchronous (or Streaming) Replication = Allow WAL to be streamed on a byte level rather than copied file-by-file Hot Standby is not any sort of replication. I don't think I was confused before - but I am confused now. :-) This patch does not provide streaming replication? Having the standby allow for read-only queries is a "would be nice" for me, but it's not very useful. I've been monitoring these threads (and wishing they were at a level I could participate and contribute on), because I want the ability to have near real time updates such that the standby can become live. "Hot standby" to me means "the slave is as close to up-to-date as possible and can potentially take over at any time in a near instance." This *implies* some sort of streaming replication (byte level rather than file-by-file) rather than waiting for WAL logs to become full and shipped. If this patch doesn't give me near real time replication, then I am confused about why I would want it at all. pg_standby already gives the ability to do replication on a per completed WAL log file basis. Cheers, mark -- Mark Mielke
Re: [HACKERS] "Hot standby"?
On Tue, Aug 11, 2009 at 2:10 PM, Ron Mayer wrote: > David Fetter wrote: >> On Tue, Aug 11, 2009 at 08:56:38AM -0500, Kevin Grittner wrote: >>> Bruce Momjian wrote: >>> OK, so it is "warm slave". > > Why isn't it just a "read only slave". Do some systems > have read-only slave databases that can't serve as a warm > standby system as well as this one could? I think that's about right. What we have now via pg_standby or similar tools is a warm standby. What this tool does is makes the warm-standby also serve as a read-only slave. It doesn't make failover any simpler so it's not making the standby any hotter - it instead makes the standby be able to do more useful work when no failover has occurred. The technical description for the commit message is probably something like this: "Allow read-only queries to be processed during archive recovery." The P/R version is probably something like this: "Warm standby servers now function as read-only slaves." ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] "Hot standby"?
On Tue, Aug 11, 2009 at 2:48 PM, Mark Mielke wrote: > On 08/11/2009 09:56 AM, Kevin Grittner wrote: > > Bruce Momjian wrote: > > > > OK, so it is "warm slave". > > > > That is technically accurate, given the preceding definitions, but it > has disturbing connotations. Enough so, in my view, to merit getting > a little more creative in the naming. How about "warm replica"? > Other ideas? > > I agree that the present moniker misleads. > > I remember this debate from 6 months ago. :-) > > I prefer not to try and fit square pegs into round holes. Streaming > replication sounds like the best description. It may not be the keywords > that newbies are looking for, but too bad for them. Calling it something > different than what it is, just so that people who don't understand why it > is wrong will have something that approximates the right understanding, is > not a just cause. :-) Uhm, I think you are confused. Hot Standby = Allow read-only queries on a PostgreSQL server during archive recovery Synchronous (or Streaming) Replication = Allow WAL to be streamed on a byte level rather than copied file-by-file Hot Standby is not any sort of replication. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] "Hot standby"?
On 08/11/2009 09:56 AM, Kevin Grittner wrote: Bruce Momjian wrote: OK, so it is "warm slave". That is technically accurate, given the preceding definitions, but it has disturbing connotations. Enough so, in my view, to merit getting a little more creative in the naming. How about "warm replica"? Other ideas? I agree that the present moniker misleads. I remember this debate from 6 months ago. :-) I prefer not to try and fit square pegs into round holes. Streaming replication sounds like the best description. It may not be the keywords that newbies are looking for, but too bad for them. Calling it something different than what it is, just so that people who don't understand why it is wrong will have something that approximates the right understanding, is not a just cause. :-) Cheers, mark -- Mark Mielke
Re: [HACKERS] "Hot standby"?
David Fetter wrote: > On Tue, Aug 11, 2009 at 08:56:38AM -0500, Kevin Grittner wrote: >> Bruce Momjian wrote: >> >>> OK, so it is "warm slave". Why isn't it just a "read only slave". Do some systems have read-only slave databases that can't serve as a warm standby system as well as this one could? >> That is technically accurate, given the preceding definitions, but >> it has disturbing connotations. Enough so, in my view, to merit >> getting a little more creative in the naming. How about "warm >> replica"? Other ideas? > > Warm Read > Streamed Copy Master/Slave Replication and Warm Standby systems are common enough terms that I can google them or look them up in many computer science books. While coming up with creative politically correct euphemisms might be fun, I hope we stick near terms that other DBAs could already be familiar with. ISTM the best way to refer to it formally would be a "Read Only Slave / Warm Standby" system, even if informally we might have informal discussions of "just how hot our slaves are" when hot-standby features get added down the road. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] "Hot standby"?
On Tue, Aug 11, 2009 at 1:08 PM, Josh Berkus wrote: > I believe we're just copying Oracle's terminology. While that > terminology is not consistent, it is understood by the industry. Oracle > defined their Hot Standby to have both asynchronous and synchronous modes: > http://www.oracle.com/technology/products/rdb/htdocs/dbms/hotstandby.html I think that you are wrong. Hot Standby is described in that document as a way of replication the database and providing failover in the event that the master dies. This feature has to do with allowing queries to be executed on the slave, which is not discussed in that document. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] "Hot standby"?
Peter, I believe we're just copying Oracle's terminology. While that terminology is not consistent, it is understood by the industry. Oracle defined their Hot Standby to have both asynchronous and synchronous modes: http://www.oracle.com/technology/products/rdb/htdocs/dbms/hotstandby.html The other industry term which would be understood would be "log-based replication". Terms we invent would be less likely to be understood, and users would not get excited about them. Otherwise, I say we go with "simmering follower". ;-) -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] "Hot standby"?
On Tue, Aug 11, 2009 at 08:56:38AM -0500, Kevin Grittner wrote: > Bruce Momjian wrote: > > > OK, so it is "warm slave". > > That is technically accurate, given the preceding definitions, but > it has disturbing connotations. Enough so, in my view, to merit > getting a little more creative in the naming. How about "warm > replica"? Other ideas? Warm Read Streamed Copy Cheers, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] "Hot standby"?
On Tue, 2009-08-11 at 08:12 -0700, David E. Wheeler wrote: > On Aug 11, 2009, at 5:32 AM, Bruce Momjian wrote: > > > OK, so it is "warm slave". > > I suggest steaming servant. Or tepid assistant. We can't use those, I think they are on the list for Ubuntu. Joshua D. Drake > > David > -- PostgreSQL - XMPP: jdr...@jabber.postgresql.org Consulting, Development, Support, Training 503-667-4564 - http://www.commandprompt.com/ The PostgreSQL Company, serving since 1997 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] "Hot standby"?
On Tue, Aug 11, 2009 at 12:30:58PM +0300, Peter Eisentraut wrote: > warm - If the first node dies, the replacement node needs to do some work to > get ready, but it's a lot quicker than "cold". > > hot - If the first node dies, the replacement node can take over immediately. > For example, I'd say that a DRBD-based solution would be a cold standby. > Among WAL-based solutions, what we have now with pg_standby (nomen est omen), > is a warmish standby. From what I understand, Simon's patch set does not > change the "warm" property of this arrangement at all. It only changes the > "standby" to a "slave". > Am I off? What other definition of terms justifies the description of "hot > standby"? As for "warm/hot", it depends on what you exactly mean with "get ready": (A) If you mean "it is possible to connect to the second node", then Simon's patch is "hot". IIRC this meaning agrees with the terminology used by other databases, such as Oracle, Sybase, etc. (B) If you mean "the contents of the second node are equal to the contents of the first node", then Simon's patch is "warm". But then we have by definition "hot" == "synchronous" and "either warm or cold" == "asynchronous". IMHO the other case is preferrable, as the word sets "synchronous/asynchronous" and "cold/warm/hot" are more expressive because they measure two independent properties. Best regards, Dr. Gianni Ciolli - 2ndQuadrant Italia PostgreSQL Training, Services and Support gianni.cio...@2ndquadrant.it | www.2ndquadrant.it -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby and synchronous replication status
Hi, On Tue, Aug 11, 2009 at 1:25 PM, Robert Haas wrote: > But just to kick off the discussion, here is Heikki's review of Synch > Rep on 7/15: > > http://archives.postgresql.org/pgsql-hackers/2009-07/msg00913.php > > I think the key phrases in this review are "I believe we have > consensus on four major changes" and "As a hint, I think you'll find > it a lot easier if you implement only asynchronous replication at > first. That reduces the amount of inter-process communication a lot." > I think this points to a need to try to reduce the scope of this patch > to something more manageable. Heikki also points out that major > change #4 was raised back in Decemeber, and I actually think #1 and #3 > were as well. Thanks for clarifying the status. According to Heikki's advice, I'm working at asynchronous replication at first. I'll submit the patch by the next CommitFest at least. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] "Hot standby"?
On Aug 11, 2009, at 5:32 AM, Bruce Momjian wrote: OK, so it is "warm slave". I suggest steaming servant. Or tepid assistant. David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby and synchronous replication status
Hi, On Tue, Aug 11, 2009 at 3:33 PM, Magnus Hagander wrote: >> We should probably have a separate discussion about what the least >> committable unit would be for this patch. I wonder if it might be >> sufficient to provide a facility for streaming WAL, plus a standalone >> tool for receving it and storing it to a file. This might be designed >> as an improvement on our existing concept of an archive; the advantage >> would be that you could have all but perhaps the last few seconds of >> WAL if the primary kicked the bucket, rather than being behind by up >> to checkpoint_timeout. Allowing the WAL to be received directly by >> PostgreSQL could be a future enhancement. >> That's an interesting idea. That would essentially be another method to set >> up a WAL archive. I'm not sure it's worthwhile on its own, but once we have >> the wal-sender infrastructure in place it should be easy to write such a >> tool. > > It most definitely would be useful on it's own. I have several > installations where we'd love such a capability. Yeah, this is my initial proposal for WAL receiving side. I think that it's useful to provide such tool as a contrib (or pgfoundry) program. http://archives.postgresql.org/pgsql-hackers/2008-10/msg01639.php Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] "Hot standby"?
Bruce Momjian wrote: > OK, so it is "warm slave". That is technically accurate, given the preceding definitions, but it has disturbing connotations. Enough so, in my view, to merit getting a little more creative in the naming. How about "warm replica"? Other ideas? I agree that the present moniker misleads. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] "Hot standby"?
Robert Haas wrote: > On Tue, Aug 11, 2009 at 5:30 AM, Peter Eisentraut wrote: > > What is "hot" and "standby" about the proposed "hot standby" feature? OK, so it is "warm slave". > Absolutely nothing. It's horribly misnamed. > > I have also long argued that "Synchronous Replication" should really > be called "Streaming Replication". Perhaps it may eventually support > a synchronous mode of operation, but that will probably never be the > only one. Yes, I like "streaming replication" because no matter how we implement this feature, there are going to be synchronous and asynchronous modes eventually. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] "Hot standby"?
> > Am I off? What other definition of terms justifies the description of "hot > standby"? > I think that "Hot Standby" is associated with the high WAL recovery capacity. In my opinion, is a good term to symbolizes the superiority compared with "Warm Standby". -- Matheus Ricardo Espanhol --- Dextra Sistemas http://www.dextra.com.br/postgres/
Re: [HACKERS] "Hot standby"?
On Tue, Aug 11, 2009 at 5:30 AM, Peter Eisentraut wrote: > What is "hot" and "standby" about the proposed "hot standby" feature? Absolutely nothing. It's horribly misnamed. I have also long argued that "Synchronous Replication" should really be called "Streaming Replication". Perhaps it may eventually support a synchronous mode of operation, but that will probably never be the only one. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby and synchronous replication status
On Tuesday, August 11, 2009, Heikki Linnakangas wrote: > > > 2009/8/11 Robert Haas > > We should probably have a separate discussion about what the least > committable unit would be for this patch. I wonder if it might be > sufficient to provide a facility for streaming WAL, plus a standalone > tool for receving it and storing it to a file. This might be designed > as an improvement on our existing concept of an archive; the advantage > would be that you could have all but perhaps the last few seconds of > WAL if the primary kicked the bucket, rather than being behind by up > to checkpoint_timeout. Allowing the WAL to be received directly by > PostgreSQL could be a future enhancement. > That's an interesting idea. That would essentially be another method to set > up a WAL archive. I'm not sure it's worthwhile on its own, but once we have > the wal-sender infrastructure in place it should be easy to write such a tool. It most definitely would be useful on it's own. I have several installations where we'd love such a capability. /Magnus -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby and synchronous replication status
2009/8/11 Robert Haas > We should probably have a separate discussion about what the least > committable unit would be for this patch. I wonder if it might be > sufficient to provide a facility for streaming WAL, plus a standalone > tool for receving it and storing it to a file. This might be designed > as an improvement on our existing concept of an archive; the advantage > would be that you could have all but perhaps the last few seconds of > WAL if the primary kicked the bucket, rather than being behind by up > to checkpoint_timeout. Allowing the WAL to be received directly by > PostgreSQL could be a future enhancement. That's an interesting idea. That would essentially be another method to set up a WAL archive. I'm not sure it's worthwhile on its own, but once we have the wal-sender infrastructure in place it should be easy to write such a tool. I think Hot Standby is in somewhat better shape. Having read the > patch, I can say that it needs a pretty substantial amount of cleanup > work: the code is messy. But Heikki was talking fairly seriously > about committing this for 8.4, and everyone seems to agree that the > architecture is approximately right. It's not clear to me how much > more refactoring is needed or whether there are remaining bugs, but at > least it looks to me like a reviewable version of the patch could be > produced with a fairly modest amount of work. That's my sentiment too. There's a fair amount of cleanup needed, the big changes this spring left behind some damage to readability. I haven't looked at your latest patch in detail, but it seems to go into the right direction, thanks for that. > Heikki stated (in response to a question from me) that he was not > aware of anything that could be severed from Hot Standby and committed > independently, and nothing jumped out at me when I read the patch > either. But if the whole patch can be made committable in time then > it's less critical. Yeah, we still have time, but I am worried that if we let this patch sit for another few months, we will be in the same situation when the 8.5 feature freeze arrives that we were in 8.4. When it became clear that the patch won't make it into 8.4, I thought we would continue working on the patch throughout the spring and summer, and have a cleaned up patch ready for review for the first 8.5 commit fest. I would be much more confident committing a big patch like this early in the release cycle, with still plenty of time left to uncover issues. That didn't happen. If we don't have an updated patch for the 2nd commit fest, we're in serious risk of missing the 8.5 release again. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Re: [HACKERS] Hot standby and synchronous replication status
On Mon, Aug 10, 2009 at 9:51 PM, Bruce Momjian wrote: > What is the status of hot standby and synchronous replication? Is there > a design specification? Who are the lead developers? Who is assisting? > What open item do we have for each feature? Where is the most recent > patch? Can we incrementally start applying patches for these features? > > Would someone create a wiki for each of these features and update it so > we can be sure of their status as we move into September/October? I > would like to have some traction on these in the next few months rather > than waiting for the later commitfests. For what it's worth, there are already some materials on the wiki about these projects: http://wiki.postgresql.org/wiki/Hot_Standby http://wiki.postgresql.org/wiki/NTT%27s_Development_Projects To get a real project status, I think we need input from Heikki, who is the person who will likely be committing whatever of this work gets into 8.5, and who is also the committer who has been following these patches most closely, at least AIUI. Tom may also have some thoughts. But just to kick off the discussion, here is Heikki's review of Synch Rep on 7/15: http://archives.postgresql.org/pgsql-hackers/2009-07/msg00913.php I think the key phrases in this review are "I believe we have consensus on four major changes" and "As a hint, I think you'll find it a lot easier if you implement only asynchronous replication at first. That reduces the amount of inter-process communication a lot." I think this points to a need to try to reduce the scope of this patch to something more manageable. Heikki also points out that major change #4 was raised back in Decemeber, and I actually think #1 and #3 were as well. We should probably have a separate discussion about what the least committable unit would be for this patch. I wonder if it might be sufficient to provide a facility for streaming WAL, plus a standalone tool for receving it and storing it to a file. This might be designed as an improvement on our existing concept of an archive; the advantage would be that you could have all but perhaps the last few seconds of WAL if the primary kicked the bucket, rather than being behind by up to checkpoint_timeout. Allowing the WAL to be received directly by PostgreSQL could be a future enhancement. (But take all of this with a grain of salt, because as I say I haven't read the patch and am not familiar with this part of the code either.) I think Hot Standby is in somewhat better shape. Having read the patch, I can say that it needs a pretty substantial amount of cleanup work: the code is messy. But Heikki was talking fairly seriously about committing this for 8.4, and everyone seems to agree that the architecture is approximately right. It's not clear to me how much more refactoring is needed or whether there are remaining bugs, but at least it looks to me like a reviewable version of the patch could be produced with a fairly modest amount of work. Heikki stated (in response to a question from me) that he was not aware of anything that could be severed from Hot Standby and committed independently, and nothing jumped out at me when I read the patch either. But if the whole patch can be made committable in time then it's less critical. Having offered those rather bold opinions, I'm going to repeat the thought I started out with: we need to hear from Heikki. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] hot standby - merged up to CVS HEAD
Simon Riggs wrote: > > On Mon, 2009-08-10 at 10:20 -0700, Josh Berkus wrote: > > > Simon: you have people offering to help with the patch. Offering to > > help *right now*. Might I suggest that you establish a GIT branch for > > Hot Standby so that more people can collaborate? Working on it until > > you get it "perfect" offsite doesn't work; it's going to require > > adjustment/debugging once it gets to commitfest anyway. Might as well > > start that now, or it'll just delay application. > > Agreed, but there will be some time before that is possible. I'm happy > to commit to Sept 15 *latest* to do the above. I know what has to be > done and that's my timescale for doing it. > > > Everyone Else: Simon is working hard on this, please get off his back. > > Thanks, good plan. > > There is absolutely no danger this patch is going to be delayed and > there is really no call for haste. I near killed myself trying to get it > into 8.4 and I would like to avoid a tension-fest this time around. We > have time and intend to take it at a reasonable pace, and spend time > thinking first, then talking later. Over and out, for now. Simon, I am sure you worked very hard trying to get hot standby into 8.4, and worked under great pressure, I am sure it is hard to get motivated to continue that work. I regret not vocally expressing caution about trying to get hot standby into 8.4. It would have required Herculean effort with everything going perfectly, so it was very unlikely to be possible, but rather than say something, I didn't want to be the bearer of bad news, so said nothing. I want to try to avoid a big push at the end of 8.5 development to get hot standby completed, which is why I am trying to motivate anyone to continue working on it, and I want to get more people to help Simon with it. Simon, I am not sure how happy you are about this, but the community needs your help with hot standby _and_ synchronous replication in 8.5, so I would like to get hot standby completed soon (hopefully by getting you help and not killing you), so we leave time for you to help on synchronous replication as well. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] hot standby - merged up to CVS HEAD
On Mon, Aug 10, 2009 at 11:15:51PM +0100, Simon Riggs wrote: > On Mon, 2009-08-10 at 10:20 -0700, Josh Berkus wrote: > > > Simon: you have people offering to help with the patch. Offering > > to help *right now*. Might I suggest that you establish a GIT > > branch for Hot Standby so that more people can collaborate? > > Working on it until you get it "perfect" offsite doesn't work; > > it's going to require adjustment/debugging once it gets to > > commitfest anyway. Might as well start that now, or it'll just > > delay application. > > Agreed, but there will be some time before that is possible. I'm > happy to commit to Sept 15 *latest* to do the above. I know what has > to be done and that's my timescale for doing it. With all due respect, Simon, you've missed the point completely. If you have done any work on this whatsoever, *NOW* is the time to share it and going forward, immediately publishing any change, is the way to keep it shared. Working off in splendid isolation, while it may appeal to you, is the wrong move. Guaranteed tears. Cheers, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] hot standby - merged up to CVS HEAD
On Mon, 2009-08-10 at 10:20 -0700, Josh Berkus wrote: > Simon: you have people offering to help with the patch. Offering to > help *right now*. Might I suggest that you establish a GIT branch for > Hot Standby so that more people can collaborate? Working on it until > you get it "perfect" offsite doesn't work; it's going to require > adjustment/debugging once it gets to commitfest anyway. Might as well > start that now, or it'll just delay application. Agreed, but there will be some time before that is possible. I'm happy to commit to Sept 15 *latest* to do the above. I know what has to be done and that's my timescale for doing it. > Everyone Else: Simon is working hard on this, please get off his back. Thanks, good plan. There is absolutely no danger this patch is going to be delayed and there is really no call for haste. I near killed myself trying to get it into 8.4 and I would like to avoid a tension-fest this time around. We have time and intend to take it at a reasonable pace, and spend time thinking first, then talking later. Over and out, for now. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] hot standby - merged up to CVS HEAD
On Mon, 2009-08-10 at 10:20 -0700, Josh Berkus wrote: > All, > > Can we stop arguing about a patch everyone wants? > > Simon: you have people offering to help with the patch. Offering to > help *right now*. Might I suggest that you establish a GIT branch for > Hot Standby so that more people can collaborate? Working on it until > you get it "perfect" offsite doesn't work; it's going to require > adjustment/debugging once it gets to commitfest anyway. Might as well > start that now, or it'll just delay application. > > Everyone Else: Simon is working hard on this, please get off his back. I believe that all anyone is asking is that Simon communicate and collaborate. Joshua D. Drake -- PostgreSQL - XMPP: jdr...@jabber.postgresql.org Consulting, Development, Support, Training 503-667-4564 - http://www.commandprompt.com/ The PostgreSQL Company, serving since 1997 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] hot standby - merged up to CVS HEAD
All, Can we stop arguing about a patch everyone wants? Simon: you have people offering to help with the patch. Offering to help *right now*. Might I suggest that you establish a GIT branch for Hot Standby so that more people can collaborate? Working on it until you get it "perfect" offsite doesn't work; it's going to require adjustment/debugging once it gets to commitfest anyway. Might as well start that now, or it'll just delay application. Everyone Else: Simon is working hard on this, please get off his back. -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] hot standby - merged up to CVS HEAD
On Sun, 2009-08-09 at 22:15 -0400, Robert Haas wrote: > On Sun, Aug 9, 2009 at 2:43 PM, Simon Riggs wrote: > > I've said very clearly that I am working on this and it's fairly > > laughable to suggest that anybody thought I wasn't. What more should I > > do to prove something is "active" if you won't accept my clearly spoken > > word? How did you decide I was idle exactly? > > I think we looked at the fact that you haven't posted an updated > version of this patch in almost 6 months. That pretty much covers it. We practice open development, we always have. Those who don't generally run into problems just like this one. Robert has taken the path of being open about the work that is being performed and thus he is the one that appears to be making progress. Simon, regardless of your "words" you have shown nothing for 6 months. Does that mean you aren't working on it? Of course not but it certainly shows a lack of transparency to the community with the work. You know that doesn't work. The community assumes by default that no patch (or active communication which you also haven't done) means no work. It always has. So instead of all of us bickering, how about we start actively working "together" on the feature again. Sincerely, Joshua D. Drake -- PostgreSQL - XMPP: jdr...@jabber.postgresql.org Consulting, Development, Support, Training 503-667-4564 - http://www.commandprompt.com/ The PostgreSQL Company, serving since 1997 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] hot standby - merged up to CVS HEAD
On Sun, Aug 9, 2009 at 2:43 PM, Simon Riggs wrote: > I've said very clearly that I am working on this and it's fairly > laughable to suggest that anybody thought I wasn't. What more should I > do to prove something is "active" if you won't accept my clearly spoken > word? How did you decide I was idle exactly? I think we looked at the fact that you haven't posted an updated version of this patch in almost 6 months. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] hot standby - merged up to CVS HEAD
Simon Riggs wrote: > On Sat, 2009-08-08 at 13:12 -0400, Bruce Momjian wrote: > > Simon Riggs wrote: > > > > > > I'm not sure why you're stirring this up again. > > > > > > You stated: > > > > - It's going to be very confusing if people submit their own versions of > > - it. So now we have mine, Heikki's and Robert's. I'd like this to stop > > - please, have a little faith and a little patience. Presumably Robert's > > - rebasing patch is best place to start from now for later work. > > > > I assume your last sentence is saying exactly that Robert's version > > should be used as the most current reprsentation of this feature patch. > > That isn't what I meant then and isn't what I think now: that patch is > not verified. I am not sure how to respond to you when I can't even interpret what you say in emails, e.g. "Presumably Robert's rebasing patch is best place to start from now for later work." > As you point out, people can do anything they want with submitted code, > so they may make any judgement they wish about that patch. If anybody > thinks any good will come from ignoring the opinion of the original > author, go right ahead. Right. At some point more people are going to get involved and complete the patch --- historically this is the way complex patches have evolved, and I think many of your patches are in that group. > > The bottom line is that you think you have ownership of the patch and > > the feature --- you do not. > > > > You are right you don't have to justify anything, but neither can you > > claim ownership of the patch/feature and complain that others are > > working on it too. This is a community project --- if you want your > > patches to remain your property, I suggest you no longer post them to > > our community lists. If you are actively working on patches, I assume > > others will not duplicate your work, but if you are idle, others are > > encouraged to keep improving the patch. Again, if you don't like > > that, > > then perhaps the community-development process isn't for you. > > I've *never* spoken of code or feature ownership. But this is a > community project and I can request teamwork from other contributors, > which is what I did. > > I've said very clearly that I am working on this and it's fairly > laughable to suggest that anybody thought I wasn't. What more should I > do to prove something is "active" if you won't accept my clearly spoken > word? How did you decide I was idle exactly? Your statement of 15 Jul 2009 stated: - I've said very clearly that I would work on this for 8.5 [at the - developer meeting] and also that it wouldn't be ready for the first - commit fest, when asked. I was told recently that someone heard the - patch was dead; I've never said that, but I would like a summer holiday. I assume that means you were not actively working on it, hence my conclusion, which is probably wrong because I can't manage to interpret your emails. :-( -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] hot standby - merged up to CVS HEAD
On Sat, 2009-08-08 at 13:12 -0400, Bruce Momjian wrote: > Simon Riggs wrote: > > > > I'm not sure why you're stirring this up again. > > > You stated: > > - It's going to be very confusing if people submit their own versions of > - it. So now we have mine, Heikki's and Robert's. I'd like this to stop > - please, have a little faith and a little patience. Presumably Robert's > - rebasing patch is best place to start from now for later work. > > I assume your last sentence is saying exactly that Robert's version > should be used as the most current reprsentation of this feature patch. That isn't what I meant then and isn't what I think now: that patch is not verified. The reason for my objection was that accepting patches had already caused significant setbacks on this complex patch. I won't be ignoring Robert's work, which would be petty, but I won't be picking it up wholesale either, nor will I be providing a review of it. Nor Heikki's, nor anyone elses. I am moving forward the parts of the patch that I consider worth submitting. I need to be happy with every single line of code before I submit it; it's too easy to make a mistake otherwise. I'm not going to submit something that I can't verify, any more than I would expect any committer to commit code they can't verify either. The current dev team (Simon, Gianni, Gabriele) only has time to spend on testing one patch, not various ones. I do hope to receive comments from reviewers and will include consensus changes into the code. And as I mentioned elsewhere, there are still changes/features to add to the code itself. As you point out, people can do anything they want with submitted code, so they may make any judgement they wish about that patch. If anybody thinks any good will come from ignoring the opinion of the original author, go right ahead. > The bottom line is that you think you have ownership of the patch and > the feature --- you do not. > > You are right you don't have to justify anything, but neither can you > claim ownership of the patch/feature and complain that others are > working on it too. This is a community project --- if you want your > patches to remain your property, I suggest you no longer post them to > our community lists. If you are actively working on patches, I assume > others will not duplicate your work, but if you are idle, others are > encouraged to keep improving the patch. Again, if you don't like > that, > then perhaps the community-development process isn't for you. I've *never* spoken of code or feature ownership. But this is a community project and I can request teamwork from other contributors, which is what I did. I've said very clearly that I am working on this and it's fairly laughable to suggest that anybody thought I wasn't. What more should I do to prove something is "active" if you won't accept my clearly spoken word? How did you decide I was idle exactly? I'll make sure to do regular blogs on what I'm working on. I have no problem with Robert. I have no problem with Robert completing my inactive patches - he is doing exactly that with join removal and I haven't uttered a word. If I felt as you think I do, then surely I would have objected to both. Yet I have only objected on the one patch that I've said clearly I'm working on, with specific reasons. If Robert hadn't been present when I said it, I might have reacted differently. To everybody and anybody: please don't submit alternative versions of a patch that other hackers have said they are working on, and don't have conversations about those projects on diverse threads. That's not a claim of code or feature ownership, it's just common sense teamwork on an important development project. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] hot standby - merged up to CVS HEAD
On Sun, Aug 9, 2009 at 6:11 AM, Simon Riggs wrote: > I'm working on HS; I've said so clearly and say it again now. To my > knowledge, no other Postgres project has committed to a timetable for > delivery, so I'm not clear why you think one should have been given > here, or why the absence of such a timetable implies anything. Dev tree Well, basically because otherwise nobody except you can do anything. In your last email you wrote: > assistance from any and every other hacker is welcome in producing that. What I need is some help figuring out when and how I can provide that assistance and what I can do. At the moment, the version of the patch that I last posted does not apply due to a conflict in sinval.h, I believe due to conflicts introduced by Fujii Masao's signal multiplexing patch. I haven't had time to look at that in any detail yet, but I'd like to do do so soon. There are some other things that look like easy cleanups that I think I could knock out as well; see my original email on this thread. Of course, your input on those items would be invaluable. Of course, if you've already done some of this work, that would be great, but then it seems like you ought to have let us know that you were doing it before you did it, and posted the updated patch to -hackers afterwards, just as you asked me to do. Working disconnected from everyone else until September 15th (or November 15th) and then posting the patch will make it very, very difficult for anyone else to do anything useful. When I was working the explain patches, I posted them regularly to -hackers, and published a git repository on git.postgresql.org, which meant that anyone could follow along at home if they wished. Since Hot Standby is more interesting before breakfast than machine-readable explain output is all day, the same approach seems desirable here. At the very least, I think you should post your progress weekly so that we can read, review, comment, propose changes... ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] hot standby - merged up to CVS HEAD
On Sat, 2009-08-08 at 22:02 -0400, Robert Haas wrote: > I think it would also be fair to point out that you keep saying that > you're going to deliver this patch for 8.5, but you haven't provided > any real timetable as to when you're going to start working on it or > when it'll be completed. Because this patch IS so important to the > community, people want to know the answers to those questions. That > is exactly why you were asked about your schedule at PGcon; and you > demurred. I'm not sure I have ever demurred on anything, a failing I'm sure. I mentioned that HS would make sense to go in *after* Synch Rep and I stand by that, since we only have so many people that understand that area and we cannot do everything at once. HS is important. That is why I have put so much time and money into being in a position where the end of the tunnel is in sight. Had I not done so, we wouldn't even be discussing it. > I think it's unfair to ask > other people to wait for you to work on something when you haven't > committed to a timetable for working on it I've not asked anybody to wait. I tried very, very hard to get HS into 8.4 and many people were opposed to that. The next release of Postgres isn't released until next year. If it matters as to which month it goes into Postgres, I've not heard anybody explain why the exact month is important. I don't see anything there myself of concern to the community. I'm working on HS; I've said so clearly and say it again now. To my knowledge, no other Postgres project has committed to a timetable for delivery, so I'm not clear why you think one should have been given here, or why the absence of such a timetable implies anything. Dev tree only opened again about a month ago, the dates of which were not published in advance, so no detailed planning was possible for people contributing to the beta and release of 8.4. Want an HS Timetable? Well, I will try to complete it for next commit-fest, but there may be issues that mean it comes in the next one after that. So Sept 15, or maybe Nov 15. My understanding is that community wants quality and so that is my #1 priority. I'll make code available on Sept 15, so that we either have a WIP patch or a request-for-commit patch, not sure which it will be. > I understand that your #1 priority needs to be the work for > which you get paid the most money, but I think it's unfair to ask > other people to wait for you to work on something when you haven't > committed to a timetable for working on it. It might be unfair to ask > it even if you had committed to a timetable and that timetable was > well out in the future, but it's certainly unfair when there is no > timetable at all. I'm not embarrassed by discussing money but that doesn't make it my personal priority. I'm sure you didn't mean to imply I was mercenary. I've contributed to the community for years, mostly unpaid. Which means I do at some point have to take work that pays. If I have ever got paid for working on Postgres, it has always been at a much lower rate than I would have otherwise received, so if anything I've lost money by working on Postgres. My choice. I parted with EDB specifically to allow me to spend more time working on software for Postgres, which would otherwise have certainly been denied me. My choice, nobody else's and one that has worked well for me. I've chosen contribution to this community over money many, many times. The current team will continue working on HS; assistance from any and every other hacker is welcome in producing that. Not all effort is productive teamwork, however, and I encourage anyone that wishes to help on a project prior to patch submission to contact the patch author to discuss that first, to coordinate and avoid wasted effort. People interested in review and test need not make contact, since they'll have access to the code in the normal way. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] hot standby - merged up to CVS HEAD
On Sat, Aug 8, 2009 at 1:12 PM, Bruce Momjian wrote: > You are right you don't have to justify anything, but neither can you > claim ownership of the patch/feature and complain that others are > working on it too. This is a community project --- if you want your > patches to remain your property, I suggest you no longer post them to > our community lists. If you are actively working on patches, I assume > others will not duplicate your work, but if you are idle, others are > encouraged to keep improving the patch. Again, if you don't like that, > then perhaps the community-development process isn't for you. Simon, I think it would also be fair to point out that you keep saying that you're going to deliver this patch for 8.5, but you haven't provided any real timetable as to when you're going to start working on it or when it'll be completed. Because this patch IS so important to the community, people want to know the answers to those questions. That is exactly why you were asked about your schedule at PGcon; and you demurred. I understand that your #1 priority needs to be the work for which you get paid the most money, but I think it's unfair to ask other people to wait for you to work on something when you haven't committed to a timetable for working on it. It might be unfair to ask it even if you had committed to a timetable and that timetable was well out in the future, but it's certainly unfair when there is no timetable at all. The most recent discussion of the timing of this patch was that you opined it should go after Streaming Rep. Based on the review of Streaming Rep this CommitFest, I would say that there is an awful lot of work left to be done to make that patch committable. I think we will be lucky if it makes it into 8.5. Call me a pessimist but I think we'll be doing pretty well if it makes it into 8.6. I think the chances that we are going to get streaming rep committed and still have enough CommitFests left to get Hot Standby committed too are just about zero, so waiting for Streaming Rep to be committed first does not seem like a very realistic plan to me. Note that Streaming Rep got moved to returned with feedback on *the first day* of this CommitFest; that's how much work it took to see that it was not committable. Note also that the resistance to committing large patches is going to grow and grow as we get closer to the end of this development cycle. I am very much afraid that if we don't have a version of Hot Standby that is reviewable for the next CommitFest we are going to be out of luck for 8.5. I do not think that I have the juice to make Hot Standby happen. It's possible that I don't know my own strength, but I'm not prepared to bet on it. At least, it looks like I do have the juice to dust of the bitrot, and maybe fix some of the more superficial problems with it. I would like to think that is something helpful. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] hot standby - merged up to CVS HEAD
Simon Riggs wrote: > > On Sat, 2009-08-08 at 00:02 -0400, Bruce Momjian wrote: > > > > Also, to my knowledge, nobody has really looked through the results to > > > see if they are any good, so the success of the endeavor remains in > > > doubt from my point of view. That's a bit of a shame because I am > > > interested in putting some more time into this, but I don't have the > > > knowledge or experience to "fly solo" here. > > > > Well, Simon stated that your version should now be used as the most > > recent one, so I would call that a success. > > I'm not sure why you're stirring this up again. > > Simon didn't state that the above. You can re-read my words and we can > debate their meaning, but that's just a waste of time. You stated: - It's going to be very confusing if people submit their own versions of - it. So now we have mine, Heikki's and Robert's. I'd like this to stop - please, have a little faith and a little patience. Presumably Robert's - rebasing patch is best place to start from now for later work. I assume your last sentence is saying exactly that Robert's version should be used as the most current reprsentation of this feature patch. > I shouldn't have to publicly justify why I haven't finished working on a > patch, when a) we have time, b) it's summer and c) I've already said I > would finish the patch, very very clearly in a big loud voice. I expect > to finish and commit comfortably in 2009, leaving many months before > next release. > > So, as I said before, I expect to be left in peace to finish my own > work. There wouldn't be anything to finish if it wasn't for me. I > specifically don't want to review other people's versions of work when > I'm trying to do my own, nor do I expect others to encourage multiple > authors on the same piece of work. The bottom line is that you think you have ownership of the patch and the feature --- you do not. You are right you don't have to justify anything, but neither can you claim ownership of the patch/feature and complain that others are working on it too. This is a community project --- if you want your patches to remain your property, I suggest you no longer post them to our community lists. If you are actively working on patches, I assume others will not duplicate your work, but if you are idle, others are encouraged to keep improving the patch. Again, if you don't like that, then perhaps the community-development process isn't for you. And your misunderstanding in this area is exactly why I am bringing this up. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] hot standby - merged up to CVS HEAD
On Sat, 2009-08-08 at 00:02 -0400, Bruce Momjian wrote: > > Also, to my knowledge, nobody has really looked through the results to > > see if they are any good, so the success of the endeavor remains in > > doubt from my point of view. That's a bit of a shame because I am > > interested in putting some more time into this, but I don't have the > > knowledge or experience to "fly solo" here. > > Well, Simon stated that your version should now be used as the most > recent one, so I would call that a success. I'm not sure why you're stirring this up again. Simon didn't state that the above. You can re-read my words and we can debate their meaning, but that's just a waste of time. I shouldn't have to publicly justify why I haven't finished working on a patch, when a) we have time, b) it's summer and c) I've already said I would finish the patch, very very clearly in a big loud voice. I expect to finish and commit comfortably in 2009, leaving many months before next release. So, as I said before, I expect to be left in peace to finish my own work. There wouldn't be anything to finish if it wasn't for me. I specifically don't want to review other people's versions of work when I'm trying to do my own, nor do I expect others to encourage multiple authors on the same piece of work. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] hot standby - merged up to CVS HEAD
On Sat, Aug 8, 2009 at 12:02 AM, Bruce Momjian wrote: > Well, Simon stated that your version should now be used as the most > recent one, so I would call that a success. Fair enough, but it still needs more work. I had some review comments I was hoping to get responses to, in the section beginning with "A few other comments based on a preliminary reading of this patch": http://archives.postgresql.org/pgsql-hackers/2009-07/msg00854.php ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] hot standby - merged up to CVS HEAD
On Fri, Aug 7, 2009 at 11:33 PM, Bruce Momjian wrote: > Third, Robert, you should have communicated to the list that you were > going to work on the patch, so that there would not be duplicate effort > if someone else was also working on it. As I understood it, Heikki was > in control of the patch, but it doesn't hurt to send out a short email > stating you wanted to work on it now. In this case no one was working > on it, but if someone had been, there would have been duplicate effort > and that is disappointing to everyone. Odds are when you started on the > patch you didn't realize you would be overhauling it, but once that > became clear, an email to hackers, ideally CC'ing the original patch > authors, would have been a good idea. Simon asked me about this offlist as well: I'll repeat the gist of what I said to him here. I really wasn't sure how far I was going to be able to get with this, and I didn't put more work into it before sending it than the amount of time I was willing to waste on it. I figured that I was fairly safe because there had been no activity for 5 months, but if I had been wrong, I was prepared to accept that. I thought it would be a little arrogant to say I was going to work on the patch without having any idea whether I was going to be able to do anything useful with it; since I ended up getting something that may be useful done, it now seems like I should've said something, but that was a little less obvious at the time. At any rate, I am sensitive to the issue and will try to handle it better the next time. Also, to my knowledge, nobody has really looked through the results to see if they are any good, so the success of the endeavor remains in doubt from my point of view. That's a bit of a shame because I am interested in putting some more time into this, but I don't have the knowledge or experience to "fly solo" here. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] hot standby - merged up to CVS HEAD
Robert Haas wrote: > On Fri, Aug 7, 2009 at 11:33 PM, Bruce Momjian wrote: > > Third, Robert, you should have communicated to the list that you were > > going to work on the patch, so that there would not be duplicate effort > > if someone else was also working on it. ?As I understood it, Heikki was > > in control of the patch, but it doesn't hurt to send out a short email > > stating you wanted to work on it now. ?In this case no one was working > > on it, but if someone had been, there would have been duplicate effort > > and that is disappointing to everyone. ?Odds are when you started on the > > patch you didn't realize you would be overhauling it, but once that > > became clear, an email to hackers, ideally CC'ing the original patch > > authors, would have been a good idea. > > Simon asked me about this offlist as well: I'll repeat the gist of > what I said to him here. I really wasn't sure how far I was going to > be able to get with this, and I didn't put more work into it before > sending it than the amount of time I was willing to waste on it. I > figured that I was fairly safe because there had been no activity for > 5 months, but if I had been wrong, I was prepared to accept that. I > thought it would be a little arrogant to say I was going to work on > the patch without having any idea whether I was going to be able to do > anything useful with it; since I ended up getting something that may > be useful done, it now seems like I should've said something, but that > was a little less obvious at the time. At any rate, I am sensitive to > the issue and will try to handle it better the next time. Yea, it is a learning experience. > Also, to my knowledge, nobody has really looked through the results to > see if they are any good, so the success of the endeavor remains in > doubt from my point of view. That's a bit of a shame because I am > interested in putting some more time into this, but I don't have the > knowledge or experience to "fly solo" here. Well, Simon stated that your version should now be used as the most recent one, so I would call that a success. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] hot standby - merged up to CVS HEAD
Joshua D. Drake wrote: > On Wed, 2009-07-15 at 17:27 +0100, Simon Riggs wrote: > > On Tue, 2009-07-14 at 21:12 -0400, Robert Haas wrote: > > > > > > It's going to be very confusing if people submit their own versions of > > it. So now we have mine, Heikki's and Robert's. I'd like this to stop > > please, have a little faith and a little patience. Presumably Robert's > > rebasing patch is best place to start from now for later work. > > > > Robert, thank you for your work. > > Simon you need to realize that a lot of people really want this patch. I > for one applaud Robert's work (and Heikki's). If you want a summer > holiday, go for it. I certainly haven't been working that hard this > summer. > > However, I certainly don't think that is any reason for people who are > showing initiation and drive should stop. If Robert wants to work on > this patch, more power to him. Perhaps he can solve something you can't. > Perhaps it will be done before you are done with holiday. If not, then > at least we have moved a little further in the process and in theory > taken some workload off of you. Sorry to be chiming in weeks late on this, but there are some procedural issues I want to address. First, I agree with everything Joshua Drake said, and thank you for chiming in on this. Second, Simon, you seem disappointed that Robert Haas helped with the hot standby patch. By your own admission you weren't working on it, so I would think you would be grateful that someone moved it forward. This is not a question of 'faith' and 'patience', but getting the patch completed. The goal is to get the patch completed, not for a single individual to complete it. Third, Robert, you should have communicated to the list that you were going to work on the patch, so that there would not be duplicate effort if someone else was also working on it. As I understood it, Heikki was in control of the patch, but it doesn't hurt to send out a short email stating you wanted to work on it now. In this case no one was working on it, but if someone had been, there would have been duplicate effort and that is disappointing to everyone. Odds are when you started on the patch you didn't realize you would be overhauling it, but once that became clear, an email to hackers, ideally CC'ing the original patch authors, would have been a good idea. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] hot standby - merged up to CVS HEAD
On Wed, 2009-07-15 at 17:27 +0100, Simon Riggs wrote: > On Tue, 2009-07-14 at 21:12 -0400, Robert Haas wrote: > > > It's going to be very confusing if people submit their own versions of > it. So now we have mine, Heikki's and Robert's. I'd like this to stop > please, have a little faith and a little patience. Presumably Robert's > rebasing patch is best place to start from now for later work. > Robert, thank you for your work. Simon you need to realize that a lot of people really want this patch. I for one applaud Robert's work (and Heikki's). If you want a summer holiday, go for it. I certainly haven't been working that hard this summer. However, I certainly don't think that is any reason for people who are showing initiation and drive should stop. If Robert wants to work on this patch, more power to him. Perhaps he can solve something you can't. Perhaps it will be done before you are done with holiday. If not, then at least we have moved a little further in the process and in theory taken some workload off of you. Joshua D. Drake -- PostgreSQL - XMPP: jdr...@jabber.postgresql.org Consulting, Development, Support, Training 503-667-4564 - http://www.commandprompt.com/ The PostgreSQL Company, serving since 1997 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] hot standby - merged up to CVS HEAD
On Tue, 2009-07-14 at 21:12 -0400, Robert Haas wrote: > > 1. Downloaded norecoveryprocs-1.patch from > http://archives.postgresql.org/message-id/49a64d73.6090...@enterprisedb.com http://archives.postgresql.org/message-id/4a4dbf8f.8040...@enterprisedb.com I have to confess that I had no idea that the above discussion had taken place. (The title wasn't anything to do with Hot Standby, nor did anyone copy me in; I don't read every email). I've said very clearly that I would work on this for 8.5 [at the developer meeting] and also that it wouldn't be ready for the first commit fest, when asked. I was told recently that someone heard the patch was dead; I've never said that, but I would like a summer holiday. It's going to be very confusing if people submit their own versions of it. So now we have mine, Heikki's and Robert's. I'd like this to stop please, have a little faith and a little patience. Presumably Robert's rebasing patch is best place to start from now for later work. Welcome to add notes here http://wiki.postgresql.org/wiki/Hot_Standby On other points: there was a bug related to subtransaction handling in the initial startup info of Hot Standby. It could not have been committed without that being fixed. In my own recent review, I've noted two design flaw bugs: * AccessExclusiveLocks held at startup are not properly initialised * AccessExclusiveLocks held by prepared transactions are not handled correctly at termination of recovery - the lock owner needs to be transferred or we redesign somehow I'm sure it needs much work yet. Not least of which will be re-checking all of the previous bugs to ensure no regressions in translation. Thanks, -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, running xacts, subtransactions
On Tuesday 03 March 2009 03:22:30 Simon Riggs wrote: > On Mon, 2009-03-02 at 21:11 -0500, Robert Treat wrote: > > On Wednesday 25 February 2009 16:43:54 Simon Riggs wrote: > > > On Wed, 2009-02-25 at 13:33 -0800, Josh Berkus wrote: > > > > > You raised that as an annoyance previously because it means that > > > > > connection in hot standby mode may be delayed in cases of heavy, > > > > > repeated use of significant numbers of subtransactions. > > > > > > > > While most users still don't use explicit subtransactions at all, > > > > wouldn't this also affect users who use large numbers of stored > > > > procedures? > > > > > > If they regularly use more than 64 levels of nested EXCEPTION clauses > > > *and* they start their base backups during heavy usage of those stored > > > procedures, then yes. > > > > We have stored procedrues that loop over thousands of records, with > > begin...exception blocks in that loop, so I think we do that. AFAICT > > there's no way to tell if you have it wrong until you fire up the standby > > (ie. you can't tell at the time you make your base backup), right ? > > That was supposed to be a simplification for phase one, not a barrier > for all time. > Understood; I only mention it because it's usually good to know how quickly we run into some of these cases that we don't think will be common. > I'm changing that now, though the effect will be that in some cases we > take longer before we accept connections. The initialisation > requirements are that we have full knowledge of transactions in progress > before we allow snapshots to be taken. > That seems pretty reasonable; hopefully people aren't setting up hot standy machines as an emergency scaling technique :-) -- Robert Treat Conjecture: http://www.xzilla.net Consulting: http://www.omniti.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, running xacts, subtransactions
On Mon, 2009-03-02 at 21:11 -0500, Robert Treat wrote: > On Wednesday 25 February 2009 16:43:54 Simon Riggs wrote: > > On Wed, 2009-02-25 at 13:33 -0800, Josh Berkus wrote: > > > > You raised that as an annoyance previously because it means that > > > > connection in hot standby mode may be delayed in cases of heavy, > > > > repeated use of significant numbers of subtransactions. > > > > > > While most users still don't use explicit subtransactions at all, > > > wouldn't this also affect users who use large numbers of stored > > > procedures? > > > > If they regularly use more than 64 levels of nested EXCEPTION clauses > > *and* they start their base backups during heavy usage of those stored > > procedures, then yes. > > > > We have stored procedrues that loop over thousands of records, with > begin...exception blocks in that loop, so I think we do that. AFAICT there's > no way to tell if you have it wrong until you fire up the standby (ie. you > can't tell at the time you make your base backup), right ? That was supposed to be a simplification for phase one, not a barrier for all time. I'm changing that now, though the effect will be that in some cases we take longer before we accept connections. The initialisation requirements are that we have full knowledge of transactions in progress before we allow snapshots to be taken. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, running xacts, subtransactions
On Wednesday 25 February 2009 16:43:54 Simon Riggs wrote: > On Wed, 2009-02-25 at 13:33 -0800, Josh Berkus wrote: > > > You raised that as an annoyance previously because it means that > > > connection in hot standby mode may be delayed in cases of heavy, > > > repeated use of significant numbers of subtransactions. > > > > While most users still don't use explicit subtransactions at all, > > wouldn't this also affect users who use large numbers of stored > > procedures? > > If they regularly use more than 64 levels of nested EXCEPTION clauses > *and* they start their base backups during heavy usage of those stored > procedures, then yes. > We have stored procedrues that loop over thousands of records, with begin...exception blocks in that loop, so I think we do that. AFAICT there's no way to tell if you have it wrong until you fire up the standby (ie. you can't tell at the time you make your base backup), right ? -- Robert Treat Conjecture: http://www.xzilla.net Consulting: http://www.omniti.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby - >8.5
Heikki Linnakangas wrote: > As discussed at > http://archives.postgresql.org/message-id/603c8f070902251956s16eee4a7l495d75d3ddccc...@mail.gmail.com, > > it's time to stop pushing hot standby into 8.4, take the time to work > out the remaining details, and schedule it for 8.5. It will be a great > feature when it's released, and in 8.5 we should also have the built-in > replication capability, making it even greater. > > I'll start looking at other patches, to help with getting 8.4 out of the > door. Meanwhile, I'll also continue to review and discuss any updates to > Hot Standby, but at a lower priority. I'm expecting it to be ready for > the first or second commit fest of 8.5, meaning we have the full cycle > left to iron out any bugs or usability issues. Disappointing, but sounds good. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, recovery infra
Simon Riggs wrote: On Thu, 2009-02-26 at 20:38 +0200, Heikki Linnakangas wrote: I think we should simply remove the signal handler for SIGQUIT from pg_standby. If you do this, please make it release dependent so pg_standby behaves correctly for the release it is being used with. Hmm, I don't think there's a way for pg_standby to know which version of PostgreSQL is calling it. Assuming there is, how would you want it to behave? If you want no change in behavior in old releases, can't we just leave it unfixed in back-branches? In fact, it seems more useful to not detect the server version, so that if you do want the new behavior, you can use a 8.4 pg_standby against a 8.3 server. In back-branches, I think we need to decide between fixing this, at the risk of breaking someone's script that is using "killall -QUIT pg_standby" or similar to trigger failover, and leaving it as it is knowing that immediate shutdown doesn't work on a standby server. I'm not sure which is best. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, recovery infra
On Thu, 2009-02-26 at 20:38 +0200, Heikki Linnakangas wrote: > I think we should simply remove the signal handler for SIGQUIT from > pg_standby. If you do this, please make it release dependent so pg_standby behaves correctly for the release it is being used with. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, recovery infra
Hi, On Fri, Feb 27, 2009 at 3:38 AM, Heikki Linnakangas wrote: > I think the real problem here is that pg_standby traps SIGQUIT. The startup > process doesn't receive the SIGQUIT because it's in system(), and pg_standby > doesn't propagate it to the startup process either because it traps it. Yes, you are right. > I think we should simply remove the signal handler for SIGQUIT from > pg_standby. +1 > I don't see how that helps, as we already have this in there: > >signaled = WIFSIGNALED(rc) || WEXITSTATUS(rc) > 125; > >ereport(signaled ? FATAL : DEBUG2, >(errmsg("could not restore file \"%s\" from archive: return > code %d", >xlogfname, rc))); > > which means we already ereport(FATAL) if the restore command dies with > SIGQUIT. SIGQUIT should kill the process immediately, so I think that the startup process as well as other auxiliary process should call exit(2) instead of ereport(FATAL). Right? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, recovery infra
Fujii Masao wrote: On Fri, Jan 30, 2009 at 7:47 PM, Simon Riggs wrote: That whole area was something I was leaving until last, since immediate shutdown doesn't work either, even in HEAD. (Fujii-san and I discussed this before Christmas, briefly). This problem remains in current HEAD. I mean, immediate shutdown may be unable to kill the startup process because system() which executes restore_command ignores SIGQUIT while waiting. When I tried immediate shutdown during recovery, only the startup process survived. This is undesirable behavior, I think. Yeah, we need to fix that. The following code should be added into RestoreArchivedFile()? if (WTERMSIG(rc) == SIGQUIT) exit(2); I don't see how that helps, as we already have this in there: signaled = WIFSIGNALED(rc) || WEXITSTATUS(rc) > 125; ereport(signaled ? FATAL : DEBUG2, (errmsg("could not restore file \"%s\" from archive: return code %d", xlogfname, rc))); which means we already ereport(FATAL) if the restore command dies with SIGQUIT. I think the real problem here is that pg_standby traps SIGQUIT. The startup process doesn't receive the SIGQUIT because it's in system(), and pg_standby doesn't propagate it to the startup process either because it traps it. I think we should simply remove the signal handler for SIGQUIT from pg_standby. Or will that lead to core dump by default? In that case, we need pg_standby to exit(128) or similar, so that RestoreArchivedFile understands that the command was killed by a signal. Another approach is to check that the postmaster is still alive, like we do in walwriter and bgwriter: /* * Emergency bailout if postmaster has died. This is to avoid the * necessity for manual cleanup of all postmaster children. */ if (!PostmasterIsAlive(true)) exit(1); However, I'm afraid there's a race condition with that. If we do that right after system(), postmaster might've signaled us but not exited yet. We could check that in the main loop, but if we wrongly interpret the exit of the recovery command as a "file not found - go ahead and start up", the damage might be done by the time we notice that the postmaster is gone. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, recovery infra
Hi, On Fri, Jan 30, 2009 at 7:47 PM, Simon Riggs wrote: > That whole area was something I was leaving until last, since immediate > shutdown doesn't work either, even in HEAD. (Fujii-san and I discussed > this before Christmas, briefly). This problem remains in current HEAD. I mean, immediate shutdown may be unable to kill the startup process because system() which executes restore_command ignores SIGQUIT while waiting. When I tried immediate shutdown during recovery, only the startup process survived. This is undesirable behavior, I think. The following code should be added into RestoreArchivedFile()? if (WTERMSIG(rc) == SIGQUIT) exit(2); Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, recovery procs
On Thu, 2009-02-26 at 12:19 +0200, Heikki Linnakangas wrote: > Simon Riggs wrote: > > On Thu, 2009-02-26 at 11:36 +0200, Heikki Linnakangas wrote: > > > >>> You haven't even given a good reason to make these changes. > >> Simplicity. > > > > You used that argument in January to explain why the coupling should be > > reduced and now the same argument to put it back again. > > That was in reference to the slot ids, I'm not suggesting to put that > back. If anything, removing the need for the the xl_topxid field in WAL > record will further reduce the coupling between master and standby. OK, well, if you feel those changes are necessary prior to commit then I would ask you do that in your public repo and we'll test and provide helpful comments on it from there as quickly as we can. Too many cooks spoil the git. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, recovery procs
Simon Riggs wrote: On Thu, 2009-02-26 at 11:36 +0200, Heikki Linnakangas wrote: You haven't even given a good reason to make these changes. Simplicity. You used that argument in January to explain why the coupling should be reduced and now the same argument to put it back again. That was in reference to the slot ids, I'm not suggesting to put that back. If anything, removing the need for the the xl_topxid field in WAL record will further reduce the coupling between master and standby. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, recovery procs
On Thu, 2009-02-26 at 11:36 +0200, Heikki Linnakangas wrote: > > You haven't even given a good reason to make these changes. > > Simplicity. You used that argument in January to explain why the coupling should be reduced and now the same argument to put it back again. > > We don't have time to make this change and then shake out everything > > else that will break as a result. Are you suggesting that you will make > > these changes and then follow up on all other breakages? Forcing this > > request seems like a great way to cancel this patch, since it will be > > marked as "author refused to make change". > > I'm not suggesting anything to be canceled. I simply think these are > changes that should be made. I wish you could make them, because that > means less work for me. But if you're not willing to, I can pick it up > myself. When you review my code, you make many useful suggestions and I am very thankful. Testing can't find out some of those things. My feeling is that you are now concentrating on things that are optional, yet will have a huge potential for negative impact. If I could please draw your review efforts to other parts of the patch, I would be happy to return to these parts later. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, recovery procs
Simon Riggs wrote: On Thu, 2009-02-26 at 10:04 +0200, Heikki Linnakangas wrote: we keep track of which xids have already been "reported" in the WAL (similar to what you had in an earlier version of the patch) You objected to doing exactly that earlier. I'm talking about the "xidMarkedInWAL" and "hasUnMarkedSubXids" fields you had in TransactionState, at least still in version hs.v7.20090112_1.tar.bz2 of the patch. I objected to adding the corresponding flags in the WAL header, and that made tracking the status in TransactionState obsolete in the patch too, since it wasn't used for anything anymore. There's nothing wrong per se about tracking the "marked" or "reported" status in master. You haven't even given a good reason to make these changes. Simplicity. We don't have time to make this change and then shake out everything else that will break as a result. Are you suggesting that you will make these changes and then follow up on all other breakages? Forcing this request seems like a great way to cancel this patch, since it will be marked as "author refused to make change". I'm not suggesting anything to be canceled. I simply think these are changes that should be made. I wish you could make them, because that means less work for me. But if you're not willing to, I can pick it up myself. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, recovery procs
On Thu, 2009-02-26 at 10:04 +0200, Heikki Linnakangas wrote: > we keep track of which xids > have already been "reported" in the WAL (similar to what you had in an > earlier version of the patch) You objected to doing exactly that earlier. Why is it OK to do it now that you are proposing it? You haven't even given a good reason to make these changes. We don't have time to make this change and then shake out everything else that will break as a result. Are you suggesting that you will make these changes and then follow up on all other breakages? Forcing this request seems like a great way to cancel this patch, since it will be marked as "author refused to make change". You have spotted a problem elsewhere and I am working to fix that now. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, recovery procs
Simon Riggs wrote: On Tue, 2009-02-24 at 21:59 +0200, Heikki Linnakangas wrote: I think if I had not made those into procs you would have said that they are so similar it would aid code readability to have them be the same. And in fact I suggested earlier that we get rid of the unobserved xids array, and only use recovery procs. Last week, I think. Why are these tweaks so important? Heh, actually, I went searching my mail for when I had suggested that, and found that in fact I proposed this exact same method of using the unobserved xids array only back in October: http://archives.postgresql.org/message-id/48f76342.5070...@enterprisedb.com I had since forgotten all about, but now came up with the same idea again during review. In the first reply in that thread you said that "The main problem is fatal errors that don't write abort records. By reusing the PROC entries we can keep those to a manageable limit". We're not worried about that anymore. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, recovery procs
Simon Riggs wrote: On Tue, 2009-02-24 at 21:59 +0200, Heikki Linnakangas wrote: What benefit would we gain from separating them, especially since we now have working, tested code? Simplicity. That matters a lot. Removing the distinction between unobserved xids and already-observed running transactions would slash a lot of code. It might and it might not, but I don't believe all angles have been evaluated. But I would say that major changes such as this have resulted in weeks of work. More bugs have been introduced since feature freeze than were present beforehand. Here's a rough sketch of how the transaction tracking could work without recovery procs, relying on unobserved xids array only. The "unobserved xids" is a complete misnomer now, as it tracks all master-transactions, and there's no distinction between observed and unobserved ones. Another big change in this patch is the way xl_xact_assignment records work. Instead of issuing one such WAL record for each subtransaction when they're being assigned recursively, we keep track of which xids have already been "reported" in the WAL (similar to what you had in an earlier version of the patch). Whenever you hit the limit of 64 unreported subxids, you issue a single WAL record listing all the unreported subxids of this top-level transactions, and mark them as reported. The limit of 64 is chosen arbitrarily, but it should match the number of slots in the unobserved xids array per backend, to avoid running out of slots. This eliminates the need for the xl_topxid field in the WAL record header. I think one WAL record per 64 assigned subtransactions is a small price to pay, considering that a transaction with that many subtransactions is probably doing some interesting work anyway, and the volume of those assignment WAL records is lost in the noise of all the other WAL records the transactions issues. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, running xacts, subtransactions
On Wed, 2009-02-25 at 13:33 -0800, Josh Berkus wrote: > > You raised that as an annoyance previously because it means that > > connection in hot standby mode may be delayed in cases of heavy, > > repeated use of significant numbers of subtransactions. > > While most users still don't use explicit subtransactions at all, > wouldn't this also affect users who use large numbers of stored procedures? If they regularly use more than 64 levels of nested EXCEPTION clauses *and* they start their base backups during heavy usage of those stored procedures, then yes. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, running xacts, subtransactions
You raised that as an annoyance previously because it means that connection in hot standby mode may be delayed in cases of heavy, repeated use of significant numbers of subtransactions. While most users still don't use explicit subtransactions at all, wouldn't this also affect users who use large numbers of stored procedures? --Josh Berkus -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, running xacts, subtransactions
On Wed, 2009-02-25 at 23:08 +0200, Heikki Linnakangas wrote: > > > > That is exactly the reason why we don't treat an overflowed snapshot as > > a valid starting point. > > We don't? I don't see anything stopping it. In GetRunningTransactionData() we explicitly set latestRunningXid to InvalidTransactionId if the snapshot is overflowed. That prevents the snapshot from being used to initialise the recovery procs. I'll document that better. You raised that as an annoyance previously because it means that connection in hot standby mode may be delayed in cases of heavy, repeated use of significant numbers of subtransactions. My answer was that there is a way to avoid that but it complicates things and I'm trying my best to avoid complexity in the first release, yet still have it work (this decade :-)) -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, running xacts, subtransactions
Simon Riggs wrote: On Wed, 2009-02-25 at 22:39 +0200, Heikki Linnakangas wrote: When we take the snapshot of running transactions in the master, in GetRunningTransactionData(), it only includes top-level xids and those subxids that are in the subxid caches. Overflowed subxids are not included. Isn't that a problem? When the standby initializes the recovery procs using the running xacts information, pg_subtrans doesn't isn't set for the overflowed xids, because that information is not included in the WAL record. If you're lucky, the information is there already, but we don't generally guarantee pg_subtrans to survive crash or restart. That is exactly the reason why we don't treat an overflowed snapshot as a valid starting point. We don't? I don't see anything stopping it. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, recovery procs
On Tue, 2009-02-24 at 23:41 +, Simon Riggs wrote: > On Tue, 2009-02-24 at 22:29 +0200, Heikki Linnakangas wrote: > > overwrites subxids array, and will resurrect any already aborted > > subtransaction. > > > > Isn't XLByteLT(proc->lsn, lsn) always true, because 'lsn' is the lsn of > > the WAL record we're redoing, so there can't be any procs with an LSN > > higher than that? > > I'm wondering whether we need those circumstances at all. > > The main role of ProcArrayUpdateRecoveryTransactions() is two-fold > * initialise snapshot when there isn't one > * reduce possibility of FATAL errors that don't write abort records > > Neither of those needs us to update the subxid cache, so we'd be better > off avoiding that altogether in the common case. So we should be able to > ignore the lsn and race conditions altogether. We still have a race condition for the initial snapshot, so your concern still holds. Thanks for highlighting it. I'm in the middle of rewriting ProcArrayUpdateRecoveryTransactions() to avoid errors caused by these race conditions. The LSN flag was an attempt to do that, but was insufficient and has now been removed. I'll discuss it more when I've got it working. Seems like we need working code now rather than lengthy debates. I see a solution and almost have it done. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, running xacts, subtransactions
On Wed, 2009-02-25 at 22:39 +0200, Heikki Linnakangas wrote: > When we take the snapshot of running transactions in the master, in > GetRunningTransactionData(), it only includes top-level xids and those > subxids that are in the subxid caches. Overflowed subxids are not > included. Isn't that a problem? When the standby initializes the > recovery procs using the running xacts information, pg_subtrans doesn't > isn't set for the overflowed xids, because that information is not > included in the WAL record. If you're lucky, the information is there > already, but we don't generally guarantee pg_subtrans to survive crash > or restart. That is exactly the reason why we don't treat an overflowed snapshot as a valid starting point. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, recovery procs
On Tue, 2009-02-24 at 22:29 +0200, Heikki Linnakangas wrote: > Oh, right... But we have the same problem with the subtransactions, > don't we? This block: > > > /* > > * If our state information is later for this proc, then > > * overwrite it. It's possible for a commit and possibly > > * a new transaction record to have arrived in WAL in between > > * us doing GetRunningTransactionData() and grabbing the > > * WALInsertLock, so we musn't assume we always know best. > > */ > > if (XLByteLT(proc->lsn, lsn)) > > { > > TransactionId *subxip = (TransactionId *) > > &(xlrec->xrun[xlrec->xcnt]); > > > > proc->lsn = lsn; > > /* proc-> pid stays 0 for Recovery Procs */ > > > > proc->subxids.nxids = rxact[xid_index].nsubxids; > > proc->subxids.overflowed = rxact[xid_index].overflowed; > > > > memcpy(proc->subxids.xids, subxip, > > rxact[xid_index].nsubxids * > > sizeof(TransactionId)); > > > > /* Remove subtransactions from UnobservedXids also */ > > if (unobserved) > > { > > for (index = 0; index < > > rxact[xid_index].nsubxids; index++) > > > > UnobservedTransactionsRemoveXid(subxip[index + > > rxact[xid_index].subx_offset], false); > > } > > } > > overwrites subxids array, and will resurrect any already aborted > subtransaction. > > Isn't XLByteLT(proc->lsn, lsn) always true, because 'lsn' is the lsn of > the WAL record we're redoing, so there can't be any procs with an LSN > higher than that? I'm wondering whether we need those circumstances at all. The main role of ProcArrayUpdateRecoveryTransactions() is two-fold * initialise snapshot when there isn't one * reduce possibility of FATAL errors that don't write abort records Neither of those needs us to update the subxid cache, so we'd be better off avoiding that altogether in the common case. So we should be able to ignore the lsn and race conditions altogether. It might even be more helpful to explicitly separate those twin roles so the code is clearer. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, recovery procs
On Tue, 2009-02-24 at 21:59 +0200, Heikki Linnakangas wrote: > > I think if I had not made those into procs you would have said that they > > are so similar it would aid code readability to have them be the same. > > And in fact I suggested earlier that we get rid of the unobserved xids > array, and only use recovery procs. Last week, I think. Why are these tweaks so important? Checking pg_subtrans for every call to XidInMVCCSnapshot will destroy performance, as well you know. > > What benefit would we gain from separating them, especially since we now > > have working, tested code? > > Simplicity. That matters a lot. Removing the distinction between > unobserved xids and already-observed running transactions would slash a > lot of code. It might and it might not, but I don't believe all angles have been evaluated. But I would say that major changes such as this have resulted in weeks of work. More bugs have been introduced since feature freeze than were present beforehand. If you want this code to fail, then twisting it in lots of directions every week is exactly the way to do that. Neither of us will understand how it works and we'll take more weeks for it to settle down to the point of reviewability again. We don't have weeks any more. So far I've made every change you've asked, but there is a reasonable limit. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, recovery procs
Simon Riggs wrote: On Tue, 2009-02-24 at 21:59 +0200, Heikki Linnakangas wrote: We only need the lsn atrribute because we when we take the snapshot of running xids, we don't write it to the WAL immediately, and a new transaction might begin after that. If we close that gap in the master, we don't need the lsn in recovery procs. Actually, I think the patch doesn't get that right as it stands: 0. Transactions 1 is running in master 1. Get list of running transactions 2. Transaction 1 commits. 3. List of running xacts is written to WAL When the standby replays the xl_running_xacts record, it will create a recovery proc and mark the transaction as running again, even though it has already committed. No, because we check whether TransactionIdDidCommit(). Oh, right... But we have the same problem with the subtransactions, don't we? This block: /* * If our state information is later for this proc, then * overwrite it. It's possible for a commit and possibly * a new transaction record to have arrived in WAL in between * us doing GetRunningTransactionData() and grabbing the * WALInsertLock, so we musn't assume we always know best. */ if (XLByteLT(proc->lsn, lsn)) { TransactionId *subxip = (TransactionId *) &(xlrec->xrun[xlrec->xcnt]); proc->lsn = lsn; /* proc-> pid stays 0 for Recovery Procs */ proc->subxids.nxids = rxact[xid_index].nsubxids; proc->subxids.overflowed = rxact[xid_index].overflowed; memcpy(proc->subxids.xids, subxip, rxact[xid_index].nsubxids * sizeof(TransactionId)); /* Remove subtransactions from UnobservedXids also */ if (unobserved) { for (index = 0; index < rxact[xid_index].nsubxids; index++) UnobservedTransactionsRemoveXid(subxip[index + rxact[xid_index].subx_offset], false); } } overwrites subxids array, and will resurrect any already aborted subtransaction. Isn't XLByteLT(proc->lsn, lsn) always true, because 'lsn' is the lsn of the WAL record we're redoing, so there can't be any procs with an LSN higher than that? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, recovery procs
On Tue, 2009-02-24 at 21:59 +0200, Heikki Linnakangas wrote: > We only need the lsn atrribute because we when we take the snapshot > of > running xids, we don't write it to the WAL immediately, and a new > transaction might begin after that. If we close that gap in the > master, > we don't need the lsn in recovery procs. > > Actually, I think the patch doesn't get that right as it stands: > > 0. Transactions 1 is running in master > 1. Get list of running transactions > 2. Transaction 1 commits. > 3. List of running xacts is written to WAL > > When the standby replays the xl_running_xacts record, it will create > a > recovery proc and mark the transaction as running again, even though > it > has already committed. No, because we check whether TransactionIdDidCommit(). -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, recovery procs
Simon Riggs wrote: On Tue, 2009-02-24 at 10:40 +0200, Heikki Linnakangas wrote: (back to reviewing the main hot standby patch at last) Why do we need recovery procs? AFAICS the only fields that we use are xid and the subxid cache. Now that we also have the unobserved xids array, why don't we use it to track all transactions in the master, not just the unobserved ones. We need an array of objects defined in shared memory that has a top-level xid and a subxid cache. Not really. The other transactions, taking snapshots, don't need to distinguish top-level xids and subxids. That's why the unobserved xids array works to begin with. We only need a list of running (sub)transaction ids. Which is exactly what unobservedxids array is. The startup process can track the parent-child relationships in private memory if it needs to. But I can't immediately see why it would need to: commit and abort records list all the subtransactions. To keep the unobserved xids array bounded, when we find out about a parent-child relationship, via an xact-assignment record or via the xid and top-level xid fields in other WAL records, we can simply use SubtransSetParent. To keep it real simple, we can stipulate that you always check subtrans in XidIdInMVCCSnapshot while in hot standby mode. That object also needs an lsn attribute. We need code that adds these, removes them and adds the data onto snapshots in almost identical ways to current procarray code. We only need the lsn atrribute because we when we take the snapshot of running xids, we don't write it to the WAL immediately, and a new transaction might begin after that. If we close that gap in the master, we don't need the lsn in recovery procs. Actually, I think the patch doesn't get that right as it stands: 0. Transactions 1 is running in master 1. Get list of running transactions 2. Transaction 1 commits. 3. List of running xacts is written to WAL When the standby replays the xl_running_xacts record, it will create a recovery proc and mark the transaction as running again, even though it has already committed. PS. This line in the same function (ProcArrayUpdateRecoveryTransactions) seems wrong as well: memcpy(proc->subxids.xids, subxip, rxact[xid_index].nsubxids * sizeof(TransactionId)); I don't think "subxip" is correct for the 2d argument. I think if I had not made those into procs you would have said that they are so similar it would aid code readability to have them be the same. And in fact I suggested earlier that we get rid of the unobserved xids array, and only use recovery procs. What benefit would we gain from separating them, especially since we now have working, tested code? Simplicity. That matters a lot. Removing the distinction between unobserved xids and already-observed running transactions would slash a lot of code. I appreciate your testing, but it's not like it has gone through years of usage in the field. This is not the case of "if it ain't broken, don't fix it". The code that's in the patch is not in production yet, and now is precisely the right time to get it right, before it goes into the "if it ain't broke, don't fix it" mode. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, recovery procs
On Tue, 2009-02-24 at 10:40 +0200, Heikki Linnakangas wrote: > (back to reviewing the main hot standby patch at last) > > Why do we need recovery procs? AFAICS the only fields that we use are > xid and the subxid cache. Now that we also have the unobserved xids > array, why don't we use it to track all transactions in the master, not > just the unobserved ones. We need an array of objects defined in shared memory that has a top-level xid and a subxid cache. That object also needs an lsn attribute. We need code that adds these, removes them and adds the data onto snapshots in almost identical ways to current procarray code. Those objects live and die completely differently to unobservedxids, which don't need (nor can they have) the more complex data structure. I think if I had not made those into procs you would have said that they are so similar it would aid code readability to have them be the same. What benefit would we gain from separating them, especially since we now have working, tested code? -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, recovery infra
On Wed, 2009-02-18 at 18:01 +0200, Heikki Linnakangas wrote: > Simon Riggs wrote: > > On Wed, 2009-02-18 at 14:26 +0200, Heikki Linnakangas wrote: > > > >> The outer "if" should ensure that it isn't printed repeatedly on an idle > >> system. > > > > Regrettably not. > > Ok, committed. Cool. > I fixed that and some comment changes. I also renamed > IsRecoveryProcessingMode() to RecoveryInProgress(), to avoid confusion > with the "real" processing modes defined in miscadmin.h. That will > probably cause you merge conflicts in the hot standby patch, but it > should be a matter of search-replace to fix. Yep, good change, agree with reasons. > The changes need to be documented. At least the removal of > log_restartpoints is a clear user-visible change. Yep. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, recovery infra
Simon Riggs wrote: On Wed, 2009-02-18 at 14:26 +0200, Heikki Linnakangas wrote: The outer "if" should ensure that it isn't printed repeatedly on an idle system. Regrettably not. Ok, committed. I fixed that and some comment changes. I also renamed IsRecoveryProcessingMode() to RecoveryInProgress(), to avoid confusion with the "real" processing modes defined in miscadmin.h. That will probably cause you merge conflicts in the hot standby patch, but it should be a matter of search-replace to fix. The changes need to be documented. At least the removal of log_restartpoints is a clear user-visible change. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, recovery infra
On Wed, 2009-02-18 at 14:26 +0200, Heikki Linnakangas wrote: > The outer "if" should ensure that it isn't printed repeatedly on an idle > system. Regrettably not. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, recovery infra
Simon Riggs wrote: On Mon, 2009-02-09 at 17:13 +0200, Heikki Linnakangas wrote: Attached is an updated patch that does that, and I've fixed all the other outstanding issues I listed earlier as well. Now I'm feeling again that this is in pretty good shape. UpdateMinRecoveryPoint() issues a DEBUG2 message even when we have not updated the control file, leading to log filling behaviour on an idle system. DEBUG: updated min recovery point to ... We should just tuck the message into the "if" section above it. The outer "if" should ensure that it isn't printed repeatedly on an idle system. But I agree it belongs inside the inner if section. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, recovery infra
On Mon, 2009-02-09 at 17:13 +0200, Heikki Linnakangas wrote: > Attached is an updated patch that does that, and I've fixed all the > other outstanding issues I listed earlier as well. Now I'm feeling > again that this is in pretty good shape. UpdateMinRecoveryPoint() issues a DEBUG2 message even when we have not updated the control file, leading to log filling behaviour on an idle system. DEBUG: updated min recovery point to ... We should just tuck the message into the "if" section above it. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby: subxid cache changes
Simon Riggs wrote: I think my proposal still holds water, but I also think it is probably time to say OK, let's make this simpler and take the subxid tuning off line. Agreed. We would need to increase the max size of the xip array by 2*max_connections. So an increase of 80kB on normal running, which I can accept. You only need the bigger xip array while in hot standby mode. Backends starting after the recovery is done can use just max_connections. And you were already allocating a bigger subxip array, so the net effect is nil. Is that the only change you are suggesting to resolve this? I hope so. Yes. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby: subxid cache changes
On Thu, 2009-02-12 at 09:50 +0200, Heikki Linnakangas wrote: > It occurs to me that we don't need this patch for hot standby if we > abuse the main xid array (SnapshotData.xip) to store the unobserved xids > instead of the subxid array. That one is always scanned in > XidInMVCCSnapshot. I think we should do that rather than try to salvage > this patch. I think my proposal still holds water, but I also think it is probably time to say OK, let's make this simpler and take the subxid tuning off line. We would need to increase the max size of the xip array by 2*max_connections. So an increase of 80kB on normal running, which I can accept. Is that the only change you are suggesting to resolve this? I hope so. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby: subxid cache changes
On Fri, 2009-02-13 at 10:55 +0200, Heikki Linnakangas wrote: > >>> The logic is: if there is no lock table entry for that xid *and* it is > >>> not in progress *and* it is not in pg_subtrans, then it must have been > >>> an aborted subtransaction of a currently active xact or it has otherwise > >>> completed. > >> Right, we got it right that far. But after the subtransaction has > >> completed, the question is: what's its parent? That's what the patch got > >> wrong. > > > > We can find that out from procarray, since a subcommitted xid will still > > be present in the subxid cache of its parent (by definition, otherwise > > it will be marked in pg_subtrans). > > Unless the top transaction just committed. Looking at the other callers > of SubTransGetParent, I think it would introduce a race condition to > TransactionIdDidAbort and TransactionIdDidCommit. I don't see a race condition, but we would need to add a clog recheck if the xid was not found in procarray in the TransactionIdDid...() functions. SubTransGetParent() is just a way of getting the next xid to wait on. If the xid has been removed from procarray, we know we can recheck clog to find an authoritative answer because clog is always fully marked before we remove from procarray. We need three changes: * in DidCommit/Abort if (xidstatus == TRANSACTION_STATUS_SUB_COMMITTED) { ...other code... parentXid = SubTransGetParent(transactionId); if (!TransactionIdIsValid(parentXid)) TransactionIdDidCommit(transactionId); return TransactionIdDidCommit(parentXid); } * bottom of XactLockTableWait becomes /* we may find xid has completed just before we check */ xid = SubTransGetParent(xid); if (!TransactionIdIsValid(xid)) break; * SubTransGetParent() has some extra code to check procarray if pg_subtrans returns 0. It would be better to refactor that -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby: subxid cache changes
Simon Riggs wrote: On Thu, 2009-02-12 at 14:23 +0200, Heikki Linnakangas wrote: Simon Riggs wrote: On Thu, 2009-02-12 at 09:50 +0200, Heikki Linnakangas wrote: So far so good, but what about all the other callers of SubTransGetParent()? For example, XactLockTableWait will fail an assertion if asked to wait on a subtransaction which is then released. I agree that it could fail the assertion, though it is clear that the assertion should now be removed. No, then you just get an infinite loop instead, trying to get the parent of 0 over and over again. There is no infinite loop. Try it, or read TransactionIdIsInProgress(). I did, my CPU was pegged at 100%. Hmm, attaching with a debugger shows that it's not looping within XactLockTableWait as I assumed. Instead, XactLockTableWait returns without waiting on the parent, so we get into an busy loop in _bt_do_insert, trying to wait on the transaction over and over again. The logic is: if there is no lock table entry for that xid *and* it is not in progress *and* it is not in pg_subtrans, then it must have been an aborted subtransaction of a currently active xact or it has otherwise completed. Right, we got it right that far. But after the subtransaction has completed, the question is: what's its parent? That's what the patch got wrong. We can find that out from procarray, since a subcommitted xid will still be present in the subxid cache of its parent (by definition, otherwise it will be marked in pg_subtrans). Unless the top transaction just committed. Looking at the other callers of SubTransGetParent, I think it would introduce a race condition to TransactionIdDidAbort and TransactionIdDidCommit. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby: subxid cache changes
On Thu, 2009-02-12 at 14:23 +0200, Heikki Linnakangas wrote: > Simon Riggs wrote: > > On Thu, 2009-02-12 at 09:50 +0200, Heikki Linnakangas wrote: > >> So far so good, but what about all the other callers of > >> SubTransGetParent()? For example, XactLockTableWait will fail an > >> assertion if asked to wait on a subtransaction which is then released. > > > > I agree that it could fail the assertion, though it is clear that the > > assertion should now be removed. > > No, then you just get an infinite loop instead, trying to get the parent > of 0 over and over again. There is no infinite loop. Try it, or read TransactionIdIsInProgress(). > > The logic is: if there is no lock table entry for that xid *and* it is > > not in progress *and* it is not in pg_subtrans, then it must have been > > an aborted subtransaction of a currently active xact or it has otherwise > > completed. > > Right, we got it right that far. But after the subtransaction has > completed, the question is: what's its parent? That's what the patch got > wrong. We can find that out from procarray, since a subcommitted xid will still be present in the subxid cache of its parent (by definition, otherwise it will be marked in pg_subtrans). It will be quicker to fix that than to make other changes. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby: subxid cache changes
Simon Riggs wrote: On Thu, 2009-02-12 at 09:50 +0200, Heikki Linnakangas wrote: So far so good, but what about all the other callers of SubTransGetParent()? For example, XactLockTableWait will fail an assertion if asked to wait on a subtransaction which is then released. I agree that it could fail the assertion, though it is clear that the assertion should now be removed. No, then you just get an infinite loop instead, trying to get the parent of 0 over and over again. The logic is: if there is no lock table entry for that xid *and* it is not in progress *and* it is not in pg_subtrans, then it must have been an aborted subtransaction of a currently active xact or it has otherwise completed. Right, we got it right that far. But after the subtransaction has completed, the question is: what's its parent? That's what the patch got wrong. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Standby: subxid cache changes
On Thu, 2009-02-12 at 09:50 +0200, Heikki Linnakangas wrote: > It occurs to me that we don't need this patch for hot standby if we > abuse the main xid array (SnapshotData.xip) to store the unobserved xids > instead of the subxid array. That one is always scanned in > XidInMVCCSnapshot. I think we should do that rather than try to salvage > this patch. At this stage, yes. > So far so good, but what about all the other callers of > SubTransGetParent()? For example, XactLockTableWait will fail an > assertion if asked to wait on a subtransaction which is then released. I agree that it could fail the assertion, though it is clear that the assertion should now be removed. The logic is: if there is no lock table entry for that xid *and* it is not in progress *and* it is not in pg_subtrans, then it must have been an aborted subtransaction of a currently active xact or it has otherwise completed. I think we can rework the other aspects also. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, recovery infra
On Fri, 2009-02-06 at 10:06 +0200, Heikki Linnakangas wrote: > Simon Riggs wrote: > > On Thu, 2009-02-05 at 21:54 +0200, Heikki Linnakangas wrote: > >> - If you perform a fast shutdown while startup process is waiting for > >> the restore command, startup process sometimes throws a FATAL error > >> which leads escalates into an immediate shutdown. That leads to > >> different messages in the logs, and skipping of the shutdown > >> restartpoint that we now otherwise perform. > > > > Sometimes? > > I think what happens is that if the restore command receives the SIGTERM > and dies before the startup process that's waiting for the restore > command receives the SIGTERM, the startup process throws a FATAL error > because the restore command died unexpectedly. I put this > > > if (shutdown_requested && InRedo) > > { > > /* XXX: Is EndRecPtr always the right value here? */ > > UpdateMinRecoveryPoint(EndRecPtr); > > proc_exit(0); > > } > > right after the "system(xlogRestoreCmd)" call, to exit gracefully if we > were requested to shut down while restore command was running, but it > seems that that's not enough because of the race condition. Can we trap the death of the restorecmd and handle it differently from the death of the startup process? -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, recovery infra
Simon Riggs wrote: On Thu, 2009-02-05 at 21:54 +0200, Heikki Linnakangas wrote: - If you perform a fast shutdown while startup process is waiting for the restore command, startup process sometimes throws a FATAL error which leads escalates into an immediate shutdown. That leads to different messages in the logs, and skipping of the shutdown restartpoint that we now otherwise perform. Sometimes? I think what happens is that if the restore command receives the SIGTERM and dies before the startup process that's waiting for the restore command receives the SIGTERM, the startup process throws a FATAL error because the restore command died unexpectedly. I put this if (shutdown_requested && InRedo) { /* XXX: Is EndRecPtr always the right value here? */ UpdateMinRecoveryPoint(EndRecPtr); proc_exit(0); } right after the "system(xlogRestoreCmd)" call, to exit gracefully if we were requested to shut down while restore command was running, but it seems that that's not enough because of the race condition. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, recovery infra
On Thu, 2009-02-05 at 21:54 +0200, Heikki Linnakangas wrote: > - If bgwriter is performing a restartpoint when recovery ends, the > startup checkpoint will be queued up behind the restartpoint. And since > it uses the same smoothing logic as checkpoints, it can take quite some > time for that to finish. The original patch had some code to hurry up > the restartpoint by signaling the bgwriter if > LWLockConditionalAcquire(CheckPointLock) fails, but there's a race > condition with that if a restartpoint starts right after that check. We > could let the bgwriter do the checkpoint too, and wait for it, but > bgwriter might not be running yet, and we'd have to allow bgwriter to > write WAL while disallowing it for all other processes, which seems > quite complex. Seems like we need something like the > LWLockConditionalAcquire approach, but built into CreateCheckPoint to > eliminate the race condition Seems straightforward? Hold the lock longer. > - If you perform a fast shutdown while startup process is waiting for > the restore command, startup process sometimes throws a FATAL error > which leads escalates into an immediate shutdown. That leads to > different messages in the logs, and skipping of the shutdown > restartpoint that we now otherwise perform. Sometimes? > - It's not clear to me if the rest of the xlog flushing related > functions, XLogBackgroundFlush, XLogNeedsFlush and XLogAsyncCommitFlush, > need to work during recovery, and what they should do. XLogNeedsFlush should always return false InRecoveryProcessingMode(). The WAL is already in the WAL files, not in wal_buffers anymore. XLogAsyncCommitFlush should contain Assert(!InRecoveryProcessingMode()) since it is called during a VACUUM FULL only. XLogBackgroundFlush should never be called during recovery because the WALWriter is never active in recovery. That should just be documented. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, recovery infra
On Thu, 2009-02-05 at 14:18 +0200, Heikki Linnakangas wrote: > when the control file is updated in XLogFlush, it's > typically the bgwriter doing it as it cleans buffers ahead of the > clock hand, not the startup process That is the key point. Let's do it your way. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, recovery infra
Simon Riggs wrote: On Thu, 2009-02-05 at 13:18 +0200, Heikki Linnakangas wrote: Simon Riggs wrote: On Thu, 2009-02-05 at 11:46 +0200, Heikki Linnakangas wrote: Simon Riggs wrote: So we might end up flushing more often *and* we will be doing it potentially in the code path of other users. For example, imagine a database that fits completely in shared buffers. If we update at every XLogFileRead, we have to fsync every 16MB of WAL. If we update in XLogFlush the way I described, you only need to update when we flush a page from the buffer cache, which will only happen at restartpoints. That's far less updates. Oh, did you change the bgwriter so it doesn't do normal page cleaning? No. Ok, that wasn't completely accurate. The page cleaning by bgwriter will perform XLogFlushes, but that should be pretty insignificant. When there's little page replacement going on, bgwriter will do a small trickle of page cleaning, which won't matter much. Yes, that case is good, but it wasn't the use case we're trying to speed up by having the bgwriter active during recovery. We're worried about I/O bound recoveries. Ok, let's do the math: By updating minRecoveryPoint in XLogFileRead, you're fsyncing the control file once every 16MB of WAL. By updating in XLogFlush, the frequency depends on the amount of shared_buffers available to buffer the modified pages, the average WAL record size, and the cache hit ratio. Let's determine the worst case: The smallest WAL record that dirties a page is a heap deletion record. That contains just enough information to locate the tuple. If I'm reading the headers right, that record is 48 bytes long (28 bytes of xlog header + 18 bytes of payload + padding). Assuming that the WAL is full of just those records, and there's no full page images, and that the cache hit ratio is 0%, we will need (16 MB / 48 B) * 8 kB = 2730 MB of shared_buffers to achieve the once per 16 MB of WAL per one fsync mark. So if you have a lower shared_buffers setting than 2.7 GB, you can have more frequent fsyncs this way in the worst case. If you think of the typical case, you're probably not doing all deletes, and you're having a non-zero cache hit ratio, so you achieve the same frequency with a much lower shared_buffers setting. And if you're really that I/O bound, I doubt the few extra fsyncs matter much. Also note that when the control file is updated in XLogFlush, it's typically the bgwriter doing it as it cleans buffers ahead of the clock hand, not the startup process. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, recovery infra
On Thu, 2009-02-05 at 13:18 +0200, Heikki Linnakangas wrote: > Simon Riggs wrote: > > On Thu, 2009-02-05 at 11:46 +0200, Heikki Linnakangas wrote: > >> Simon Riggs wrote: > > > >>> So we might end up flushing more often *and* we will be doing it > >>> potentially in the code path of other users. > >> For example, imagine a database that fits completely in shared buffers. > >> If we update at every XLogFileRead, we have to fsync every 16MB of WAL. > >> If we update in XLogFlush the way I described, you only need to update > >> when we flush a page from the buffer cache, which will only happen at > >> restartpoints. That's far less updates. > > > > Oh, did you change the bgwriter so it doesn't do normal page cleaning? > > No. Ok, that wasn't completely accurate. The page cleaning by bgwriter > will perform XLogFlushes, but that should be pretty insignificant. When > there's little page replacement going on, bgwriter will do a small > trickle of page cleaning, which won't matter much. Yes, that case is good, but it wasn't the use case we're trying to speed up by having the bgwriter active during recovery. We're worried about I/O bound recoveries. > If there's more page > replacement going on, bgwriter is cleaning up pages that will soon be > replaced, so it's just offsetting work from other backends (or the > startup process in this case). Which only needs to be done if we follow this route, so is not an argument in favour. Using more I/O in I/O bound cases makes the worst case even worse. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, recovery infra
Simon Riggs wrote: On Thu, 2009-02-05 at 11:46 +0200, Heikki Linnakangas wrote: Simon Riggs wrote: So we might end up flushing more often *and* we will be doing it potentially in the code path of other users. For example, imagine a database that fits completely in shared buffers. If we update at every XLogFileRead, we have to fsync every 16MB of WAL. If we update in XLogFlush the way I described, you only need to update when we flush a page from the buffer cache, which will only happen at restartpoints. That's far less updates. Oh, did you change the bgwriter so it doesn't do normal page cleaning? No. Ok, that wasn't completely accurate. The page cleaning by bgwriter will perform XLogFlushes, but that should be pretty insignificant. When there's little page replacement going on, bgwriter will do a small trickle of page cleaning, which won't matter much. If there's more page replacement going on, bgwriter is cleaning up pages that will soon be replaced, so it's just offsetting work from other backends (or the startup process in this case). Expanding that example to a database that doesn't fit in cache, you're still replacing pages from the buffer cache that have been untouched for longest. Such pages will have an old LSN, too, so we shouldn't need to update very often. They will tend to be written in ascending LSN order which will mean we continually update the control file. Anything out of order does skip a write. The better the cache is at finding LRU blocks out the more writes we will make. When minRecoveryPoint is updated, it's not update to just the LSN that's being flushed. It's updated to the recptr of the most recently read WAL record. That's an important point to avoid that behavior. Just like XLogFlush normally always flushes all of the outstanding WAL, not just up to the requested LSN. I'd like to have the extra protection that this approach gives. If we let safeStartPoint to be ahead of the actual WAL we've replayed, we have to just assume we're fine if we reach end of WAL before reaching that point. That assumption falls down if e.g recovery is stopped, and you go and remove the last few WAL segments from the archive before restarting it, or signal pg_standby to trigger failover too early. Tracking the real safe starting point and enforcing it always protects you from that. Doing it this way will require you to remove existing specific error messages about ending before end time of backup, to be replaced by more general ones that say "consistency not reached" which is harder to figure out what to do about it. Yeah. If that's an important distinction, we could still save the original backup stop location somewhere, just so that we can give the old error message when we've not passed that location. But perhaps a message like "WAL ends before reaching a consistent state" with a hint "Make sure you archive all the WAL created during backup" or something would do suffice. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers