On Thursday, June 7, 2012, Fujii Masao wrote: > On Thu, Jun 7, 2012 at 6:25 PM, Magnus Hagander <mag...@hagander.net> > wrote: > > On Thursday, June 7, 2012, Fujii Masao wrote: > >> > >> On Thu, Jun 7, 2012 at 5:05 AM, Magnus Hagander <mag...@hagander.net> > >> wrote: > >> > On Wed, Jun 6, 2012 at 8:26 PM, Fujii Masao <masao.fu...@gmail.com> > >> > wrote: > >> >> On Tue, Jun 5, 2012 at 11:44 PM, Magnus Hagander < > mag...@hagander.net> > >> >> wrote: > >> >>> On Tue, Jun 5, 2012 at 4:42 PM, Fujii Masao <masao.fu...@gmail.com> > >> >>> wrote: > >> >>>> On Tue, Jun 5, 2012 at 9:53 PM, Magnus Hagander < > mag...@hagander.net> > >> >>>> wrote: > >> >>>>> Right now, pg_receivexlog sets: > >> >>>>> replymsg->write = InvalidXLogRecPtr; > >> >>>>> replymsg->flush = InvalidXLogRecPtr; > >> >>>>> replymsg->apply = InvalidXLogRecPtr; > >> >>>>> > >> >>>>> when it sends it's status updates. > >> >>>>> > >> >>>>> I'm thinking it sohuld set replymsg->write = blockpos instad. > >> >>>>> > >> >>>>> Why? That way you can see in pg_stat_replication what has actually > >> >>>>> been received by pg_receivexlog - not just what we last sent. This > >> >>>>> can > >> >>>>> be useful in combination with an archive_command that can block > WAL > >> >>>>> recycling until it has been saved to the standby. And it would be > >> >>>>> useful as a general monitoring thing as well. > >> >>>>> > >> >>>>> I think the original reason was that it shouldn't interefer with > >> >>>>> synchronous replication - but it does take away a fairly useful > >> >>>>> usecase... > >> >>>> > >> >>>> I think that not only replaymsg->write but also ->flush should be > set > >> >>>> to > >> >>>> blockpos in pg_receivexlog. Which allows pg_receivexlog to behave > >> >>>> as synchronous standby, so we can write WAL to both local and > remote > >> >>>> synchronously. I believe there are some use cases for synchronous > >> >>>> pg_receivexlog. > >> >>> > >> >>> pg_receivexlog doesn't currently fsync() after every write. It only > >> >>> fsync():s complete files. So we'd need to set ->flush only at the > end > >> >>> of a segment, right? > >> >> > >> >> Yes. > >> >> > >> >> Currently the status update is sent for each status interval. In sync > >> >> replication, transaction has to wait for a while even after > >> >> pg_receivexlog > >> >> has written or flushed the WAL data. > >> >> > >> >> So we should add new option which specifies whether pg_receivexlog > >> >> sends the status packet back as soon as it writes or flushes the WAL > >> >> data, like the walreceiver does? > >> > > >> > That might be useful, but I think that's 9.3 material at this point. > >> > >> Fair enough. That's new feature rather than a bugfix. > >> > >> > But I think we can get the "set the write location" in as a bugfix. > >> > >> Also "set the flush location"? Sending the flush location back seems > >> helpful when using pg_receivexlog for WAL archiving purpose. By > >> seeing the flush location we can ensure that WAL file has been archived > >> durably (IOW, WAL file has been flushed in remote archive area). > >> > > > > You can do that with the write location as well, as long as you round it > You mean to prevent pg_receivexlog from sending back the end of WAL file > as the write location *before* it completes the WAL file? If so, yes. But > why do you want to keep the flush location invalid? >
No. pg_receivexlog sends back the correct write location. Whoever does the check (through pg_stat_replication) rounds down, so it only counts it once pg_receivexlog has acknowledged receiving the whole mail. I'm not against doing the flush location as well, I'm just worried about feature-creep :-) But let's see how big a change that would turn out to be... //Magnus -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/