On Thu, Jan 16, 2014 at 9:37 AM, Andres Freund <and...@2ndquadrant.com>wrote:
> On 2014-01-16 09:25:51 -0800, Jeff Janes wrote: > > On Thu, Nov 21, 2013 at 2:43 PM, Andres Freund <and...@2ndquadrant.com > >wrote: > > > > > On 2013-11-21 14:40:36 -0800, Jeff Janes wrote: > > > > But if the transaction would not have otherwise generated WAL (i.e. a > > > > select that did not have to do any HOT pruning, or an update with > zero > > > rows > > > > matching the where condition), doesn't it now have to flush and wait > when > > > > it would otherwise not? > > > > > > We short circuit that if there's no xid assigned. Check > > > RecordTransactionCommit(). > > > > > > > It looks like that only short-circuits the flush if both there is no xid > > assigned, and !wrote_xlog. (line 1054 of xact.c) > > Hm. Indeed. Why don't we just always use the async commit behaviour for > that? I don't really see any significant dangers from doing so? > I think the argument is that drawing the next value from a sequence can generate xlog that needs to be flushed, but doesn't assign an xid. I would think the sequence should flush that record before it hands out the value, not before the commit, but... > > It's also rather odd to use the sync rep mechanisms in such > scenarios... The if() really should test markXidCommitted instead of > wrote_xlog. > > > I do see stalls on fdatasync on flush from select statements which had no > > xid, but did generate xlog due to HOT pruning, I don't see why WAL > logging > > hint bits would be different. > > Are the stalls at commit or while the select is running? If wal_buffers > is filled too fast, which can easily happen if loads of pages are hinted > and wal logged, that will happen independently from > RecordTransactionCommit(). > In the real world, I'm not sure what the distribution is. But in my present test case, they are coming almost exclusively from RecordTransactionCommit. I use "pgbench -T10" in a loop to generate dirty data and checkpoints (with synchronous_commit on but with a BBU), and then to probe the consequences I use: pgbench -T10 -S -n --startup='set synchronous_commit='$f (where --startup is an extension to pgbench proposed a few months ago) Running the select-only query with synchronous_commit off almost completely isolates it from the checkpoint drama that otherwise has a massive effect on it. with synchronous_commit=on, it goes from 6000 tps normally to 30 tps during the checkpoint sync, with synchronous_commit=off it might dip to 4000 or so during the worst of it. (To be clear, this is about the pruning, not the logging of the hint bits) Cheers, Jeff