On Mon, 2008-01-21 at 09:21 +0000, [EMAIL PROTECTED] wrote: > From: Steven Whitehouse <[EMAIL PROTECTED]> > > This patch intriduces two new log thresholds: > > o thresh1 is the point at which we wake up gfs2_logd due to the pinned > block count. It is initialised at mount time to 2/5ths of the size > of the journal. Currently it does not change during the course of > the mount, but the intention is to adjust it automatically based > upon various conditions. This automatic adjustment will be the subject > of later patches. > > o thresh2 is the point at which we wake up gfs2_logd due to the total > of pinned blocks and AIL blocks. It is initialised at mount time > to 4/5ths of the size of the journal. The reason for not making it > equal to 100% of journal size is to give gfs2_logd time to start up > and do something before the processes filling the journal before > they land up stalling, and waiting on journal flushes. > > At the same time, the incore_log_blocks tunable is removed since it > was wrong (just a basic fixed threshold set to a number plucked out > of the air) and it was being compared against the wrong thing (the > amount of metadata in the journal) rather than the total number of > blocks. > > Also, since the free blocks count is now an atomic variable, a > number of these comparisons now do not need locking, so that > the log lock has been removed around some operations. > > This patch also ensures that there are no races when gfs2_logd is > woken up. It also changes the behavour of the periodic sync > so that instead of occuring every 60 secs, they will now > occur every 30 secs (which can be set via /sysfs still) if > there have been no other log flushes in the mean time. > > When we reserve blocks at the start of a transaction, we now > use a waitqueue too. This means we can remove the old mutex > and the fast path through that code is just a couple of atomic > operations now. Also we no longer do log flushing at this point > in the code. Instead we wake up gfs2_logd to do it for us (this > shouldn't happen if the log is large enough and if gfs2_logd is > properly tuned) and do an exclusive wait. > > As a result of these changes, postmark on my test machine runs about > 20% faster, mainly due to increased efficiency in flushing the > journal. >
Steve, I still think this one is a bad idea. Postmark might be 20% faster, but every other benchmark we have run shows this causes significant I/O delays and poor performance. It also changes the interface between users space utilities and the filesystem. Kevin