Hi,
On Thu, 8 Oct 1998 18:53:08 +0300 (EEST), Matti Aarnio
<[EMAIL PROTECTED]> said:
>> > Do several parallel writes, and then start 2 or 3 parallel
>> > (f)syncs. If you do one, it completes rather rapidly, but
>> > two in parallel is bad medicine, and three is, well ...
>>
>> Is this sync problem known to the kernel developers? I would expect that
>> it shouldn't be much work to encapsulate the sync by semaphores so that
>> following sync calls will be blocked will one sync is running.
> Yes it is known, but it won't be touched before 2.3.
> (Wasn't that the plan, Stephen ?)
Here's a quick-and-dirty 2.1.125 patch which simply makes sync()s single
threaded. There's one mutex for sync(), and one per device for fsync().
It won't help for sync()s in parallel with fsync()s, for that reason,
but I could fix that if it matters.
Does it help? Does it matter? We're in feature freeze so I can't take
this to Linus for 2.2 unless we've got good justification to do so. I
need real world applications where this is hurting. Having said that,
it's pretty easy to serialise so the patch isn't particularly dangerous
from a correctness point of view, just from a performance view.
Tell me if this works for you. There are more intelligent ways we can
fix the problem: in particular, with a bit more work we can actually
merge concurrent sync()s into a single operation rather than doing the
same work multiple times, but I'd expect the dirty queue to be small on
subsequent syncs so it's probably OK. I don't want to make the patch
any more complex than necessary for 2.2. Beyond that, fsync() is going
to be completely rewritten anyway.
Comments?
--Stephen
----------------------------------------------------------------
--- drivers/block/ll_rw_blk.c.~1~ Tue Aug 18 11:17:32 1998
+++ drivers/block/ll_rw_blk.c Mon Oct 12 13:23:04 1998
@@ -108,6 +108,14 @@
*/
int * max_sectors[MAX_BLKDEV] = { NULL, NULL, };
+/*
+ * MUTEX locking to prevent concurrent fsync()s to the block devices.
+ * (Concurrent syncs thrash the disk enormously and result in much worse
+ * performance than serial syncs.)
+ */
+char blk_synclock[MAX_BLKDEV] = {0};
+
+
static inline int get_max_sectors(kdev_t dev)
{
if (!max_sectors[MAJOR(dev)])
--- fs/buffer.c.~1~ Thu Aug 27 11:55:20 1998
+++ fs/buffer.c Mon Oct 12 13:23:04 1998
@@ -170,6 +170,17 @@
int i, retry, pass = 0, err = 0;
struct buffer_head * bh, *next;
+ static struct semaphore sem = MUTEX;
+ static struct wait_queue * wait;
+
+ if (dev == 0)
+ down (&sem);
+ else {
+ while (blk_synclock[MAJOR(dev)] != 0)
+ sleep_on(&wait);
+ blk_synclock[MAJOR(dev)] = 1;
+ }
+
/* One pass for no-wait, three for wait:
* 0) write out all dirty, unlocked buffers;
* 1) write out all dirty buffers, waiting if locked;
@@ -266,6 +277,14 @@
* more buffers on the second pass).
*/
} while (wait && retry && ++pass<=2);
+
+ if (dev == 0)
+ up (&sem);
+ else {
+ blk_synclock[MAJOR(dev)] = 0;
+ wake_up (&wait);
+ }
+
return err;
}
--- include/linux/blkdev.h.~1~ Mon Sep 28 12:14:53 1998
+++ include/linux/blkdev.h Mon Oct 12 13:23:04 1998
@@ -75,6 +75,8 @@
extern int * max_sectors[MAX_BLKDEV];
+extern char blk_synclock[MAX_BLKDEV];
+
#define MAX_SECTORS 244 /* 254 ? */
#define PageAlignSize(size) (((size) + PAGE_SIZE -1) & PAGE_MASK)