there seems to be a race condition that I hit regularly when running
postmark on a dual 3.06mhz xeon blade.
basically, I do a heavy duty postmark
set size 512 10240
set number 20000
set transactions 200000
set subdirectories 200
set read 4096
set write 4096
set buffering false
and have it branch every 60 seconds. (--add new ; --mode old ro)
this is using the latest snapshot
unionfs-20060117-2031
kernel trace (echo t > /proc/sysrq-trigger) gives
unionctl D C030B520 0 3237 3235 (NOTLB)
ea885ea8 00000086 c26e5a20 c030b520 f8e17d20 f8e0c7a5 f8e0e060 00000297
dfc33000 f71e2300 f782e520 00085578 be7b2551 000000ec c26e5b70
f7616818
ea885ebc c26e5a20 c26e5a20 c024c73d f8e11906 0000015c f761681c
f74cfe18
Call Trace:
[<c024c73d>] rwsem_down_write_failed+0x8d/0x170
[<f8ddfbd4>] .text.lock.branchman+0x12a/0x1a6 [unionfs]
[<f8e08334>] unionfs_ioctl+0x2a4/0x3f0 [unionfs]
[<c015ba6e>] do_ioctl+0x6e/0x80
[<c015bc55>] vfs_ioctl+0x65/0x1e0
[<c015be37>] sys_ioctl+0x67/0x90
[<c0102493>] syscall_call+0x7/0xb
i.e. unionctl is waiting to get the read/write semaphore as write so
that it can do the ioctl. implying that either something is blocked
holding it as read or something is blocked holding it as write.
but
postmark D C030B520 0 3210 2796 (NOTLB)
f74cfe04 00000086 f782e520 c030b520 ea8a7ea4 c2655480 0000003a 000021d8
00000000 00000000 00000000 0000025a be7b35ca 000000ec f782e670
f7616818
f74cfe18 f782e520 0000000c c024c5cd f74cfe28 c010f017 f761681c
f761681c
Call Trace:
[<c024c5cd>] rwsem_down_read_failed+0x8d/0x170
[<c010f017>] __wake_up_locked+0x27/0x30
[<f8dc303a>] .text.lock.dentry+0x1f/0x1c5 [unionfs]
[<c024c5cd>] rwsem_down_read_failed+0x8d/0x170
[<f8e03282>] unionfs_file_revalidate+0x122/0x2ba0 [unionfs]
[<f8e09d54>] fist_dprint_internal+0x14/0x80 [unionfs]
[<f8e084f3>] unionfs_flush+0x73/0xd2f [unionfs]
[<f8dc3798>] unionfs_write+0x1b8/0x1f0 [unionfs]
[<c014a2bf>] vfs_write+0xff/0x160
[<c0149806>] filp_close+0x76/0x90
[<c0149870>] sys_close+0x50/0x60
[<c0102493>] syscall_call+0x7/0xb
can't get the read, implying that something is blocked holding it as
write.
however, unless something is blowing up in such a way that doesn't cause
the kernel to complain, I don't see how any write_locks can be taken
without being released (it's only in branchman.c, and seems pretty
simple).
_______________________________________________
unionfs mailing list
[email protected]
http://www.fsl.cs.sunysb.edu/mailman/listinfo/unionfs