On 04/16/2009 08:25 AM, Felix Frank wrote:
-    if (!avc->states & CPageWrite)

I see a bug there - this line probably wants to be:
    if (!(avc->states & CPageWrite))

So the recursion was avoided by never actually doing anything in StoreAllSegments, since CPageWrite never got set and the condition was always false.

With the fix above, my larger mmap test quickly runs into a deadlock again. Looks like cache_write_pages is trying to lock the page that is currently being written:

(this is pdflush):
[<ffffffffa0b91d14>] ? crfree+0x38/0x3c [libafs]
[<ffffffff81077f85>] ? getnstimeofday+0x5a/0xae
[<ffffffff810b2b0a>] ? sync_page+0x0/0x45
[<ffffffff8144c905>] schedule+0x9/0x1d
[<ffffffff8144c94c>] io_schedule+0x33/0x44
[<ffffffff810b2b4b>] sync_page+0x41/0x45
[<ffffffff8144cd0e>] __wait_on_bit_lock+0x41/0x8a
[<ffffffff810b2acf>] __lock_page+0x61/0x68
[<ffffffff8107144d>] ? wake_bit_function+0x0/0x2e
[<ffffffff810b863c>] write_cache_pages+0x1dc/0x3b3
[<ffffffff810b804a>] ? __writepage+0x0/0x2f
[<ffffffff810b8832>] generic_writepages+0x1f/0x21
[<ffffffff810b8863>] do_writepages+0x2f/0x37
[<ffffffff810b35e3>] __filemap_fdatawrite_range+0x4b/0x4d
[<ffffffff810b3d90>] filemap_fdatawrite+0x1a/0x1c
[<ffffffffa0b9485c>] osi_VM_StoreAllSegments+0xd7/0x17c [libafs]
[<ffffffffa0b5e000>] afs_StoreAllSegments+0xcb/0x17c7 [libafs]
[<ffffffff810dbc69>] ? __fput+0x17b/0x18a
[<ffffffff81077f85>] ? getnstimeofday+0x5a/0xae
[<ffffffff81077fee>] ? do_gettimeofday+0x15/0x38
[<ffffffffa0b99fdf>] ? afs_icl_Event4+0xfe/0x162 [libafs]
[<ffffffffa0b751ba>] afs_DoPartialWrite+0x55/0x5a [libafs]
[<ffffffffa0b97655>] afs_linux_writepage_sync+0x30f/0x3fc [libafs]
[<ffffffff8122156b>] ? prio_tree_next+0x1c3/0x224
[<ffffffffa0b97838>] afs_linux_writepage+0x8c/0xba [libafs]
[<ffffffff810b805c>] __writepage+0x12/0x2f
[<ffffffff810b8696>] write_cache_pages+0x236/0x3b3
[<ffffffff810b804a>] ? __writepage+0x0/0x2f
[<ffffffff810b8832>] generic_writepages+0x1f/0x21
[<ffffffff810b8863>] do_writepages+0x2f/0x37
[<ffffffff810f403a>] __writeback_single_inode+0x1a1/0x3b9
[<ffffffff81052516>] ? __dequeue_entity+0x2e/0x33
[<ffffffff810f468a>] generic_sync_sb_inodes+0x2a7/0x438

> What I don't get is why setting CPageWrite prevents
> afs_linux_writepage_sync from being called (?), as CPageWrite is checked
> inside it, and only after the afs_Trace4(). Iupdatepage with code 99999
> should therefore even show up with working antirecursion, as far as I
> can understand it.

You probably didn't wait long enough for the other Iupdatepage to show up. The unmap() doesn't cause a flush to happen immediately - the dirty pages eventually get written by pdflush, but that can be several seconds later. Without the anti-recursion code, close() causes osi_VM_StoreAllSegments to write out the mmaped modified pages right away.

Marc
_______________________________________________
OpenAFS-devel mailing list
[email protected]
https://lists.openafs.org/mailman/listinfo/openafs-devel

Reply via email to