Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-11-19 Thread Erez Zadok
In message [EMAIL PROTECTED], Hugh Dickins writes: On Tue, 13 Nov 2007, Erez Zadok wrote: [...] I'm glad to report that this unionfs, not the one in 2.6.24-rc2-mm1 but the one including those 9 patches you posted, now gets through my testing with tmpfs without a problem. I do still get

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-11-17 Thread Hugh Dickins
On Tue, 13 Nov 2007, Erez Zadok wrote: I posted all of these patches just now. You're CC'ed. Hopefully Andrew can pull from my unionfs.git branch soon. You also reported in your previous emails some hangs/oopses while doing make -j 20 in unionfs on top of a single tmpfs, using -mm.

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-11-13 Thread Erez Zadok
In message [EMAIL PROTECTED], Hugh Dickins writes: On Fri, 9 Nov 2007, Erez Zadok wrote: In message [EMAIL PROTECTED], Hugh Dickins writes: Three, I believe you need to add a flush_dcache_page(lower_page) after the copy_highpage(lower_page): some architectures will need that to see

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-11-12 Thread Hugh Dickins
On Fri, 9 Nov 2007, Erez Zadok wrote: In message [EMAIL PROTECTED], Hugh Dickins writes: Three, I believe you need to add a flush_dcache_page(lower_page) after the copy_highpage(lower_page): some architectures will need that to see the new data if they have lower_page mapped (though I

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-11-11 Thread Hugh Dickins
On Fri, 9 Nov 2007, Erez Zadok wrote: In message [EMAIL PROTECTED], Hugh Dickins writes: One, I think you would be safer to do a set_page_dirty(lower_page) before your clear_page_dirty_for_io(lower_page). I know that sounds silly, but see Linus' Yes, Virginia comment in

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-11-08 Thread Erez Zadok
In message [EMAIL PROTECTED], Dave Hansen writes: On Mon, 2007-11-05 at 15:40 +, Hugh Dickins wrote: [...] I have a decent guess what the bug is, too. In the unionfs code: int init_lower_nd(struct nameidata *nd, unsigned int flags) { ... #ifdef ALLOC_LOWER_ND_FILE

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-11-08 Thread Erez Zadok
In message [EMAIL PROTECTED], Hugh Dickins writes: [Dave, I've Cc'ed you re handle_write_count_underflow, see below.] On Wed, 31 Oct 2007, Erez Zadok wrote: Hi Hugh, I've addressed all of your concerns and am happy to report that the newly revised unionfs_writepage works even better,

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-11-05 Thread Hugh Dickins
[Dave, I've Cc'ed you re handle_write_count_underflow, see below.] On Wed, 31 Oct 2007, Erez Zadok wrote: Hi Hugh, I've addressed all of your concerns and am happy to report that the newly revised unionfs_writepage works even better, including under my memory-pressure conditions. To

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-11-05 Thread Dave Hansen
On Mon, 2007-11-05 at 15:40 +, Hugh Dickins wrote: The second problem was a hang: all cpus in handle_write_count_underflow doing lock_and_coalesce_cpu_mnt_writer_counts: new -mm stuff from Dave Hansen. At first I thought that was a locking problem in Dave's code, but I now suspect it's

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-11-05 Thread Hugh Dickins
On Mon, 5 Nov 2007, Dave Hansen wrote: Actually, I think your s/while/if/ change is probably a decent fix. Any resemblance to a decent fix is purely coincidental. Barring any other races, that loop should always have made progress on mnt-__mnt_writers the way it is written. If we get to:

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-31 Thread Erez Zadok
Hi Hugh, I've addressed all of your concerns and am happy to report that the newly revised unionfs_writepage works even better, including under my memory-pressure conditions. To summarize my changes since the last time: - I'm only masking __GFP_FS, not __GFP_IO - using find_or_create_page to

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-29 Thread Hugh Dickins
On Sun, 28 Oct 2007, Erez Zadok wrote: I took your advise regarding ~(__GFP_FS|__GFP_IO), AOP_WRITEPAGE_ACTIVATE, and such. I revised my unionfs_writepage and unionfs_sync_page, and tested it under memory pressure: I have a couple of live CDs that use tmpfs and can deterministically

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-28 Thread Erez Zadok
Huge, I took your advise regarding ~(__GFP_FS|__GFP_IO), AOP_WRITEPAGE_ACTIVATE, and such. I revised my unionfs_writepage and unionfs_sync_page, and tested it under memory pressure: I have a couple of live CDs that use tmpfs and can deterministically reproduce the conditions resulting in A_W_A.

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-26 Thread Pekka Enberg
Hi Hugh, On 10/25/07, Hugh Dickins [EMAIL PROTECTED] wrote: @@ -349,10 +349,6 @@ static pageout_t pageout(struct page *pa res = mapping-a_ops-writepage(page, wbc); if (res 0) handle_write_error(mapping, page, res); - if

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-26 Thread Pekka Enberg
Hi, On 10/26/07, Neil Brown [EMAIL PROTECTED] wrote: It seems that the new requirement is that if the address_space chooses not to write out the page, it should now call SetPageActive(). If that is the case, I think it should be explicit in the documentation - please? Agreed. - To

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-26 Thread Hugh Dickins
On Fri, 26 Oct 2007, Neil Brown wrote: On Thursday October 25, [EMAIL PROTECTED] wrote: The patch looks like it makes perfect sense to me. Great, thanks a lot for looking at it, Neil and Pekka. Before the change, -writepage could return AOP_WRITEPAGE_ACTIVATE without unlocking the page,

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-25 Thread Hugh Dickins
On Mon, 22 Oct 2007, Erez Zadok wrote: In message [EMAIL PROTECTED], Hugh Dickins writes: Only ramdisk and shmem have been returning AOP_WRITEPAGE_ACTIVATE. Both of those set BDI_CAP_NO_WRITEBACK. ramdisk never returned it if !wbc-for_reclaim. I contend that shmem shouldn't either:

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-25 Thread Erez Zadok
That's a nice historical review, Huge, of how got into these mess we're in now -- it all starts with good intentions. :-) On a related note, I would just love to get rid of calling the lower -writepage in unionfs b/c I can't even tell if I have a lower page to use all the time. I'd prefer to

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-25 Thread Hugh Dickins
On Mon, 22 Oct 2007, Erez Zadok wrote: What's the precise semantics of AOP_WRITEPAGE_ACTIVATE? Sigh - not at you, at it! It's a secret that couldn't be kept secret, a hack for tmpfs reclaim, let's just look forward to it going away. Is it considered an error or not? No, it's definitely not

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-25 Thread Neil Brown
On Thursday October 25, [EMAIL PROTECTED] wrote: On Mon, 22 Oct 2007, Pekka Enberg wrote: On 10/22/07, Hugh Dickins [EMAIL PROTECTED] wrote: Only ramdisk and shmem have been returning AOP_WRITEPAGE_ACTIVATE. Both of those set BDI_CAP_NO_WRITEBACK. ramdisk never returned it if

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-22 Thread Hugh Dickins
Sorry for my delay, here are a few replies. On Sun, 14 Oct 2007, Erez Zadok wrote: In message [EMAIL PROTECTED], Pekka Enberg writes: However, I don't think the mapping_cap_writeback_dirty() check in __filemap_fdatawrite_range() works as expected when tmpfs is a lower mount for an

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-22 Thread Hugh Dickins
On Sun, 14 Oct 2007, Erez Zadok wrote: In message [EMAIL PROTECTED], Pekka J Enberg writes: Look at mm/filemap.c:__filemap_fdatawrite_range(). You shouldn't be calling unionfs_writepage() _at all_ if the lower mapping has BDI_CAP_NO_WRITEBACK capability set. Perhaps something like the

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-22 Thread Hugh Dickins
On Mon, 15 Oct 2007, Pekka Enberg wrote: I wonder whether _not setting_ BDI_CAP_NO_WRITEBACK implies that -writepage() will never return AOP_WRITEPAGE_ACTIVATE for !wbc-for_reclaim case which would explain why we haven't hit this bug before. Hugh, Andrew? Only ramdisk and shmem have been

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-22 Thread Pekka Enberg
Hi Hugh, On 10/22/07, Hugh Dickins [EMAIL PROTECTED] wrote: I don't disagree with your unionfs_writepages patch, Pekka, but I think it should be viewed as an optimization (don't waste time trying to write a group of pages when we know that nothing will be done) rather than as essential. Ok,

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-22 Thread Pekka Enberg
Hi Hugh, On Mon, 15 Oct 2007, Pekka Enberg wrote: I wonder whether _not setting_ BDI_CAP_NO_WRITEBACK implies that -writepage() will never return AOP_WRITEPAGE_ACTIVATE for !wbc-for_reclaim case which would explain why we haven't hit this bug before. Hugh, Andrew? On 10/22/07, Hugh

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-22 Thread Erez Zadok
In message [EMAIL PROTECTED], Hugh Dickins writes: On Mon, 15 Oct 2007, Pekka Enberg wrote: I wonder whether _not setting_ BDI_CAP_NO_WRITEBACK implies that -writepage() will never return AOP_WRITEPAGE_ACTIVATE for !wbc-for_reclaim case which would explain why we haven't hit this bug

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-22 Thread Erez Zadok
In message [EMAIL PROTECTED], Hugh Dickins writes: Sorry for my delay, here are a few replies. In unionfs_writepage() I tried to emulate as best possible what the lower f/s will have returned to the VFS. Since tmpfs's -writepage can return AOP_WRITEPAGE_ACTIVATE and re-mark its page as

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-16 Thread Erez Zadok
In message [EMAIL PROTECTED], Pekka Enberg writes: Hi, On 10/15/07, Erez Zadok [EMAIL PROTECTED] wrote: Pekka, with a small change to your patch (to handle time-based cache coherency), your patch worked well and passed all my tests. Thanks. So now I wonder if we still need the patch

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-15 Thread Pekka Enberg
Hi, On 10/15/07, Erez Zadok [EMAIL PROTECTED] wrote: Pekka, with a small change to your patch (to handle time-based cache coherency), your patch worked well and passed all my tests. Thanks. So now I wonder if we still need the patch to prevent AOP_WRITEPAGE_ACTIVATE from being returned to

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-14 Thread Hugh Dickins
On Sat, 13 Oct 2007, Pekka Enberg wrote: On 10/12/07, Hugh Dickins [EMAIL PROTECTED] wrote: But I keep suspecting that the answer might be the patch below (which rather follows what drivers/block/rd.c is doing). I'm especially worried that, rather than just AOP_WRITEPAGE_ACTIVATE being

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-14 Thread Pekka Enberg
Hi Hugh, On Sat, 13 Oct 2007, Pekka Enberg wrote: Doesn't msync(2) get to it via mm/page-writeback.c:write_cache_pages() without unionfs even? On 10/14/07, Hugh Dickins [EMAIL PROTECTED] wrote: I believe not. Please do double-check my assertions, I've always found the _writepages paths

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-14 Thread Erez Zadok
In message [EMAIL PROTECTED], Pekka Enberg writes: Hi Hugh, On Sat, 13 Oct 2007, Pekka Enberg wrote: Doesn't msync(2) get to it via mm/page-writeback.c:write_cache_pages() without unionfs even? On 10/14/07, Hugh Dickins [EMAIL PROTECTED] wrote: I believe not. Please do double-check

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-14 Thread Pekka J Enberg
Hi Erez, On Sun, 14 Oct 2007, Erez Zadok wrote: In unionfs_writepage() I tried to emulate as best possible what the lower f/s will have returned to the VFS. Since tmpfs's -writepage can return AOP_WRITEPAGE_ACTIVATE and re-mark its page as dirty, I did the same in unionfs: mark again my page

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-14 Thread Erez Zadok
In message [EMAIL PROTECTED], Pekka J Enberg writes: Hi Erez, On Sun, 14 Oct 2007, Erez Zadok wrote: In unionfs_writepage() I tried to emulate as best possible what the lower f/s will have returned to the VFS. Since tmpfs's -writepage can return AOP_WRITEPAGE_ACTIVATE and re-mark its

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-12 Thread Pekka Enberg
Hi Hugh, On 10/12/07, Hugh Dickins [EMAIL PROTECTED] wrote: But I keep suspecting that the answer might be the patch below (which rather follows what drivers/block/rd.c is doing). I'm especially worried that, rather than just AOP_WRITEPAGE_ACTIVATE being returned to userspace, bad enough in

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-11 Thread Andrew Morton
On Sun, 7 Oct 2007 15:20:19 -0400 Erez Zadok [EMAIL PROTECTED] wrote: According to vfs.txt, -writepage() may return AOP_WRITEPAGE_ACTIVATE back to the VFS/VM. Indeed some filesystems such as tmpfs can return AOP_WRITEPAGE_ACTIVATE; and stackable file systems (e.g., Unionfs) also return

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-11 Thread Ryan Finnie
On 10/11/07, Andrew Morton [EMAIL PROTECTED] wrote: shit. That's a nasty bug. Really userspace should be testing for -1, but the msync() library function should only ever return 0 or -1. Does this fix it? --- a/mm/page-writeback.c~a +++ a/mm/page-writeback.c @@ -850,8 +850,10 @@ retry:

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-11 Thread Hugh Dickins
On Thu, 11 Oct 2007, Ryan Finnie wrote: On 10/11/07, Andrew Morton [EMAIL PROTECTED] wrote: shit. That's a nasty bug. Really userspace should be testing for -1, but the msync() library function should only ever return 0 or -1. Does this fix it? --- a/mm/page-writeback.c~a +++

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-08 Thread Pekka Enberg
Hi Ryan, On 10/8/07, Ryan Finnie [EMAIL PROTECTED] wrote: Doesn't appear to be enough. I can't figure out why (since it appears write_cache_pages bubbles up directly to sys_msync), but with that patch applied, in my test case[1], msync returns -1 EIO. However, with the exact same kernel

msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-07 Thread Erez Zadok
According to vfs.txt, -writepage() may return AOP_WRITEPAGE_ACTIVATE back to the VFS/VM. Indeed some filesystems such as tmpfs can return AOP_WRITEPAGE_ACTIVATE; and stackable file systems (e.g., Unionfs) also return AOP_WRITEPAGE_ACTIVATE if the lower f/s returned it. Anyway, some Ubuntu users

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-07 Thread Pekka J Enberg
Hi Erez, On 10/7/07, Erez Zadok [EMAIL PROTECTED] wrote: Anyway, some Ubuntu users of Unionfs reported that msync(2) sometimes returns AOP_WRITEPAGE_ACTIVATE (decimal 524288) back to userland. Therefore, some user programs fail, esp. if they're written such as this: [snip] On 10/7/07, Erez

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-10-07 Thread Ryan Finnie
On 10/7/07, Pekka J Enberg [EMAIL PROTECTED] wrote: On 10/7/07, Erez Zadok [EMAIL PROTECTED] wrote: Anyway, some Ubuntu users of Unionfs reported that msync(2) sometimes returns AOP_WRITEPAGE_ACTIVATE (decimal 524288) back to userland. Therefore, some user programs fail, esp. if they're