On Wed, 15 Apr 2009, Derrick Brashear wrote:

On Wed, Apr 15, 2009 at 5:44 AM, Felix Frank <[email protected]> wrote:
On a hunch, I applied this to 1.4.8:

--- src/afs/LINUX/osi_vm.c.orig 2009-04-15 11:37:49.000000000 +0200
+++ src/afs/LINUX/osi_vm.c      2009-04-15 11:38:56.000000000 +0200
@@ -102,11 +102,6 @@ osi_VM_StoreAllSegments(struct vcache *a
 {
    struct inode *ip = AFSTOV(avc);

-    if (!avc->states & CPageWrite)
-       avc->states |= CPageWrite;
-    else -       return; /* someone already writing */
-
 #if LINUX_VERSION_CODE >= KERNEL_VERSION(2,4,5)
    /* filemap_fdatasync() only exported in 2.4.5 and above */
    ReleaseWriteLock(&avc->lock);
@@ -120,7 +115,6 @@ osi_VM_StoreAllSegments(struct vcache *a
    AFS_GLOCK();
    ObtainWriteLock(&avc->lock, 121);
 #endif
-    avc->states &= ~CPageWrite;
 }

 /* Purge VM for a file when its callback is revoked.


This apparently solved the problem for 1.4.8 w/ disk cache. Will try 1.4.10
as well. BCC'ing openafs-bugs now.

The problem without that is a deadlock as described in RT 120491,
which means either this or that needs to be solved in another way.

Looking through my local pile of things to deal with, I see Chaskiel
commented thus:
"What's there seems like it will prevent recursion, but in a silly
way. The whole point of calling filemap_fdatawrite
is for the kernel to call writepage() on all the dirty pages. But
since osi_VM_StoreAllSegments always sets CPageWrite and CPageWrite
means writepage always returns WRITEPAGE_ACTIVATE, there's no point.
Wouldn't it be better
for a DoPartialWrite-driven StoreAllSegments to not call
osi_VM_StoreAllSegments (and restore the latter to usefulness)?"

If you wish to/can look, please do, otherwise I will as soon as I can.

I'm stuck. More tests showed that linux-mmap-antirecursion-20081020 will lead to faulty mmap behaviour *in any case* on Linux (disk cache, don't even get me started on memcache), not only when making changes after the file close. As to why, this has me puzzled:

I fstraced the run of the attached program with and without antirecursion, and the traces differ like this

...
Open
Iwrite
Iupdatepage ... code 99999
Write
Iupdatepage ... code 2
Iwrite
Gn_map
+ Iupdatepage ... code 99999
+ Write
+ Iupdatepage ... code 2
StoreAll
...

Where the lines marked "+" show up only in the run where antirecursion is short-circuited. They are the only lines associated to a pid that shows up as [pdflush] in ps.

What I don't get is why setting CPageWrite prevents
afs_linux_writepage_sync from being called (?), as CPageWrite is checked inside it, and only after the afs_Trace4(). Iupdatepage with code 99999 should therefore even show up with working antirecursion, as far as I can understand it.

Sorry if I'm being gratuitous in a request for offline advice. Any pointers, though?

Sincerely
Felix
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/mman.h>
#include <unistd.h>
#include <fcntl.h>

int main(int argc, char **argv)
{
    char *file = "mapped-file.bin";
    char *map = NULL;
    int fd;

    if ( argc > 1 )
        file = argv[1];

    printf("Using file %s...\n", file);

    fd = open(file, O_RDWR | O_CREAT);
    if ( fd == -1 ) {
        perror(file);
        return 1;
    }

    write(fd, "1\n", 2);

    if ( (map = (char*)mmap(NULL, 1, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0)) 
                        == (char*) -1 ) {
        perror("mmap");
        return 1;
    }

    map[0]++;

    printf("Changed first byte to %u, unmapping and closing...\n", map[0]);

    munmap((void*)map, 1);

    close(fd);

    return 0;
}

Reply via email to