On Wed, 15 Apr 2009, Derrick Brashear wrote:
On Wed, Apr 15, 2009 at 5:44 AM, Felix Frank <[email protected]> wrote:
On a hunch, I applied this to 1.4.8:
--- src/afs/LINUX/osi_vm.c.orig 2009-04-15 11:37:49.000000000 +0200
+++ src/afs/LINUX/osi_vm.c 2009-04-15 11:38:56.000000000 +0200
@@ -102,11 +102,6 @@ osi_VM_StoreAllSegments(struct vcache *a
{
struct inode *ip = AFSTOV(avc);
- if (!avc->states & CPageWrite)
- avc->states |= CPageWrite;
- else - return; /* someone already writing */
-
#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,4,5)
/* filemap_fdatasync() only exported in 2.4.5 and above */
ReleaseWriteLock(&avc->lock);
@@ -120,7 +115,6 @@ osi_VM_StoreAllSegments(struct vcache *a
AFS_GLOCK();
ObtainWriteLock(&avc->lock, 121);
#endif
- avc->states &= ~CPageWrite;
}
/* Purge VM for a file when its callback is revoked.
This apparently solved the problem for 1.4.8 w/ disk cache. Will try 1.4.10
as well. BCC'ing openafs-bugs now.
The problem without that is a deadlock as described in RT 120491,
which means either this or that needs to be solved in another way.
Looking through my local pile of things to deal with, I see Chaskiel
commented thus:
"What's there seems like it will prevent recursion, but in a silly
way. The whole point of calling filemap_fdatawrite
is for the kernel to call writepage() on all the dirty pages. But
since osi_VM_StoreAllSegments always sets CPageWrite and CPageWrite
means writepage always returns WRITEPAGE_ACTIVATE, there's no point.
Wouldn't it be better
for a DoPartialWrite-driven StoreAllSegments to not call
osi_VM_StoreAllSegments (and restore the latter to usefulness)?"
If you wish to/can look, please do, otherwise I will as soon as I can.
I'm stuck. More tests showed that linux-mmap-antirecursion-20081020 will
lead to faulty mmap behaviour *in any case* on Linux (disk cache, don't
even get me started on memcache), not only when making
changes after the file close. As to why, this has me puzzled:
I fstraced the run of the attached program with and without antirecursion,
and the traces differ like this
...
Open
Iwrite
Iupdatepage ... code 99999
Write
Iupdatepage ... code 2
Iwrite
Gn_map
+ Iupdatepage ... code 99999
+ Write
+ Iupdatepage ... code 2
StoreAll
...
Where the lines marked "+" show up only in the run where antirecursion is
short-circuited. They are the only lines associated to a pid that shows up
as [pdflush] in ps.
What I don't get is why setting CPageWrite prevents
afs_linux_writepage_sync from being called (?), as CPageWrite is checked
inside it, and only after the afs_Trace4(). Iupdatepage with code 99999
should therefore even show up with working antirecursion, as far as I can
understand it.
Sorry if I'm being gratuitous in a request for offline advice. Any
pointers, though?
Sincerely
Felix
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/mman.h>
#include <unistd.h>
#include <fcntl.h>
int main(int argc, char **argv)
{
char *file = "mapped-file.bin";
char *map = NULL;
int fd;
if ( argc > 1 )
file = argv[1];
printf("Using file %s...\n", file);
fd = open(file, O_RDWR | O_CREAT);
if ( fd == -1 ) {
perror(file);
return 1;
}
write(fd, "1\n", 2);
if ( (map = (char*)mmap(NULL, 1, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0))
== (char*) -1 ) {
perror("mmap");
return 1;
}
map[0]++;
printf("Changed first byte to %u, unmapping and closing...\n", map[0]);
munmap((void*)map, 1);
close(fd);
return 0;
}