On Wed, 6 May 2009, Simon Wilkinson wrote:
On 5 May 2009, at 13:43, Felix Frank wrote:
The patches in RT are just variations on the theme of
linux-mmap-antirecursion-20081020. They prevent deadlock at the risk of
data loss. The fixes in RT solve a cache inconsistency, but data corruption
is still possible.
Just trying to clarify where we're at with this problem, as I know that there
are people who get worried whenever they hear the words "data loss" (and I'm
one of them!)
My understanding is that one class of problems is solved by fixing
linux-mmap-antirecursion-20081020 with the latest patch in RT. This solves
the deadlock, and removes one set of write corruption issues. So far this
corruption has only been observed with applications that mmap a file, close
it, and then write to the mmap'd chunk. Does this match with your testing?
Yes, that's what the fixes in RT solve.
Secondly, we have another issue that occurs with mmap when the file of the
size being mmap'd is larger than the cache size. This has also only been
observed where an application does mmap, close, write. This problem is
currently unfixed, but has only been observed with Linux kernels that don't
have the BDI starvation fixes. Is that a valid summary?
Exactly (almost), but for this to work, the file needs not even be closed
prior to writing.
This is somewhat embarrassing, but I just had a sudden idea to try another
variation on the test program. I will speak up again after doing more tests,
but this problem seems to be even more edge case than was originally assumed.
Regards
- Felix
_______________________________________________
OpenAFS-devel mailing list
[email protected]
https://lists.openafs.org/mailman/listinfo/openafs-devel