Re: [OpenAFS-devel] flock Input/output error

Simon Wilkinson Wed, 11 Aug 2010 15:38:12 -0700

On 11 Aug 2010, at 17:21, Simon Wilkinson wrote:
> 
> Once you've applied this, I would be interested to know what error your test 
> now returns ...


I'm still interesting in the error code you're seeing, but on further analysis, 
I think I've identified two problems. They're both related to race conditions 
in the way that we enrol AFS locks with the kernel's local lock management 
system (we do this so that the kernel can handle byte-range locks on the local 
machine for us).

The first is that locks and unlocks can race against each other. On a lock we 
do SetAFSLock, SetKernelLock. On unlock we do ReleaseAFSLock, 
ReleaseKernelLock. However, we don't hold any locks on the file whilst we do 
so. Multiple calls to set a lock are safe, as the SetAFSLock serialises them. 
However, a lock and an unlock may race each other. In this case we have

Process A                 Process B
SetAFSLock
SetKernelLock
....
ReleaseAFSLock
                          SetAFSLock
                          SetKernelLock
ReleaseKernelLock

Process B can't get the kernel lock, despite the fact that it has the AFS lock, 
because process A hasn't released it yet. So you get an error message.

The second problem is a similar race, but related to what happens when we close 
a file handle. We don't actually clean up any of the kernel file locks 
ourselves - instead, we let the kernel do so when it disposes of the file 
descriptor. However, we do release any file server locks that we might have. 
Between us releasing the fileserver locks, and the kernel freeing it's locks, 
there's an opportunity for another process to gain a fileserver lock, but not a 
local one, and you'll get an error back there.

I think that it's the second problem that your test is hitting. Sadly this 
problem is the harder one to fix, as it requires refactoring the way that we 
interface with the Linux lock management code.

Cheers,

Simon.

_______________________________________________
OpenAFS-devel mailing list
[email protected]
https://lists.openafs.org/mailman/listinfo/openafs-devel

Re: [OpenAFS-devel] flock Input/output error

Reply via email to