Kris Davis <[EMAIL PROTECTED]> writes:
> I am struggling with locking a file in AFS.  Maybe there is something

> Not knowing any better, I suspect that the end of file is being
> determined at "open" time.  This would cause the closure on another
> machine to write data which would subsequently be overwritten by a write
> on the current machine.  Because the close on the other machine could
> occur in the window between the open and the lock, I would never know
> about it.

Precisely correct.

> The reason I suspect the open-lock window is the problem, is because I used to:
> 
>     * open the file,
>     * wait for a successful lock,
>     * append the file,
>     * close the file.
> 
> With this algorithm I ran into the problem more frequently.

Yes, big races here.  You might have been better off without the lock.

> The code below implements the following algorithm
> 
>       * while not locked
>     *    open the file,
>     *    attempt lock,
>     *    if lock failure then close file
>     * append the file,
>     * close the file.
> 
> This seemed to resolve the problem, but on stressing it further the
> problem still occurs.

There is still a race between open and lock, but it's not as big. 
Somebody else could get a lock,append,close in there on you.
I think you have to do something more like:
fd1 = open (READ)       -- can't be RDWR because of AFS consistency semantics
flock(fd1, EXCLUSIVE)   -- so nobody else can write it 
   fd2 = open (RDWR)
   append(fd2, some stuff)
   close(fd2);
flock(fd1, UNLOCK)
close(fd1);

Awful, ain't it?

> 1) what is the proper way to perform file locking?  I would have thought
> that opening and locking a file needed to be an atomic operation.  Why
> is an atomic operation not needed?

It should be atomic.  It's not neccessary when the lockers are both on
the same machine, however.

> 2) Is there something unique I need to do in AFS to get a file lock?

Other than this, not really.

> 3) When do the AIX functions I am compiling into my code get intercepted
> so that AFS can handle them in their unique way?  Am I using the right
> functions?  I have looked for the "flock" function, but it does not seem
> to be in any of the AIX headers.

flock is a system call and a vnode operation.  AFS inserts its own
code under the VFS switch, and it gets invoked by the flock system
call.

Lyle            Transarc                707 Grant Street
412 338 4474    The Gulf Tower          Pittsburgh 15219

Reply via email to