On Thu, 2008-06-26 at 12:12 +0200, Manuel Teira wrote:
> Hello.
> After further  investigation and tests, related with the change in 
> r671604 to drop the file locking strategy in favour of a flock on the 
> data dir.
> 
> Trying to write a similar code, but using lockf, I hit the issue that 
> the file must be opened using O_RDWR or O_RWONLY, and that's not allowed 
> for a directory.
> The same happens trying to use a fcntl call.
> And unexpectedly, the same for flock. In the solaris manual page:
> 
> <snip>
>      Read permission is required on a file  to  obtain  a  shared
>      lock,   and  write  permission  is  required  to  obtain  an
>      exclusive lock.
> </snip>
> 
> But the linux man page claims:
> 
> <snip>
> A shared or exclusive lock can be placed on a file regardless of the 
> mode in which the file was opened.
> </snip>
> 
> I've searched the web for some BSD system pages, but they don't say 
> anything about the file mode.
> 
> 
> On the other way, POSIX fcntl specification says, apropos the failure 
> causes:
> 
> [EBADF]
>     The /fildes/ argument is not a valid open file descriptor, or the
>     argument /cmd/ is F_SETLK or F_SETLKW, the type of lock, *l_type*,
>     is a shared lock (F_RDLCK), and /fildes/ is not a valid file
>     descriptor open for reading, or the type of lock *l_type*, is an
>     exclusive lock (F_WRLCK), and /fildes/ is not a valid file
>     descriptor open for writing. 
> 
> Posix specs also forces write permissions for lockf:
> http://www.opengroup.org/onlinepubs/007908799/xsh/lockf.html
> 
> 
> 
> This leads to solaris not being able to lock directly on a directory, 
> I'm afraid. Any idea?


Yes, we can create (if it doesn't already exist) a lock file in the
directory and then use lockf to lock it. There's already code in
Daemon.cpp that does exactly this for the PID file. The reason I
switched to flock was because crashing or killed brokers were sometimes
leaving the lock file behind them, whereas a flock (or lockf)  lock is
automatically released when the process exits. 

We need to
 - create a qpid::sys::LockFile class that can be re-implemented on
different platforms.
 - use the Daemon.cpp code as the posix implementation.
 - Replace the locking code in Daemon.cpp and DataDir.cpp with the
common sys::LockFile.

It's JIRA https://issues.apache.org/jira/browse/QPID-1158
Could you take this on Manuel? I'll can do it but it may take a couple
days to get to it. 


Reply via email to