[jira] Commented: (JCR-1605) RepositoryLock does not work on NFS sometimes

Thomas Mueller (JIRA) Fri, 16 May 2008 05:42:22 -0700

    [ 
https://issues.apache.org/jira/browse/JCR-1605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12597450#action_12597450
 ]


Thomas Mueller commented on JCR-1605:
-------------------------------------

> Does this need to be configurable? 

Yes, unless we only want to support the new mechanism.

> Wouldn't it be sufficient to catch the Exception and then fall back to the 
> new approach?

No. The file system may not always throw an exception. The message "No locks 
available" sounds like there is a number of locks, and if more locks are used 
this exception occurs. This would mean that sometimes the new algorithm is used 
and sometimes the old. This wouldn't work correctly.

The new mechanism I have in mind is a cooperative algorithm. This algorithm is 
already in use in the H2 database. It goes like this:

*  When the lock file does not exist, it is created (using the atomic operation 
File.createNewFile). Then, the process waits a little bit (20ms) and checks the 
file again. If the file was changed during this time, the operation is aborted. 
This protects against a race condition when a process deletes the lock file 
just after one create it, and a third process creates the file again. It does 
not occur if there are only two writers.

* If the file can be created, a random number is inserted. Afterwards, a 
watchdog thread is started that checks regularly (every second once by default) 
if the file was deleted or modified by another (challenger) thread / process. 
Whenever that occurs, the file is overwritten with the old data. The watchdog 
thread runs with high priority so that a change to the lock file does not get 
through undetected even if the system is very busy. However, the watchdog 
thread does use very little resources (CPU time), because it waits most of the 
time. Also, the watchdog only reads from the hard disk and does not write to it.

* If the lock file exists, and it was modified in the 20 ms, the process waits 
for some time (up to 10 times). If it was still changed, an exception is thrown 
("locked"). This is done to eliminate race conditions with many concurrent 
writers. Afterwards, the file is overwritten with a new version (challenge). 
After that, the thread waits for 2 seconds. If there is a watchdog thread 
protecting the file, he will overwrite the change and this process will fail to 
lock. However, if there is no watchdog thread, the lock file will still be as 
written by this thread. In this case, the file is deleted and atomically 
created again. The watchdog thread is started in this case and the file is 
locked. 



> RepositoryLock does not work on NFS sometimes
> ---------------------------------------------
>
>                 Key: JCR-1605
>                 URL: https://issues.apache.org/jira/browse/JCR-1605
>             Project: Jackrabbit
>          Issue Type: Bug
>            Reporter: Thomas Mueller
>            Assignee: Thomas Mueller
>
> The RepositoryLock mechanism currently used in Jackrabbit uses FileLock. This 
> doesn't work on some NFS file system. It looks like only NFS version 4 and 
> newer supports locking. Older implementations may throw a IOException "No 
> locks available", which means the NFS does not support byte-range locking.
> I propose to add a second locking mechanism, and add a configuration option 
> to use it. For example: <FileLocking class="acme" />. This second locking 
> mechanism is a cooperative locking protocol that uses a background (watchdog) 
> thread and only uses regular file operations.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1605) RepositoryLock does not work on NFS sometimes

Reply via email to