I think a LockFactory for Lucene that implemented the ideas you &
Marvin are discussing in LUCENE-1877,  and/or the approach you
implemented in the H2 DB, would be a useful addition to Lucene!

For many apps, the simple LockFactory impls suffice, but for apps
where multiple machines can become the writer, it gets hairy.  Having
an always correct Lock impl for these apps would be great.

Note that Lucene has some basic tools (in oal.store) for asserting
that a LockFactory is correct (see LockVerifyServer), so it's a useful
way to test that things are working from Lucene's standpoint.

Mike

On Fri, Nov 27, 2009 at 9:23 AM, Thomas Mueller
<thomas.tom.muel...@gmail.com> wrote:
> Hi,
>
> I'm wondering if your are interested in automatically releasing the
> write lock. See also my comments on
> https://issues.apache.org/jira/browse/LUCENE-1877 - I thought it's a
> problem worth solving, because it's also in the Lucene FAQ list at
> http://wiki.apache.org/lucene-java/LuceneFAQ#What_is_the_purpose_of_write.lock_file.2C_when_is_it_used.2C_and_by_which_classes.3F
>
> Unfortunately there seems to be no solution that 'always works', but
> delegating the task and responsibility to the application / to the
> user is problematic as well. For example, a user of the H2 database
> (that supports Lucene fulltext indexing) suggested to automatically
> remove the write.lock file whenever the file is there:
> http://code.google.com/p/h2database/issues/detail?id=141 - sounds a
> bit dangerous in my view.
>
> So, if you are interested to solve the problem, then maybe I can help.
> If not, then I will not bother you any longer :-)
>
> Regards,
> Thomas
>
>
>
>> > > shouldn't active code like that live in the application layer?
>> > Why?
>> You can all but guarantee that polling will work at the app layer
>
> The application layer may also run with low priority. In operating
> systems, it's usually the lower layer that have more 'rights'
> (priority), and not the higher levels (I'm not saying it should be
> like that in Java). I just think the application layer should not have
> to deal with write locks or removing write locks.
>
>> by the time the original process realizes that it doesn't hold the lock 
>> anymore, the damage could already have been done.
>
> Yes, I'm not sure how to best avoid that (with any design). Asking the
> application layer or the user whether the lock file can be removed is
> probably more dangerous than trying the best in Lucene.
>
> Standby / hibernate: the question is, if the machine process is
> currently not running, does the process still hold the lock? I think
> no, because the machine might as well turned off. How to detect
> whether the machine is turned off versus in hibernate mode? I guess
> that's a problem for all mechanisms (socket / file lock / background
> thread).
>
> When a hibernated process wakes up again, he thinks he owns the lock.
> Even if the process checks before each write, it is unsafe:
>
> if (isStillLocked()) {
>  write();
> }
>
> The process could wake up after isStillLocked() but before write().
> One protection is: The second process (the one that breaks the lock)
> would need to work on a copy of the data instead of the original file
> (it could delete / truncate the orginal file after creating a copy).
> On Windows, renaming the file might work (not sure); on Linux you
> probably need to copy the content to a new file. Like that, the awoken
> process can only destroy inactive data.
>
> The question is: do we need to solve this problem? How big is the
> risk? Instead of solving this problem completely, you could detect it
> after the fact without much overhead, and throw an exception saying:
> "data may be corrupt now".
>
> PID: With the PID, you could check if the process still runs. Or it
> could be another process with the same PID (is that possible?), or the
> same PID but a different machine (when using a network share). It's
> probably more safe if you can communicate with the lock owner (using
> TCP/IP or over the file system by deleting/creating a file).
>
> Unique id: The easiest solution is to use a UUID (a cryptographically
> secure random number). That problem _is_ solved (some systems have
> trouble generating entropy, but there are workarounds). If you anyway
> have a communication channel to the process, you could ask for this
> UUID. One you have a communication channel, you can do a lot
> (reference counting, safely transfer the lock,...).
>
>> If the server and the client can't access each other
>
> How to find out that the server is still running? My point is: I like
> to have a secure, automatic way to break the lock if the machine or
> process is stopped. And from my experience, native file locking is
> problematic for this.
>
> You could also combine solutions (such as: combine the 'open a server
> socket' solution with 'background thread' solution). I'm not sure if
> it's worth it to solve the 'hibernate' problem.
>
> Regards,
> Thomas
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-dev-h...@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to