Re: Lockless commits -- great stuff!

Michael McCandless Fri, 12 Jan 2007 03:03:12 -0800

Marvin Humphrey wrote:

On Jan 11, 2007, at 6:48 AM, Michael McCandless wrote:
I too am happy that we have no more commit lock :)
Not just that.  :)
No more lock directory, since we can put write.lock in the indexdirectory itself.
No more lock file name munging, since lock files from different indexesno longer need to avoid collisions within a shared namespace.
No more need to deal with any files outside of the index directory.
Those three changes have a bigger impact on Lucy than they do on Lucene,and since I'm writing a lot of KS 0.20 code with the notion that it willbe submitted to Lucy, they're having an impact on what I'm doing rightnow. C doesn't provide a number of the dependencies needed to supportthe old lock system, so we would either have had to include them, writethem ourselves, or supply the needed functionality via PITA callbacks tothe host language (Perl, Ruby, etc).
Since the lock directory lived in the system's tmp directory, we neededcode to discover where it was. Now we don't.
The lock file name munging required a checksum string generator. Wedon't need that now.
Lastly, a failure of imagination had left me blind to the fact that wedidn't need sophisticated, portable filepath manipulating routines: justknowing a directory separator suffices. Previously, I'd wrapped Perl'sFile::Spec::Functions to make catfile() and canonpath() available fromC. That hadn't been necessary, because we could have built up thelockfile paths given the location of the tmp directory and the dir_sep.However, as is often the case, simplifying the implementation revealsunnecessary cruft, and when all of a sudden everything ended up in onedirectory with a splash, it became obvious that generating filepathsdidn't require heavy machinery.
But I have to say the lockless changes pale in comparison to what you
have done/are doing with KinoSearch, specifically the clean merge
model with an external sorter and other related file format changes
look very interesting.


Ooh, excellent points!

In fact, we haven't done this follow-through for Lucene but I think we
now should?  I think having only one directory (the index directory)
where things happen, and simple file name for the write lock
("write.lock") is a great simplification to our users.

Now that readers are read-only, I think it makes sense to default the
write lock into the index directory, and as you describe, no longer
generate a "unique namespace" hash lock ID since the index dir gives
us that scoping.

Are there any reasons not to do this?  I will open a JIRA issue to
track this.

Well, I look forward to seeing whether you can suggest improvements onsome of the algos I'll bring up in this forum once KS 0.20_01 is out. :)


I will try, but I'm already behind just trying to understand how we
could improve Lucene based on your current KS release!  Is there any
preview/general summary of what's being done for KS 2.0/Lucy?  I tried
to quickly search the KS archives and look through Lucy's archives but
didn't find any solid hit.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Lockless commits -- great stuff!

Reply via email to