Re: Back Compatibility

robert engels Wed, 23 Jan 2008 10:59:02 -0800

I guess I don't understand what a commit lock is, or what's itspurpose is. It seems the write lock is all that is needed.

If you still need a write lock, then what is the purpose of"lockless" commits.

You can get consistency if all writers get the write lock beforeperforming any read. It would seem this should be the requirement???

Is there a Wiki or some such thing that discusses the "locklesscommits", their purpose and their implementation? I find the emailthread a bit cumbersome to review.



On Jan 23, 2008, at 11:55 AM, Michael McCandless wrote:

robert engels wrote:
Maybe I don't understand lockless commits then.
I just don't think you can enforce transactional consistencywithout either 1) locking, or 2) optimistic collision detection. Icould be wrong here, but this has been my experience.By effectively removing the locking requirement, I think you aregoing to have users developing code without thought as to what isgoing to happen when locking is added. This is going to break thebackwards compatibility that people are striving for.
Lucene still has locking (write.lock), to only allow one writer ata time to make changes to the index (ie, it serializes writersessions). Lock-less commits just replaced the old "commit.lock".
The lucene "writer" structure needs to be something like:

start tx for update
do work
commit
where commit is composed of (prepare and commit phases), butcommit may fail.
Right, this is what IndexWriter does now. It's just that withautoCommit=false you have total control on when that commit takesplace (only on closing the writer).
It is unknown if this can actually happen though, since there isno unique ID that could cause collisions, but there is theinternal id (which would need to remain constant throughout the txin order for queries and delete operations to work).
Yes but there are other errors that Lucene may hit, like disk full,which must (and do) rollback the commit to the start of thetransaction (ie, index state when writer was first opened).
I am sure it is that I don't understand lockless commits, so Iwill give a scenario.
client A issues query looking for documents with OID (a field) ="some field";
client B issues same query
both queries return nothing found
client A inserts document with OID = "some filed"
client B inserts document with OID = "some field"

client A commits and client B commits
unless B is blocked, once A issues the query, the index is goingto end up with 2 different copies of the document.
I understand that Lucene is not a database, and has no concept ofunique constraints. It is my understand that this has beenovercome using locks and sequential access to the index when writing.
In a simple XA implementation, client A would open a SERIALIZABLEtransaction, which would block B from even reading the index.Most simple XA implementation only support READ_COMMITTED,SERIALIZABLE, and NONE.
There are other ways of offering finer grained locking (based oninternal id and timestamps), but most are going to need a "serverbased" implementation of lucene to pull off.
To summarize, I think the "shared filestore (NFS)" and "locklesscommits" make implementing transactions very difficult. I am sureI am missing something here, I just don't see what.
Lucene hasn't ever supported that case above: it never blocks areader from opening the index. But, you could easily build that ontop of Lucene, right?
I'm still trying to understand what you feel is missing in the corethat prevents you from building XA (or, your own transactionshandling that involves another resource like a DB) on top of Lucene...
Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Back Compatibility

Reply via email to