Re: Back Compatibility

Michael McCandless Wed, 23 Jan 2008 09:55:53 -0800


robert engels wrote:

Maybe I don't understand lockless commits then.
I just don't think you can enforce transactional consistencywithout either 1) locking, or 2) optimistic collision detection. Icould be wrong here, but this has been my experience.By effectively removing the locking requirement, I think you aregoing to have users developing code without thought as to what isgoing to happen when locking is added. This is going to break thebackwards compatibility that people are striving for.

Lucene still has locking (write.lock), to only allow one writer at atime to make changes to the index (ie, it serializes writersessions). Lock-less commits just replaced the old "commit.lock".

The lucene "writer" structure needs to be something like:

start tx for update
do work
commit
where commit is composed of (prepare and commit phases), but commitmay fail.

Right, this is what IndexWriter does now. It's just that withautoCommit=false you have total control on when that commit takesplace (only on closing the writer).

It is unknown if this can actually happen though, since there is nounique ID that could cause collisions, but there is the internal id(which would need to remain constant throughout the tx in order forqueries and delete operations to work).

Yes but there are other errors that Lucene may hit, like disk full,which must (and do) rollback the commit to the start of thetransaction (ie, index state when writer was first opened).

I am sure it is that I don't understand lockless commits, so I willgive a scenario.
client A issues query looking for documents with OID (a field) ="some field";
client B issues same query
both queries return nothing found
client A inserts document with OID = "some filed"
client B inserts document with OID = "some field"

client A commits and client B commits
unless B is blocked, once A issues the query, the index is going toend up with 2 different copies of the document.
I understand that Lucene is not a database, and has no concept ofunique constraints. It is my understand that this has been overcomeusing locks and sequential access to the index when writing.
In a simple XA implementation, client A would open a SERIALIZABLEtransaction, which would block B from even reading the index. Mostsimple XA implementation only support READ_COMMITTED, SERIALIZABLE,and NONE.
There are other ways of offering finer grained locking (based oninternal id and timestamps), but most are going to need a "serverbased" implementation of lucene to pull off.
To summarize, I think the "shared filestore (NFS)" and "locklesscommits" make implementing transactions very difficult. I am sure Iam missing something here, I just don't see what.

Lucene hasn't ever supported that case above: it never blocks areader from opening the index. But, you could easily build that ontop of Lucene, right?

I'm still trying to understand what you feel is missing in the corethat prevents you from building XA (or, your own transactionshandling that involves another resource like a DB) on top of Lucene...


Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Back Compatibility

Reply via email to