robert engels wrote:
Maybe I don't understand lockless commits then.
I just don't think you can enforce transactional consistency
without either 1) locking, or 2) optimistic collision detection. I
could be wrong here, but this has been my experience.
By effectively removing the locking requirement, I think you are
going to have users developing code without thought as to what is
going to happen when locking is added. This is going to break the
backwards compatibility that people are striving for.
Lucene still has locking (write.lock), to only allow one writer at a
time to make changes to the index (ie, it serializes writer
sessions). Lock-less commits just replaced the old "commit.lock".
The lucene "writer" structure needs to be something like:
start tx for update
do work
commit
where commit is composed of (prepare and commit phases), but commit
may fail.
Right, this is what IndexWriter does now. It's just that with
autoCommit=false you have total control on when that commit takes
place (only on closing the writer).
It is unknown if this can actually happen though, since there is no
unique ID that could cause collisions, but there is the internal id
(which would need to remain constant throughout the tx in order for
queries and delete operations to work).
Yes but there are other errors that Lucene may hit, like disk full,
which must (and do) rollback the commit to the start of the
transaction (ie, index state when writer was first opened).
I am sure it is that I don't understand lockless commits, so I will
give a scenario.
client A issues query looking for documents with OID (a field) =
"some field";
client B issues same query
both queries return nothing found
client A inserts document with OID = "some filed"
client B inserts document with OID = "some field"
client A commits and client B commits
unless B is blocked, once A issues the query, the index is going to
end up with 2 different copies of the document.
I understand that Lucene is not a database, and has no concept of
unique constraints. It is my understand that this has been overcome
using locks and sequential access to the index when writing.
In a simple XA implementation, client A would open a SERIALIZABLE
transaction, which would block B from even reading the index. Most
simple XA implementation only support READ_COMMITTED, SERIALIZABLE,
and NONE.
There are other ways of offering finer grained locking (based on
internal id and timestamps), but most are going to need a "server
based" implementation of lucene to pull off.
To summarize, I think the "shared filestore (NFS)" and "lockless
commits" make implementing transactions very difficult. I am sure I
am missing something here, I just don't see what.
Lucene hasn't ever supported that case above: it never blocks a
reader from opening the index. But, you could easily build that on
top of Lucene, right?
I'm still trying to understand what you feel is missing in the core
that prevents you from building XA (or, your own transactions
handling that involves another resource like a DB) on top of Lucene...
Mike
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]