[
https://issues.apache.org/jira/browse/LUCENE-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12525377
]
Michael McCandless commented on LUCENE-992:
-------------------------------------------
> - addDocument(Document doc, Analyzer analyzer, Term delTerm): is it
> better to name it updateDocument?
Good, I'll make that change.
> - I didn't check all the variable accesses in DocumentsWriter, but
> it seems abort() should lock for some of the variables it
> accesses. Or make abort() a synchronized method.
OK I will make abort synchronized.
> - Observation: Large documents will block small documents from being
> flushed if addDocument of large documents is called before that of
> small ones. This is not the case before LUCENE-843.
Right, when multiple documents are in flight at once (because multiple
threads are adding documents), the documents must be flushed in order
of docID. Each one grabs a unique (sequential) docID at the start
(synchronized), does the indexing un-synchronized, then flushes
(synchronized) but only if it is that documents "turn" to flush (ie it
is the next docID to be written). So if a large doc grabs docID
first, then a small doc comes through, it's possible for small docs to
finish indexing before large doc does in which case small docs are
buffered, waiting for large doc to flush first.
> I also slightly changed the exception semantics in IndexWriter:
> previously if a disk full (or other) exception was hit when flushing
> the buffered docs, the buffered deletes were retained but the
> partially flushed buffered docs (if any) were discarded.
> - Observation: Before LUCENE-843, both buffered docs and buffered
> deletes were retained when such an exception occurs. Now both
> buffered docs and buffered deletes would be discared if an exception
> is hit.
Right, altough if the exception is hit after the commit point (eg,
while building the compound file) then the buffered docs & deletes
are added to the index.
I plan to commit this in a day or two.
> IndexWriter.updateDocument is no longer atomic
> ----------------------------------------------
>
> Key: LUCENE-992
> URL: https://issues.apache.org/jira/browse/LUCENE-992
> Project: Lucene - Java
> Issue Type: Bug
> Components: Index
> Affects Versions: 2.2
> Reporter: Michael McCandless
> Assignee: Michael McCandless
> Priority: Minor
> Fix For: 2.3
>
> Attachments: LUCENE-992.patch
>
>
> Spinoff from LUCENE-847.
> Ning caught that as of LUCENE-843, we lost the atomicity of the delete
> + add in IndexWriter.updateDocument.
> Ning suggested a simple fix: move the buffered deletes into
> DocumentsWriter and let it do the delete + add atomically. This has a
> nice side effect of also consolidating the "time to flush" logic in
> DocumentsWriter.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]