Chetan Mehrotra created OAK-6726:
------------------------------------

             Summary: Use addDocument instead of updateDocument while 
reindexing with Lucene
                 Key: OAK-6726
                 URL: https://issues.apache.org/jira/browse/OAK-6726
             Project: Jackrabbit Oak
          Issue Type: Improvement
          Components: lucene
            Reporter: Chetan Mehrotra
            Assignee: Chetan Mehrotra
            Priority: Minor
             Fix For: 1.8


Currently the DefaultIndexWriter uses 
[updateDocument|https://lucene.apache.org/core/4_7_1/core/org/apache/lucene/index/IndexWriter.html#updateDocument(org.apache.lucene.index.Term,
 java.lang.Iterable)] while adding/updating document in index. This is fine for 
incremental indexing where we cannot be sure if index already has that entry. 
This call first does a search for existing document matching the term and then 
deletes and add the new document

However for reindex case where we start from empty index we can use 
[addDocument|https://lucene.apache.org/core/4_7_1/core/org/apache/lucene/index/IndexWriter.html#addDocument(java.lang.Iterable)].
 This avoids the extra work for search

In test where index had ~70M entries switch to addDocument resulted in 10 min 
reduction in reindexing timings



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to