As far as I know, the updateDocument method on the IndexWriter delete and add. See also the javadoc:
[..] Updates a document by first deleting the document(s) containing term and then adding the new document. The delete and then add are atomic as seen by a reader on the same index (flush may happen only after the add). [..] On Tue, May 9, 2017 at 3:37 PM, Kudrettin Güleryüz <kudret...@gmail.com> wrote: > I do update the entire document each time. Furthermore, this sometimes > means deleting compressed archives which are stores as multiple documents > for each compressed archive file and readding them. > > Is there an update method, is it better performance than remove then add? I > was simply removing modified files from the index (which doesn't seem to > take long), and readd them. > > On Tue, May 9, 2017 at 9:33 AM Rob Audenaerde <rob.audenae...@gmail.com> > wrote: > > > Do you update each entire document? (vs updating numeric docvalues?) > > > > That is implemented as 'delete and add' so I guess that will be slower > than > > clean sheet indexing. Not sure if it is 3x slower, that seems a bit much? > > > > On Tue, May 9, 2017 at 3:24 PM, Kudrettin Güleryüz <kudret...@gmail.com> > > wrote: > > > > > Hi, > > > > > > For a 5.2.1 index that contains around 1.2 million documents, updating > > the > > > index with 1.3 million files seems to take 3X longer than doing a > scratch > > > indexing. (Files are crawled over NFS, indexes are stored on a > mechanical > > > disk locally (Btrfs)) > > > > > > Is this expected for Lucene's update index logic, or should I further > > debug > > > my part of the code for update performance? > > > > > > Thank you, > > > Kudret > > > > > >