I do update the entire document each time. Furthermore, this sometimes means deleting compressed archives which are stores as multiple documents for each compressed archive file and readding them.
Is there an update method, is it better performance than remove then add? I was simply removing modified files from the index (which doesn't seem to take long), and readd them. On Tue, May 9, 2017 at 9:33 AM Rob Audenaerde <rob.audenae...@gmail.com> wrote: > Do you update each entire document? (vs updating numeric docvalues?) > > That is implemented as 'delete and add' so I guess that will be slower than > clean sheet indexing. Not sure if it is 3x slower, that seems a bit much? > > On Tue, May 9, 2017 at 3:24 PM, Kudrettin Güleryüz <kudret...@gmail.com> > wrote: > > > Hi, > > > > For a 5.2.1 index that contains around 1.2 million documents, updating > the > > index with 1.3 million files seems to take 3X longer than doing a scratch > > indexing. (Files are crawled over NFS, indexes are stored on a mechanical > > disk locally (Btrfs)) > > > > Is this expected for Lucene's update index logic, or should I further > debug > > my part of the code for update performance? > > > > Thank you, > > Kudret > > >