Hi again, I found the difference causing the slow down.
It's NRTCachingDirectory#doCacheWrite method. With the implementation of 8.8 it's slow. With the version of 8.3 it's fast. Hope it helps, Markus -----Original Message----- From: Gietzen, Markus <markus.giet...@softwareag.com.INVALID> Sent: Wednesday, 19 May 2021 13:55 To: Michael McCandless <luc...@mikemccandless.com>; Lucene Users <java-user@lucene.apache.org> Subject: RE: Performance decrease with NRT use-case in 8.8.x (coming from 8.3.0) Hi, thanks for reaching me that fast! Your hint that there were changes to NRTCachingDirectory were the right point: I copied the 8.3 NRTCachingDirectory implementation into the project (with a different classname, you get the idea😉 ) and used that one. And believe it or not: everything is fine. Now 8.8 performs as fast as 8.3! I will check the differences and put them in step by step to find out which change causes the slow-down. I’ll report here. Bye, Markus From: Michael McCandless <luc...@mikemccandless.com> Sent: Wednesday, 19 May 2021 13:39 To: Lucene Users <java-user@lucene.apache.org>; Gietzen, Markus <markus.giet...@softwareag.com> Subject: Re: Performance decrease with NRT use-case in 8.8.x (coming from 8.3.0) > The update showed no issues (e.g. compiled without changes) but I noticed > that our test-suites take a lot longer to finish. Hmm, that sounds bad. We need our tests to stay fast but also do a good job testing things ;) Does your production usage also slow down? Tests do other interesting things, e.g. enable assertions, randomly swap in different codecs, different Directory impls (if you are using Lucene's randomized test infrastructure<http://blog.mikemccandless.com/2011/03/your-test-cases-should-sometimes-fail.html>), etc. > While in 8.8 files are opened for reading that do not (yet) exist. That is incredibly strange! Lucene should never do that (opening a non-existing file for "read", causing that file to become created through Windows CreateFile), that I am aware of. Can you share the full source code of the test case? Try to eliminate parts of the test maybe? Do you see the same slowdown if you don't use NRTCachingDirectory at all? (Just straight to MMapDirectory.) There were some recent fixes to NRTCachineDirectory, e.g. https://issues.apache.org/jira/browse/LUCENE-9115 and https://issues.apache.org/jira/browse/LUCENE-8663 -- maybe they are related? Mike McCandless http://blog.mikemccandless.com On Wed, May 19, 2021 at 7:23 AM Gietzen, Markus <markus.giet...@softwareag.com.invalid<mailto:markus.giet...@softwareag.com.invalid>> wrote: Hello, recently I updated the Lucene version in one of our products from 8.3 to 8.8.x (8.8.2 as of now). The update showed no issues (e.g. compiled without changes) but I noticed that our test-suites take a lot longer to finish. So I took a closer look at one test-case which showed a severe slowdown (it’s doing small update, flush, search cycles in order to stress NRT; the purpose is to see performance-changes in an early stage 😉 ): Lucene 8.3: ~2,3s Lucene 8.8.x: 25s This is a huge difference. Therefore I used YourKit to profile 8.3 and 8.8 and do a comparison. The gap is caused by different amount of calls to sun.nio.fs.WindowsNativeDispatcher.CreateFile0(long, int, int, long, int, int) WindowsNativeDispatcher.java (native) 8.3: about 150 calls 8.8: about 12500 calls In order to hunt down what is causing this, I took a look at the open() in NRTDirectory. Here I could see that the amount of calls to that open is in the same ballpark for 8.3 and 8.8 The difference is that in 8.3 nearly all files are available in the underlying RAMDirectory. While in 8.8 files are opened for reading that do not (yet) exist. This leads to a call to the WindowsNativeDispatcher.CreateFile0 Add the end of the mail I added two example-stacktraces that show this behavior. Has someone an idea what change might cause this or if I need to do something different in 8.8 compared to 8.3? Thanks for any help, Markus Here is an example stacktrace that is causing such a try of a read-access to non-existing file: Filename= _0.fdm (IOContext is READ) (I checked the directory on harddisk: it did not yet contain it nor in RAM-directory of the NRTCacheDir) openInput:100, FilterDirectory (org.apache.lucene.store) openInput:100, FilterDirectory (org.apache.lucene.store) openChecksumInput:157, Directory (org.apache.lucene.store) finish:140, FieldsIndexWriter (org.apache.lucene.codecs.compressing) finish:480, CompressingStoredFieldsWriter (org.apache.lucene.codecs.compressing) flush:81, StoredFieldsConsumer (org.apache.lucene.index) flush:239, DefaultIndexingChain (org.apache.lucene.index) flush:350, DocumentsWriterPerThread (org.apache.lucene.index) doFlush:476, DocumentsWriter (org.apache.lucene.index) flushAllThreads:656, DocumentsWriter (org.apache.lucene.index) getReader:605, IndexWriter (org.apache.lucene.index) doOpenIfChanged:277, StandardDirectoryReader (org.apache.lucene.index) openIfChanged:235, DirectoryReader (org.apache.lucene.index) In a consequence later accesses to such files also lead to the state that the file is not within the RAMDirectory but only on harddisk. Example: Filename _1.fdx Context = READ (file is on harddisk but not in RAMDirectory) openInput:100, FilterDirectory (org.apache.lucene.store) openInput:100, FilterDirectory (org.apache.lucene.store) openInput:100, FilterDirectory (org.apache.lucene.store) openChecksumInput:157, Directory (org.apache.lucene.store) write:90, Lucene50CompoundFormat (org.apache.lucene.codecs.lucene50) createCompoundFile:5316, IndexWriter (org.apache.lucene.index) sealFlushedSegment:457, DocumentsWriterPerThread (org.apache.lucene.index) flush:395, DocumentsWriterPerThread (org.apache.lucene.index) doFlush:476, DocumentsWriter (org.apache.lucene.index) flushAllThreads:656, DocumentsWriter (org.apache.lucene.index) getReader:605, IndexWriter (org.apache.lucene.index) doOpenFromWriter:290, StandardDirectoryReader (org.apache.lucene.index) doOpenIfChanged:275, StandardDirectoryReader (org.apache.lucene.index) openIfChanged:235, DirectoryReader (org.apache.lucene.index) Software AG – Sitz/Registered office: Uhlandstraße 12, 64297 Darmstadt, Germany – Registergericht/Commercial register: Darmstadt HRB 1562 - Vorstand/Management Board: Sanjay Brahmawar (Vorsitzender/Chairman), Dr. Elke Frank, Dr. Matthias Heiden, Dr. Stefan Sigg - Aufsichtsratsvorsitzender/Chairman of the Supervisory Board: Karl-Heinz Streibich - http://www.softwareag.com --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org