Hello,

recently I updated the Lucene version in one of our products from 8.3 to 8.8.x 
(8.8.2 as of now).
The update showed no issues (e.g. compiled without changes) but I noticed that 
our test-suites take a lot longer to finish.

So I took a closer look at one test-case which showed a severe slowdown
(it’s doing small update, flush, search  cycles in order to stress NRT; the 
purpose is to see performance-changes in an early stage 😉 ):

Lucene 8.3:   ~2,3s
Lucene 8.8.x:  25s

This is a huge difference. Therefore I used YourKit to profile 8.3 and 8.8 and 
do a comparison.

The gap is caused by different amount of calls to  
sun.nio.fs.WindowsNativeDispatcher.CreateFile0(long, int, int, long, int, int) 
WindowsNativeDispatcher.java (native)
8.3:  about 150 calls
8.8:  about 12500 calls

In order to hunt down what is causing this, I took a look at the open() in 
NRTDirectory.
Here I could see that the amount of calls to that open is in the same ballpark 
for 8.3 and 8.8

The difference is that in 8.3 nearly all files are available in the underlying 
RAMDirectory. While in 8.8 files are opened for reading that do not (yet) exist.
This leads to a call to the WindowsNativeDispatcher.CreateFile0

Add the end of the mail I added two example-stacktraces that show this behavior.

Has someone an idea what change might cause this or if I need to do something 
different in 8.8 compared to 8.3?


Thanks for any help,

Markus

Here is an example stacktrace that is causing such a try of a read-access to 
non-existing file:

Filename= _0.fdm    (IOContext is READ)   (I checked the directory on harddisk: 
it did not yet contain it nor in RAM-directory of the NRTCacheDir)

openInput:100, FilterDirectory (org.apache.lucene.store)
openInput:100, FilterDirectory (org.apache.lucene.store)
openChecksumInput:157, Directory (org.apache.lucene.store)
finish:140, FieldsIndexWriter (org.apache.lucene.codecs.compressing)
finish:480, CompressingStoredFieldsWriter (org.apache.lucene.codecs.compressing)
flush:81, StoredFieldsConsumer (org.apache.lucene.index)
flush:239, DefaultIndexingChain (org.apache.lucene.index)
flush:350, DocumentsWriterPerThread (org.apache.lucene.index)
doFlush:476, DocumentsWriter (org.apache.lucene.index)
flushAllThreads:656, DocumentsWriter (org.apache.lucene.index)
getReader:605, IndexWriter (org.apache.lucene.index)
doOpenIfChanged:277, StandardDirectoryReader (org.apache.lucene.index)
openIfChanged:235, DirectoryReader (org.apache.lucene.index)


In a consequence later accesses to such files also lead to the state that the 
file is not within the RAMDirectory but only on harddisk.
Example:



Filename _1.fdx  Context = READ   (file is on harddisk but not in RAMDirectory)



openInput:100, FilterDirectory (org.apache.lucene.store)

openInput:100, FilterDirectory (org.apache.lucene.store)

openInput:100, FilterDirectory (org.apache.lucene.store)

openChecksumInput:157, Directory (org.apache.lucene.store)

write:90, Lucene50CompoundFormat (org.apache.lucene.codecs.lucene50)

createCompoundFile:5316, IndexWriter (org.apache.lucene.index)

sealFlushedSegment:457, DocumentsWriterPerThread (org.apache.lucene.index)

flush:395, DocumentsWriterPerThread (org.apache.lucene.index)

doFlush:476, DocumentsWriter (org.apache.lucene.index)

flushAllThreads:656, DocumentsWriter (org.apache.lucene.index)

getReader:605, IndexWriter (org.apache.lucene.index)

doOpenFromWriter:290, StandardDirectoryReader (org.apache.lucene.index)

doOpenIfChanged:275, StandardDirectoryReader (org.apache.lucene.index)

openIfChanged:235, DirectoryReader (org.apache.lucene.index)

Software AG – Sitz/Registered office: Uhlandstraße 12, 64297 Darmstadt, Germany 
– Registergericht/Commercial register: Darmstadt HRB 1562 - Vorstand/Management 
Board: Sanjay Brahmawar (Vorsitzender/Chairman), Dr. Elke Frank, Dr. Matthias 
Heiden, Dr. Stefan Sigg - Aufsichtsratsvorsitzender/Chairman of the Supervisory 
Board: Karl-Heinz Streibich - http://www.softwareag.com

Reply via email to