Hi again,

I found the difference causing the slow down.

It's  NRTCachingDirectory#doCacheWrite  method.

With the implementation of 8.8 it's slow. With the version of 8.3 it's fast.  

Hope it helps,

Markus


-----Original Message-----
From: Gietzen, Markus <markus.giet...@softwareag.com.INVALID> 
Sent: Wednesday, 19 May 2021 13:55
To: Michael McCandless <luc...@mikemccandless.com>; Lucene Users 
<java-user@lucene.apache.org>
Subject: RE: Performance decrease with NRT use-case in 8.8.x (coming from 8.3.0)

Hi,

thanks for reaching me that fast!

Your hint that there were changes to NRTCachingDirectory were the right point:

I copied the 8.3 NRTCachingDirectory implementation into the project (with a 
different classname, you get the idea😉 ) and used that one.

And believe it or not: everything is fine. Now 8.8 performs as fast as 8.3!

I will check the differences and put them in step by step to find out which 
change causes the slow-down.
I’ll report here.

Bye,

Markus


From: Michael McCandless <luc...@mikemccandless.com>
Sent: Wednesday, 19 May 2021 13:39
To: Lucene Users <java-user@lucene.apache.org>; Gietzen, Markus 
<markus.giet...@softwareag.com>
Subject: Re: Performance decrease with NRT use-case in 8.8.x (coming from 8.3.0)

> The update showed no issues (e.g. compiled without changes) but I noticed 
> that our test-suites take a lot longer to finish.

Hmm, that sounds bad.  We need our tests to stay fast but also do a good job 
testing things ;)

Does your production usage also slow down?  Tests do other interesting things, 
e.g. enable assertions, randomly swap in different codecs, different Directory 
impls (if you are using Lucene's randomized test 
infrastructure<http://blog.mikemccandless.com/2011/03/your-test-cases-should-sometimes-fail.html>),
 etc.

> While in 8.8 files are opened for reading that do not (yet) exist.

That is incredibly strange!  Lucene should never do that (opening a 
non-existing file for "read", causing that file to become created through 
Windows CreateFile), that I am aware of.  Can you share the full source code of 
the test case?

Try to eliminate parts of the test maybe?  Do you see the same slowdown if you 
don't use NRTCachingDirectory at all? (Just straight to MMapDirectory.)  There 
were some recent fixes to NRTCachineDirectory, e.g. 
https://issues.apache.org/jira/browse/LUCENE-9115 and 
https://issues.apache.org/jira/browse/LUCENE-8663 -- maybe they are related?

Mike McCandless

http://blog.mikemccandless.com


On Wed, May 19, 2021 at 7:23 AM Gietzen, Markus 
<markus.giet...@softwareag.com.invalid<mailto:markus.giet...@softwareag.com.invalid>>
 wrote:
Hello,

recently I updated the Lucene version in one of our products from 8.3 to 8.8.x 
(8.8.2 as of now).
The update showed no issues (e.g. compiled without changes) but I noticed that 
our test-suites take a lot longer to finish.

So I took a closer look at one test-case which showed a severe slowdown (it’s 
doing small update, flush, search  cycles in order to stress NRT; the purpose 
is to see performance-changes in an early stage 😉 ):

Lucene 8.3:   ~2,3s
Lucene 8.8.x:  25s

This is a huge difference. Therefore I used YourKit to profile 8.3 and 8.8 and 
do a comparison.

The gap is caused by different amount of calls to  
sun.nio.fs.WindowsNativeDispatcher.CreateFile0(long, int, int, long, int, int) 
WindowsNativeDispatcher.java (native)
8.3:  about 150 calls
8.8:  about 12500 calls

In order to hunt down what is causing this, I took a look at the open() in 
NRTDirectory.
Here I could see that the amount of calls to that open is in the same ballpark 
for 8.3 and 8.8

The difference is that in 8.3 nearly all files are available in the underlying 
RAMDirectory. While in 8.8 files are opened for reading that do not (yet) exist.
This leads to a call to the WindowsNativeDispatcher.CreateFile0

Add the end of the mail I added two example-stacktraces that show this behavior.

Has someone an idea what change might cause this or if I need to do something 
different in 8.8 compared to 8.3?


Thanks for any help,

Markus

Here is an example stacktrace that is causing such a try of a read-access to 
non-existing file:

Filename= _0.fdm    (IOContext is READ)   (I checked the directory on harddisk: 
it did not yet contain it nor in RAM-directory of the NRTCacheDir)

openInput:100, FilterDirectory (org.apache.lucene.store) openInput:100, 
FilterDirectory (org.apache.lucene.store) openChecksumInput:157, Directory 
(org.apache.lucene.store) finish:140, FieldsIndexWriter 
(org.apache.lucene.codecs.compressing)
finish:480, CompressingStoredFieldsWriter (org.apache.lucene.codecs.compressing)
flush:81, StoredFieldsConsumer (org.apache.lucene.index) flush:239, 
DefaultIndexingChain (org.apache.lucene.index) flush:350, 
DocumentsWriterPerThread (org.apache.lucene.index) doFlush:476, DocumentsWriter 
(org.apache.lucene.index) flushAllThreads:656, DocumentsWriter 
(org.apache.lucene.index) getReader:605, IndexWriter (org.apache.lucene.index) 
doOpenIfChanged:277, StandardDirectoryReader (org.apache.lucene.index) 
openIfChanged:235, DirectoryReader (org.apache.lucene.index)


In a consequence later accesses to such files also lead to the state that the 
file is not within the RAMDirectory but only on harddisk.
Example:



Filename _1.fdx  Context = READ   (file is on harddisk but not in RAMDirectory)



openInput:100, FilterDirectory (org.apache.lucene.store)

openInput:100, FilterDirectory (org.apache.lucene.store)

openInput:100, FilterDirectory (org.apache.lucene.store)

openChecksumInput:157, Directory (org.apache.lucene.store)

write:90, Lucene50CompoundFormat (org.apache.lucene.codecs.lucene50)

createCompoundFile:5316, IndexWriter (org.apache.lucene.index)

sealFlushedSegment:457, DocumentsWriterPerThread (org.apache.lucene.index)

flush:395, DocumentsWriterPerThread (org.apache.lucene.index)

doFlush:476, DocumentsWriter (org.apache.lucene.index)

flushAllThreads:656, DocumentsWriter (org.apache.lucene.index)

getReader:605, IndexWriter (org.apache.lucene.index)

doOpenFromWriter:290, StandardDirectoryReader (org.apache.lucene.index)

doOpenIfChanged:275, StandardDirectoryReader (org.apache.lucene.index)

openIfChanged:235, DirectoryReader (org.apache.lucene.index)

Software AG – Sitz/Registered office: Uhlandstraße 12, 64297 Darmstadt, Germany 
– Registergericht/Commercial register: Darmstadt HRB 1562 - Vorstand/Management 
Board: Sanjay Brahmawar (Vorsitzender/Chairman), Dr. Elke Frank, Dr. Matthias 
Heiden, Dr. Stefan Sigg - Aufsichtsratsvorsitzender/Chairman of the Supervisory 
Board: Karl-Heinz Streibich - http://www.softwareag.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to