[ 
https://issues.apache.org/jira/browse/LUCENE-7778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15965678#comment-15965678
 ] 

Steve Mason commented on LUCENE-7778:
-------------------------------------

Thanks for suggesting {{MMapDirectory}} with 
{{MMapDirectory.setPreload(true)}}.  I've tried this out with a locally patched 
version of Luwak and performance suffers massively - it's at least 20% slower 
than RAMDirectory in its current form and 80% slower than RAMDirectory with my 
changes applied.

>From looking at stacks to see what it's doing it seems to be mostly spending a 
>lot of time writing to disk or doing direct memory access.  As Luwak is all 
>about on-the-fly matching of documents to queries, writing to disk or going 
>outside of the JVM seems redundant because we only need the index for less 
>than a second before we're on to matching the next set of documents.

It seems that RAMDirectory fits this use-case pretty well - it's accurate and 
performs decently (better after these changes).  We aren't using it for massive 
in-memory indexes (I'm using batches from 10 to 200 documents) and the lifetime 
of them is very short so GC issues aren't a concern.

I've noticed that my assertion in the description that RAMFile doesn't lock 
during writes is wrong - {{addBuffer}} has a `synchronized(this)` block in it.  
Before I change my patch to use a more complicated ReadWriteLock arrangement 
are there any further objections to fixing this issue?

> Remove synchronized from high-contention methods on RAMFile
> -----------------------------------------------------------
>
>                 Key: LUCENE-7778
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7778
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/store
>            Reporter: Steve Mason
>            Priority: Minor
>
> When benchmarking RAMDirectory access via multiple threads the methods 
> {{RAMFile::numBuffers}} and {{RAMFile::getBuffer}} show up blocking threads 
> fairly frequently
> By removing the {{synchronized}} keyword from these methods our internal 
> benchmarks show a 2x performance increase under concurrent load.
> I don't think removing {{synchronized}} from these methods is a problem as 
> they are read-only and write access to these fields is not synchronized.  
> LUCENE-2779 also implies that some ofthe locking on RAMDirectory is not 
> necessary



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to