[
https://issues.apache.org/jira/browse/LUCENE-7778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15965678#comment-15965678
]
Steve Mason commented on LUCENE-7778:
-------------------------------------
Thanks for suggesting {{MMapDirectory}} with
{{MMapDirectory.setPreload(true)}}. I've tried this out with a locally patched
version of Luwak and performance suffers massively - it's at least 20% slower
than RAMDirectory in its current form and 80% slower than RAMDirectory with my
changes applied.
>From looking at stacks to see what it's doing it seems to be mostly spending a
>lot of time writing to disk or doing direct memory access. As Luwak is all
>about on-the-fly matching of documents to queries, writing to disk or going
>outside of the JVM seems redundant because we only need the index for less
>than a second before we're on to matching the next set of documents.
It seems that RAMDirectory fits this use-case pretty well - it's accurate and
performs decently (better after these changes). We aren't using it for massive
in-memory indexes (I'm using batches from 10 to 200 documents) and the lifetime
of them is very short so GC issues aren't a concern.
I've noticed that my assertion in the description that RAMFile doesn't lock
during writes is wrong - {{addBuffer}} has a `synchronized(this)` block in it.
Before I change my patch to use a more complicated ReadWriteLock arrangement
are there any further objections to fixing this issue?
> Remove synchronized from high-contention methods on RAMFile
> -----------------------------------------------------------
>
> Key: LUCENE-7778
> URL: https://issues.apache.org/jira/browse/LUCENE-7778
> Project: Lucene - Core
> Issue Type: Improvement
> Components: core/store
> Reporter: Steve Mason
> Priority: Minor
>
> When benchmarking RAMDirectory access via multiple threads the methods
> {{RAMFile::numBuffers}} and {{RAMFile::getBuffer}} show up blocking threads
> fairly frequently
> By removing the {{synchronized}} keyword from these methods our internal
> benchmarks show a 2x performance increase under concurrent load.
> I don't think removing {{synchronized}} from these methods is a problem as
> they are read-only and write access to these fields is not synchronized.
> LUCENE-2779 also implies that some ofthe locking on RAMDirectory is not
> necessary
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]