[
https://issues.apache.org/jira/browse/LUCENE-3653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13171858#comment-13171858
]
Gerrit Jansen van Vuuren commented on LUCENE-3653:
--------------------------------------------------
OK, so for searching (after loading) the RAMFile synced methods would not be
called, now bare with me for a moment:
This does not seem to be the case when you retrieve the Document, which is
probably to be expected.
In my code I do:
searcher.doc(matched[i].doc) //searcher == IndexSearcer and matched[] ==
ScoreDoc[]
In my code I see the following call trace (Top to Bottom):
IndexSearcher.doc
IndexReader.document
DirectoryReader.document
SegementReader.document
FieldsReader.doc
RAMInputStream.seek
RAMInputStream.switchCurrentBuffer
RAMFile.getBuffer
Which means I can search concurrently but as soon as I try to retrieve
something again I hit contention.
Now I appreciate that with File IO this is required but a fully in memory index
should not have these problems. I'm trying to change the RAMFile usage so that
it does not require synchronization.
> Lucene Search not scalling
> --------------------------
>
> Key: LUCENE-3653
> URL: https://issues.apache.org/jira/browse/LUCENE-3653
> Project: Lucene - Java
> Issue Type: Improvement
> Reporter: Gerrit Jansen van Vuuren
> Attachments: App.java,
> LUCENE-3653-VirtualMethod+AttributeSource.patch,
> LUCENE-3653-VirtualMethod+AttributeSource.patch, lucene-unsync.diff,
> profile_1_a.png, profile_1_b.png, profile_1_c.png, profile_1_d.png,
> profile_2_a.png, profile_2_b.png, profile_2_c.png
>
>
> I've noticed that when doing thousands of searches in a single thread the
> average time is quite low i.e. a few milliseconds. When adding more
> concurrent searches doing exactly the same search the average time increases
> drastically.
> I've profiled the search classes and found that the whole of lucene blocks on
> org.apache.lucene.index.SegmentCoreReaders.getTermsReader
> org.apache.lucene.util.VirtualMethod
> public synchronized int getImplementationDistance
> org.apache.lucene.util.AttributeSourcew.getAttributeInterfaces
> These cause search times to increase from a few milliseconds to up to 2
> seconds when doing 500 concurrent searches on the same in memory index. Note:
> That the index is not being updates at all, so not refresh methods are called
> at any stage.
> Some questions:
> Why do we need synchronization here?
> There must be a non-lockable solution for these, they basically cause
> lucene to be ok for single thread applications but disastrous for any
> concurrent implementation.
> I'll do some experiments by removing the synchronization from the methods of
> these classes.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]