[ 
https://issues.apache.org/jira/browse/LUCENE-3653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13171858#comment-13171858
 ] 

Gerrit Jansen van Vuuren commented on LUCENE-3653:
--------------------------------------------------

OK, so for searching (after loading) the RAMFile synced methods would not be 
called, now bare with me for a moment:

This does not seem to be the case when you retrieve the Document, which is 
probably to be expected.
In my code I do:

searcher.doc(matched[i].doc) //searcher == IndexSearcer and matched[] == 
ScoreDoc[] 

In my code I see the following call trace (Top to Bottom):


IndexSearcher.doc
IndexReader.document
DirectoryReader.document
SegementReader.document
FieldsReader.doc
RAMInputStream.seek
RAMInputStream.switchCurrentBuffer
RAMFile.getBuffer


Which means I can search concurrently but as soon as I try to retrieve 
something again I hit contention. 
Now I appreciate that with File IO this is required but a fully in memory index 
should not have these problems. I'm trying to change the RAMFile usage so that 
it does not require synchronization.




                
> Lucene Search not scalling
> --------------------------
>
>                 Key: LUCENE-3653
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3653
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Gerrit Jansen van Vuuren
>         Attachments: App.java, 
> LUCENE-3653-VirtualMethod+AttributeSource.patch, 
> LUCENE-3653-VirtualMethod+AttributeSource.patch, lucene-unsync.diff, 
> profile_1_a.png, profile_1_b.png, profile_1_c.png, profile_1_d.png, 
> profile_2_a.png, profile_2_b.png, profile_2_c.png
>
>
> I've noticed that when doing thousands of searches in a single thread the 
> average time is quite low i.e. a few milliseconds. When adding more 
> concurrent searches doing exactly the same search the average time increases 
> drastically. 
> I've profiled the search classes and found that the whole of lucene blocks on 
> org.apache.lucene.index.SegmentCoreReaders.getTermsReader
> org.apache.lucene.util.VirtualMethod
>   public synchronized int getImplementationDistance 
> org.apache.lucene.util.AttributeSourcew.getAttributeInterfaces
> These cause search times to increase from a few milliseconds to up to 2 
> seconds when doing 500 concurrent searches on the same in memory index. Note: 
> That the index is not being updates at all, so not refresh methods are called 
> at any stage.
> Some questions:
>   Why do we need synchronization here?
>   There must be a non-lockable solution for these, they basically cause 
> lucene to be ok for single thread applications but disastrous for any 
> concurrent implementation.
> I'll do some experiments by removing the synchronization from the methods of 
> these classes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to