[ 
https://issues.apache.org/jira/browse/LUCENE-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-1609:
---------------------------------------

    Attachment: LUCENE-1609.patch

Attached patch.  This addresses this issue and LUCENE-1718.

I added 2 new static IndexReader.open expert methods that allow you to
pass in the TermInfos index divisor.  You can pass in -1 to disable
loading of the index entirely (eg, IndexWriter does this when merging
segments).  I also added the param to IndexWriter.getReader, so you
can get an NRT reader w/ subsampled index terms.

This replaces the set/getTermInfosIndexDivisor methods (they are now
deprecated), ie you now must specify the divisor when opening the
reader.  If these methods are called, an UnsupportedOperationException
is thrown.  This is technically a break in back compat (previously you
could call it before the terms index was used, eg if no searches had
been run) but I think we should make an exception here.  Very few
users make use of these expert methods, and having these users switch
to specifying the index divisor up front is a small code change in
exchange for removing all synchronization from the terms dict.

I also made all attrs in TermInfosReader final, and there is no longer
any synchronization.  To handle the case in IndexWriter, where a merge
first opens a segment (which does not need the index) and then an NRT
reader (or, applyDeletes) needs to share the same pooled reader and
needs the terms index, I added a PrivateTermsDict static class to
SegmentReader.  This class just wraps a no-index-loaded
TermInfosReader, which merging will use, and then can open a new
index-is-loaded TermInfosReader when/if needed.


> Eliminate synchronization contention on initial index reading in 
> TermInfosReader ensureIndexIsRead 
> ---------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1609
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1609
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>    Affects Versions: 2.9
>         Environment: Solr 
> Tomcat 5.5
> Ubuntu 2.6.20-17-generic
> Intel(R) Pentium(R) 4 CPU 2.80GHz, 2Gb RAM
>            Reporter: Dan Rosher
>            Assignee: Michael McCandless
>             Fix For: 2.9
>
>         Attachments: LUCENE-1609.patch, LUCENE-1609.patch, LUCENE-1609.patch
>
>
> synchronized method ensureIndexIsRead in TermInfosReader causes contention 
> under heavy load
> Simple to reproduce: e.g. Under Solr, with all caches turned off, do a simple 
> range search e.g. id:[0 TO 999999] on even a small index (in my case 28K 
> docs) and under a load/stress test application, and later, examining the 
> Thread dump (kill -3) , many threads are blocked on 'waiting for monitor 
> entry' to this method.
> Rather than using Double-Checked Locking which is known to have issues, this 
> implementation uses a state pattern, where only one thread can move the 
> object from IndexNotRead state to IndexRead, and in doing so alters the 
> objects behavior, i.e. once the index is loaded, the index nolonger needs a 
> synchronized method. 
> In my particular test, this uncreased throughput at least 30 times.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to