[jira] Updated: (LUCENE-1316) Avoidable synchronization bottleneck in MatchAlldocsQuery$MatchAllScorer

Todd Feak (JIRA) Fri, 27 Jun 2008 13:25:09 -0700

     [ 
https://issues.apache.org/jira/browse/LUCENE-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Todd Feak updated LUCENE-1316:
------------------------------


I applied the patch to the 2.3.0 file. I ran against an optimized and 
non-optimized (12 segment) index with 4700 entries.

2.3.0 non-optimized index  *104 tps*
2.3.0 patched non-optimized index *482 tps*

2.3.0 optimized index *21 tps*
2.3.0 patched optimized index *718 tps*

The patch provided improvements in both optimized and unoptimized indexes. 
Thanks again Yonik.


> Avoidable synchronization bottleneck in MatchAlldocsQuery$MatchAllScorer
> ------------------------------------------------------------------------
>
>                 Key: LUCENE-1316
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1316
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Query/Scoring
>    Affects Versions: 2.3
>         Environment: All
>            Reporter: Todd Feak
>            Priority: Minor
>         Attachments: LUCENE_1316.patch, LUCENE_1316.patch, LUCENE_1316.patch, 
> MatchAllDocsQuery.java
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The isDeleted() method on IndexReader has been mentioned a number of times as 
> a potential synchronization bottleneck. However, the reason this  bottleneck 
> occurs is actually at a higher level that wasn't focused on (at least in the 
> threads I read).
> In every case I saw where a stack trace was provided to show the lock/block, 
> higher in the stack you see the MatchAllScorer.next() method. In Solr 
> paricularly, this scorer is used for "NOT" queries. We saw incredibly poor 
> performance (order of magnitude) on our load tests for NOT queries, due to 
> this bottleneck. The problem is that every single document is run through 
> this isDeleted() method, which is synchronized. Having an optimized index 
> exacerbates this issues, as there is only a single SegmentReader to 
> synchronize on, causing a major thread pileup waiting for the lock.
> By simply having the MatchAllScorer see if there have been any deletions in 
> the reader, much of this can be avoided. Especially in a read-only 
> environment for production where you have slaves doing all the high load 
> searching.
> I modified line 67 in the MatchAllDocsQuery
> FROM:
>   if (!reader.isDeleted(id)) {
> TO:
>   if (!reader.hasDeletions() || !reader.isDeleted(id)) {
> In our micro load test for NOT queries only, this was a major performance 
> improvement.  We also got the same query results. I don't believe this will 
> improve the situation for indexes that have deletions. 
> Please consider making this adjustment for a future bug fix release.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Updated: (LUCENE-1316) Avoidable synchronization bottleneck in MatchAlldocsQuery$MatchAllScorer

Reply via email to