[jira] Commented: (LUCENE-1316) Avoidable synchronization bottleneck in MatchAlldocsQuery$MatchAllScorer

Hoss Man (JIRA) Wed, 25 Jun 2008 13:51:08 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12608183#action_12608183
 ]


Hoss Man commented on LUCENE-1316:
----------------------------------

bq. if thread A deleted a document, and then thread B checked if it was 
deleted, thread B was guaranteed to see that it was in fact deleted.

Hmmm.... i'll take your word for it, but i don't follow the rational: the 
current synchronization just ensured that either the isDeleted() call will 
complete before the delete() call started or vice versa -- but you have no 
guarantee that thread B would run after thread A and get true.   .... unless... 
is your point that without synchronization on the null check there's no 
garuntee that B will ever see the change to deletedDocs even if it does execute 
after delete() ?

either way: robert's point about hasDeletions() needing to be synchronized 
seems like a bigger issue -- isn't that a bug in the current implementation?  
assuming we fix that then it seems like the original issue is back to square 
one: synchro bottlenecks when there are no deletions.





> Avoidable synchronization bottleneck in MatchAlldocsQuery$MatchAllScorer
> ------------------------------------------------------------------------
>
>                 Key: LUCENE-1316
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1316
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Query/Scoring
>    Affects Versions: 2.3
>         Environment: All
>            Reporter: Todd Feak
>            Priority: Minor
>         Attachments: MatchAllDocsQuery.java
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The isDeleted() method on IndexReader has been mentioned a number of times as 
> a potential synchronization bottleneck. However, the reason this  bottleneck 
> occurs is actually at a higher level that wasn't focused on (at least in the 
> threads I read).
> In every case I saw where a stack trace was provided to show the lock/block, 
> higher in the stack you see the MatchAllScorer.next() method. In Solr 
> paricularly, this scorer is used for "NOT" queries. We saw incredibly poor 
> performance (order of magnitude) on our load tests for NOT queries, due to 
> this bottleneck. The problem is that every single document is run through 
> this isDeleted() method, which is synchronized. Having an optimized index 
> exacerbates this issues, as there is only a single SegmentReader to 
> synchronize on, causing a major thread pileup waiting for the lock.
> By simply having the MatchAllScorer see if there have been any deletions in 
> the reader, much of this can be avoided. Especially in a read-only 
> environment for production where you have slaves doing all the high load 
> searching.
> I modified line 67 in the MatchAllDocsQuery
> FROM:
>   if (!reader.isDeleted(id)) {
> TO:
>   if (!reader.hasDeletions() || !reader.isDeleted(id)) {
> In our micro load test for NOT queries only, this was a major performance 
> improvement.  We also got the same query results. I don't believe this will 
> improve the situation for indexes that have deletions. 
> Please consider making this adjustment for a future bug fix release.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Commented: (LUCENE-1316) Avoidable synchronization bottleneck in MatchAlldocsQuery$MatchAllScorer

Reply via email to