[ 
https://issues.apache.org/jira/browse/SOLR-12833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16833192#comment-16833192
 ] 

Ishan Chattopadhyaya commented on SOLR-12833:
---------------------------------------------

There are some reproducible failures. Could it be due to this change?

{code}
ant test  -Dtestcase=DistributedUpdateProcessorTest 
-Dtests.method=testVersionAdd -Dtests.seed=5ED48C678637330F 
-Dtests.multiplier=3 -Dtests.slow=true -Dtests.locale=saq-KE 
-Dtests.timezone=Pacific/Tahiti -Dtests.asserts=true 
-Dtests.file.encoding=US-ASCII
{code}

More here:  https://jenkins.thetaphi.de/job/Lucene-Solr-8.x-Linux/514/

> Use timed-out lock in DistributedUpdateProcessor
> ------------------------------------------------
>
>                 Key: SOLR-12833
>                 URL: https://issues.apache.org/jira/browse/SOLR-12833
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: update, UpdateRequestProcessors
>    Affects Versions: 7.5, 8.0
>            Reporter: jefferyyuan
>            Assignee: Mark Miller
>            Priority: Blocker
>             Fix For: 7.7, 8.0, 8.1
>
>         Attachments: SOLR-12833-noint.patch, SOLR-12833.patch, 
> SOLR-12833.patch, threadDump.txt
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> There is a synchronize block that blocks other update requests whose IDs fall 
> in the same hash bucket. The update waits forever until it gets the lock at 
> the synchronize block, this can be a problem in some cases.
>  
> Some add/update requests (for example updates with spatial/shape analysis) 
> like may take time (30+ seconds or even more), this would the request time 
> out and fail.
> Client may retry the same requests multiple times or several minutes, this 
> would make things worse.
> The server side receives all the update requests but all except one can do 
> nothing, have to wait there. This wastes precious memory and cpu resource.
> We have seen the case 2000+ threads are blocking at the synchronize lock, and 
> only a few updates are making progress. Each thread takes 3+ mb memory which 
> causes OOM.
> Also if the update can't get the lock in expected time range, its better to 
> fail fast.
>  
> We can have one configuration in solrconfig.xml: 
> updateHandler/versionLock/timeInMill, so users can specify how long they want 
> to wait the version bucket lock.
> The default value can be -1, so it behaves same - wait forever until it gets 
> the lock.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to