[ 
https://issues.apache.org/jira/browse/SOLR-5935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13955546#comment-13955546
 ] 

Mark Miller commented on SOLR-5935:
-----------------------------------

bq. Mark - thread dumps are attached in the zip file,

Sorry - was following along via email.

Yeah, these are all in lease connection. Seems like a connection pool 
configuration issue. I think we recently exposed config for some of that to the 
user, but I'll have to go dig that up.

> SolrCloud hangs under certain conditions
> ----------------------------------------
>
>                 Key: SOLR-5935
>                 URL: https://issues.apache.org/jira/browse/SOLR-5935
>             Project: Solr
>          Issue Type: Bug
>    Affects Versions: 4.6.1
>            Reporter: Rafał Kuć
>            Priority: Critical
>         Attachments: thread dumps.zip
>
>
> As discussed in a mailing list - let's try to find the reason while under 
> certain conditions SolrCloud can hang.
> I have an issue with one of the SolrCloud deployments. Six machines, a 
> collection with 6 shards with a replication factor of 3. It all runs on 6 
> physical servers, each with 24 cores. We've indexed about 32 million 
> documents and everything was fine until that point.
> Now, during performance tests, we run into an issue - SolrCloud hangs
> when querying and indexing is run at the same time. First we see a
> normal load on the machines, than the load starts to drop and thread
> dump shown numerous threads like this:
> {noformat}
> Thread 12624: (state = BLOCKED)
>  - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information 
> may be imprecise)
>  - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, 
> line=186 (Compiled frame)
>  - 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await() 
> @bci=42, line=2043 (Compiled frame)
>  - org.apache.http.pool.PoolEntryFuture.await(java.util.Date) @bci=50, 
> line=131 (Compiled frame)
>  - 
> org.apache.http.pool.AbstractConnPool.getPoolEntryBlocking(java.lang.Object, 
> java.lang.Object, long, java.util.concurrent.TimeUnit, 
> org.apache.http.pool.PoolEntryFuture) @bci=431, line=281 (Compiled frame)
>  - 
> org.apache.http.pool.AbstractConnPool.access$000(org.apache.http.pool.AbstractConnPool,
>  java.lang.Object, java.lang.Object, long, java.util.concurrent.TimeUnit, 
> org.apache.http.pool.PoolEntryFuture) @bci=8, line=62 (Compiled frame)
>  - org.apache.http.pool.AbstractConnPool$2.getPoolEntry(long, 
> java.util.concurrent.TimeUnit) @bci=15, line=176 (Compiled frame)
>  - org.apache.http.pool.AbstractConnPool$2.getPoolEntry(long, 
> java.util.concurrent.TimeUnit) @bci=3, line=169 (Compiled frame)
>  - org.apache.http.pool.PoolEntryFuture.get(long, 
> java.util.concurrent.TimeUnit) @bci=38, line=100 (Compiled frame)
>  - 
> org.apache.http.impl.conn.PoolingClientConnectionManager.leaseConnection(java.util.concurrent.Future,
>  long, java.util.concurrent.TimeUnit) @bci=4, line=212 (Compiled frame)
>  - 
> org.apache.http.impl.conn.PoolingClientConnectionManager$1.getConnection(long,
>  java.util.concurrent.TimeUnit) @bci=10, line=199 (Compiled frame)
>  - 
> org.apache.http.impl.client.DefaultRequestDirector.execute(org.apache.http.HttpHost,
>  org.apache.http.HttpRequest, org.apache.http.protocol.HttpContext) @bci=259, 
> line=456 (Compiled frame)
>  - 
> org.apache.http.impl.client.AbstractHttpClient.execute(org.apache.http.HttpHost,
>  org.apache.http.HttpRequest, org.apache.http.protocol.HttpContext) @bci=344, 
> line=906 (Compiled frame)
>  - 
> org.apache.http.impl.client.AbstractHttpClient.execute(org.apache.http.client.methods.HttpUriRequest,
>  org.apache.http.protocol.HttpContext) @bci=21, line=805 (Compiled frame)
>  - 
> org.apache.http.impl.client.AbstractHttpClient.execute(org.apache.http.client.methods.HttpUriRequest)
>  @bci=6, line=784 (Compiled frame)
>  - 
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(org.apache.solr.client.solrj.SolrRequest,
>  org.apache.solr.client.solrj.ResponseParser) @bci=1175, line=395 
> (Interpreted frame)
>  - 
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(org.apache.solr.client.solrj.SolrRequest)
>  @bci=17, line=199 (Compiled frame)
>  - 
> org.apache.solr.client.solrj.impl.LBHttpSolrServer.request(org.apache.solr.client.solrj.impl.LBHttpSolrServer$Req)
>  @bci=132, line=285 (Interpreted frame)
>  - 
> org.apache.solr.handler.component.HttpShardHandlerFactory.makeLoadBalancedRequest(org.apache.solr.client.solrj.request.QueryRequest,
>  java.util.List) @bci=13, line=214 (Compiled frame)
>  - org.apache.solr.handler.component.HttpShardHandler$1.call() @bci=246, 
> line=161 (Compiled frame)
>  - org.apache.solr.handler.component.HttpShardHandler$1.call() @bci=1, 
> line=118 (Interpreted frame)
>  - java.util.concurrent.FutureTask$Sync.innerRun() @bci=29, line=334 
> (Interpreted frame)
>  - java.util.concurrent.FutureTask.run() @bci=4, line=166 (Compiled frame)
>  - java.util.concurrent.Executors$RunnableAdapter.call() @bci=4, line=471 
> (Interpreted frame)
>  - java.util.concurrent.FutureTask$Sync.innerRun() @bci=29, line=334 
> (Interpreted frame)
>  - java.util.concurrent.FutureTask.run() @bci=4, line=166 (Compiled frame)
>  - 
> java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker)
>  @bci=95, line=1145 (Compiled frame)
>  - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=615 
> (Interpreted frame)
>  - java.lang.Thread.run() @bci=11, line=724 (Interpreted frame)
> {noformat}
> I've checked I/O statistics, GC working, memory usage, networking and
> all of that - those resources are not exhausted during the test.
> Hard autocommit is set to 15 seconds with openSearcher=false and
> softAutocommit to 4 hours. We have a fairly high query rate, but until
> we start indexing everything runs smooth.
> I've attached four thread dumps, stack_1 to stack_4. They were gathered 
> incrementally - stack_1 and stack_2 are when Solr was still able to respond, 
> stack_3 is Solr barely alive and stack_4 is Solr not responding at all.
> If more information is needed I can provide those.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to