[jira] [Commented] (SOLR-6406) ConcurrentUpdateSolrServer hang in blockUntilFinished.

2016-04-28 Thread Stephan Lagraulet (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15261732#comment-15261732
 ] 

Stephan Lagraulet commented on SOLR-6406:
-

Hi [~yo...@apache.org] did you make any progress on this issue?

> ConcurrentUpdateSolrServer hang in blockUntilFinished.
> --
>
> Key: SOLR-6406
> URL: https://issues.apache.org/jira/browse/SOLR-6406
> Project: Solr
>  Issue Type: Bug
>Reporter: Mark Miller
>Assignee: Yonik Seeley
> Fix For: 5.5, master
>
> Attachments: CPU Sampling.png, SOLR-6406.patch, SOLR-6406.patch, 
> SOLR-6406.patch
>
>
> Not sure what is causing this, but SOLR-6136 may have taken us a step back 
> here. I see this problem occasionally pop up in ChaosMonkeyNothingIsSafeTest 
> now - test fails because of a thread leak, thread leak is due to a 
> ConcurrentUpdateSolrServer hang in blockUntilFinished. Only started popping 
> up recently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6406) ConcurrentUpdateSolrServer hang in blockUntilFinished.

2015-11-17 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15009016#comment-15009016
 ] 

Yonik Seeley commented on SOLR-6406:


So increasing maxConnectionsPerHost didn't fix the problem.
I instrumented the ConcurrentUpdateSolrServer to try and understand what is 
happening when, and am analyzing some of those fails now.
They are all beyond the max size that can be uploaded to JIRA though, so I'lll 
just put up a summary based on what I find.

> ConcurrentUpdateSolrServer hang in blockUntilFinished.
> --
>
> Key: SOLR-6406
> URL: https://issues.apache.org/jira/browse/SOLR-6406
> Project: Solr
>  Issue Type: Bug
>Reporter: Mark Miller
>Assignee: Yonik Seeley
> Fix For: 5.4, Trunk
>
> Attachments: CPU Sampling.png, SOLR-6406.patch, SOLR-6406.patch, 
> SOLR-6406.patch
>
>
> Not sure what is causing this, but SOLR-6136 may have taken us a step back 
> here. I see this problem occasionally pop up in ChaosMonkeyNothingIsSafeTest 
> now - test fails because of a thread leak, thread leak is due to a 
> ConcurrentUpdateSolrServer hang in blockUntilFinished. Only started popping 
> up recently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6406) ConcurrentUpdateSolrServer hang in blockUntilFinished.

2015-11-11 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15001429#comment-15001429
 ] 

Yonik Seeley commented on SOLR-6406:


I was analyzing another "shards-out-of-sync" failure on trunk.
It looks like that certain update are just not being forwarded from the leader 
to a certain replica.

Working theory: the max connections per host of the HttpClient is being hit, 
starving updates from certain update threads.
This could account for why shutdownNow on the update executor service is having 
such an impact.  In an orderly shutdown, all scheduled jobs will still be run 
(I think), which means that connections will be released, and the updates that 
were being starved will get to proceed.  But it's for exactly this reason that 
we should probably keep the shutdownNow... it mimics much better what will 
happen in real world situations when a node goes down.

>From this, it looks like max connections per host is 20:

{code}
13404 INFO  
(TEST-HdfsChaosMonkeyNothingIsSafeTest.test-seed#[A22375CC545D2B82]) [] 
o.a.s.h.c.HttpShardHandlerFactory created with socketTimeout : 9,urlScheme 
: ,connTimeout : 15000,maxConnectionsPerHost : 20,maxConnections : 
1,corePoolSize : 0,maximumPoolSize : 2147483647,maxThreadIdleTime : 
5,sizeOfQueue : -1,fairnessPolicy : false,useRetries : false,
{code}

The test used 12 nodes (and 2 shards)... increasing the chance of hitting the 
max connections (since all nodes run on the same host).


> ConcurrentUpdateSolrServer hang in blockUntilFinished.
> --
>
> Key: SOLR-6406
> URL: https://issues.apache.org/jira/browse/SOLR-6406
> Project: Solr
>  Issue Type: Bug
>Reporter: Mark Miller
>Assignee: Yonik Seeley
> Fix For: 5.4, Trunk
>
> Attachments: CPU Sampling.png, SOLR-6406.patch, SOLR-6406.patch, 
> SOLR-6406.patch
>
>
> Not sure what is causing this, but SOLR-6136 may have taken us a step back 
> here. I see this problem occasionally pop up in ChaosMonkeyNothingIsSafeTest 
> now - test fails because of a thread leak, thread leak is due to a 
> ConcurrentUpdateSolrServer hang in blockUntilFinished. Only started popping 
> up recently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6406) ConcurrentUpdateSolrServer hang in blockUntilFinished.

2015-11-09 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14997248#comment-14997248
 ] 

Mark Miller commented on SOLR-6406:
---

Strange. I got over 300 runs without an out of sync with it originally. I have 
not tried on recent trunk or recent changes though.

> ConcurrentUpdateSolrServer hang in blockUntilFinished.
> --
>
> Key: SOLR-6406
> URL: https://issues.apache.org/jira/browse/SOLR-6406
> Project: Solr
>  Issue Type: Bug
>Reporter: Mark Miller
>Assignee: Yonik Seeley
> Fix For: 5.4, Trunk
>
> Attachments: CPU Sampling.png, SOLR-6406.patch, SOLR-6406.patch, 
> SOLR-6406.patch
>
>
> Not sure what is causing this, but SOLR-6136 may have taken us a step back 
> here. I see this problem occasionally pop up in ChaosMonkeyNothingIsSafeTest 
> now - test fails because of a thread leak, thread leak is due to a 
> ConcurrentUpdateSolrServer hang in blockUntilFinished. Only started popping 
> up recently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6406) ConcurrentUpdateSolrServer hang in blockUntilFinished.

2015-11-07 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14995263#comment-14995263
 ] 

Yonik Seeley commented on SOLR-6406:


The other variants:
 4) trunk with only client changes reverted (i.e. DUH check enabled, 
shutdownNow used)
 5) trunk with client + DUH changes reverted (i.e. only shutdownNow enabled)
 6) trunk with alternate client changes (only changes to blockUntilFinished)
 7) trunk with alternate client changes 2 (only changes to blockUntilFinished, 
but using former isTerminated instead of isShutdown) 

I managed to get inconsistent shard runs with all of these.
The common element is shutdownNow being used on the shard update executor.

> ConcurrentUpdateSolrServer hang in blockUntilFinished.
> --
>
> Key: SOLR-6406
> URL: https://issues.apache.org/jira/browse/SOLR-6406
> Project: Solr
>  Issue Type: Bug
>Reporter: Mark Miller
>Assignee: Yonik Seeley
> Fix For: 5.0, Trunk
>
> Attachments: CPU Sampling.png, SOLR-6406.patch, SOLR-6406.patch, 
> SOLR-6406.patch
>
>
> Not sure what is causing this, but SOLR-6136 may have taken us a step back 
> here. I see this problem occasionally pop up in ChaosMonkeyNothingIsSafeTest 
> now - test fails because of a thread leak, thread leak is due to a 
> ConcurrentUpdateSolrServer hang in blockUntilFinished. Only started popping 
> up recently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6406) ConcurrentUpdateSolrServer hang in blockUntilFinished.

2015-11-05 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14991961#comment-14991961
 ] 

Yonik Seeley commented on SOLR-6406:


Update: I looped 3 tests overnight...
1) trunk with shutdownNow on the update executor reverted
2) trunk with shutdownNow and the client changes in this patch reverted
3) plain trunk

Only #3 resulted in inconsistent shards.  I'm setting up some new variants to 
test now...

> ConcurrentUpdateSolrServer hang in blockUntilFinished.
> --
>
> Key: SOLR-6406
> URL: https://issues.apache.org/jira/browse/SOLR-6406
> Project: Solr
>  Issue Type: Bug
>Reporter: Mark Miller
>Assignee: Yonik Seeley
> Fix For: 5.0, Trunk
>
> Attachments: CPU Sampling.png, SOLR-6406.patch, SOLR-6406.patch, 
> SOLR-6406.patch
>
>
> Not sure what is causing this, but SOLR-6136 may have taken us a step back 
> here. I see this problem occasionally pop up in ChaosMonkeyNothingIsSafeTest 
> now - test fails because of a thread leak, thread leak is due to a 
> ConcurrentUpdateSolrServer hang in blockUntilFinished. Only started popping 
> up recently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6406) ConcurrentUpdateSolrServer hang in blockUntilFinished.

2015-11-02 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14985389#comment-14985389
 ] 

ASF subversion and git services commented on SOLR-6406:
---

Commit 1712045 from [~yo...@apache.org] in branch 'dev/trunk'
[ https://svn.apache.org/r1712045 ]

SOLR-6406: fix race/hang in ConcurrentUpdateSolrClient.blockUntilFinished when 
executor service is shut down

> ConcurrentUpdateSolrServer hang in blockUntilFinished.
> --
>
> Key: SOLR-6406
> URL: https://issues.apache.org/jira/browse/SOLR-6406
> Project: Solr
>  Issue Type: Bug
>Reporter: Mark Miller
> Fix For: 5.0, Trunk
>
> Attachments: CPU Sampling.png, SOLR-6406.patch, SOLR-6406.patch, 
> SOLR-6406.patch
>
>
> Not sure what is causing this, but SOLR-6136 may have taken us a step back 
> here. I see this problem occasionally pop up in ChaosMonkeyNothingIsSafeTest 
> now - test fails because of a thread leak, thread leak is due to a 
> ConcurrentUpdateSolrServer hang in blockUntilFinished. Only started popping 
> up recently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6406) ConcurrentUpdateSolrServer hang in blockUntilFinished.

2015-11-02 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14985392#comment-14985392
 ] 

ASF subversion and git services commented on SOLR-6406:
---

Commit 1712047 from [~yo...@apache.org] in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1712047 ]

SOLR-6406: fix race/hang in ConcurrentUpdateSolrClient.blockUntilFinished when 
executor service is shut down

> ConcurrentUpdateSolrServer hang in blockUntilFinished.
> --
>
> Key: SOLR-6406
> URL: https://issues.apache.org/jira/browse/SOLR-6406
> Project: Solr
>  Issue Type: Bug
>Reporter: Mark Miller
>Assignee: Yonik Seeley
> Fix For: 5.0, Trunk
>
> Attachments: CPU Sampling.png, SOLR-6406.patch, SOLR-6406.patch, 
> SOLR-6406.patch
>
>
> Not sure what is causing this, but SOLR-6136 may have taken us a step back 
> here. I see this problem occasionally pop up in ChaosMonkeyNothingIsSafeTest 
> now - test fails because of a thread leak, thread leak is due to a 
> ConcurrentUpdateSolrServer hang in blockUntilFinished. Only started popping 
> up recently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6406) ConcurrentUpdateSolrServer hang in blockUntilFinished.

2015-11-01 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14984535#comment-14984535
 ] 

Yonik Seeley commented on SOLR-6406:


OK, so I haven't hit any hangs with this latest patch and shutdownNow() on the 
associated executor.  Interestingly enough though, this still results in 
inconsistent shard failures.  My guess is that the shutdown of the executor is 
done as one of the last steps in CoreContainer.shutdown(), which still gives 
time for streaming update requests to continue streaming.

> ConcurrentUpdateSolrServer hang in blockUntilFinished.
> --
>
> Key: SOLR-6406
> URL: https://issues.apache.org/jira/browse/SOLR-6406
> Project: Solr
>  Issue Type: Bug
>Reporter: Mark Miller
> Fix For: 5.0, Trunk
>
> Attachments: CPU Sampling.png, SOLR-6406.patch, SOLR-6406.patch, 
> SOLR-6406.patch
>
>
> Not sure what is causing this, but SOLR-6136 may have taken us a step back 
> here. I see this problem occasionally pop up in ChaosMonkeyNothingIsSafeTest 
> now - test fails because of a thread leak, thread leak is due to a 
> ConcurrentUpdateSolrServer hang in blockUntilFinished. Only started popping 
> up recently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6406) ConcurrentUpdateSolrServer hang in blockUntilFinished.

2015-10-30 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983761#comment-14983761
 ] 

Mark Miller commented on SOLR-6406:
---

Ha, that's actually close to the first hack fix I made. Occasionally waking up 
in the wait and checking if empty again. 

> ConcurrentUpdateSolrServer hang in blockUntilFinished.
> --
>
> Key: SOLR-6406
> URL: https://issues.apache.org/jira/browse/SOLR-6406
> Project: Solr
>  Issue Type: Bug
>Reporter: Mark Miller
> Fix For: 5.0, Trunk
>
> Attachments: CPU Sampling.png, SOLR-6406.patch, SOLR-6406.patch, 
> SOLR-6406.patch
>
>
> Not sure what is causing this, but SOLR-6136 may have taken us a step back 
> here. I see this problem occasionally pop up in ChaosMonkeyNothingIsSafeTest 
> now - test fails because of a thread leak, thread leak is due to a 
> ConcurrentUpdateSolrServer hang in blockUntilFinished. Only started popping 
> up recently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6406) ConcurrentUpdateSolrServer hang in blockUntilFinished.

2015-10-29 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14980772#comment-14980772
 ] 

Mark Miller commented on SOLR-6406:
---

Just two threads stuck - not necessarily from the same client. Previously I had 
only ever seen 1 thread stuck. Just noting it, may not mean much.

> ConcurrentUpdateSolrServer hang in blockUntilFinished.
> --
>
> Key: SOLR-6406
> URL: https://issues.apache.org/jira/browse/SOLR-6406
> Project: Solr
>  Issue Type: Bug
>Reporter: Mark Miller
> Fix For: 5.0, Trunk
>
> Attachments: CPU Sampling.png, SOLR-6406.patch, SOLR-6406.patch
>
>
> Not sure what is causing this, but SOLR-6136 may have taken us a step back 
> here. I see this problem occasionally pop up in ChaosMonkeyNothingIsSafeTest 
> now - test fails because of a thread leak, thread leak is due to a 
> ConcurrentUpdateSolrServer hang in blockUntilFinished. Only started popping 
> up recently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6406) ConcurrentUpdateSolrServer hang in blockUntilFinished.

2015-10-29 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14980812#comment-14980812
 ] 

Yonik Seeley commented on SOLR-6406:


OK, I was able to reproduce... Interestingly, this is pretty easy to hit (and I 
also saw 2 threads stuck at the same point... which as you say must be 2 
different client objects).  There must be something more here than a 
subtle/little race condition.

> ConcurrentUpdateSolrServer hang in blockUntilFinished.
> --
>
> Key: SOLR-6406
> URL: https://issues.apache.org/jira/browse/SOLR-6406
> Project: Solr
>  Issue Type: Bug
>Reporter: Mark Miller
> Fix For: 5.0, Trunk
>
> Attachments: CPU Sampling.png, SOLR-6406.patch, SOLR-6406.patch
>
>
> Not sure what is causing this, but SOLR-6136 may have taken us a step back 
> here. I see this problem occasionally pop up in ChaosMonkeyNothingIsSafeTest 
> now - test fails because of a thread leak, thread leak is due to a 
> ConcurrentUpdateSolrServer hang in blockUntilFinished. Only started popping 
> up recently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6406) ConcurrentUpdateSolrServer hang in blockUntilFinished.

2015-10-26 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14974518#comment-14974518
 ] 

Yonik Seeley commented on SOLR-6406:


bq. Got a hang at the same spot - 2 threads stuck on it this time rather than 
the usual 1:

Hmmm, yeah, I only fixed one spot where runners are submitted. I'll try taking 
another crack at it...
Although having 2 threads stuck at the same place in blockUntilFinished should 
be impossible... it's a synchronized method.

> ConcurrentUpdateSolrServer hang in blockUntilFinished.
> --
>
> Key: SOLR-6406
> URL: https://issues.apache.org/jira/browse/SOLR-6406
> Project: Solr
>  Issue Type: Bug
>Reporter: Mark Miller
> Fix For: 5.0, Trunk
>
> Attachments: CPU Sampling.png, SOLR-6406.patch, SOLR-6406.patch
>
>
> Not sure what is causing this, but SOLR-6136 may have taken us a step back 
> here. I see this problem occasionally pop up in ChaosMonkeyNothingIsSafeTest 
> now - test fails because of a thread leak, thread leak is due to a 
> ConcurrentUpdateSolrServer hang in blockUntilFinished. Only started popping 
> up recently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6406) ConcurrentUpdateSolrServer hang in blockUntilFinished.

2015-10-25 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14973243#comment-14973243
 ] 

Mark Miller commented on SOLR-6406:
---

I'm testing Yonik's approach today and will see if it resolves this. My quick 
patch does not FWIW.

> ConcurrentUpdateSolrServer hang in blockUntilFinished.
> --
>
> Key: SOLR-6406
> URL: https://issues.apache.org/jira/browse/SOLR-6406
> Project: Solr
>  Issue Type: Bug
>Reporter: Mark Miller
> Fix For: 5.0, Trunk
>
> Attachments: CPU Sampling.png, SOLR-6406.patch, SOLR-6406.patch
>
>
> Not sure what is causing this, but SOLR-6136 may have taken us a step back 
> here. I see this problem occasionally pop up in ChaosMonkeyNothingIsSafeTest 
> now - test fails because of a thread leak, thread leak is due to a 
> ConcurrentUpdateSolrServer hang in blockUntilFinished. Only started popping 
> up recently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6406) ConcurrentUpdateSolrServer hang in blockUntilFinished.

2015-10-25 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14973270#comment-14973270
 ] 

Mark Miller commented on SOLR-6406:
---

Got a hang at the same spot - 2 threads stuck on it this time rather than the 
usual 1:

{noformat}
   [junit4]>2) Thread[id=1071, name=qtp612828486-1071, state=WAITING, 
group=TGRP-HdfsChaosMonkeyNothingIsSafeTest]
   [junit4]> at java.lang.Object.wait(Native Method)
   [junit4]> at java.lang.Object.wait(Object.java:502)
   [junit4]> at 
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient.blockUntilFinished(ConcurrentUpdateSolrClient.java:418)
   [junit4]> at 
org.apache.solr.update.StreamingSolrClients.blockUntilFinished(StreamingSolrClients.java:106)
   [junit4]> at 
org.apache.solr.update.SolrCmdDistributor.blockAndDoRetries(SolrCmdDistributor.java:231)
   [junit4]> at 
org.apache.solr.update.SolrCmdDistributor.finish(SolrCmdDistributor.java:89)
   [junit4]> at 
org.apache.solr.update.processor.DistributedUpdateProcessor.doFinish(DistributedUpdateProcessor.java:778)
   [junit4]> at 
org.apache.solr.update.processor.DistributedUpdateProcessor.finish(DistributedUpdateProcessor.java:1622)
   [junit4]> at 
org.apache.solr.update.processor.LogUpdateProcessor.finish(LogUpdateProcessorFactory.java:183)
   [junit4]> at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:83)
   [junit4]> at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:151)
   [junit4]> at 
org.apache.solr.core.SolrCore.execute(SolrCore.java:2079)
   [junit4]> at 
org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:667)
   [junit4]> at 
org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:460)
   [junit4]> at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:220)
   [junit4]> at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:179)
{noformat}

> ConcurrentUpdateSolrServer hang in blockUntilFinished.
> --
>
> Key: SOLR-6406
> URL: https://issues.apache.org/jira/browse/SOLR-6406
> Project: Solr
>  Issue Type: Bug
>Reporter: Mark Miller
> Fix For: 5.0, Trunk
>
> Attachments: CPU Sampling.png, SOLR-6406.patch, SOLR-6406.patch
>
>
> Not sure what is causing this, but SOLR-6136 may have taken us a step back 
> here. I see this problem occasionally pop up in ChaosMonkeyNothingIsSafeTest 
> now - test fails because of a thread leak, thread leak is due to a 
> ConcurrentUpdateSolrServer hang in blockUntilFinished. Only started popping 
> up recently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6406) ConcurrentUpdateSolrServer hang in blockUntilFinished.

2015-10-24 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14972845#comment-14972845
 ] 

Mark Miller commented on SOLR-6406:
---

Ha - hadn't refreshed my browser.

I'll review this approach.

> ConcurrentUpdateSolrServer hang in blockUntilFinished.
> --
>
> Key: SOLR-6406
> URL: https://issues.apache.org/jira/browse/SOLR-6406
> Project: Solr
>  Issue Type: Bug
>Reporter: Mark Miller
> Fix For: 5.0, Trunk
>
> Attachments: CPU Sampling.png, SOLR-6406.patch, SOLR-6406.patch
>
>
> Not sure what is causing this, but SOLR-6136 may have taken us a step back 
> here. I see this problem occasionally pop up in ChaosMonkeyNothingIsSafeTest 
> now - test fails because of a thread leak, thread leak is due to a 
> ConcurrentUpdateSolrServer hang in blockUntilFinished. Only started popping 
> up recently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6406) ConcurrentUpdateSolrServer hang in blockUntilFinished.

2015-10-24 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14972813#comment-14972813
 ] 

Mark Miller commented on SOLR-6406:
---

A more recent set of stack traces:

{noformat}
   [junit4] ERROR   0.00s | HdfsChaosMonkeyNothingIsSafeTest (suite) <<<
   [junit4]> Throwable #1: java.lang.AssertionError: ERROR: 
SolrIndexSearcher opens=39 closes=38
   [junit4]>at 
__randomizedtesting.SeedInfo.seed([71608A03B4692CB]:0)
   [junit4]>at 
org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:468)
   [junit4]>at 
org.apache.solr.SolrTestCaseJ4.afterClass(SolrTestCaseJ4.java:234)
   [junit4]>at java.lang.Thread.run(Thread.java:745)Throwable #2: 
com.carrotsearch.randomizedtesting.ThreadLeakError: 2 threads leaked from SUITE 
scope at org.apache.solr.cloud.hdfs.HdfsChaosMonkeyNothingIsSafeTest: 
   [junit4]>1) Thread[id=243, name=qtp487431535-243, state=WAITING, 
group=TGRP-HdfsChaosMonkeyNothingIsSafeTest]
   [junit4]> at java.lang.Object.wait(Native Method)
   [junit4]> at java.lang.Object.wait(Object.java:502)
   [junit4]> at 
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient.blockUntilFinished(ConcurrentUpdateSolrClient.java:404)
   [junit4]> at 
org.apache.solr.update.StreamingSolrClients.blockUntilFinished(StreamingSolrClients.java:103)
   [junit4]> at 
org.apache.solr.update.SolrCmdDistributor.blockAndDoRetries(SolrCmdDistributor.java:231)
   [junit4]> at 
org.apache.solr.update.SolrCmdDistributor.finish(SolrCmdDistributor.java:89)
   [junit4]> at 
org.apache.solr.update.processor.DistributedUpdateProcessor.doFinish(DistributedUpdateProcessor.java:778)
   [junit4]> at 
org.apache.solr.update.processor.DistributedUpdateProcessor.finish(DistributedUpdateProcessor.java:1622)
   [junit4]> at 
org.apache.solr.update.processor.LogUpdateProcessor.finish(LogUpdateProcessorFactory.java:183)
   [junit4]> at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:83)
   [junit4]> at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:151)
   [junit4]> at 
org.apache.solr.core.SolrCore.execute(SolrCore.java:2079)
   [junit4]> at 
org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:667)
   [junit4]> at 
org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:460)
   [junit4]> at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:220)
   [junit4]> at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:179)
   [junit4]> at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
   [junit4]> at 
org.apache.solr.client.solrj.embedded.JettySolrRunner$DebugFilter.doFilter(JettySolrRunner.java:109)
   [junit4]> at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
   [junit4]> at 
org.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.java:83)
   [junit4]> at 
org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:300)
   [junit4]> at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
   [junit4]> at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
   [junit4]> at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:221)
   [junit4]> at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
   [junit4]> at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
   [junit4]> at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
   [junit4]> at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
   [junit4]> at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
   [junit4]> at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
   [junit4]> at 
org.eclipse.jetty.server.Server.handle(Server.java:499)
   [junit4]> at 
org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)
   [junit4]> at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
   [junit4]> at 
org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
   [junit4]> at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
   [junit4]> at 

[jira] [Commented] (SOLR-6406) ConcurrentUpdateSolrServer hang in blockUntilFinished.

2015-10-24 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14972821#comment-14972821
 ] 

Yonik Seeley commented on SOLR-6406:


OK, here's one theory after a quick look:

{code}
  } finally {
synchronized (runners) {
  if (runners.size() == 1 && !queue.isEmpty()) {
// keep this runner alive
scheduler.execute(this);
  } else {
runners.remove(this);
if (runners.isEmpty())
  runners.notifyAll();
  }
}
{code}

What if the queue isn't empty, so we try to do "scheduler.execute", but the 
scheduler has been shut down?  That will throw an exception and the else block 
containing notifyAll() will never be executed.

> ConcurrentUpdateSolrServer hang in blockUntilFinished.
> --
>
> Key: SOLR-6406
> URL: https://issues.apache.org/jira/browse/SOLR-6406
> Project: Solr
>  Issue Type: Bug
>Reporter: Mark Miller
> Fix For: 5.0, Trunk
>
> Attachments: CPU Sampling.png
>
>
> Not sure what is causing this, but SOLR-6136 may have taken us a step back 
> here. I see this problem occasionally pop up in ChaosMonkeyNothingIsSafeTest 
> now - test fails because of a thread leak, thread leak is due to a 
> ConcurrentUpdateSolrServer hang in blockUntilFinished. Only started popping 
> up recently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6406) ConcurrentUpdateSolrServer hang in blockUntilFinished.

2015-08-18 Thread Stephan Lagraulet (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701450#comment-14701450
 ] 

Stephan Lagraulet commented on SOLR-6406:
-

I have attached a cpu sample of a solr cloud server which has very poor update 
performance since a few hours.
I guess it could be related to this problem.

 ConcurrentUpdateSolrServer hang in blockUntilFinished.
 --

 Key: SOLR-6406
 URL: https://issues.apache.org/jira/browse/SOLR-6406
 Project: Solr
  Issue Type: Bug
Reporter: Mark Miller
 Fix For: 5.0, Trunk

 Attachments: CPU Sampling.png


 Not sure what is causing this, but SOLR-6136 may have taken us a step back 
 here. I see this problem occasionally pop up in ChaosMonkeyNothingIsSafeTest 
 now - test fails because of a thread leak, thread leak is due to a 
 ConcurrentUpdateSolrServer hang in blockUntilFinished. Only started popping 
 up recently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6406) ConcurrentUpdateSolrServer hang in blockUntilFinished.

2015-01-29 Thread Timothy Potter (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297238#comment-14297238
 ] 

Timothy Potter commented on SOLR-6406:
--

Would it make sense to change the code to wait(timeoutMs) and we can recheck 
the state of things and going back to waiting if it makes sense vs. the 
indefinite way you're seeing?

 ConcurrentUpdateSolrServer hang in blockUntilFinished.
 --

 Key: SOLR-6406
 URL: https://issues.apache.org/jira/browse/SOLR-6406
 Project: Solr
  Issue Type: Bug
Reporter: Mark Miller
 Fix For: 5.0, Trunk


 Not sure what is causing this, but SOLR-6136 may have taken us a step back 
 here. I see this problem occasionally pop up in ChaosMonkeyNothingIsSafeTest 
 now - test fails because of a thread leak, thread leak is due to a 
 ConcurrentUpdateSolrServer hang in blockUntilFinished. Only started popping 
 up recently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6406) ConcurrentUpdateSolrServer hang in blockUntilFinished.

2015-01-29 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297242#comment-14297242
 ] 

Mark Miller commented on SOLR-6406:
---

Mabye. I've been trying to spot how it can happen (without a runner also still 
going, which I don't see). So far, I cannot spot how it happens.

 ConcurrentUpdateSolrServer hang in blockUntilFinished.
 --

 Key: SOLR-6406
 URL: https://issues.apache.org/jira/browse/SOLR-6406
 Project: Solr
  Issue Type: Bug
Reporter: Mark Miller
 Fix For: 5.0, Trunk


 Not sure what is causing this, but SOLR-6136 may have taken us a step back 
 here. I see this problem occasionally pop up in ChaosMonkeyNothingIsSafeTest 
 now - test fails because of a thread leak, thread leak is due to a 
 ConcurrentUpdateSolrServer hang in blockUntilFinished. Only started popping 
 up recently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6406) ConcurrentUpdateSolrServer hang in blockUntilFinished.

2015-01-29 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297214#comment-14297214
 ] 

Mark Miller commented on SOLR-6406:
---

I still see this happen in tests. This hangs at  runners.wait(); and no notify 
or anything comes and it's just an ugly hang.

 ConcurrentUpdateSolrServer hang in blockUntilFinished.
 --

 Key: SOLR-6406
 URL: https://issues.apache.org/jira/browse/SOLR-6406
 Project: Solr
  Issue Type: Bug
Reporter: Mark Miller
 Fix For: 5.0, Trunk


 Not sure what is causing this, but SOLR-6136 may have taken us a step back 
 here. I see this problem occasionally pop up in ChaosMonkeyNothingIsSafeTest 
 now - test fails because of a thread leak, thread leak is due to a 
 ConcurrentUpdateSolrServer hang in blockUntilFinished. Only started popping 
 up recently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6406) ConcurrentUpdateSolrServer hang in blockUntilFinished.

2014-08-22 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14106906#comment-14106906
 ] 

Mark Miller commented on SOLR-6406:
---

{noformat}
   1) Thread[id=55, name=qtp823025155-55, state=WAITING, 
group=TGRP-ChaosMonkeyNothingIsSafeTest]
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:503)
at 
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer.blockUntilFinished(ConcurrentUpdateSolrServer.java:374)
at 
org.apache.solr.update.StreamingSolrServers.blockUntilFinished(StreamingSolrServers.java:103)
at 
org.apache.solr.update.SolrCmdDistributor.blockAndDoRetries(SolrCmdDistributor.java:228)
at 
org.apache.solr.update.SolrCmdDistributor.finish(SolrCmdDistributor.java:89)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.doFinish(DistributedUpdateProcessor.java:766)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.finish(DistributedUpdateProcessor.java:1662)
at 
org.apache.solr.update.processor.LogUpdateProcessor.finish(LogUpdateProcessorFactory.java:179)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:83)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1967)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:777)
{noformat}

 ConcurrentUpdateSolrServer hang in blockUntilFinished.
 --

 Key: SOLR-6406
 URL: https://issues.apache.org/jira/browse/SOLR-6406
 Project: Solr
  Issue Type: Bug
Reporter: Mark Miller
 Fix For: 5.0, 4.11


 Not sure what is causing this, but SOLR-6136 may have taken us a step back 
 here. I see this problem occasionally pop up in ChaosMonkeyNothingIsSafeTest 
 now - test fails because of a thread leak, thread leak is due to a 
 ConcurrentUpdateSolrServer hang in blockUntilFinished. Only started popping 
 up recently.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6406) ConcurrentUpdateSolrServer hang in blockUntilFinished.

2014-08-22 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14106909#comment-14106909
 ] 

Mark Miller commented on SOLR-6406:
---

This was on a nightly of ChaosMonkeyNothingIsSafeTest. It's fairly rare.

 ConcurrentUpdateSolrServer hang in blockUntilFinished.
 --

 Key: SOLR-6406
 URL: https://issues.apache.org/jira/browse/SOLR-6406
 Project: Solr
  Issue Type: Bug
Reporter: Mark Miller
 Fix For: 5.0, 4.11


 Not sure what is causing this, but SOLR-6136 may have taken us a step back 
 here. I see this problem occasionally pop up in ChaosMonkeyNothingIsSafeTest 
 now - test fails because of a thread leak, thread leak is due to a 
 ConcurrentUpdateSolrServer hang in blockUntilFinished. Only started popping 
 up recently.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org