Hi Roland, sorry for providing such bad input/information :S
> The numbers are wrong, but the situation is actually worse. So the problem is that we have only 2 running threads (checked out connections) but an increasing number of waiting threads, right? > Well, it looks like connections are leaking. So my next guess is > that there is some kind of error situation in which the connection > is not correctly released. The solrj code looks clean though. > Is there any other application or daemon thread that uses the > same connection manager? This should not be the case, as we're creating the CommonsHttpSolrServer simply with a URL, so that it creates a new HttpClient with a new MTHCM. Whoo, I just realized that the DefaultMaxConnectionsPerHost are NOT increased in the version of CommonsHttpSolrServer we're running in production, and therefore is still 2. So having two threads processing a response is correct and the problem might be related to the response processing... Btw. we right now have a new production release in the pipeline with the new version of solrj (CommonsHttpSolrServer with DefaultMaxConnectionsPerHost set to 32). This new version of solrj contains also changes to the XMLResponseParser, one contains threading related changes (https://issues.apache.org/jira/browse/SOLR-360). So we could wait for the release and see if we encounter the problem again or if it's already fixed - hopefully :) Thanx a lot for your help, cheers, Martin On Wed, 2008-02-06 at 21:41 +0100, Roland Weber wrote: > Hello Martin, > > >> Hm, I didn't get the last part about not reaching the wait. You have > >> more than 300 threads and just 128 connections, so I don't see a > >> problem with 200 threads waiting at the same time if the machine > >> is busy. > > Then I wasn't clear enough (sorry for my bad english :)), we do not have > > 200 but 300 threads waiting at the same time... > > I was talking in orders of magnitude, not in specific numbers. > > > The threaddump (http://senduit.com/93f7d2) shows 302 waiting threads and > > 6 that are running. > > > > I count 302 by doing this: > > > > grep "doGetConnection(MultiThreadedHttpConnectionManager.java:518)" > > threaddump.txt | wc -l > > > I count 6 with this: > > > > grep CommonsHttpSolrServer threaddump.txt | grep -v > > CommonsHttpSolrServer.java:222 | wc -l > > The 6 matches are for the following threads: > > "http-8080-2" daemon prio=10 tid=0x00002aab40860800 nid=0x1210 runnable > [0x0000000041725000..0x0000000041728dc0] > "TP-Processor45" daemon prio=10 tid=0x00002aab4103cc00 nid=0x11c9 runnable > [0x0000000042210000..0x0000000042213d40] > "http-8080-2" daemon prio=10 tid=0x00002aab40860800 nid=0x1210 runnable > [0x0000000041725000..0x0000000041728dc0] > "TP-Processor45" daemon prio=10 tid=0x00002aab4103cc00 nid=0x11c9 runnable > [0x0000000042210000..0x0000000042213d40] > "http-8080-2" daemon prio=10 tid=0x00002aab40860800 nid=0x1210 runnable > [0x0000000041725000..0x0000000041728dc0] > "TP-Processor45" daemon prio=10 tid=0x00002aab4103cc00 nid=0x11c9 runnable > [0x0000000042210000..0x0000000042213d40] > > Looks to me as if you're counting duplicates. > Same is true for the waiting threads: > > grep -A 10 -B 2 "Thread.State: WAITING" threaddump.txt | grep Processor149 > > returns three matches for TP-Processor149. > Maybe that is because there are three thread dumps in your file? > > grep -B 1 "Full thread dump" threaddump.txt > > 2008-01-29 11:00:02 > Full thread dump Java HotSpot(TM) 64-Bit Server VM (1.6.0_02-b05 mixed mode): > -- > 2008-01-29 11:12:01 > Full thread dump Java HotSpot(TM) 64-Bit Server VM (1.6.0_02-b05 mixed mode): > -- > 2008-01-29 11:22:01 > Full thread dump Java HotSpot(TM) 64-Bit Server VM (1.6.0_02-b05 mixed mode): > > > > Asuming that not, we do not have 128 checked out connections but only 6. > > The numbers are wrong, but the situation is actually worse. > > > The effect is also, that these 302 thread are blocking "forever" and we > > have to restart the server, as no new requests are being served... > > Well, it looks like connections are leaking. So my next guess is > that there is some kind of error situation in which the connection > is not correctly released. The solrj code looks clean though. > Is there any other application or daemon thread that uses the > same connection manager? > > You probably cannot easily switch to a 1.5 JVM to see whether that > makes a difference? > > cheers, > Roland > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] >
signature.asc
Description: This is a digitally signed message part
