Do these machines have a firewall in-between? On Fri, 8 Jun 2018, 20:29 Markus Jelsma, <markus.jel...@openindex.io> wrote:
> Hello Shawn, > > The logs appear useless, they are littered with these: > > 2018-06-08 14:02:47.382 ERROR (qtp1458849419-1263) [ ] > o.a.s.s.HttpSolrCall null:org.apache.solr.common.SolrException: Error > trying to proxy request for url: http://idx2:8983/solr/ > search/admin/ping <http://idx2:8983/solr/search/admin/ping> > at > org.apache.solr.servlet.HttpSolrCall.remoteQuery(HttpSolrCall.java:647) > at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:501) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384) > .. > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131) > at > org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:382) > at > org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708) > at > org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.eclipse.jetty.io.EofException > at > org.eclipse.jetty.server.HttpConnection$SendCallback.reset(HttpConnection.java:704) > .. > at org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:509) > > Regarding the versions, it is a bit hard to recall but i do no thing i > have seen this on 7.2, most certainly not on 7.1. > > We operate three distinct type of Solr collections, they only share the > same Zookeeper quorum. The other two collections do not seem to have this > problem, but i don't restart those as often as i restart this collection, > as i am STILL trying to REPRODUCE the dreaded memory leak i reported having > on 7.3 about two weeks ago. Sorry, but i drives me nuts! > > Thanks, > Markus > > -----Original message----- > > From:Shawn Heisey <apa...@elyograg.org> > > Sent: Friday 8th June 2018 16:47 > > To: solr-user@lucene.apache.org > > Subject: Re: 7.3.1 creates thousands of threads after start up > > > > On 6/8/2018 8:17 AM, Markus Jelsma wrote: > > > Our local test environment mini cluster goes nuts right after start > up. It is a two node/shard/replica collection starts up normally if only > one node start up. But as soon as the second node attempts to join the > cluster, both nodes go crazy, creating thousands of threads with identical > stack traces. > > > > > > "qtp1458849419-4738" - Thread t@4738 > > > java.lang.Thread.State: TIMED_WAITING > > > at sun.misc.Unsafe.park(Native Method) > > > - parking to wait for <6ee32168> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > > > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > > > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) > > > at > org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:392) > > > at > org.eclipse.jetty.util.thread.QueuedThreadPool.idleJobPoll(QueuedThreadPool.java:600) > > > at > org.eclipse.jetty.util.thread.QueuedThreadPool.access$800(QueuedThreadPool.java:49) > > > at > org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:663) > > > at java.lang.Thread.run(Thread.java:748) > > > > > > Locked ownable synchronizers: > > > - None > > > > > > If does not happen always, but most of the time i am unable to boot > the cluster normally. Sometimes, apparently right now for the first time, > the GUI is still accessible. > > > > > > Is this a known issue? > > > > It's not a problem that I've heard of. There are no Solr classes in the > > stacktrace, only Jetty and Java classes. I won't try to tell you that a > > bug in Solr can't be the root cause, because it definitely can. The > > threads appear to be created by Jetty, but the supplied info doesn't > > indicate WHY it's happening. > > > > Presumably there's a previous version you've used where this problem did > > NOT happen. What version would that be? > > > > Can you share the solr.log file from both nodes when this happens? > > There might be a clue there. > > > > It sounds like you probably have a small number of collections in the > > dev cluster. Can you confirm that? > > > > Thanks, > > Shawn > > > > >