A client had a Solr instance doing absolutely nothing for a month.
Literally a test system that was idle. When they tried to finally do
something, they couldn't. That Solr process had over 16K threads
operating. No indexing, no querying, was going on, nada.
They investigated and found that the Solr couldn't connect to
Zookeeper and had a zillion threads (well, actually about 16K which
was their limit) with the stack trace at the end.
Admittedly the client had a weird situation where Solr couldn't talk
to Zookeeper, and admittedly Solr can't do much if it can't talk to
ZK.
Even so this seems odd. I'm also a bit worried that if they fix the
reason Solr couldn't talk to ZK (maybe it's a firewall issue? plug the
cable back in?) that when all those threads suddenly get to do their
thing what will happen? Not to mention any effects on other processes.
Anyway, if this is worth a JIRA I can create one if there aren't any already.
Solr 4.10
Here's the stack trace:
"main-EventThread" daemon prio=10 tid=0x000000000a38d000 nid=0xeb51 in
Object.wait() [0x00007e8c15c89000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at
org.apache.solr.common.cloud.ConnectionManager.waitForConnected(ConnectionManager.java:215)
- locked <0x00000003c5306fc0> (a
org.apache.solr.common.cloud.ConnectionManager)
at
org.apache.solr.common.cloud.ConnectionManager$1.update(ConnectionManager.java:138)
at
org.apache.solr.common.cloud.DefaultConnectionStrategy.reconnect(DefaultConnectionStrategy.java:56)
at
org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:132)
at
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
Erick
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]