[ 
https://issues.apache.org/jira/browse/SOLR-9135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15300443#comment-15300443
 ] 

Ronald Braun commented on SOLR-9135:
------------------------------------

I can do a "uname -a" successfully if I ssh into the running container and try 
it.  It seems that the container state is such that any exec attempted by solr 
itself fails (the process enters immediate D state), after which the admin page 
hangs for a few seconds before timing out and trying again.  This is almost 
certainly a by-product of a problematic setup on our part which we are sorting 
out.  My main concern was that the admin was forking the external process to 
begin with on a simple admin page load, and that if it failed, it was 
continuing to try it ad infinitum, thus consuming our process pool and blocking 
any ability to access admin functions.  It seems like an unnecessary coupling 
to external system state given the data being fetched, best avoided.

> SystemInfoHandler can poison / consume Jetty thread pool
> --------------------------------------------------------
>
>                 Key: SOLR-9135
>                 URL: https://issues.apache.org/jira/browse/SOLR-9135
>             Project: Solr
>          Issue Type: Bug
>         Environment: Solr 6.0.0
>            Reporter: Ronald Braun
>            Priority: Minor
>
> We are running solr 6.0.0 in solr cloud mode within a docker container.  We 
> encountered an issue whereby the SystemInfoHandler was forking out processes 
> that would immediately enter D (uninterruputable sleep) due to a container 
> volume issue after hitting the admin manager in a browser.  The thread stays 
> in runnable state:
> {noformat}
> "qtp43368234-13611" #13611 prio=5 os_prio=0 tid=0x00007f0260011800 nid=0x36fb 
> ru
> nnable [0x00007efa0bce1000]
>    java.lang.Thread.State: RUNNABLE
>         at java.lang.UNIXProcess.forkAndExec(Native Method)
>         at java.lang.UNIXProcess.<init>(UNIXProcess.java:248)
>         at java.lang.ProcessImpl.start(ProcessImpl.java:134)
>         at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
>         at java.lang.Runtime.exec(Runtime.java:620)
>         at java.lang.Runtime.exec(Runtime.java:450)
>         at java.lang.Runtime.exec(Runtime.java:347)
>         at 
> org.apache.solr.handler.admin.SystemInfoHandler.execute(SystemInfoHan
> dler.java:244)
>         at 
> org.apache.solr.handler.admin.SystemInfoHandler.getSystemInfo(SystemI
> nfoHandler.java:198)
>         at 
> org.apache.solr.handler.admin.SystemInfoHandler.handleRequestBody(Sys
> temInfoHandler.java:111)
>         at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandl
> erBase.java:155)
>         at 
> org.apache.solr.handler.admin.InfoHandler.handleRequestBody(InfoHandl
> er.java:86)
>         at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandl
> erBase.java:155)
>         at 
> org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.
> java:658)
>         at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:441)
>         at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilte
> r.java:229)
>         at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilte
> r.java:184)
>         at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(Servlet
> Handler.java:1668)
>         at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java
> :581)
>         at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.j
> ava:143)
>         at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.jav
> a:548)
>         at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandl
> er.java:226)
>         at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandl
> er.java:1160)
>         at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:
> 511)
>         at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandle
> r.java:185)
>         at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandle
> r.java:1092)
>         at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.j
> ava:141)
>         at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(Cont
> extHandlerCollection.java:213)
>         at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerColl
> ection.java:119)
>         at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper
> .java:134)
>         at org.eclipse.jetty.server.Server.handle(Server.java:518)
>         at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:308)
>         at 
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.jav
> a:244)
>         at 
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(Abstra
> ctConnection.java:273)
>         at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
>         at 
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoin
> t.java:93)
>         at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceA
> ndRun(ExecuteProduceConsume.java:246)
>         at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(Exec
> uteProduceConsume.java:156)
>         at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPoo
> l.java:654)
>         at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool
> .java:572)
>         at java.lang.Thread.run(Thread.java:745)
> {noformat}
> The problematic command being executed was 'uname -a'.  The admin manager 
> would throw up a "Lost connection to solr" message but presumably retries the 
> connection periodically (at least a couple of times a minute).  Before we 
> figured out what was going on, we had 600+ threads in D state:
> {noformat}
> 4433 solr 20 0 0.399t 0.105t 0.057t D 0.0 86.2 0:00.00 
> /usr/lib/jvm/java-8-oracle/bin/java -server -Xms48g -Xmx48g -XX:NewRatio=3 
> -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThre+
> 4434 solr 20 0 0.399t 0.105t 0.057t D 0.0 86.2 0:00.00 
> /usr/lib/jvm/java-8-oracle/bin/java -server -Xms48g -Xmx48g -XX:NewRatio=3 
> -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThre+
> 4439 solr 20 0 0.399t 0.105t 0.057t D 0.0 86.2 0:00.00 
> /usr/lib/jvm/java-8-oracle/bin/java -server -Xms48g -Xmx48g -XX:NewRatio=3 
> -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThre+
> 4440 solr 20 0 0.399t 0.105t 0.057t D 0.0 86.2 0:00.04 
> /usr/lib/jvm/java-8-oracle/bin/java -server -Xms48g -Xmx48g -XX:NewRatio=3 
> -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThre+
> 4461 solr 20 0 0.399t 0.105t 0.057t D 0.0 86.2 0:00.00 
> /usr/lib/jvm/java-8-oracle/bin/java -server -Xms48g -Xmx48g -XX:NewRatio=3 
> -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThre+
> 4462 solr 20 0 0.399t 0.105t 0.057t D 0.0 86.2 0:00.00 
> /usr/lib/jvm/java-8-oracle/bin/java -server -Xms48g -Xmx48g -XX:NewRatio=3 
> -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThre+
> 4467 solr 20 0 0.399t 0.105t 0.057t D 0.0 86.2 0:00.00 
> /usr/lib/jvm/java-8-oracle/bin/java -server -Xms48g -Xmx48g -XX:NewRatio=3 
> -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThre+
> 4470 solr 20 0 0.399t 0.105t 0.057t D 0.0 86.2 0:00.00 
> /usr/lib/jvm/java-8-oracle/bin/java -server -Xms48g -Xmx48g -XX:NewRatio=3 
> -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThre+
> 4486 solr 20 0 0.399t 0.105t 0.057t D 0.0 86.2 0:00.00 
> /usr/lib/jvm/java-8-oracle/bin/java -server -Xms48g -Xmx48g -XX:NewRatio=3 
> -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThre+
> 4487 solr 20 0 0.399t 0.105t 0.057t D 0.0 86.2 0:00.06 
> /usr/lib/jvm/java-8-oracle/bin/java -server -Xms48g -Xmx48g -XX:NewRatio=3 
> -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThre+
> 4488 solr 20 0 0.399t 0.105t 0.057t D 0.0 86.2 0:00.00 
> /usr/lib/jvm/java-8-oracle/bin/java -server -Xms48g -Xmx48g -XX:NewRatio=3 
> -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThre+
> 4489 solr 20 0 0.399t 0.105t 0.057t D 0.0 86.2 0:00.00 
> /usr/lib/jvm/java-8-oracle/bin/java -server -Xms48g -Xmx48g -XX:NewRatio=3 
> -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThre+
> 4496 solr 20 0 0.399t 0.105t 0.057t D 0.0 86.2 0:00.00 
> /usr/lib/jvm/java-8-oracle/bin/java -server -Xms48g -Xmx48g -XX:NewRatio=3 
> -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThre+
> 4497 solr 20 0 0.399t 0.105t 0.057t D 0.0 86.2 0:00.00 
> /usr/lib/jvm/java-8-oracle/bin/java -server -Xms48g -Xmx48g -XX:NewRatio=3 
> -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThre+
> 4501 solr 20 0 0.399t 0.105t 0.057t D 0.0 86.2 0:00.00 
> /usr/lib/jvm/java-8-oracle/bin/java -server -Xms48g -Xmx48g -XX:NewRatio=3 
> -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThre+
> etc.
> {noformat}
> An OS exec call is a bit heavy for loading the admin page...  Might you 
> consider either:
> - load this info once at startup and store
> - use a collapsed panel for display and fetch only on expansion / request



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to