Re: Proactively going down in anticipation of high load / bad state
Hi Erick, I have submitted a patch at https://issues.apache.org/jira/browse/SOLR-7121 Will add tests to this if the approach is acceptable. Thanks Sachin On Tue, Jan 6, 2015 at 12:26 PM, Erick Erickson wrote: > If you have done the some coding, it's always appropriate to open a JIRA > and attach the patch for discussion. > > Yonik's Law of Patches: > > "A half-baked patch in Jira, with no documentation, no tests > and no backwards compatibility is better than no patch at all." > > Even if the approach is shot down, it may spawn alternative approaches > or stimulate thinking. > > Best, > Erick > > On Tue, Jan 6, 2015 at 11:50 AM, S G wrote: > > Hi, > > > > For a solr cloud, is there a setting that allows a core to proactively go > > down if its able to detect some temporary issues like high GC, high > > thread-counts, temporary network slow down etc. ? > > Currently we see that a node gets in a distributed deadlock because its > not > > able to detect such situations. > > > > I am exploring Solr code to see if its possible to take some proactive > > action in such cases. > > One way could be to have configurable limits for GC time, thread-count, > > response-time, 5-minute-rate etc. and make a core shut down if it senses > > problems. > > Once that happens, a background thread will monitor the trouble causing > > parameters and recover the downed core when situation improves. > > > > > > My current patch can bring down a core for: > > 1) High thread-counts, > > 2) High 95thPcRequestTime, > > 3) Huge # of heavy queries in a given time. > > > > The patch also recovers the core when its health improves. > > > > > > If the above seems doable, then I can create a JIRA for more discussion > and > > implementation. > > > > > > Thanks > > Sachin > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > >
Re: Proactively going down in anticipation of high load / bad state
If you have done the some coding, it's always appropriate to open a JIRA and attach the patch for discussion. Yonik's Law of Patches: "A half-baked patch in Jira, with no documentation, no tests and no backwards compatibility is better than no patch at all." Even if the approach is shot down, it may spawn alternative approaches or stimulate thinking. Best, Erick On Tue, Jan 6, 2015 at 11:50 AM, S G wrote: > Hi, > > For a solr cloud, is there a setting that allows a core to proactively go > down if its able to detect some temporary issues like high GC, high > thread-counts, temporary network slow down etc. ? > Currently we see that a node gets in a distributed deadlock because its not > able to detect such situations. > > I am exploring Solr code to see if its possible to take some proactive > action in such cases. > One way could be to have configurable limits for GC time, thread-count, > response-time, 5-minute-rate etc. and make a core shut down if it senses > problems. > Once that happens, a background thread will monitor the trouble causing > parameters and recover the downed core when situation improves. > > > My current patch can bring down a core for: > 1) High thread-counts, > 2) High 95thPcRequestTime, > 3) Huge # of heavy queries in a given time. > > The patch also recovers the core when its health improves. > > > If the above seems doable, then I can create a JIRA for more discussion and > implementation. > > > Thanks > Sachin - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Proactively going down in anticipation of high load / bad state
Hi, For a solr cloud, is there a setting that allows a core to proactively go down if its able to detect some temporary issues like high GC, high thread-counts, temporary network slow down etc. ? Currently we see that a node gets in a distributed deadlock because its not able to detect such situations. I am exploring Solr code to see if its possible to take some proactive action in such cases. One way could be to have configurable limits for GC time, thread-count, response-time, 5-minute-rate etc. and make a core shut down if it senses problems. Once that happens, a background thread will monitor the trouble causing parameters and recover the downed core when situation improves. My current patch can bring down a core for: 1) High thread-counts, 2) High 95thPcRequestTime, 3) Huge # of heavy queries in a given time. The patch also recovers the core when its health improves. If the above seems doable, then I can create a JIRA for more discussion and implementation. Thanks Sachin