Hi, thanks for the quick response. We have meanwhile tried to remove the group.facet=true from the set of parameters and couldn't reproduce the problem using the same stress test, so I think 80% chance this is the root cause. We have tried solr 6.4.1, same problem occurs. There is only a very few small number of documents (100 - 200K) index on the disk only 544Mb. Very low traffic - definitelly less than 10 qps. No OOM errors. GC log shows 2017-03-04T18:40:48.056+0000: 407.215: Total time for which application threads were stopped: 19.2522394 seconds, Stopping threads took: 0.0003600 seconds 2017-03-04T18:40:49.135+0000: 408.294: Total time for which application threads were stopped: 0.0256060 seconds, Stopping threads took: 0.0240290 seconds 2017-03-04T18:40:50.146+0000: 409.305: Total time for which application threads were stopped: 0.0106780 seconds, Stopping threads took: 0.0090890 seconds
19 seconds is worrying. I will try some traces when I'm not under stress test myself. Thanks Marek > The "Unable to write response, client closed connection or we are > shutting down" bits mean you're timing out. Or maybe something much > more serious. You can up the timeouts, but that's not particularly > useful since the response is so long anyway. > > Before jumping to conclusions, I'd _really_ recommend you figure out > the root cause. First set up jmeter or the like so you can create a > stress test and reproduce this at will on a test machine. > > Things I'd check: > >> At what point do things get slow? 10 QPS? 100 QPS, 1,000 QPS? Let's get a >> benchmark here for a reality check. If you're throwing 1,000 QPS at a single >> Solr instance that's simply unrealistic. 100 QPS/node is on the high side of >> what I'd expect. >> how many docs do you have on a node? >> look at your Solr logs for any anomalies, particularly OOM errors. >> turn on GC logs and see if you're spending an inordinate amount of time in >> GC. Note you can get a clue if this is the issue by just increasing the JVM >> heap as a quick test. Not conclusive, but if you give the app another 4G and >> your timings change radically, problem identified. >> That JIRA you pointed to is unlikely to be the real issue since your >> performance is OK to start. It's still possible, but.. >> attach a profiler to see where the time is being spent. Must be on a test >> machine since profilers are generally intrusive. >> Grab a couple of stack traces and see if that sheds a clue. > I really have to emphasize, though, that until you do a Root Cause > Analysis, you're just guessing. Going to 6.4 an using JSON facets is a > shot in the dark. > > Best, > Erick > > > > On Sat, Mar 4, 2017 at 8:45 AM, Marek Tichy <ma...@gn.apc.org> wrote: >> Hi, >> >> I'm in a bit of a crisis here. Trying to deploy a new search on an >> ecommerce website which has been tested (but not stress tested). The >> core has been running for years without any performance problems but we >> have now changed two things: >> >> 1) started using group.facet=true in a rather complicated query - see below >> >> 2) added a new core with suggester component >> >> Solr version was 5.2, upgraded to 5.5.4 to try, no improvement. >> >> What happens under real load is the query response times start getting >> higher > 10000 and most requests end up like this: >> org.apache.solr.servlet.HttpSolrCall; Unable to write response, client >> closed connection or we are shutting down >> >> Could it be this issue https://issues.apache.org/jira/browse/SOLR-4763 >> ? And if so, would upgrading to 6.4 help or changing the app to start >> using JSON.facet ? >> >> Any help would be greatly appreciated. >> >> Thanks >> >> Marek >> >> >> INFO - 2017-03-04 16:04:42.619; [ x:kcore] >> org.apache.solr.core.SolrCore; [kcore] webapp=/solr path=/select >> params={f.ebook_formats.facet.mincount=1&f.languageid.facet.limit=10&f.ebook_formats.facet.limit=10&fq=((type:knihy)+OR+(type:defekty))&fq=authorid:(27544)&f.thematicgroupid.facet.mincount=1&group.ngroups=true&group.ngroups=true&f.type.facet.limit=10&group.facet=true&f.articleparts.facet.mincount=1&f.articleparts.facet.limit=10&group.field=edition&group=true&facet.field=categoryid&facet.field={!ex%3Dat}articletypeid_grouped&facet.field={!ex%3Dat}type&facet.field={!ex%3Dsw}showwindow&facet.field={!ex%3Dtema}thematicgroupid&facet.field={!ex%3Dformat}articleparts&facet.field={!ex%3Dformat}ebook_formats&facet.field={!ex%3Dlang}languageid&f.categoryid.facet.mincount=1&group.limit=30&start=0&f.type.facet.mincount=1&f.thematicgroupid.facet.limit=10&sort=score+desc&rows=12&version=2.2&f.languageid.facet.mincount=1&q=&group.truncate=false&group.format=grouped&f.showwindow.facet.mincount=1&f.articletypeid_grouped.facet.mincount=1&f.categoryid.facet.limit=100&f.showwindow.facet.limit=10&f.articletypeid_grouped.facet.limit=10&facet=true} >> hits=1 status=0 QTime=19214 >> >> >> >> >> >>