Hi,
thanks for the quick response.
We have meanwhile tried to remove the group.facet=true from the set of
parameters and couldn't reproduce the problem using the same stress
test, so I think 80% chance this is the root cause.
We have tried solr 6.4.1, same problem occurs.
There is only a very few small number of documents (100 - 200K) index on
the disk only 544Mb. Very low traffic - definitelly less than 10 qps.
No OOM errors.
GC log shows
2017-03-04T18:40:48.056+0000: 407.215: Total time for which application
threads were stopped: 19.2522394 seconds, Stopping threads took:
0.0003600 seconds
2017-03-04T18:40:49.135+0000: 408.294: Total time for which application
threads were stopped: 0.0256060 seconds, Stopping threads took:
0.0240290 seconds
2017-03-04T18:40:50.146+0000: 409.305: Total time for which application
threads were stopped: 0.0106780 seconds, Stopping threads took:
0.0090890 seconds

19 seconds is worrying.

I will try some traces when I'm not under stress test myself.

Thanks
 Marek
 



> The "Unable to write response, client closed connection or we are
> shutting down" bits mean you're timing out. Or maybe something much
> more serious. You can up the timeouts, but that's not particularly
> useful since the response is so long anyway.
>
> Before jumping to conclusions, I'd _really_ recommend you figure out
> the root cause. First set up jmeter or the like so you can create a
> stress test and reproduce this at will on a test machine.
>
> Things I'd check:
>
>> At what point do things get slow? 10 QPS? 100 QPS, 1,000 QPS? Let's get a 
>> benchmark here for a reality check. If you're throwing 1,000 QPS at a single 
>> Solr instance that's simply unrealistic. 100 QPS/node is on the high side of 
>> what I'd expect.
>> how many docs do you have on a node?
>> look at your Solr logs for any anomalies, particularly OOM errors.
>> turn on GC logs and see if you're spending an inordinate amount of time in 
>> GC. Note you can get a clue if this is the issue by just increasing the JVM 
>> heap as a quick test. Not conclusive, but if you give the app another 4G and 
>> your timings change radically, problem identified.
>> That JIRA you pointed to is unlikely to be the real issue since your 
>> performance is OK to start. It's still possible, but..
>> attach a profiler to see where the time is being spent. Must be on a test 
>> machine since profilers are generally intrusive.
>> Grab a couple of stack traces and see if that sheds a clue.
> I really have to emphasize, though, that until you do a Root Cause
> Analysis, you're just guessing. Going to 6.4 an using JSON facets is a
> shot in the dark.
>
> Best,
> Erick
>
>
>
> On Sat, Mar 4, 2017 at 8:45 AM, Marek Tichy <ma...@gn.apc.org> wrote:
>> Hi,
>>
>> I'm in a bit of a crisis here. Trying to deploy a new search on an
>> ecommerce website which has been tested (but not stress tested). The
>> core has been running for years without any performance problems but we
>> have now changed two things:
>>
>> 1) started using group.facet=true in a rather complicated query - see below
>>
>> 2) added a new core with suggester component
>>
>> Solr version was 5.2, upgraded to 5.5.4 to try, no improvement.
>>
>> What happens under real load is the query response times start getting
>> higher  > 10000  and most requests end up like this:
>> org.apache.solr.servlet.HttpSolrCall; Unable to write response, client
>> closed connection or we are shutting down
>>
>> Could it be  this issue https://issues.apache.org/jira/browse/SOLR-4763
>> ? And if so, would upgrading to 6.4 help or changing the app to start
>> using JSON.facet ?
>>
>> Any help would be  greatly appreciated.
>>
>> Thanks
>>
>> Marek
>>
>>
>> INFO  - 2017-03-04 16:04:42.619; [   x:kcore]
>> org.apache.solr.core.SolrCore; [kcore]  webapp=/solr path=/select
>> params={f.ebook_formats.facet.mincount=1&f.languageid.facet.limit=10&f.ebook_formats.facet.limit=10&fq=((type:knihy)+OR+(type:defekty))&fq=authorid:(27544)&f.thematicgroupid.facet.mincount=1&group.ngroups=true&group.ngroups=true&f.type.facet.limit=10&group.facet=true&f.articleparts.facet.mincount=1&f.articleparts.facet.limit=10&group.field=edition&group=true&facet.field=categoryid&facet.field={!ex%3Dat}articletypeid_grouped&facet.field={!ex%3Dat}type&facet.field={!ex%3Dsw}showwindow&facet.field={!ex%3Dtema}thematicgroupid&facet.field={!ex%3Dformat}articleparts&facet.field={!ex%3Dformat}ebook_formats&facet.field={!ex%3Dlang}languageid&f.categoryid.facet.mincount=1&group.limit=30&start=0&f.type.facet.mincount=1&f.thematicgroupid.facet.limit=10&sort=score+desc&rows=12&version=2.2&f.languageid.facet.mincount=1&q=&group.truncate=false&group.format=grouped&f.showwindow.facet.mincount=1&f.articletypeid_grouped.facet.mincount=1&f.categoryid.facet.limit=100&f.showwindow.facet.limit=10&f.articletypeid_grouped.facet.limit=10&facet=true}
>> hits=1 status=0 QTime=19214
>>
>>
>>
>>
>>
>>


Reply via email to