To calculate percentiles we need all the data points. If there is a lot of data, it could be sampled.
Average can be calculated with the total time and the number of requests. Snapshots of those two values allow snapshots of averages. But averages are the wrong metric for a one-sided distribution like response time. Let’s assume that any response longer than 10 seconds is a bad experience. Percentiles will tell you what response time 95% of customer searches are getting. With averages, a single 30 second response time will increase the metric, even though it is “just as broken” as a 15 s response. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Nov 15, 2016, at 7:27 AM, Ryan Josal <rjo...@gmail.com> wrote: > > I haven't tried for 95th percentile, but generally with those collection > start stats you would monitor based on calculated deltas. You can figure out > the average response time for any given window of time not smaller than your > snapshot polling interval. I don't see why 95th percentile would be any > different. > > Ryan > > On Monday, November 14, 2016, Walter Underwood <wun...@wunderwood.org > <mailto:wun...@wunderwood.org>> wrote: > Because the current stats are not usable. They really should be removed from > the code. > > They calculate percentiles since the last collection load. We need to know > 95th percentile > during the peak hour last night, not the 95th for the last month. > > Right now, we run eleven collections in our Solr 4 cluster. In each > collection, we have > several different handlers. Usually, one for autosuggest (instant results), > one for the SRP, > and one for mobile, though we also have SEO requests and so on. We can track > performance > for each of these. > > wunder > Walter Underwood > wun...@wunderwood.org <javascript:_e(%7B%7D,'cvml','wun...@wunderwood.org');> > http://observer.wunderwood.org/ <http://observer.wunderwood.org/> (my blog) > > >> On Nov 14, 2016, at 3:54 PM, Erick Erickson <erickerick...@gmail.com >> <javascript:_e(%7B%7D,'cvml','erickerick...@gmail.com');>> wrote: >> >> Point taken, and thanks for the link. The stats I'm referring to in >> this thread are available now, and would (I think) be a quick win. I >> don't have a huge amount of investment in it though, more "why didn't >> we think of this before?" followed by "maybe there's a very good >> reason not to bother". This may be it since we now standardize on >> Jetty. My question of course is whether this would be supported moving >> forward to netty or whatever... >> >> Best, >> Erick >> >> On Mon, Nov 14, 2016 at 3:44 PM, Walter Underwood <wun...@wunderwood.org >> <javascript:_e(%7B%7D,'cvml','wun...@wunderwood.org');>> wrote: >>> I’m not fond of polling for performance stats. I’d rather have the app >>> report them. >>> >>> We could integrate existing Jetty monitoring: >>> >>> http://metrics.dropwizard.io/3.1.0/manual/jetty/ >>> <http://metrics.dropwizard.io/3.1.0/manual/jetty/> >>> >>> From our experience with a similar approach, we might need some >>> Solr-specific metric >>> conflation. SolrJ sends a request to /solr/collection/handler as >>> /solr/collection/select?qt=/handler. >>> In our code, we fix that request to the intended path. We’ve been running a >>> Tomcat metrics search >>> filter for three years. >>> >>> Also, see: >>> >>> https://issues.apache.org/jira/browse/SOLR-8785 >>> <https://issues.apache.org/jira/browse/SOLR-8785> >>> >>> wunder >>> Walter Underwood >>> wun...@wunderwood.org >>> <javascript:_e(%7B%7D,'cvml','wun...@wunderwood.org');> >>> http://observer.wunderwood.org/ <http://observer.wunderwood.org/> (my blog) >>> >>> >>> On Nov 14, 2016, at 3:25 PM, Erick Erickson <erickerick...@gmail.com >>> <javascript:_e(%7B%7D,'cvml','erickerick...@gmail.com');>> wrote: >>> >>> What do people think about exposing a Collections API call (name TBD, >>> but the sense is PERFORMANCESTATS) that would simply issue the >>> admin/mbeans call to each replica of a collection and report them >>> back. This would give operations monitors the ability to see, say, >>> anomalous replicas that had poor average response times for the last 5 >>> minutes and the like. >>> >>> Seems like an easy enhancement that would make ops people's lives easier. >>> >>> I'll raise a JIRA if there's interest, but sure won't make progress on >>> it until I clear my plate of some other JIRAs that I've let linger for >>> far too long. >>> >>> Erick >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >>> <javascript:_e(%7B%7D,'cvml','dev-unsubscr...@lucene.apache.org');> >>> For additional commands, e-mail: dev-h...@lucene.apache.org >>> <javascript:_e(%7B%7D,'cvml','dev-h...@lucene.apache.org');> >>> >>> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> <javascript:_e(%7B%7D,'cvml','dev-unsubscr...@lucene.apache.org');> >> For additional commands, e-mail: dev-h...@lucene.apache.org >> <javascript:_e(%7B%7D,'cvml','dev-h...@lucene.apache.org');> >> >