On 3 January 2014 23:22, Philippe Mouawad <[email protected]> wrote: > Thanks Rainer, of great help! > Regarding graphite, it offers a function to compute percentiles, so we > could as sebb proposed it send raw results.
No, I was not proposing that. I did not know that Graphite could do the calculations. I was proposing that the calculations were done in the back-end thread before sending to graphite. The point was to separate the collection and processing of the data. However it would be nice if graphite supported aggregated data, that could reduce the amount to be sent to it. > My concern was about the memory and network impact of sending all sample > results to backend. Likewise. > So should we create what you propose, do you know an existing library that > implements it already ? See my reply to Rainer. StatCalculator does it already. > > On Friday, January 3, 2014, Rainer Jung wrote: > >> On 03.01.2014 13:57, [email protected] <javascript:;> wrote: >> > https://issues.apache.org/bugzilla/show_bug.cgi?id=55932 >> > >> > --- Comment #6 from Sebb <[email protected] <javascript:;>> --- >> > I have been having a look at the implementation. >> > >> > I don't really see that it needs Commons Math; we aleady have >> StatCalculator >> > which handles percentiles and more. >> > >> > Likewise, does it really need Commons Pool? >> > It seems wrong to have to have 2 separate pools of SocketOutputStream >> > instances. >> > How many of these would there be? >> > >> > Also, DescriptiveStatistics is not thread-safe (nor is StatCalculator). >> > >> > If we do implement something like this, I think the data processing needs >> > either to be carefully synchronised, or the raw data should be sent to a >> > separate singleton background thread. >> >> FWIW: I always get a bit nervous when percentiles are calculated. >> Percentiles are expensive to calculate if one needs exact results with >> given percentage numbers (50%, 99%, 99.9% etc.). In that case one needs >> to keep all values as an ordered list to calculate the percentiles. For >> a long running test that would be expensive in terms of memory but also >> in terms of CPU (sorting). There's no way of exactly merging percentiles >> from interim statistical data. >> >> Sometimes approximations are enough. By approximation I don't mean >> estimated data, but percentages which are not exactly the ones you are >> keen for. E.g. you would get a 48% value instead of a 50% value, or a >> 99.02% value instead of a 99% value. >> >> Suppose you would know (configure) that only very few samples will take >> longer than 1000ms, then one could create fixed bins for e.g. 10ms, >> 15ms, 20ms, 25ms, 30ms, 40ms, 50ms, 75ms, 100ms, 150ms, 200ms, 250ms, >> 300ms, 400ms, 500ms, 750ms and 1000ms. Now whenever a sample finishes >> you count the sample in the bin it belongs to and do not save the data >> (of course you can still log it). At any time you can now look at the >> bin counters and cheaply produce quantiles. For example suppose the bin >> counters look like that: >> >> Duration binCount >> 10ms 3 >> 15ms 2 >> 20ms 5 >> 25ms 10 >> 30ms 8 >> 40ms 20 >> 50ms 28 >> 75ms 100 >> 100ms 230 >> 150ms 610 >> 200ms 780 >> 250ms 530 >> 300ms 220 >> 400ms 200 >> 500ms 80 >> 750ms 90 >> 1000ms 50 >> >1000ms 30 >> >> Now we sum up the bin count into cumulated count and take percentages: >> >> Duration binCount Cum. Cum.Pct binPct >> 10ms 3 3 0.10% 0.10% >> 15ms 2 5 0.17% 0.07% >> 20ms 5 10 0.33% 0.17% >> 25ms 10 20 0.67% 0.33% >> 30ms 8 28 0.93% 0.27% >> 40ms 20 48 1.60% 0.67% >> 50ms 28 76 2.54% 0.93% >> 75ms 100 176 5.87% 3.34% >> 100ms 230 406 13.55% 7.68% >> 150ms 610 1016 33.91% 20.36% >> 200ms 780 1796 59.95% 26.03% >> 250ms 530 2326 77.64% 17.69% >> 300ms 220 2546 84.98% 7.34% >> 400ms 200 2746 91.66% 6.68% >> 500ms 80 2826 94.33% 2.67% >> 750ms 90 2916 97.33% 3.00% >> 1000ms 50 2966 99.00% 1.67% >> >1000ms 30 2996 100.00% 1.00% >> >> The Cum.Pct column could be used instead of the percentiles. One does >> not need to keep all sample values around and sort them, but one does >> also not get equidistant percentiles (10%, 11%, 12%, ...). >> >> Making the table more fine grained by using more rows is cheap. I have >> chosen here a short table to make the example easier to understand. The >> duration limits for the bins I had chosen were human friendly (integral >> numbers, often divisible by 10 or 100), but one could also use a >> mathematically strict series like e.g. 100 logarithmic steps or 10 >> logarithmic steps for each factor 10 in duration time. The bin limits >> would then increase by 26% from bin to bin, e.g. 100ms, 126ms, 158ms, >> 200ms, 251ms, 316ms, 398ms, 501ms, 631ms, 794ms, 1000ms. >> >> Users who need integral percentiles, like exactly 50% or 99% values >> instead of 48.2% or 99.05% values (example) would need to either post >> process the sample logs or choose a more expensive live processing (in >> terms of memory and cpu be it inside JMeter or another backend). I'd >> expect, that often the non-integral percentiles above are good enough to >> follow a running test and get sufficient data about its behavior (and >> are cheap enough to produce on the fly) and the integral percentiles are >> OK to generate during post-processing and full report generation after a >> run. >> >> Just my 2c. >> >> Regards, >> >> Rainer >> > > > -- > Cordialement. > Philippe Mouawad.
