On Sun, Feb 25, 2001 at 10:54:16PM +0100, Alex van den Bogaerdt wrote: > > Hi folks, > > Suppose I can create an array with 1234 elements, sort the values > in this array, I should be able to fetch the 95th percentile from > the array by selecting the correct element. > > However, I need to know two things: > > - 95% * 1234 == 1172.30 > Do I take element 1171 or 1172 in this case? (1172-1, or 1173-1, > as the array is zero based)
I believe there are statistical formulae for this; it seems to me I recall for maximum accuracy you interpolate between the two samples based on the fraction of the way through that it falls, so you would take v(1171) + 0.30*(v(1172)-v(1171)). (Similar to what you do when calculating the median of an even number of samples, you take halfway between the middle two values.) > - If there are unknown entries in the database, it seems to me that > these unknowns should be processed in favor of the one paying the > bill (correct?) If so, should I keep them in the array? This question lies on some hazy boundary between statistics and ethics and I am not sure I know the best answer. The two alternatives I see are: keep the samples in and treat as 0 (sort to bottom of the array), or count only the number of samples for which you have data stored, and take the 95%ile of those samples, ignoring those for which there is no data. -- Clifton -- Clifton Royston -- LavaNet Systems Architect -- [EMAIL PROTECTED] WWJD? "JWRTFM!" - Scott Dorsey (kludge) "JWG" - Eddie Aikau -- Unsubscribe mailto:[EMAIL PROTECTED] Help mailto:[EMAIL PROTECTED] Archive http://www.ee.ethz.ch/~slist/rrd-users WebAdmin http://www.ee.ethz.ch/~slist/lsg2.cgi
