Re: How do you (not how do I) calculate 95th percentile?

2006-02-23 Thread Daniel Roesen

On Wed, Feb 22, 2006 at 05:46:01PM -0500, Russell, David wrote:
 I personally think that 5 minute sampling is so last century

s/5 minute sampling/polling/

RWSL[1] do deliver their accounting data via scp or FTP to
collector hosts by themselves. Push instead of pull/poll.

SNMP counter polling for accounting is real pain.


Regards,
Daniel

[1] Routers Which Suck Less

-- 
CLUE-RIPE -- Jabber: [EMAIL PROTECTED] -- [EMAIL PROTECTED] -- PGP: 0xA85C8AA0


How do you (not how do I) calculate 95th percentile?

2006-02-22 Thread Jo Rhett

I am wondering what other people are doing for 95th percentile calculations
these days.  Not how you gather the data, but how often you check the
counter? Do you use averages or maximums over time periods to create the 
buckets used for the 95th percentile calculation?

A lot of smaller folks check the counter every 5 min and use that same
value for the 95th percentile.  Most of us larger folks need to check more 
often to prevent 32bit counters from rolling over too often.  Are you larger
folks averaging the retrieved values over a larger period?  Using the
maximum within a larger period?  Or just using your saved values?

This is curiosity only.  A few years ago we compared the same data and the
answers varied wildly.  It would appear from my latest check that it is
becoming more standardized on 5-minute averages, so I'm asking here on Nanog 
as a reality check.

Note: I have AboveNet, Savvis, Verio, etc calculations.  I'm wondering
if there are any other odd combinations out there.

Reply to me offlist.  If there is interest I'll summarize the results
without identifying the source.

-- 
Jo Rhett
senior geek
SVcolo : Silicon Valley Colocation


Re: How do you (not how do I) calculate 95th percentile?

2006-02-22 Thread Tom Sands




Jo Rhett wrote:


I am wondering what other people are doing for 95th percentile calculations
these days.  Not how you gather the data, but how often you check the
counter? Do you use averages or maximums over time periods to create the 
buckets used for the 95th percentile calculation?




We use maximums, every 5 minutes.


A lot of smaller folks check the counter every 5 min and use that same
value for the 95th percentile.  Most of us larger folks need to check more 
often to prevent 32bit counters from rolling over too often. 


Actually, a lot of people do 5 minutes... and I would say that larger 
companies don't check them more often because they are using 64 bit 
counters, as should anyone with over about 100Mbps of traffic.



 Are you larger

folks averaging the retrieved values over a larger period?  Using the
maximum within a larger period?  Or just using your saved values?



In our setup, as with a lot of people likely, any data that is older 
than 30 days is averaged.  However, we store the exact maximums for the 
most current 30 days.



This is curiosity only.  A few years ago we compared the same data and the
answers varied wildly.  It would appear from my latest check that it is
becoming more standardized on 5-minute averages, so I'm asking here on Nanog 
as a reality check.






Note: I have AboveNet, Savvis, Verio, etc calculations.  I'm wondering
if there are any other odd combinations out there.

Reply to me offlist.  If there is interest I'll summarize the results
without identifying the source.



--
--
Tom Sands   
Chief Network Engineer  
Rackspace Managed Hosting   
(210)447-4065   
--


Re: How do you (not how do I) calculate 95th percentile?

2006-02-22 Thread Warren Kumari



On Feb 22, 2006, at 10:12 AM, Jo Rhett wrote:




A lot of smaller folks check the counter every 5 min and use that same
value for the 95th percentile.  Most of us larger folks need to  
check more
often to prevent 32bit counters from rolling over too often.  Are  
you larger

folks averaging the retrieved values over a larger period?  Using the
maximum within a larger period?  Or just using your saved values?


Most people are using 64 bit counters. This avoids the wrapping  
problem (assuming you don't have 100GE and poll more then once every  
5 years :-)).


This is curiosity only.  A few years ago we compared the same data  
and the
answers varied wildly.  It would appear from my latest check that  
it is
becoming more standardized on 5-minute averages, so I'm asking here  
on Nanog

as a reality check.


Yup, 5 min seems to be the accepted time.


Note: I have AboveNet, Savvis, Verio, etc calculations.  I'm wondering
if there are any other odd combinations out there.

Reply to me offlist.  If there is interest I'll summarize the results
without identifying the source.

--
Jo Rhett
senior geek
SVcolo : Silicon Valley Colocation





Re: How do you (not how do I) calculate 95th percentile?

2006-02-22 Thread Alex Rubenstein



(I did this fast, and, who knows; I could be off my an order or two of 
magnitude)


Most people are using 64 bit counters. This avoids the wrapping problem 
(assuming you don't have 100GE and poll more then once every 5 years :-)).


2^64 is 18,446,744,073,709,551,616 bytes.

100 GE (100,000,000,000 bits/sec) is 12,500,000,000 bytes/sec.

It would take 1,475,739,525 seconds, or 46.79 years for a counter wrap.


--
Alex Rubenstein, AR97, K2AHR, [EMAIL PROTECTED], latency, Al Reuben
Net Access Corporation, 800-NET-ME-36, http://www.nac.net




Re: How do you (not how do I) calculate 95th percentile?

2006-02-22 Thread Warren Kumari


Doh! You are 100% correct.

I didn't take into account the fact that the counters are if(In|Out) 
*Octets* and NOT if(in/Out)*Bits*.


The point is that 64-bit counters are not likely to roll :-)

Warren


On Feb 22, 2006, at 12:24 PM, Alex Rubenstein wrote:




(I did this fast, and, who knows; I could be off my an order or two  
of magnitude)


Most people are using 64 bit counters. This avoids the wrapping  
problem (assuming you don't have 100GE and poll more then once  
every 5 years :-)).


2^64 is 18,446,744,073,709,551,616 bytes.

100 GE (100,000,000,000 bits/sec) is 12,500,000,000 bytes/sec.

It would take 1,475,739,525 seconds, or 46.79 years for a counter  
wrap.



--
Alex Rubenstein, AR97, K2AHR, [EMAIL PROTECTED], latency, Al Reuben
Net Access Corporation, 800-NET-ME-36, http://www.nac.net






Re: How do you (not how do I) calculate 95th percentile?

2006-02-22 Thread Tom Sands




David W. Hankins wrote:


On Wed, Feb 22, 2006 at 12:50:34PM -0600, Tom Sands wrote:


A lot of smaller folks check the counter every 5 min and use that same
value for the 95th percentile.  Most of us larger folks need to check more 
often to prevent 32bit counters from rolling over too often. 


Actually, a lot of people do 5 minutes... and I would say that larger 
companies don't check them more often because they are using 64 bit 
counters, as should anyone with over about 100Mbps of traffic.



Counter size is an incomplete reason for polling interval.



Possibly incomplete, but a reason for some none the less, if all they 
can do is 32 bit counters.



If you need a 5 minute average and poll your routers once every five
minutes, what happens if an SNMP packet gets lost?



No one said it was needed, just what is done.. and I agree with  your 
reason of more frequent polling, than doing it because of counter roll.



In the best case, a retransmission over Y seconds sees it through, but
now you've got 300+Y seconds in what was supposed to be a 300 second
average...your next datapoint will also now be a 300-Y average unless
you schedule it into the future.

In the worst case, you've lost the datapoint entirely.  This loses not
just the one datapoint ending in that five minute span, but also the
next datapoint.  Sure, you can synthesize two 5 minute averages from
one 10 minute average (presuming your counters wouldn't roll), but this
is still a loss in data - one of those two datapoints should have been
higher than the other.






In our setup, as with a lot of people likely, any data that is older 
than 30 days is averaged.  However, we store the exact maximums for the 
most current 30 days.



You keep no record?  What do you do if a customer challenges their
bill?  Synthesize 5 minute datapoints out of the larger averages?



This isn't for customer billing.  We don't bill customers on Mbps, but 
rather on total volume of GB transfered.  That is an easy number to 
collect and doesn't depend on 5 minute itervals being successful.  Right 
up until someone clears the counters  ;)



I recommend keeping the 5 minute averages in perpetuity, even if that
means having an operator burn the data to CD and store it in a safe (not
under his desk in the pizza boxes, nor under his soft drink as a coaster).



--
--
Tom Sands   
Chief Network Engineer  
Rackspace Managed Hosting   
(210)447-4065   
--


RE: How do you (not how do I) calculate 95th percentile?

2006-02-22 Thread Russell, David
Title: How do you (not how do I) calculate 95th percentile?






I think that we have two 
(partially) unrelated issues in this thread: 1) how often you should sample and 
2) what do you do with the results. 

I personally think that 5 minute sampling 
is so last century because it is better suited for batch load types that do not 
change very quickly than for interactive web applications. If your users' web 
performance is being affected by a particular link,they are going to 
notice it in the 10 second range. Congestion events lasting 1-3 minutes can 
bea problem.After five minutes they have forgotten what they were 
doing:)

How often you check the counter should be 
driven by how granular you want to measure the network. Pick the right counter 
so that it does not wrap on you during your sampling interval.

The initial downside is that you have 10-30 
times as much data. Network datahaschaotic (aka 
self-similar)characteristics that make simple statistics such as 
max, min or average somewhat useless.

My understanding of the 
reason to calculate a 95th percentile is to try to reduce the dataset size and 
to make some sense out of the random performance data. For example, I could take 
some range of data and figure out the 95% threshold and save that as a data 
point. (eg. 95% of the samples are less than X Mbps).

Read the counter value, compute the rate 
for the interval, then compute the 95th % threshold for 20+ samples and save 
that as the value for that longer period.

The basic assumption is 
thatyou can ignore or not billthe 5% of the time thatyou had 
higher values. Its 6 minutes during a 10 hour business window or 15 minutes over 
a 24 hour period. One could argue that 95 should be 98 or 92 or it matters 
if the 5% is a continuous. But its a reasonable starting point for making 
a decision about whether link utilization is too high.



David Russell



From: [EMAIL PROTECTED] on 
behalf of Jo RhettSent: Wed 2/22/2006 1:12 PMTo: 
nanog@merit.eduSubject: How do you (not how do I) calculate 95th 
percentile?

I am wondering what other people are doing for 95th percentile 
calculationsthese days. Not how you gather the data, but how often you 
check thecounter? Do you use averages or maximums over time periods to 
create thebuckets used for the 95th percentile calculation?A lot of 
smaller folks check the counter every 5 min and use that samevalue for the 
95th percentile. Most of us larger folks need to check moreoften to 
prevent 32bit counters from rolling over too often. Are you 
largerfolks averaging the retrieved values over a larger period? Using 
themaximum within a larger period? Or just using your saved 
values?This is curiosity only. A few years ago we compared the 
same data and theanswers varied wildly. It would appear from my latest 
check that it isbecoming more standardized on 5-minute averages, so I'm 
asking here on Nanogas a reality check.Note: I have AboveNet, 
Savvis, Verio, etc calculations. I'm wonderingif there are any other 
odd combinations out there.Reply to me offlist. If there is 
interest I'll summarize the resultswithout identifying the 
source.--Jo Rhettsenior geekSVcolo : Silicon Valley 
Colocation



Note: The information contained in this message may be privileged and confidential and protected from disclosure. If the reader of this message is not the intended recipient, or an employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by replying to the message and deleting it from your computer. Thank you. ThruPoint, Inc.




RE: How do you (not how do I) calculate 95th percentile?

2006-02-22 Thread Greenhagen, Robin

Database triggers are a marvelous thing.  Is this wrong?

((InOctetsCurrent-InOctetsLastTime)*8)
-
(TimeCurrent-TimeLastTime)   = Inbound bits/sec

We chose this because it doesn't matter if it is 30 seconds or 8 minutes
between sample points, it is all normalized in that period.  I do
realize that it is an average within that window, but all we need to do
is tune the cron job (how often we poll) to increase resolution.

Robin