Re: Cacti (Was: Re: Load Balancing: How Busy are the servers?)

2006-01-02 Thread Francisco Reyes

Marc G. Fournier writes:

You can setup Graph Trees, so you can group Graphs together .. ie. all 
the CPU Usage graphs for all (or groups of) servers, so that you can 
compare them ...


Great report.
Have you seen anything yet about disk performance?
That would be very usefull too... specially for people who use rsync. I have 
found that rsync can do significant amount of disk I/O with very little CPU 
utilization.

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Cacti (Was: Re: Load Balancing: How Busy are the servers?)

2006-01-02 Thread Marc G. Fournier

On Mon, 2 Jan 2006, Francisco Reyes wrote:


Marc G. Fournier writes:

You can setup Graph Trees, so you can group Graphs together .. ie. all 
the CPU Usage graphs for all (or groups of) servers, so that you can 
compare them ...


Great report.
Have you seen anything yet about disk performance?


I haven't ... this is something I'm going to have to figure out SNMP MIBs 
for ... just gotta find time to do an snmpwalk and read through the output 
for ... it looks like something easy to integrate into cacti though ...



Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Load Balancing: How Busy are the servers?

2006-01-01 Thread Marc G. Fournier


For all the technology, I was kinda hoping for some 'scientific formula' 
:)


Now, I really hate to ask, but how do you use vmstat to get a feel for how 
busy the disk subsystem is?  What are you looking for?


On Sat, 31 Dec 2005, Francisco Reyes wrote:


Marc G. Fournier writes:

1. What variables on a server should be monitored to determine how busy a 
server is? 


I am a fairly new sysadmin.. who inheritted nearly 20 machines, so take my 
comments with a gain of salt. Before that the most I ever had was 7, mostly 
DB, FreeBSD machines :-) .. and.. Hi Marc. :)


I think it comes down to primarily 3 factors
* RAM
* CPU
* DISK

If you are hitting Swap, you are either running too many programs/services or 
too many users.


Same for CPU

Disk are different in that the same number of disks can perform different 
based on what raid controller and what type of RAID.


I use top and load average to determine if a machine is up to capacity in 
memory/cpu.


I use vmstat to determine if the disk subsystem is falling behind.

BIG NOTE: The one thing that I have yet to really pay much attention is the 
network performance. Fortunately we just hired someone who has significantly 
more experience on that area. :-)



2. Are there any tools that I can run to give me a point in time summary 
of how busy a server is based on these several factors?


I think there are lots of tools. Some vary from SNMP capture/graphing, to 
custom made tools done in-house. I think it's a combination of how difficult 
it is to setup vs what you need to monitor. 
At work we are just starting to roll out an SNMP tool. The new hire is 
leading the effort so I am not very familiar with the setups.. the one thing 
I see so far is that ultimately, there usually are things that one needs to 
monitor that is unique to your organization and you need to either integrate 
a program into the tool or do your own independant monitoring of that 
particular resource.


I think the ISP list may be a good resource since the needs of the average 
user are different from ISPs/companies with numerous machines. 
Basically, I'd like to keep track of multiple servers and be able to say 
this server is running 75% of capacity, time to upgrade or move things 
off of it ... if its possible ... ?


In my opinion, for the most part, the answer is yes. The problem is usually 
how long it's going to take you to setup the environment to monitor the 
servers.


The program we went with was chosen because the new hire was familiar with 
it, but a search on the archives for monitoring tools will give you a long 
list of programs and opinions of which are easier.


If I had the time, I think I would likely write my own tool. This way I will 
be able to measure exactly what I want. Right now I thik we will cover most 
basics with the tool we are going with, but will need to still do our own 
custom apps to monitor a number of resources and metrics.







Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Load Balancing: How Busy are the servers?

2006-01-01 Thread Francisco Reyes

Marc G. Fournier writes:

For all the technology, I was kinda hoping for some 'scientific formula' 
:)


There are..

Now, I really hate to ask, but how do you use vmstat to get a feel for how 
busy the disk subsystem is?


For me, reading Absolute BSD by Michael Lucas was very helpfull.
In particular Chapter 18, System performance.

The three columns I look at are for vmstat r and b on the left, and  
fault.


r shows how many processes are waiting for CPU, b shows how many 
processes are waiting for disk. The fault column(s) show how badly your 
system is accesing swap.


Quick example:
r b w
2 5 0
1 5 0
2 4 0
2 5 0
3 4 0
1 5 0
1 5 0


That's from my home machine as I am doing some backups.
The machine at this point is more disk bound than CPU bound with 4 to 5 disk 
operations at any point in time waiting for disk access


I am also falling behind in CPU, but not as bad.

On the far right of vmsat you also have CPU stats.. in my case the vmstat 
from the above lines showed 70% to 90% iddle which confirmed I was disk 
bound at that point. 

The fault column show you how actively you are using swap. The lines 
above had between 30 and 200 approximately. If you look at swapinfo and you 
have a large amount of swap in use and then you see a high number in vmstat 
for fault, the machine is short on RAM for the load you have on it.


So far in my experience nothing hurts a machine as badly as hitting swap 
(given that you have adequate CPU/disks). Once you start to hit swap heavily 
you need to do something (if you can...) such as moving services to another 
machine or putting in more memory.


Instead of looking for fixed number I think that relative figures are more 
important.. like looking at your machines at their lowest usage and then at 
their busiest.. or at spikes.. If at slow times of activity the machines are 
already falling behind on b, r on vmstat.. then that machine is 
overloaded.


One possible quick way to start benchmarking your machines, until you can do 
something better is to capture snapshots of vmstat every 15 to 30 minutes 
and take a look.. perhaps even write a short script to summarize it. On my 
list of things to do.. is to do a simple setup of that nature.. just because 
it would be easy to setup and can provide very valuable information until 
you setup something more feature rich. 



top in 5.X branch and up is also very userfull. If you hit m it shows 
you disk processes so you can see what programs are doing the most I/O.


One thing to watch out for in top when using 'm' is if you see all low 
numbers ( hit 'o' to sort and then type 'total').. is that you may have lots 
of programs doing little I/O, but their combined load is a problem for your 
disk subsystem like having 200+ IMAP connections. Each single IMAP 
connection may not be doing more than a handfull of transactions per second, 
but all of them combined can give a disk subsystem a pretty good workout.


The load averages from 'w' are also good figures to do comparative tests. I 
started to wokr on a script (but needs more work) that dumps 'w' and 
'vmstat' .. next have to work on parsing them and giving summaries. In 
particular one wants to know peak times.. since that is the best time to 
determine if the machine can handle it's load.. and more importantly spikes. 
If a machine is usually under 2.. and it spikes at 5+.. that machine is 
possibly able to do normal loads, but may not be able to handle spikes in 
traffic (ie a customer doing  a mailing list, or a site just got press.. and 
there are a larger number than usual of people going to their URL).


I still thinkg I have MUCH, MUCH to learn.. but I would be glad to expand on 
anything mentioned above.. or anything else. Ultimately each machine/company 
is unique enough that absolute numbers from other people (ie what is a good 
value for 'r' and 'b' to be around most of the time) may be less important 
than learning what are the different figures for your different machines 
under normal operation.

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Load Balancing: How Busy are the servers?

2006-01-01 Thread Marc G. Fournier


I just installed cacti, which seems fairly useful for 'long term views' of 
how a server is doing ... now I have to figure out what SNMP MIBs related 
to all of the important things :(




On Sun, 1 Jan 2006, Francisco Reyes wrote:


Marc G. Fournier writes:


For all the technology, I was kinda hoping for some 'scientific formula' :)


There are..

Now, I really hate to ask, but how do you use vmstat to get a feel for how 
busy the disk subsystem is?


For me, reading Absolute BSD by Michael Lucas was very helpfull.
In particular Chapter 18, System performance.

The three columns I look at are for vmstat r and b on the left, and 
fault.


r shows how many processes are waiting for CPU, b shows how many 
processes are waiting for disk. The fault column(s) show how badly your 
system is accesing swap.


Quick example:
r b w
2 5 0
1 5 0
2 4 0
2 5 0
3 4 0
1 5 0
1 5 0


That's from my home machine as I am doing some backups.
The machine at this point is more disk bound than CPU bound with 4 to 5 disk 
operations at any point in time waiting for disk access


I am also falling behind in CPU, but not as bad.

On the far right of vmsat you also have CPU stats.. in my case the vmstat 
from the above lines showed 70% to 90% iddle which confirmed I was disk bound 
at that point. 
The fault column show you how actively you are using swap. The lines above 
had between 30 and 200 approximately. If you look at swapinfo and you have a 
large amount of swap in use and then you see a high number in vmstat for 
fault, the machine is short on RAM for the load you have on it.


So far in my experience nothing hurts a machine as badly as hitting swap 
(given that you have adequate CPU/disks). Once you start to hit swap heavily 
you need to do something (if you can...) such as moving services to another 
machine or putting in more memory.


Instead of looking for fixed number I think that relative figures are more 
important.. like looking at your machines at their lowest usage and then at 
their busiest.. or at spikes.. If at slow times of activity the machines are 
already falling behind on b, r on vmstat.. then that machine is 
overloaded.


One possible quick way to start benchmarking your machines, until you can do 
something better is to capture snapshots of vmstat every 15 to 30 minutes and 
take a look.. perhaps even write a short script to summarize it. On my list 
of things to do.. is to do a simple setup of that nature.. just because it 
would be easy to setup and can provide very valuable information until you 
setup something more feature rich. 

top in 5.X branch and up is also very userfull. If you hit m it shows you 
disk processes so you can see what programs are doing the most I/O.


One thing to watch out for in top when using 'm' is if you see all low 
numbers ( hit 'o' to sort and then type 'total').. is that you may have lots 
of programs doing little I/O, but their combined load is a problem for your 
disk subsystem like having 200+ IMAP connections. Each single IMAP 
connection may not be doing more than a handfull of transactions per second, 
but all of them combined can give a disk subsystem a pretty good workout.


The load averages from 'w' are also good figures to do comparative tests. I 
started to wokr on a script (but needs more work) that dumps 'w' and 'vmstat' 
.. next have to work on parsing them and giving summaries. In particular one 
wants to know peak times.. since that is the best time to determine if the 
machine can handle it's load.. and more importantly spikes. If a machine is 
usually under 2.. and it spikes at 5+.. that machine is possibly able to do 
normal loads, but may not be able to handle spikes in traffic (ie a 
customer doing  a mailing list, or a site just got press.. and there are a 
larger number than usual of people going to their URL).


I still thinkg I have MUCH, MUCH to learn.. but I would be glad to expand on 
anything mentioned above.. or anything else. Ultimately each machine/company 
is unique enough that absolute numbers from other people (ie what is a good 
value for 'r' and 'b' to be around most of the time) may be less important 
than learning what are the different figures for your different machines 
under normal operation.






Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Load Balancing: How Busy are the servers?

2006-01-01 Thread Francisco Reyes

Marc G. Fournier writes:

I just installed cacti, which seems fairly useful for 'long term views' of 
how a server is doing


Have not played with it, but have read good/favorable comments about it.

I would be nice if you did a mini report of your early impressions later.. 
In particular I think it would be good to know how easy it is to setup and 
what it covers.


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Cacti (Was: Re: Load Balancing: How Busy are the servers?)

2006-01-01 Thread Marc G. Fournier

On Mon, 2 Jan 2006, Francisco Reyes wrote:


Marc G. Fournier writes:

I just installed cacti, which seems fairly useful for 'long term views' of 
how a server is doing


Have not played with it, but have read good/favorable comments about it.

I would be nice if you did a mini report of your early impressions 
later.. In particular I think it would be good to know how easy it is to 
setup and what it covers.


'k, I'm terrible at 'reports', but ... to be totally honest, this has 
gotta be one of the nicer pieces of software I've played with as far as 
documentation *especially* for OSS ...


I installed it out of ports, initially directly on one of our servers, 
mistakenly thinking I needed to do one install per server ... ended up 
moving it into a vServer so that I can easily move it around as I get more 
powerful servers, instead of having it tied to a specific machine ...


On all our other servers, I just had to install the net-snmp port, to give 
it something to talk to ...


The hardest part about setting things up with setting up snmpd, but ended 
up running snmpconf -i to do this (snmpconf -g basic is apparently 
slightly easier too) ... once I built the initial snmpd.conf file, I just 
copied that to the other servers, instead of building one for each ...


The Cacti port ends up with a short message that tells you step by step 
what needs to be done ... it has one error, in that the crontab entry it 
tells you to create appears to be wrong ... does Linux support a 'user to 
run as' arg within their crontab?


After that, so far, I've just used the 'default net-snmp' settings that 
come with Cacti ... haven't had a chance to dive into snmp yet, to figure 
out what else can be monitored ...


I currently have it monitoring CPU, Load, Traffic and Memory Usage

You can setup Graph Trees, so you can group Graphs together .. ie. all 
the CPU Usage graphs for all (or groups of) servers, so that you can 
compare them ...


So far, at least, definitely a tool I'd recommend ... so far, seems to 
work well ...



Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Load Balancing: How Busy are the servers?

2005-12-31 Thread Francisco Reyes

Marc G. Fournier writes:

1. What variables on a server should be monitored to determine how busy a 
server is? 


I am a fairly new sysadmin.. who inheritted nearly 20 machines, so take my 
comments with a gain of salt. Before that the most I ever had was 7, mostly 
DB, FreeBSD machines :-) 
.. and.. Hi Marc. :)


I think it comes down to primarily 3 factors
* RAM
* CPU
* DISK

If you are hitting Swap, you are either running too many programs/services 
or too many users.


Same for CPU

Disk are different in that the same number of disks can perform different 
based on what raid controller and what type of RAID.


I use top and load average to determine if a machine is up to capacity in 
memory/cpu.


I use vmstat to determine if the disk subsystem is falling behind.

BIG NOTE: 
The one thing that I have yet to really pay much attention is the network 
performance. Fortunately we just hired someone who has significantly more 
experience on that area. :-)



2. Are there any tools that I can run to give me a point in time summary 
of how busy a server is based on these several factors?


I think there are lots of tools. Some vary from SNMP capture/graphing, to 
custom made tools done in-house. I think it's a combination of how difficult 
it is to setup vs what you need to monitor. 

At work we are just starting to roll out an SNMP tool. The new hire is 
leading the effort so I am not very familiar with the setups.. the one thing 
I see so far is that ultimately, there usually are things that one needs to 
monitor that is unique to your organization and you need to either integrate 
a program into the tool or do your own independant monitoring of that 
particular resource.


I think the ISP list may be a good resource since the needs of the average 
user are different from ISPs/companies with numerous machines. 

Basically, I'd like to keep track of multiple servers and be able to say 
this server is running 75% of capacity, time to upgrade or move things 
off of it ... if its possible ... ?


In my opinion, for the most part, the answer is yes. The problem is usually 
how long it's going to take you to setup the environment to monitor the 
servers.


The program we went with was chosen because the new hire was familiar with 
it, but a search on the archives for monitoring tools will give you a long 
list of programs and opinions of which are easier.


If I had the time, I think I would likely write my own tool. This way I will 
be able to measure exactly what I want. Right now I thik we will cover most 
basics with the tool we are going with, but will need to still do our own 
custom apps to monitor a number of resources and metrics.


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Load Balancing: How Busy are the servers?

2005-12-28 Thread Matthew Seaman

Chuck Swiger wrote:

Marc G. Fournier wrote:


Basically, I'd like to keep track of multiple servers and be able to 
say this server is running 75% of capacity, time to upgrade or move 
things off of it ... if its possible ... ?


Take a look at Big Brother, www.bb4.org?, it will at least give warnings 
for high load average, disk space, and so forth.


Yep -- BB, or for my preference, Nagios is good, but by their nature they
alert you to abnormal conditions -- peaks in load or traffic, failure of
connectivity etc.  That's important, but it doesn't really help with getting
an overview of how hard a server is running over time.

For that purpose I find that programs like Cacti or Cricket do much better.
Essentially anything you can express as a number and that you can get a SNMP
daemon to present as an OID can be graphed.  Cacti comes with pre-canned
queries on system load, disk space, memory usage, swap usage, number of 
processes,
as well as graphing all of the network traffic.  Adding extensions to do stuff
like monitor how much work MySQL is doing is not too difficult.  


Not only that, it produces graphs pretty enough to convince even the
pointiest-haired management.

Cheers,

Matthew

--
Dr Matthew J Seaman MA, D.Phil.   7 Priory Courtyard
 Flat 3
PGP: http://www.infracaninophile.co.uk/pgpkey Ramsgate
 Kent, CT11 9PW


signature.asc
Description: OpenPGP digital signature


Load Balancing: How Busy are the servers?

2005-12-27 Thread Marc G. Fournier


Two part question here ... first leads into the second, and the second 
might answer the first ...


1. What variables on a server should be monitored to determine how busy a 
server is?  For instance, I've always been taugth that 'loadavg' is not an 
indication of how busy a server is, since a high loadavg on a single CPU 
server might be an overloaded server, but moderately loaded on a dual CPU 
server ... disk i/o, cpu usage, ethernet throughput ... what else?


2. Are there any tools that I can run to give me a point in time summary 
of how busy a server is based on these several factors?


Basically, I'd like to keep track of multiple servers and be able to say 
this server is running 75% of capacity, time to upgrade or move things 
off of it ... if its possible ... ?


thanks ...


Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Load Balancing: How Busy are the servers?

2005-12-27 Thread Chuck Swiger

Marc G. Fournier wrote:
1. What variables on a server should be monitored to determine how busy 
a server is?  For instance, I've always been taugth that 'loadavg' is 
not an indication of how busy a server is, since a high loadavg on a 
single CPU server might be an overloaded server, but moderately loaded 
on a dual CPU server ...


If the load average is greater than the number of CPU's, it's likely that adding 
another CPU would improve throughput.  If the load average is more than twice 
the # of CPU's it's extremely likeing that adding more CPUs would improve 
throughput.



disk i/o, cpu usage, ethernet throughput ... what else?


The primary resources are CPU, memory, and I/O.  If you measure the ones you've 
listed and pay attention to the VM stats, you should have a starting point. 
Don't forget to pay attention to running out of disk space, SysV shmem 
semaphores, and anything else which is being used by the tasks being run.


2. Are there any tools that I can run to give me a point in time 
summary of how busy a server is based on these several factors?


vmstat 5 and iostat 5 come pretty close, but you have to calibrate some of 
the I/O measurements it returns against the maximum throughput possible for each 
specific system.


Basically, I'd like to keep track of multiple servers and be able to say 
this server is running 75% of capacity, time to upgrade or move things 
off of it ... if its possible ... ?


Take a look at Big Brother, www.bb4.org?, it will at least give warnings for 
high load average, disk space, and so forth.


--
-Chuck
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]