Re: Cacti (Was: Re: Load Balancing: How Busy are the servers?)
Marc G. Fournier writes: You can setup Graph Trees, so you can group Graphs together .. ie. all the CPU Usage graphs for all (or groups of) servers, so that you can compare them ... Great report. Have you seen anything yet about disk performance? That would be very usefull too... specially for people who use rsync. I have found that rsync can do significant amount of disk I/O with very little CPU utilization. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Cacti (Was: Re: Load Balancing: How Busy are the servers?)
On Mon, 2 Jan 2006, Francisco Reyes wrote: Marc G. Fournier writes: You can setup Graph Trees, so you can group Graphs together .. ie. all the CPU Usage graphs for all (or groups of) servers, so that you can compare them ... Great report. Have you seen anything yet about disk performance? I haven't ... this is something I'm going to have to figure out SNMP MIBs for ... just gotta find time to do an snmpwalk and read through the output for ... it looks like something easy to integrate into cacti though ... Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: [EMAIL PROTECTED] Yahoo!: yscrappy ICQ: 7615664 ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Load Balancing: How Busy are the servers?
For all the technology, I was kinda hoping for some 'scientific formula' :) Now, I really hate to ask, but how do you use vmstat to get a feel for how busy the disk subsystem is? What are you looking for? On Sat, 31 Dec 2005, Francisco Reyes wrote: Marc G. Fournier writes: 1. What variables on a server should be monitored to determine how busy a server is? I am a fairly new sysadmin.. who inheritted nearly 20 machines, so take my comments with a gain of salt. Before that the most I ever had was 7, mostly DB, FreeBSD machines :-) .. and.. Hi Marc. :) I think it comes down to primarily 3 factors * RAM * CPU * DISK If you are hitting Swap, you are either running too many programs/services or too many users. Same for CPU Disk are different in that the same number of disks can perform different based on what raid controller and what type of RAID. I use top and load average to determine if a machine is up to capacity in memory/cpu. I use vmstat to determine if the disk subsystem is falling behind. BIG NOTE: The one thing that I have yet to really pay much attention is the network performance. Fortunately we just hired someone who has significantly more experience on that area. :-) 2. Are there any tools that I can run to give me a point in time summary of how busy a server is based on these several factors? I think there are lots of tools. Some vary from SNMP capture/graphing, to custom made tools done in-house. I think it's a combination of how difficult it is to setup vs what you need to monitor. At work we are just starting to roll out an SNMP tool. The new hire is leading the effort so I am not very familiar with the setups.. the one thing I see so far is that ultimately, there usually are things that one needs to monitor that is unique to your organization and you need to either integrate a program into the tool or do your own independant monitoring of that particular resource. I think the ISP list may be a good resource since the needs of the average user are different from ISPs/companies with numerous machines. Basically, I'd like to keep track of multiple servers and be able to say this server is running 75% of capacity, time to upgrade or move things off of it ... if its possible ... ? In my opinion, for the most part, the answer is yes. The problem is usually how long it's going to take you to setup the environment to monitor the servers. The program we went with was chosen because the new hire was familiar with it, but a search on the archives for monitoring tools will give you a long list of programs and opinions of which are easier. If I had the time, I think I would likely write my own tool. This way I will be able to measure exactly what I want. Right now I thik we will cover most basics with the tool we are going with, but will need to still do our own custom apps to monitor a number of resources and metrics. Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: [EMAIL PROTECTED] Yahoo!: yscrappy ICQ: 7615664 ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Load Balancing: How Busy are the servers?
Marc G. Fournier writes: For all the technology, I was kinda hoping for some 'scientific formula' :) There are.. Now, I really hate to ask, but how do you use vmstat to get a feel for how busy the disk subsystem is? For me, reading Absolute BSD by Michael Lucas was very helpfull. In particular Chapter 18, System performance. The three columns I look at are for vmstat r and b on the left, and fault. r shows how many processes are waiting for CPU, b shows how many processes are waiting for disk. The fault column(s) show how badly your system is accesing swap. Quick example: r b w 2 5 0 1 5 0 2 4 0 2 5 0 3 4 0 1 5 0 1 5 0 That's from my home machine as I am doing some backups. The machine at this point is more disk bound than CPU bound with 4 to 5 disk operations at any point in time waiting for disk access I am also falling behind in CPU, but not as bad. On the far right of vmsat you also have CPU stats.. in my case the vmstat from the above lines showed 70% to 90% iddle which confirmed I was disk bound at that point. The fault column show you how actively you are using swap. The lines above had between 30 and 200 approximately. If you look at swapinfo and you have a large amount of swap in use and then you see a high number in vmstat for fault, the machine is short on RAM for the load you have on it. So far in my experience nothing hurts a machine as badly as hitting swap (given that you have adequate CPU/disks). Once you start to hit swap heavily you need to do something (if you can...) such as moving services to another machine or putting in more memory. Instead of looking for fixed number I think that relative figures are more important.. like looking at your machines at their lowest usage and then at their busiest.. or at spikes.. If at slow times of activity the machines are already falling behind on b, r on vmstat.. then that machine is overloaded. One possible quick way to start benchmarking your machines, until you can do something better is to capture snapshots of vmstat every 15 to 30 minutes and take a look.. perhaps even write a short script to summarize it. On my list of things to do.. is to do a simple setup of that nature.. just because it would be easy to setup and can provide very valuable information until you setup something more feature rich. top in 5.X branch and up is also very userfull. If you hit m it shows you disk processes so you can see what programs are doing the most I/O. One thing to watch out for in top when using 'm' is if you see all low numbers ( hit 'o' to sort and then type 'total').. is that you may have lots of programs doing little I/O, but their combined load is a problem for your disk subsystem like having 200+ IMAP connections. Each single IMAP connection may not be doing more than a handfull of transactions per second, but all of them combined can give a disk subsystem a pretty good workout. The load averages from 'w' are also good figures to do comparative tests. I started to wokr on a script (but needs more work) that dumps 'w' and 'vmstat' .. next have to work on parsing them and giving summaries. In particular one wants to know peak times.. since that is the best time to determine if the machine can handle it's load.. and more importantly spikes. If a machine is usually under 2.. and it spikes at 5+.. that machine is possibly able to do normal loads, but may not be able to handle spikes in traffic (ie a customer doing a mailing list, or a site just got press.. and there are a larger number than usual of people going to their URL). I still thinkg I have MUCH, MUCH to learn.. but I would be glad to expand on anything mentioned above.. or anything else. Ultimately each machine/company is unique enough that absolute numbers from other people (ie what is a good value for 'r' and 'b' to be around most of the time) may be less important than learning what are the different figures for your different machines under normal operation. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Load Balancing: How Busy are the servers?
I just installed cacti, which seems fairly useful for 'long term views' of how a server is doing ... now I have to figure out what SNMP MIBs related to all of the important things :( On Sun, 1 Jan 2006, Francisco Reyes wrote: Marc G. Fournier writes: For all the technology, I was kinda hoping for some 'scientific formula' :) There are.. Now, I really hate to ask, but how do you use vmstat to get a feel for how busy the disk subsystem is? For me, reading Absolute BSD by Michael Lucas was very helpfull. In particular Chapter 18, System performance. The three columns I look at are for vmstat r and b on the left, and fault. r shows how many processes are waiting for CPU, b shows how many processes are waiting for disk. The fault column(s) show how badly your system is accesing swap. Quick example: r b w 2 5 0 1 5 0 2 4 0 2 5 0 3 4 0 1 5 0 1 5 0 That's from my home machine as I am doing some backups. The machine at this point is more disk bound than CPU bound with 4 to 5 disk operations at any point in time waiting for disk access I am also falling behind in CPU, but not as bad. On the far right of vmsat you also have CPU stats.. in my case the vmstat from the above lines showed 70% to 90% iddle which confirmed I was disk bound at that point. The fault column show you how actively you are using swap. The lines above had between 30 and 200 approximately. If you look at swapinfo and you have a large amount of swap in use and then you see a high number in vmstat for fault, the machine is short on RAM for the load you have on it. So far in my experience nothing hurts a machine as badly as hitting swap (given that you have adequate CPU/disks). Once you start to hit swap heavily you need to do something (if you can...) such as moving services to another machine or putting in more memory. Instead of looking for fixed number I think that relative figures are more important.. like looking at your machines at their lowest usage and then at their busiest.. or at spikes.. If at slow times of activity the machines are already falling behind on b, r on vmstat.. then that machine is overloaded. One possible quick way to start benchmarking your machines, until you can do something better is to capture snapshots of vmstat every 15 to 30 minutes and take a look.. perhaps even write a short script to summarize it. On my list of things to do.. is to do a simple setup of that nature.. just because it would be easy to setup and can provide very valuable information until you setup something more feature rich. top in 5.X branch and up is also very userfull. If you hit m it shows you disk processes so you can see what programs are doing the most I/O. One thing to watch out for in top when using 'm' is if you see all low numbers ( hit 'o' to sort and then type 'total').. is that you may have lots of programs doing little I/O, but their combined load is a problem for your disk subsystem like having 200+ IMAP connections. Each single IMAP connection may not be doing more than a handfull of transactions per second, but all of them combined can give a disk subsystem a pretty good workout. The load averages from 'w' are also good figures to do comparative tests. I started to wokr on a script (but needs more work) that dumps 'w' and 'vmstat' .. next have to work on parsing them and giving summaries. In particular one wants to know peak times.. since that is the best time to determine if the machine can handle it's load.. and more importantly spikes. If a machine is usually under 2.. and it spikes at 5+.. that machine is possibly able to do normal loads, but may not be able to handle spikes in traffic (ie a customer doing a mailing list, or a site just got press.. and there are a larger number than usual of people going to their URL). I still thinkg I have MUCH, MUCH to learn.. but I would be glad to expand on anything mentioned above.. or anything else. Ultimately each machine/company is unique enough that absolute numbers from other people (ie what is a good value for 'r' and 'b' to be around most of the time) may be less important than learning what are the different figures for your different machines under normal operation. Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: [EMAIL PROTECTED] Yahoo!: yscrappy ICQ: 7615664 ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Load Balancing: How Busy are the servers?
Marc G. Fournier writes: I just installed cacti, which seems fairly useful for 'long term views' of how a server is doing Have not played with it, but have read good/favorable comments about it. I would be nice if you did a mini report of your early impressions later.. In particular I think it would be good to know how easy it is to setup and what it covers. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Cacti (Was: Re: Load Balancing: How Busy are the servers?)
On Mon, 2 Jan 2006, Francisco Reyes wrote: Marc G. Fournier writes: I just installed cacti, which seems fairly useful for 'long term views' of how a server is doing Have not played with it, but have read good/favorable comments about it. I would be nice if you did a mini report of your early impressions later.. In particular I think it would be good to know how easy it is to setup and what it covers. 'k, I'm terrible at 'reports', but ... to be totally honest, this has gotta be one of the nicer pieces of software I've played with as far as documentation *especially* for OSS ... I installed it out of ports, initially directly on one of our servers, mistakenly thinking I needed to do one install per server ... ended up moving it into a vServer so that I can easily move it around as I get more powerful servers, instead of having it tied to a specific machine ... On all our other servers, I just had to install the net-snmp port, to give it something to talk to ... The hardest part about setting things up with setting up snmpd, but ended up running snmpconf -i to do this (snmpconf -g basic is apparently slightly easier too) ... once I built the initial snmpd.conf file, I just copied that to the other servers, instead of building one for each ... The Cacti port ends up with a short message that tells you step by step what needs to be done ... it has one error, in that the crontab entry it tells you to create appears to be wrong ... does Linux support a 'user to run as' arg within their crontab? After that, so far, I've just used the 'default net-snmp' settings that come with Cacti ... haven't had a chance to dive into snmp yet, to figure out what else can be monitored ... I currently have it monitoring CPU, Load, Traffic and Memory Usage You can setup Graph Trees, so you can group Graphs together .. ie. all the CPU Usage graphs for all (or groups of) servers, so that you can compare them ... So far, at least, definitely a tool I'd recommend ... so far, seems to work well ... Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: [EMAIL PROTECTED] Yahoo!: yscrappy ICQ: 7615664 ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Load Balancing: How Busy are the servers?
Marc G. Fournier writes: 1. What variables on a server should be monitored to determine how busy a server is? I am a fairly new sysadmin.. who inheritted nearly 20 machines, so take my comments with a gain of salt. Before that the most I ever had was 7, mostly DB, FreeBSD machines :-) .. and.. Hi Marc. :) I think it comes down to primarily 3 factors * RAM * CPU * DISK If you are hitting Swap, you are either running too many programs/services or too many users. Same for CPU Disk are different in that the same number of disks can perform different based on what raid controller and what type of RAID. I use top and load average to determine if a machine is up to capacity in memory/cpu. I use vmstat to determine if the disk subsystem is falling behind. BIG NOTE: The one thing that I have yet to really pay much attention is the network performance. Fortunately we just hired someone who has significantly more experience on that area. :-) 2. Are there any tools that I can run to give me a point in time summary of how busy a server is based on these several factors? I think there are lots of tools. Some vary from SNMP capture/graphing, to custom made tools done in-house. I think it's a combination of how difficult it is to setup vs what you need to monitor. At work we are just starting to roll out an SNMP tool. The new hire is leading the effort so I am not very familiar with the setups.. the one thing I see so far is that ultimately, there usually are things that one needs to monitor that is unique to your organization and you need to either integrate a program into the tool or do your own independant monitoring of that particular resource. I think the ISP list may be a good resource since the needs of the average user are different from ISPs/companies with numerous machines. Basically, I'd like to keep track of multiple servers and be able to say this server is running 75% of capacity, time to upgrade or move things off of it ... if its possible ... ? In my opinion, for the most part, the answer is yes. The problem is usually how long it's going to take you to setup the environment to monitor the servers. The program we went with was chosen because the new hire was familiar with it, but a search on the archives for monitoring tools will give you a long list of programs and opinions of which are easier. If I had the time, I think I would likely write my own tool. This way I will be able to measure exactly what I want. Right now I thik we will cover most basics with the tool we are going with, but will need to still do our own custom apps to monitor a number of resources and metrics. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Load Balancing: How Busy are the servers?
Chuck Swiger wrote: Marc G. Fournier wrote: Basically, I'd like to keep track of multiple servers and be able to say this server is running 75% of capacity, time to upgrade or move things off of it ... if its possible ... ? Take a look at Big Brother, www.bb4.org?, it will at least give warnings for high load average, disk space, and so forth. Yep -- BB, or for my preference, Nagios is good, but by their nature they alert you to abnormal conditions -- peaks in load or traffic, failure of connectivity etc. That's important, but it doesn't really help with getting an overview of how hard a server is running over time. For that purpose I find that programs like Cacti or Cricket do much better. Essentially anything you can express as a number and that you can get a SNMP daemon to present as an OID can be graphed. Cacti comes with pre-canned queries on system load, disk space, memory usage, swap usage, number of processes, as well as graphing all of the network traffic. Adding extensions to do stuff like monitor how much work MySQL is doing is not too difficult. Not only that, it produces graphs pretty enough to convince even the pointiest-haired management. Cheers, Matthew -- Dr Matthew J Seaman MA, D.Phil. 7 Priory Courtyard Flat 3 PGP: http://www.infracaninophile.co.uk/pgpkey Ramsgate Kent, CT11 9PW signature.asc Description: OpenPGP digital signature
Load Balancing: How Busy are the servers?
Two part question here ... first leads into the second, and the second might answer the first ... 1. What variables on a server should be monitored to determine how busy a server is? For instance, I've always been taugth that 'loadavg' is not an indication of how busy a server is, since a high loadavg on a single CPU server might be an overloaded server, but moderately loaded on a dual CPU server ... disk i/o, cpu usage, ethernet throughput ... what else? 2. Are there any tools that I can run to give me a point in time summary of how busy a server is based on these several factors? Basically, I'd like to keep track of multiple servers and be able to say this server is running 75% of capacity, time to upgrade or move things off of it ... if its possible ... ? thanks ... Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: [EMAIL PROTECTED] Yahoo!: yscrappy ICQ: 7615664 ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Load Balancing: How Busy are the servers?
Marc G. Fournier wrote: 1. What variables on a server should be monitored to determine how busy a server is? For instance, I've always been taugth that 'loadavg' is not an indication of how busy a server is, since a high loadavg on a single CPU server might be an overloaded server, but moderately loaded on a dual CPU server ... If the load average is greater than the number of CPU's, it's likely that adding another CPU would improve throughput. If the load average is more than twice the # of CPU's it's extremely likeing that adding more CPUs would improve throughput. disk i/o, cpu usage, ethernet throughput ... what else? The primary resources are CPU, memory, and I/O. If you measure the ones you've listed and pay attention to the VM stats, you should have a starting point. Don't forget to pay attention to running out of disk space, SysV shmem semaphores, and anything else which is being used by the tasks being run. 2. Are there any tools that I can run to give me a point in time summary of how busy a server is based on these several factors? vmstat 5 and iostat 5 come pretty close, but you have to calibrate some of the I/O measurements it returns against the maximum throughput possible for each specific system. Basically, I'd like to keep track of multiple servers and be able to say this server is running 75% of capacity, time to upgrade or move things off of it ... if its possible ... ? Take a look at Big Brother, www.bb4.org?, it will at least give warnings for high load average, disk space, and so forth. -- -Chuck ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]