tonight
I'm going to give Jim Drake a ride to Biddeford to pick up his Jeep which is getting a new windshield. I expect to be home by 6. It it looks like we are running late I will call. Thank you, Joe Bouchard Factory Automation Engineer and Unix Support CSC/PW North Berwick, Maine Phone: (207)676-4100 x2255 - This is a PRIVATE message. If you are not the intended recipient, please delete without copying and kindly advise us by e-mail of the mistake in delivery. NOTE: Regardless of content, this e-mail shall not operate to bind CSC to any order or other contract unless pursuant to explicit written agreement or government initiative expressly permitting the use of e-mail for such purpose. - ? Computer Sciences Corporation Registered Office: 3170 Fairview Park Drive, Falls Church, Virginia 22042, USA Registered in Nevada, USA No: C-489-59
Re: Mass update deployment strategy
I would say that if you let your machines blindly to an apt-get update; apt-get upgrade every day, most of the time it won't be a problem, but someday it may be a problem and you might render half your cluster unbootable. There are various modifications to this blind update theme as others have suggested, but I think the basic issue remains. There will be some of those packages which ask a lot of questions. The thing I think you really need to wonder about is kernel packages a few times I have used stock kernels (not ones I compiled), and when a new update comes out the apt-get upgrade tries to install the new kernel, update lilo.conf, run lilo, and advises you that reboot asap. My personal track record with this has been less than perfect, and a few times I've need to revert to an old kernel, or use an emergency boot CD to fix the problem and then I'm all set. Therefore, kernel upgrades are something I want to do manually at this point in my life, and I tend to stick with the same kernel for long periods of time. I am getting ready to deploy hundreds of small embedded devices running Debian, and keeping them up to date is a potential nightmare, and I consider it carefully. Here is my strategy thus far: - My devices are the same, hardware wise, and it's my intent to keep them the same software wise. - Remove anything I don't need. Anything that's purged won't need to be updated. A samba server doesn't need gcc, or X, or frozen bubble, or apache, or LaTex. Eliminate those packages, and you remove the maintenance concerns on those packages. - General security approaches which reduce exposure. Eliminate services, use tight firewall rules, network isolation, read-only filesystems, and all that. - Look at the security updates and decide for yourself how exposed you are, and if it's really necessary to upgrade a particular package right now. If they find a bug which is only exploitable at the console, and none of my systems have a console, I don't need to worry about it really, or at least I can put it off until I have a bunch that matter. Or in the case of some exotic (potential) risk with a race condition where my local users would have do something rediculously complex, and I know they aren't smart enough to know how to do that, I can weigh the cost/benefit of inaction and potentially postpone this patch until I really need it. - When I finally do have to make some updates, I will do a couple machines by hand, and then make sure it works, then write a script which hits each of the other boxes and does exactly the same. That's my plan, we'll see how well it works. These systems are small data collection appliances, with no proprietary data, and if one of these gets hijacked we have taken steps to prevent it from spreading, so the consequences of a vulnerability are rather small. If you have a public web server, that's a whole 'nother story. Good luck. Joe -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: Server slowdown...
On Sun, Apr 11, 2004 at 12:28:31AM +0200, Jaroslaw Tabor wrote: Hello! I''ve strange problem with one of my servers. From time to time (once per 2-3 months), something strange happends, and server starts working very slow. What is strange, CPU load (from top) is about 5%, but response time for network services is extremly high. Usually gives timeout. After reboot, everything is working perfect. The question is where to start investigation. Can someone suggest some tool, to record statistics of CPU, Network, IO(drives) in correlation with processes ? Due to the fact, that problem occurs for all services, I suspect kernel (2.2.26) problem, but how to extract it? I see that 2.2.27pre1 has some fixes for tcp keepalive bug, and tcp seq nr wrapping bug. Can it be related ? I'll throw this out, I don't know if it is true or urban legend . . . In a meeting at work (I'm part of the IT group at a large corporation) someone mentioned a particular kind of network hardware which would stop working correctly after a while. We have a pretty busy network with broadcasts and what not, and apparently this device would croak after x number of packets, perhaps 2^32 or something. The time frame was a few weeks for the device to get to that point. Then someone else said some of the Dell office PC's had NICs with the same affliction, to which I joked That's what the sticker `made for Windows XX' means, they expect it to be rebooted frequently enough so you don't get to that point. :-) At any rate, that story bears some similarity to your situation. That's all I'll say. You might try to find out if your particular NIC has any sort of limitation like this. -- Thank you, Joe Bouchard Powered by Debian GNU/Linux -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: Server slowdown...
On Sun, Apr 11, 2004 at 12:28:31AM +0200, Jaroslaw Tabor wrote: Hello! I''ve strange problem with one of my servers. From time to time (once per 2-3 months), something strange happends, and server starts working very slow. What is strange, CPU load (from top) is about 5%, but response time for network services is extremly high. Usually gives timeout. After reboot, everything is working perfect. The question is where to start investigation. Can someone suggest some tool, to record statistics of CPU, Network, IO(drives) in correlation with processes ? Due to the fact, that problem occurs for all services, I suspect kernel (2.2.26) problem, but how to extract it? I see that 2.2.27pre1 has some fixes for tcp keepalive bug, and tcp seq nr wrapping bug. Can it be related ? I'll throw this out, I don't know if it is true or urban legend . . . In a meeting at work (I'm part of the IT group at a large corporation) someone mentioned a particular kind of network hardware which would stop working correctly after a while. We have a pretty busy network with broadcasts and what not, and apparently this device would croak after x number of packets, perhaps 2^32 or something. The time frame was a few weeks for the device to get to that point. Then someone else said some of the Dell office PC's had NICs with the same affliction, to which I joked That's what the sticker `made for Windows XX' means, they expect it to be rebooted frequently enough so you don't get to that point. :-) At any rate, that story bears some similarity to your situation. That's all I'll say. You might try to find out if your particular NIC has any sort of limitation like this. -- Thank you, Joe Bouchard Powered by Debian GNU/Linux