Re: Gentoo for many servers (was: Re: [gentoo-user] executing commands on lots of servers at once)
On Sat, Nov 14, 2009 at 5:09 PM, Alex Schuster wo...@wonkology.org wrote: Alan McKinnon writes: On Saturday 14 November 2009 19:36:06 Alex Schuster wrote: Alan McKinnon wrote: clusterssh will let you log into many machines at once and run emerge -avuND world everywhere This is way cool. I just started using it on eight Fedora servers I am administrating. Nice, now this is an improvement over my 'for $h in $HOSTS; do ssh $h yum install foo; done' approach. I feel your pain :-) We used to have the same problem adding new admins to 87 machines. Now we have a bespoke provisioner that does it all. Sorry, I just do not get 'bespoke provisioner'. Some sort of software, like clusterssh? Or a person, one admin instead of many? What do you guys think about using Gentoo for servers? At the institute I partially work we chose Fedora. There is no special reason for that - we already had some Fedora machines, the setup seemed to work, the reputation was good, so we kept it. That was okay for me, why choose many different environments and learn everything again. I mentioned Gentoo, but did not really suggest to actually use it. Maybe I should have. I'm a huge fan of Gentoo Now who would have thought of that! and all my personal machines (except the new netbook have run it for the last 5 years. But I will never install Gentoo on a production server at work. Why? Because it is too time consuming, because no two machines are set up the same, because I can't trust that other admins used the flags they should have. So updates become a case of logging into 80+ machines individually and doing emerge world by hand. Gentoo allows you to customize things to the nth degree - that is it's strength - so people WILL use this one discriminating factor. If OTOH I had a server farm of 80+ machines, all identical, I'd put Gentoo on them in a flash. But I don't have that Of our 8 machines, 7 are essentially the same and differ only in hard drive space and CPU speed. The other machine is Intel, not AMD, and needs different IDE drivers. At the moment it has a different initrd (I set up a minimal fedora install to generate it after the cloned system did not boot), the rest is - apart from some config files - identical. So I would make sure that about everything is exactly the same, well, maybe except for hostnames, udev net-persistent-rules, ssh keys... what more? The last, a little different machine is a problem though. With optimized CFLAGS, this one would have to compile all stuff again, while for the others I could use binpkgs. Updating them all with clusterssh should not be much more work than updating a single one. Well, not completely true, I would have the double work, as I would upgrade one server first to test if there are problems, and then do it for the others. Maybe I could use the special machine to test stuff, and then update all the others. If they would differ, Gentoo would of course be too much work. I already have this problem now... there is my desktop machine, my notebook running a Gentoo VM, a second desktop machine at my other home, the living-room machine of my flat share, the machine of a fried I also administrate, the server of my flat share I need to set up again... and clusterssh is no option here. My potentially ill informed thoughts on the above issues/ideas: 1) Pick one machine to host both your make.conf as well as your portage tree and distfiles, potentially splitting them into separate nfs mounts shared out for the rest of the hosts (having the portage tree itself ro on all but its owning machine forces centralization of syncing). 2) /etc/make.conf should simply be a symlink to the centrally located copy. If you must use binpackages, set march to something that will run on every machine involved, then set mcpu to whatever machine is most common if you want to get just a bit more performance here or there. If you don't mind compiling on every host, though, set portage niceness to something friendly to your users and march to native (if you plan to use distcc, this is a BAD idea, use the binpackages). 3) use a replaceable (otherwise identical to the others, and therefore able to be brought back online by just cloning it over) system for your testing and keep frequent scheduled backups of whichever system plays host to your portage tree, binpackages, and distfiles. 4) build your kernel with built in drivers for every piece of boot-time essential hardware in your systems. You'll still be on a far cleaner setup than a mass produced distro provided kernel, you'll only need to maintain one for all your systems, and you'll only have one kernel to worry about building against if you need any out-of-kernel modules as well. 5) script the changing of ssh host keys (or even redistribution of them, if you ), removal of persistent net rules, and prompting for the setting of host name and you'll have a nice, tiny, postinstall tool for the rare
Gentoo for many servers (was: Re: [gentoo-user] executing commands on lots of servers at once)
Alan McKinnon wrote: clusterssh will let you log into many machines at once and run emerge -avuND world everywhere This is way cool. I just started using it on eight Fedora servers I am administrating. Nice, now this is an improvement over my 'for $h in $HOSTS; do ssh $h yum install foo; done' approach. What do you guys think about using Gentoo for servers? At the institute I partially work we chose Fedora. There is no special reason for that - we already had some Fedora machines, the setup seemed to work, the reputation was good, so we kept it. That was okay for me, why choose many different environments and learn everything again. I mentioned Gentoo, but did not really suggest to actually use it. Maybe I should have. These 8 servers I mentioned are basically clones of the one I installed manually. Instead of doing this again, I boot a live-cd on a new one, create partitions, and extract tar files of the first server's partitions. Then I do some extra configuration, like hostname and network setup. Done. My plan for updating them is to take the first server down, and upgrade the installation (if that works - I had some trouble with that before, so maybe it will be better to reinstall from scratch). Then I will create a snapshot of the new setup, transfer that to the other hosts, and unpack it in new logical volumes. I plan to script this so I do not have to do it manually every time - but that was before I knew ClusterSSH. When all is done and there is some time to take the servers down, I will reboot into the new system. Now I am thinking about a Gentoo installation instead. Pros: - Continuous updates, no downtime for upgrading, only when I decide to install a new kernel. This is really really cool. I fear the upgrade from Fedora 10 to 12 which has to be done soon. - Some improvement in speed. Those machines do A LOT of numbercrunching, which jobs often lasting for days, so even small improvements would be nice. - Easier debugging. When things do not work, I think it's easier to dig into the problem. No fancy, but sometimes buggy GUIs hiding basic functionality. - Heck, Gentoo is _cooler_ than typical distributions. And emerging with distcc on about 8*4 cores would be fun :) - I am probably the only one who can administrate them. Cons: - If something will not work with this not so common (meta)distribution, people will say always trouble with your Gentoo Schmentoo, it works fine in Fedora. Fedora is more mainstream, if something does not work there, then it's okay for the people to accept it. - I fear that big packages like Matlab are made for and tested on the typical distributions, and may have problems with the not-so-common Gentoo. I think someone here just had such a problem with Mathematica (which we do currently not use). - I am probably the only one who can administrate them. I think Gentoo is easier to maintain in the long run, but only when you take the time to learn it. With Fedora, you do not need much more than the 'yum install' command. There is no need to read complicated X.org upgrade guides and such. I think I already made my decision, but I am still interested in your opinions, maybe some of you are in a similar position and like to share your experiences. Whether I will be allowed to use Gentoo is another question, I guess my boss will not like my idea at first, and I am not even sure if he is right. But maybe I can test-install Gentoo on one machine in a chroot, and see if things work fine. Wonko
Re: Gentoo for many servers (was: Re: [gentoo-user] executing commands on lots of servers at once)
On Saturday 14 November 2009 19:36:06 Alex Schuster wrote: Alan McKinnon wrote: clusterssh will let you log into many machines at once and run emerge -avuND world everywhere This is way cool. I just started using it on eight Fedora servers I am administrating. Nice, now this is an improvement over my 'for $h in $HOSTS; do ssh $h yum install foo; done' approach. I feel your pain :-) We used to have the same problem adding new admins to 87 machines. Now we have a bespoke provisioner that does it all. What do you guys think about using Gentoo for servers? At the institute I partially work we chose Fedora. There is no special reason for that - we already had some Fedora machines, the setup seemed to work, the reputation was good, so we kept it. That was okay for me, why choose many different environments and learn everything again. I mentioned Gentoo, but did not really suggest to actually use it. Maybe I should have. I'm a huge fan of Gentoo and all my personal machines (except the new netbook) have run it for the last 5 years. But I will never install Gentoo on a production server at work. Why? Because it is too time consuming, because no two machines are set up the same, because I can't trust that other admins used the flags they should have. So updates become a case of logging into 80+ machines individually and doing emerge world by hand. Gentoo allows you to customize things to the nth degree - that is it's strength - so people WILL use this one discriminating factor. If OTOH I had a server farm of 80+ machines, all identical, I'd put Gentoo on them in a flash. But I don't have that These 8 servers I mentioned are basically clones of the one I installed manually. Instead of doing this again, I boot a live-cd on a new one, create partitions, and extract tar files of the first server's partitions. Then I do some extra configuration, like hostname and network setup. Done. My plan for updating them is to take the first server down, and upgrade the installation (if that works - I had some trouble with that before, so maybe it will be better to reinstall from scratch). Then I will create a snapshot of the new setup, transfer that to the other hosts, and unpack it in new logical volumes. I plan to script this so I do not have to do it manually every time - but that was before I knew ClusterSSH. When all is done and there is some time to take the servers down, I will reboot into the new system. Now I am thinking about a Gentoo installation instead. Pros: - Continuous updates, no downtime for upgrading, only when I decide to install a new kernel. This is really really cool. I fear the upgrade from Fedora 10 to 12 which has to be done soon. Do not upgrade, especially not with a version jump of 2 or more. If you have a lot of machines, I assume you are a decent shop, and that you have some form of formal process for upgrades and changes. What you do instead is a formal migration - copy the data off, reinstall, restore data. If you can't afford to do that every six or twleve months, then I have to ask - what the hell is the organization doing using a distro that is unsupported after 12 months? - Some improvement in speed. Those machines do A LOT of numbercrunching, which jobs often lasting for days, so even small improvements would be nice. Don't fool yourself. Unless you need what Google needs, there is very little speed difference between Gentoo and Fedora. I/O improvements you need can be easily gotten by fiddling the kernel tuning knobs. - Easier debugging. When things do not work, I think it's easier to dig into the problem. No fancy, but sometimes buggy GUIs hiding basic functionality. Emm, Fedora does not require a GUI :-) - Heck, Gentoo is _cooler_ than typical distributions. And emerging with distcc on about 8*4 cores would be fun :) Can't argue with that. But that is your ego talking and the machines do not belong to you but to the institute. Your ego has no place in that. - I am probably the only one who can administrate them. This is not a benefit. It is a severe liability. Where I work, I get fired for trying that :-( Cons: - If something will not work with this not so common (meta)distribution, people will say always trouble with your Gentoo Schmentoo, it works fine in Fedora. Fedora is more mainstream, if something does not work there, then it's okay for the people to accept it. Those same people are likely to say the same about linux vs windows. - I fear that big packages like Matlab are made for and tested on the typical distributions, and may have problems with the not-so-common Gentoo. I think someone here just had such a problem with Mathematica (which we do currently not use). One or two persons had problems. Many many more replied that they had no problems at all. In Fedora-land, the ratio is the same. - I am probably the only one who can administrate them. I
Re: Gentoo for many servers (was: Re: [gentoo-user] executing commands on lots of servers at once)
Alan McKinnon writes: On Saturday 14 November 2009 19:36:06 Alex Schuster wrote: Alan McKinnon wrote: clusterssh will let you log into many machines at once and run emerge -avuND world everywhere This is way cool. I just started using it on eight Fedora servers I am administrating. Nice, now this is an improvement over my 'for $h in $HOSTS; do ssh $h yum install foo; done' approach. I feel your pain :-) We used to have the same problem adding new admins to 87 machines. Now we have a bespoke provisioner that does it all. Sorry, I just do not get 'bespoke provisioner'. Some sort of software, like clusterssh? Or a person, one admin instead of many? What do you guys think about using Gentoo for servers? At the institute I partially work we chose Fedora. There is no special reason for that - we already had some Fedora machines, the setup seemed to work, the reputation was good, so we kept it. That was okay for me, why choose many different environments and learn everything again. I mentioned Gentoo, but did not really suggest to actually use it. Maybe I should have. I'm a huge fan of Gentoo Now who would have thought of that! and all my personal machines (except the new netbook have run it for the last 5 years. But I will never install Gentoo on a production server at work. Why? Because it is too time consuming, because no two machines are set up the same, because I can't trust that other admins used the flags they should have. So updates become a case of logging into 80+ machines individually and doing emerge world by hand. Gentoo allows you to customize things to the nth degree - that is it's strength - so people WILL use this one discriminating factor. If OTOH I had a server farm of 80+ machines, all identical, I'd put Gentoo on them in a flash. But I don't have that Of our 8 machines, 7 are essentially the same and differ only in hard drive space and CPU speed. The other machine is Intel, not AMD, and needs different IDE drivers. At the moment it has a different initrd (I set up a minimal fedora install to generate it after the cloned system did not boot), the rest is - apart from some config files - identical. So I would make sure that about everything is exactly the same, well, maybe except for hostnames, udev net-persistent-rules, ssh keys... what more? The last, a little different machine is a problem though. With optimized CFLAGS, this one would have to compile all stuff again, while for the others I could use binpkgs. Updating them all with clusterssh should not be much more work than updating a single one. Well, not completely true, I would have the double work, as I would upgrade one server first to test if there are problems, and then do it for the others. Maybe I could use the special machine to test stuff, and then update all the others. If they would differ, Gentoo would of course be too much work. I already have this problem now... there is my desktop machine, my notebook running a Gentoo VM, a second desktop machine at my other home, the living-room machine of my flat share, the machine of a fried I also administrate, the server of my flat share I need to set up again... and clusterssh is no option here. Now I am thinking about a Gentoo installation instead. Pros: - Continuous updates, no downtime for upgrading, only when I decide to install a new kernel. This is really really cool. I fear the upgrade from Fedora 10 to 12 which has to be done soon. Do not upgrade, especially not with a version jump of 2 or more. If you have a lot of machines, I assume you are a decent shop, and that you have some form of formal process for upgrades and changes. Not really, I think. We are not very professional I must admit. We have two capable admins, but one is specialized in network stuff and Windows, the other has to do with our big Sun servers, huuge storage systems and such. They do not much about the Linux cluster. Another user sometimes installs a package on a machine, but usually I do this. For me, it is not my main job, I work only about ten hours per week there, mostly being some 100 km away. We are a research institute. We do neurological research, PET and MRI tomography. The Linux servers do number crunching, and of course they should work and have good uptimes, but it is not as important as if we were an ISP. What you do instead is a formal migration - copy the data off, reinstall, restore data. Advice noted. Yes, this sounds like the better idea, giving a cleaner setup. And if some things break I do not have to wonder if it was some strange side effect from the upgrade process. If you can't afford to do that every six or twleve months, then I have to ask - what the hell is the organization doing using a distro that is unsupported after 12 months? Well, I do not think this was considered much. One machine was set up with Fedora for no specific reason, and we kept this distro then.