On Sat, Nov 14, 2009 at 5:09 PM, Alex Schuster <wo...@wonkology.org> wrote:
> Alan McKinnon writes:
>
>> On Saturday 14 November 2009 19:36:06 Alex Schuster wrote:
>>> Alan McKinnon wrote:
>
>>>> clusterssh will let you log into many machines at once and run emerge
>>>> -avuND world everywhere
>>> This is way cool. I just started using it on eight Fedora servers I am
>>> administrating. Nice, now this is an improvement over my 'for $h in
>>> $HOSTS; do ssh $h "yum install foo"; done' approach.
>>
>> I feel your pain :-)
>>
>> We used to have the same problem adding new admins to 87 machines. Now
>> we have a bespoke provisioner that does it all.
>
> Sorry, I just do not get 'bespoke provisioner'. Some sort of software,
> like clusterssh? Or a person, one admin instead of many?
>
>
>>> What do you guys think about using Gentoo for servers? At the institute
>>> I partially work we chose Fedora. There is no special reason for that -
>>> we already had some Fedora machines, the setup seemed to work, the
>>> reputation was good, so we kept it. That was okay for me, why choose
>>> many different environments and learn everything again. I mentioned
>>> Gentoo, but did not really suggest to actually use it. Maybe I should
>>> have.
>>
>> I'm a huge fan of Gentoo
>
> Now who would have thought of that!
>
>> and all my personal machines (except the new netbook have run it for the
>> last 5 years.
>>
>> But I will never install Gentoo on a production server at work.
>>
>> Why?
>>
>> Because it is too time consuming, because no two machines are set up the
>> same, because I can't trust that other admins used the flags they should
>> have. So updates become a case of logging into 80+ machines individually
>> and doing emerge world by hand. Gentoo allows you to customize things to
>> the nth degree - that is it's strength - so people WILL use this one
>> discriminating factor.
>>
>> If OTOH I had a server farm of 80+ machines, all identical, I'd put
>> Gentoo on them in a flash. But I don't have that
>
> Of our 8 machines, 7 are essentially the same and differ only in hard
> drive space and CPU speed. The other machine is Intel, not AMD, and needs
> different IDE drivers. At the moment it has a different initrd (I set up a
> minimal fedora install to generate it after the cloned system did not
> boot), the rest is - apart from some config files - identical.
>
> So I would make sure that about everything is exactly the same, well,
> maybe except for hostnames, udev net-persistent-rules, ssh keys... what
> more?
> The last, a little different machine is a problem though. With optimized
> CFLAGS, this one would have to compile all stuff again, while for the
> others I could use binpkgs. Updating them all with clusterssh should not
> be much more work than updating a single one. Well, not completely true, I
> would have the double work, as I would upgrade one server first to test if
> there are problems, and then do it for the others. Maybe I could use the
> special machine to test stuff, and then update all the others.
>
> If they would differ, Gentoo would of course be too much work. I already
> have this problem now... there is my desktop machine, my notebook running
> a Gentoo VM, a second desktop machine at my other home, the living-room
> machine of my flat share, the machine of a fried I also administrate, the
> server of my flat share I need to set up again... and clusterssh is no
> option here.

My potentially ill informed thoughts on the above issues/ideas:

1) Pick one machine to host both your make.conf as well as your
portage tree and distfiles, potentially splitting them into separate
nfs mounts shared out for the rest of the hosts (having the portage
tree itself ro on all but its owning machine forces centralization of
syncing).

2) /etc/make.conf should simply be a symlink to the centrally located
copy. If you must use binpackages, set march to something that will
run on every machine involved, then set mcpu to whatever machine is
most common if you want to get just a bit more performance here or
there. If you don't mind compiling on every host, though, set portage
niceness to something friendly to your users and march to native (if
you plan to use distcc, this is a BAD idea, use the binpackages).

3) use a replaceable (otherwise identical to the others, and therefore
able to be brought back online by just cloning it over) system for
your testing and keep frequent scheduled backups of whichever system
plays host to your portage tree, binpackages, and distfiles.

4) build your kernel with built in drivers for every piece of
boot-time essential hardware in your systems. You'll still be on a far
cleaner setup than a mass produced distro provided kernel, you'll only
need to maintain one for all your systems, and you'll only have one
kernel to worry about building against if you need any out-of-kernel
modules as well.

5) script the changing of ssh host keys (or even redistribution of
them, if you ), removal of persistent net rules, and prompting for the
setting of host name and you'll have a nice, tiny, postinstall tool
for the rare case in which you need to re-deploy a system. You may
wish to restore things like ssh host keys from backups as well, in the
case of re-deployment of systems, since changing them means adjusting
known hosts lists elsewhere

>>> Now I am thinking about a Gentoo installation instead.
>>>
>>> Pros:
>>>  - Continuous updates, no downtime for upgrading, only when I decide to
>>> install a new kernel. This is really really cool. I fear the upgrade
>>> from Fedora 10 to 12 which has to be done soon.
>>
>> Do not upgrade, especially not with a version jump of 2 or more. If you
>> have a  lot of machines, I assume you are a decent shop, and that you
>> have some form of formal process for upgrades and changes.
>
> Not really, I think. We are not very professional I must admit. We have
> two capable admins, but one is specialized in network stuff and Windows,
> the other has to do with our big Sun servers, huuge storage systems and
> such. They do not much about the Linux cluster. Another user sometimes
> installs a package on a machine, but usually I do this. For me, it is not
> my main job, I work only about ten hours per week there, mostly being some
> 100 km away.
> We are a research institute. We do neurological research, PET and MRI
> tomography. The Linux servers do number crunching, and of course they
> should work and have good uptimes, but it is not as important as if we
> were an ISP.
>
>> What you do instead is a formal migration - copy the data off,
>> reinstall, restore data.
>
> Advice noted. Yes, this sounds like the better idea, giving a cleaner
> setup. And if some things break I do not have to wonder if it was some
> strange side effect from the upgrade process.
>
>> If you can't afford to do that every six or twleve months, then
>> I have to ask - what the hell is the organization doing using a distro
>> that is unsupported after 12 months?
>
> Well, I do not think this was considered much. One machine was set up with
> Fedora for no specific reason, and we kept this distro then. This does not
> sound too professional, I know. BTW, what distro would you suggest?

In the times I've used it, while a bit overweight for my tastes in
server work, Ubuntu handled updates quite gracefully, but needed
reboots somewhat often. You might get the same or better out of
Debian, as it's created a little less directly to be destktop centric,
while being the source of the package management that gives Ubuntu
what advantages it might have for the role.

>>> - Some improvement in speed. Those machines do A LOT of
>>> numbercrunching, which jobs often lasting for days, so even small
>>> improvements would be nice.
>>
>> Don't fool yourself. Unless you need what Google needs, there is very
>> little speed difference between Gentoo and Fedora. I/O improvements you
>> need can be  easily gotten by fiddling the kernel tuning knobs.
>
> I know the difference will not be huge, I see this as a little bonus -
> nice if is there, but nothing really important. But in the comparison with
> Ubuntu that came in a thread a few weeks ago, for some applications the
> speed increase was about 30 percent. Although I would not necessarily
> expect the difference to be noticeable, I would also not be surprised too
> much if it were noticeable for some number-crunching applications if they
> were optimized for the CPU.

Are the pieces of software you're using for the number crunching work
open source, and will you be recompiling those on Gentoo, with all the
optimizations, as well? In the long run, if they're not, you'll get
far more out of the I/O improvements Alan mentioned than you ever
would out of aggressive use of cflags.

>>>  - Easier debugging. When things do not work, I think it's easier to
>>> dig into the problem. No fancy, but sometimes buggy GUIs hiding basic
>>> functionality.
>>
>> Errrrrrrrrrrrrrrrmmmmmmmmmmmmmm, Fedora does not require a GUI :-)
>
> Right, and now that I think of it I do not use it anyway... Well, I did do
> some things with netsetup (or whatever it's called), now that I know the
> system a little better I edit things directly in /etc/sysconfig.
> But the installer is a GUI, right? And if I remember this correctly, I
> cannot even switch to a text console and do stuff there while installing.
> Or I could, but did not have utilities like LVM. Something like that. I
> have to use the installer and its capabilities.
>
>>> - Heck, Gentoo is _cooler_ than typical distributions. And emerging
>>> with distcc on about 8*4 cores would be fun :)
>>
>> Can't argue with that.
>>
>> But that is your ego talking and the machines do not belong to you but
>> to the institute. Your ego has no place in that.
>
> You're right, thanks for the reminder. But also note the smiley. I know my
> boss (who is also into geeky things) would also like this - as long as it
> would work.

If you've a moderately capable system sitting spare, throw virtualbox
or similar on it and bring up a few vms to test the setup in (since
with that, you can get away with ). My little core 2 here can handle
3-4 vms without fussing at all, and that's with

>>>  - I am probably the only one who can administrate them.
>>
>> This is not a benefit. It is a severe liability.
>
> That's why I listed it also on the contra side. Forgot to add a smiley
> here, it was not meant seriously.
> But when I think about it... the others also do not know much about
> Fedora. Not even I do this well. There you use 'yum install <package>',
> with Gentoo it's 'emerge <package>'. Daily work would be similar.
> Upgrades would be a different thing, though. Gentoo's portage blockers
> would not be understood easily, they would prefer to take the servers down
> and just install the current Fedora distro. Which hopefully would work.
>
>
>>> Cons:
>>> - If something will not work with this not so common
>>> (meta)distribution, people will say "always trouble with your Gentoo
>>> Schmentoo, it works fine in Fedora". Fedora is more mainstream, if
>>> something does not work there, then it's okay for the people to accept
>>> it.
>>
>> Those same people are likely to say the same about linux vs windows.
>
> Right, but we already have Linux, and we need it for our software. Gentoo
> would not really be needed.
>
>>> - I am probably the only one who can administrate them. I think Gentoo
>>> is easier to maintain in the long run, but only when you take the time
>>> to learn it. With Fedora, you do not need much more than the 'yum
>>> install' command. There is no need to read complicated X.org upgrade
>>> guides and such.
>>>
>>> I think I already made my decision, but I am still interested in your
>>> opinions, maybe some of you are in a similar position and like to share
>>> your experiences. Whether I will be allowed to use Gentoo is another
>>> question, I guess my boss will not like my idea at first, and I am not
>>> even sure if he is right. But maybe I can test-install Gentoo on one
>>> machine in a chroot, and see if things work fine.
>>
>> Depends how critical these machines are. If you want to change them just
>> because you feel like it, then I do not see how that can possibly be a
>> valid reason.
>>
>> Remember, the institute's needs and desires trump yours every time
>
> No, it's not just because I feel like it. The main advantages would be:
> - No downtime between upgrades. Our jobs run for several days, every
> downtime has to be planned in advance. People understand this, but they do
> not like it. They would be very happy if this were not longer necessary.
> And I would not fear that during the upgrade something breaks, and it
> would take me long to fix it.
> - I know this distro well, and this is not at all true about Fedora. I
> know how to fix problems, I know how things work here. I would feel better
> with Gentoo, more competent. It just does not feel so well to administrate
> Fedora.
>
> Thanks for your opinions, Alan. As always.
>
>        Wonko

As a final note... whatever path you take in either implementing a new
setup or just updating the old one, document it, and especially
document guides for upkeep and general maintenance. Your boss, Windows
guy, and Sun guy're going to be like fish out of water if you get this
whole thing put in place and get hit by a bus the next day. *This* is
why the "I'm the only one that can.." bit is such a dangerous thing.
It's not the fear that you'll try to use it as a bargaining chip down
the road, given that, they'd just take the hit, replace you, and then
replace the setup with a better documented one... it's the fear that
if for any reason you drop out of the picture for them, they're stuck
with the cost of doing that. Period. (This is also why, when actively
and intentionally done, it's a fire-able offense in many places)

-- 
Poison [BLX]
Joshua M. Murphy

Reply via email to