On 5/13/2011 12:54 PM, Tom Perrine wrote:
> So, what's your most memorable command line typo, "think-o",
> "brain-fart", or "#$$@#$*&@#$" moment?
>
> What subtle opportunity for massive destruction would you pass on as a
> warning to the next generation of system administrators?
>
> Don't limit yourself to bash/csh, feel free to explore databases,
> storage and network catastrophes!

Before I discovered inode numbers, I was rather zelously trying to 
delete a file with a bad file name. Can't for the life of me  remember 
what it was.
Nuked the current directory and everything below it. Thankfully nothing 
important. Now I know to do 'ls -i' then 'find -inum {number} -exec rm 
{} \;'

I've also managed to trigger an rm -rf against /
I meant to do rm in a specific location as part of a cron job run 
script.  The directory name changed, and so needed figured out each 
time.  For whatever reason instead of doing rm /$i or whatever, I 
decided to do cd /$i then run the rm.  However I didn't check to see I'd 
got into the directory.  In fact I didn't even check that $i was a valid 
location, or had even been worked out correctly.  Just arrogantly 
assumed it was.  Unfortunately I'd made a mistake earlier in the script 
and $i wasn't a valid directory. cd naturally failed, script didn't 
check the exit state, and it ended up running rm -rf against root, as 
root. Three lessons I took home from that,
1) Never assume anything.
2) Fully realise your paths to every program you execute in a cron 
script. Ties into 1 in that you shouldn't assume that $PATH is set.  If 
I'd fully realised the path to a particular executable, $i would have 
been valid.
3) Run as root? Was that really necessary?  Doh!

As a team we learned a hard lesson in one job. Be smarter about how 
you're using cfengine. We used cfengine to push out usernames/passwords 
to servers, rather than using something like LDAP. The idea being we 
could always log into any of our servers regardless of what happened 
elsewhere. A script ran hourly that grab user details from a mysql 
database, the data from which was used to create passwd files.  If an 
account had an expiry date that was passed or deleted from a certain 
database, the next time the script ran the passwd file would be created 
without that user.  Simple, and straightforward.  Cfengine was always 
one of the first pieces of software to be installed on a server so we 
never had UID/GID conflict problems.  With 300+ servers it was an easy 
and effective way of handling users, and also enabled us to easily have 
separate root passwords for different platforms (sudo wasn't used for 
the most part but was starting to be used more around the time I moved on)

It worked a treat for a number of years until a sysadmin accidentally 
changed the root password using a command line tool instead of changing 
it in the database like we always did.  That had the bad effect of 
setting an expiry date, which defaulted at 3am a certain number of days 
in advance.  Of course no one noticed.
The expiry date came, cfengine ran, identified root as no longer a valid 
user, generated passwd files without a root user in them and happily 
pushed it out to all our servers.
That was a very interested day, we learned all sorts of quirky things 
about various applications. Which ones were nicely self-contained (exim, 
bind etc.) and which ones relied on there being a root user.  Some 
services fell over pretty quickly, others took a while, but thankfully 
most of the critical ones were good applications well written.

I suspect it would have been worse if it had happened just before 
logrotate during which most apps would have been HUPed.  That was not a 
fun day.  The Sysadmin-on-call twigged what it was fairly quickly, and 
corrected it, but cfengine needed root account to be able to push out 
passwd files, so for the most part we were screwed. That day entailed 
lots of reboots into single-user-mode, live disks etc.  I was the 
nearest sysadmin to the datacenter so ended up doing most of the on site 
stuff whilst the others tried to keep things tiding along, and took 
advantage of toor accounts where possible.  The passwd generating script 
was rather swiftly altered so that the root account was always there, 
regardless of expiry dates!

Paul

_______________________________________________
Discuss mailing list
[email protected]
https://lists.lopsa.org/cgi-bin/mailman/listinfo/discuss
This list provided by the League of Professional System Administrators
 http://lopsa.org/

Reply via email to