Joyent operations guy here;  no, it wasn't me that hit the Big Red
Button.  (Unrelatedly, tomorrow is my last day there, so... I can say
nice things about my team, right?)

Internal culture at Joyent is pretty damn responsible, especially on
the sysadmin side of the house. Don't f*ck with my coworkers, or I
will end you.  We've got each other's backs, in the best of ways.

The postmortem posted is quite specific and accurate -- the outage was
caused by a fairly complex sociotechnical situation, and some outright
code bugs, which are now being addressed.  Recovery time was
incredibly short given the difficult nature of the problems
encountered.

There's some relevant-to-this-group hiring going on at Joyent -- do
pop a resume in if you're looking for new opportunities.

best,

--e


On Thu, May 29, 2014 at 10:55 AM, Moose Finklestein <[email protected]> wrote:
> Oh, yes, we've all been there.  Typed 'reboot' in the wrong window. Done
> 'newfs' on the wrong dev.  Told someone, "Go press the alarm button" only to
> watch in horror as they push the EPO.  Oh, yeah.
>
> The best part of this tale, I think, is that the company's attitude of
> "Well, the person screwed up and knows it; we don't see any need to beat
> them further than they're beating themself."  It's a refreshing and
> intelligent change from the typical "Of course we fired the person who did
> this!" that comes with a public disaster.
>
>
>
> _______________________________________________
> Discuss mailing list
> [email protected]
> https://lists.lopsa.org/cgi-bin/mailman/listinfo/discuss
> This list provided by the League of Professional System Administrators
>  http://lopsa.org/
>
_______________________________________________
Discuss mailing list
[email protected]
https://lists.lopsa.org/cgi-bin/mailman/listinfo/discuss
This list provided by the League of Professional System Administrators
 http://lopsa.org/

Reply via email to