[Launchpad-dev] Updated policy around cronspam, oops reporting etc

Robert Collins Wed, 14 Sep 2011 13:51:16 -0700

We recently had a situation where we got overwhelmed with 'noise' on
our cron output: things that are meant to be 'silent unless things go
wrong' started outputing significant amounts of email. This
overwhelmed the folk that track the cron output and a real issue was
missed (the librarian-gc script was in crisis).


During the team lead meeting we discussed this and I've clarified our
policies with the outcome: Things that *support* our identification of
production issues are essential to our day to day operations. Any
[significant] disruption to them is now an immediate operational
incident.

I don't see this as an actual change, rather a formalisation of the
prioritisation many folk have had in the past, but formalising it
gives *explicit* support to anyone that notices the issue and needs to
get the ball rolling.

I've updated the various docs I could see that were relevant:
https://dev.launchpad.net/BugTriage
https://dev.launchpad.net/PolicyAndProcess/ZeroOOPSPolicy
https://wiki.canonical.com/Launchpad/PolicyandProcess/DefinitionofCriticalPolicy
(sorry, internal only)

I'd love any feedback on clarity - or whether this is a crazy thing to
do :P - that you might have.

-Rob

_______________________________________________
Mailing list: https://launchpad.net/~launchpad-dev
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~launchpad-dev
More help   : https://help.launchpad.net/ListHelp

[Launchpad-dev] Updated policy around cronspam, oops reporting etc

Reply via email to