Sean Whitton writes ("Bug#1130653: tag2upload signing key updates and expiry 
checks"):
> Ian Jackson <[email protected]> [13/Mar  8:45pm GMT] wrote:
> > * Arguably the cron job which sends the emails should *not* run on the
> >   manager, since the manager already has the normal cron job that is
> >   supposed to prompt us to do the updates.  If for some reason cron
> >   jobs on the manager don't run or can't email us, we'd miss the memo.
...
> Tbh I think this is overengineering.  Why not just add separate cron
> jobs for all of the places?

Three reasons:

In my experience there can be failures affecting a single host or a
single cron job where no jobs run at all, or no emails from that host
get delivered.  In the absence of monitoring, such failures ere not
noticed until they cause some other symptom.

We want that symptom to be "we get mail from cron job A that cron job
B has stopped working".  This pattern is one I have used elsewhere and
I have indeed had some cron mails.

Or to put it another way, each monitoring cron job comes with an
inherent risk of ineffectiveness because "it didn't run" and "it
failed but the report went into an oubliette" are indistinguishable
from "it's working".  A design with only one *monitoring* cron job,
and a bunch of separate function jobs that actually do something,
reduces that risk and makes mitigations for it (manual testing) much
easier.

Secondly, "copy this key out to the central place" is a much simpler
thing to deploy and will almost never need to be updated.  If we put
the more complex check-it's-all-ok logic everywhere then we have to
keep it up to date everywhere.

Thirdly, a central script can do cross-checks (eg that the keys are
all the same, or that every public copy has the same set of
countersignatures).  That also means the setup needs less
configuration knowledge.

These kind of reasons are why proper monitoring systems have a cnetral
information aggregation point.

I might talk to DSA to see if they have some suggestions for tools (or
want to hook this into their munit or whatever) but my prior is doubt
that we want to get entangled with a real monitoring system.

Overall this is a very small amount of scripting.  Most of these cron
jobs are shell one liners and don't need full on engineering, test
cases, etc. since if they go wrong the main cron job will scream.
The main cron job can be tested ad-hoc, or it can have automated tests
because its inputs are easy to simulate.

Ian.

-- 
Ian Jackson <[email protected]>   These opinions are my own.  

Pronouns: they/he.  If I emailed you from @fyvzl.net or @evade.org.uk,
that is a private address which bypasses my fierce spamfilter.

Reply via email to