== Layered juju-depoyer == I ran into a problem over the weekend in trying to build lp:ci-train (https://docs.google.com/a/canonical.com/presentation/d/1LiDK3nVWUFKPbOCOPQEWdpXuTVpIQNU2SWjTiQaIGxE/edit). You can tell the Jenkins charm to use non-ephemeral storage, but you end up with this race condition where if you don't run euca-attach-volume at the right moment in a juju-deployer run, you end up with an install hook error and the entire deploy falls over, partially finished.
I spoke with Tom about this and they handle this problem in lp:canonical-mojo by having multiple juju-deployer configurations for the same environment. So the sames for each deployed service and underlying charm line up perfectly, but the configuration variables and relations differ. This lets you do a deployment in stages: https://code.launchpad.net/~canonical-sysadmins/canonical-mojo-specs/ Roughly: 1) Just deploy the charms without settings. 2) Add the settings. 3) Add the relations. They then have a script as part of mojo that handles attaching volumes and other juju-external tasks. == Push-based nagios == Also, he clarified that the way you do nagios checks is by polling rather than pushing data. So if you have an error state like "lxc-stop failed for this container", you want to handle that by dropping a file to a known location. Your nagios check then reports healthy when the file doesn't exist and alerts when the file does. This polling means you have a window where failures have occurred but Nagios and PagerDuty don't know about them yet. On smaller Nagios deployments we can get away with tuning up the frequency of polls, but there does not appear to be any other way around this. -- Mailing list: https://launchpad.net/~canonical-ci-engineering Post to : [email protected] Unsubscribe : https://launchpad.net/~canonical-ci-engineering More help : https://help.launchpad.net/ListHelp

