FWIW, I've just registered https://blueprints.launchpad.net/tripleo/+spec/re-assert-system-state and I'm about to start work on the spec.
Matt > -----Original Message----- > From: Clint Byrum [mailto:[email protected]] > Sent: 27 June 2014 17:01 > To: openstack-dev > Subject: Re: [openstack-dev] [TripleO] os-refresh-config run frequency > > Excerpts from Macdonald-Wallace, Matthew's message of 2014-06-27 00:14:49 > -0700: > > Hi Clint, > > > > > -----Original Message----- > > > From: Clint Byrum [mailto:[email protected]] > > > Sent: 26 June 2014 20:21 > > > To: openstack-dev > > > Subject: Re: [openstack-dev] [TripleO] os-refresh-config run > > > frequency > > > > > > > > So I see two problems highlighted above. > > > > > > 1) We don't re-assert ephemeral state set by o-r-c scripts. You're > > > right, and we've been talking about it for a while. The right thing > > > to do is have os-collect- config re-run its command on boot. I don't > > > think a cron job is the right way to go, we should just have a file > > > in /var/run that is placed there only on a successful run of the command. > > > If > that file does not exist, then we run the command. > > > > > > I've just opened this bug in response: > > > > > > https://bugs.launchpad.net/os-collect-config/+bug/1334804 > > > > > > Cool, I'm more than happy for this to be done elsewhere, I'm glad that > > people > are in agreement with me on the concept and that work has already started on > this. > > > > I'll add some notes to the bug if needed later on today. > > > > > 2) We don't re-assert any state on a regular basis. > > > > > > So one reason we haven't focused on this, is that we have a stretch > > > goal of running with a readonly root partition. It's gotten lost in > > > a lot of the craziness of "just get it working", but with rebuilds > > > blowing away root now, leading to anything not on the state drive > > > (/mnt currently), there's a good chance that this will work relatively > > > well. > > > > > > Now, since people get root, they can always override the readonly > > > root and make changes. <golem>we hates thiss!</golem>. > > > > > > I'm open to ideas, however, os-refresh-config is definitely not the > > > place to solve this. It is intended as a non-resident command to be > > > called when it is time to assert state. os-collect-config is > > > intended to gather configurations, and expose them to a command that > > > it runs, and thus should be the mechanism by which os- refresh-config is > > > run. > > > > > > I'd like to keep this conversation separate from one in which we > > > discuss more mechanisms to make os-refresh-config robust. There are > > > a bunch of things we can do, but I think we should focus just on "how do > > > we > re-assert state?". > > > > OK, that's fair enough. > > > > > Because we're able to say right now that it is only for running when > > > config changes, we can wave our hands and say it's ok that we > > > restart everything on every run. As Jan alluded to, that won't work > > > so well if we run it every 20 minutes. > > > > Agreed, and chatting with Jan and a couple of others yesterday we came to > the conclusion that whatever we do here it will require "tweaking" of a number > of elements to safely restart services. > > > > > So, I wonder if we can introduce a config version into os-collect-config. > > > > > > Basically os-collect-config would keep a version along with its cache. > > > Whenever a new version is detected, os-collect-config would set a > > > value in the environment that informs the command "this is a new > > > version of config". From that, scripts can do things like this: > > > > > > if [ -n "$OS_CONFIG_NEW_VERSION" ] ; then > > > service X restart > > > else > > > if !service X status ; then service X start fi > > > > > > This would lay the groundwork for future abilities to compare > > > old/new so we can take shortcuts by diffing the two config versions. > > > For instance if we look at old vs. new and we don't see any of the > > > keys we care about changed, we can skip restarting. > > > > I like this approach - does this require a new spec? If so, I'll start an > > etherpad > to collect thoughts on it before writing it up for approval. > > I think this should be a tripleo spec. If you're volunteering write it, > hooray \o/. It > will require several work items. Off the top of my head: > > - Add version awareness to os-collect-config > - Add version awareness to all os-refresh-config scripts that do > disruptive things. > - Add periodic command run to os-collect-config > > Let's call it 're-assert-system-state'. Sound good? > > _______________________________________________ > OpenStack-dev mailing list > [email protected] > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev _______________________________________________ OpenStack-dev mailing list [email protected] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
