Re: [lssconf-discuss] Testing configurations

Narayan Desai Thu, 02 Nov 2006 09:02:45 -0800

>>>>> "Paul" == Paul Anderson <[EMAIL PROTECTED]> writes:


  Paul> Can you give a simple example?
  >> 

> As for how this is used, consider the following case. You want to
  >> decommission a service, say ntp since it is pretty close to
  >> stateless. The three steps are to bring up the new service, make
  >> all clients use the new service instance, and decommission the
  >> old service instance. You want to build this as a transaction, so
  >> that clients won't begin using the service before it exists, or
  >> continue to use the old service after it has been turned off.
  >> 
  >> Here is how we implemented this. First, you commit three
  >> revisions to the svn repository. In the first, you enable the new
  >> ntp server. Say this gets repo revision 301. Then you commit a
  >> change that points clients at the new server; this gets revision
  >> 302. Then you commit a change that disables the old server; this
  >> gets revision 303.

  Paul> Oooh. We had endless discussions about doing exactly this kind
  Paul> of sequencing for the European DataGrid, but we never
  Paul> implemented it because we couldn't solve the problems of
  Paul> interference between updates with different priorities. I'd be
  Paul> very interested if you have something that works in
  Paul> non-trivial cases.

Right, this is a fundamental limitation of the approach. While we
haven't come up with a good way to compose workflows, we have
high-enough granularity information to figure out how to nudge the
state machine in the right direction. (ie finding state transition
blockers, etc)

  Paul> In practice, there are always multiple configuration changes
  Paul> happening. These are being made by different people, and they
  Paul> have different "urgency" levels. If some server has gone away
  Paul> and you need to reconfigure the clients, you can't wait - this
  Paul> means that you can't just do this kind of operation by
  Paul> considering the "revision" of the whole configuration - you
  Paul> have to deal with revisions on "aspects": For example ...

  Paul> Your sequence is exactly the one I used to use as an example:

  Paul> 1) bring up new server 2) point all clients at new server (and
  Paul> wait for them to make the change) 3) decommision old server

  Paul> Lets say you are waiting for all your clients to change over
  Paul> and you need to make an urgent change for some other reason -
  Paul> like a security software update, or a change to your network
  Paul> topology. What do you do? Do you do the equivalent of a CVS
  Paul> branch and add the new change to release (2) and (3) ? If you
  Paul> do this, you have a potentially explosive branching - very few
  Paul> of which will be tested. You also have the possibility that
  Paul> the changes you just made to branch 2 actually conflict with
  Paul> branch 3, so you can't actually complete your planned
  Paul> sequence. You can't wait for the sequence to complete because
  Paul> it may involve lots of clients, including ones which have been
  Paul> turned off for a month!

  Paul> Do you have a way round this?

Here is what we have discussed but not implemented. You could rewrite
the state machine as it is being executed to get the correct
result. It is sort of similar to what you are describing with the
branching. For example, say you need to apply a security patch. You
could then create revisions 302' and 303' that are the addition of
that patch to revisions 302 and 303, respectively. While you are
correct about potentially exponential explosion, this has made it
convenient and safe to do simple things that weren't reliable before,
so we are willing to accept this and handhold the process for the
cases that are important enough to make us care. 

I guess that one of the philosophical points inherent in this
approach is that we are willing to accept partial automation for
complex problems. In some sense, this makes previously hard tasks
merely inconvenient; so we see it as a net win, even it if is
administrator time intensive in some cases...If nothing else, if tells
you where the problems will occur when you go through transitions
unsafely. 
 -nld









_______________________________________________
lssconf-discuss mailing list
lssconf-discuss@inf.ed.ac.uk
http://lists.inf.ed.ac.uk/mailman/listinfo/lssconf-discuss

Re: [lssconf-discuss] Testing configurations

Reply via email to