Re: Testing Leader Election reconfiguration

2016-03-15 Thread Cory Johns
On Tue, Mar 15, 2016 at 4:36 PM, Tom Barber wrote: > If I want to wait for a message is it something from the right side of > status_set for example: > https://github.com/OSBI/layer-pdi/blob/master/reactive/pdi.py#L83 > 'Configuration > has changed, restarting Carte.'?

Re: Testing Leader Election reconfiguration

2016-03-15 Thread Tom Barber
Hey Cory Not even I'm that crazy! :) I have recycled the bootstrapped test environment but the only nodes running are those used in this test suite. I tried to use wait_for_messages initially and was a little confused as to what a "message" equated to (and again in those tests I got a timeout as

Re: Testing Leader Election reconfiguration

2016-03-15 Thread Cory Johns
Tom, It's also important to note that sentry.wait() waits for *all* units in the deployment to settle for at least 30 seconds, so it might be possible that another unit that wasn't included in the status gist you provided is churning and causing it to time out. That's particularly possible if

Re: Testing Leader Election reconfiguration

2016-03-15 Thread Tim Van Steenburgh
On Tue, Mar 15, 2016 at 12:30 PM, Tom Barber wrote: > Hi Tim, > > Why would I need to increase the timeout when the status says all the unit > are operational? > The default wait time is 300s, with an "idle threshold" of 30s. Which means, it waits for everything to be

Re: Testing Leader Election reconfiguration

2016-03-15 Thread Tom Barber
Hi Tim, Why would I need to increase the timeout when the status says all the unit are operational? The status dump came out of bundletester which said that it failed on the first wait(), I assume the status dump arrived at the same time? Bugs are allowed, the test was hacked up from a previous

Re: Testing Leader Election reconfiguration

2016-03-15 Thread Tim Van Steenburgh
Hey Tom, 1. You can increase the wait time until it doesn't time out: self.d.sentry.wait(timeout=1200) 2. At what point in this sequence of commands was the status dump captured? 3. There is a bug here. You take a reference to the pdi/0 info dict on line 1. It's the same object you use to get

Re: Testing Leader Election reconfiguration

2016-03-15 Thread Tom Barber
Okay back here again, so my nice leader election function looks like: def test_leader_election_failover(self): unit = self.d.sentry['pdi'][0].info message = unit['workload-status'].get('message') ip = message.split(':', 1)[-1] self.d.add_unit('pdi', 2)

Re: Testing Leader Election reconfiguration

2016-03-09 Thread Tom Barber
Oh really? /me stokes his invisible beard. Okay I'll go back and try again. Tom -- Director Meteorite.bi - Saiku Analytics Founder Tel: +44(0)5603641316 (Thanks to the Saiku community we reached our Kickstart

Re: Testing Leader Election reconfiguration

2016-03-09 Thread Tim Van Steenburgh
On Wed, Mar 9, 2016 at 6:31 AM, Tom Barber wrote: > Thanks Stuart. > > I do put a note in my charm message indicating the leader IP address so > that users know which to connect to. > > So with juju wait, would I destroy a unit then execute juju wait? At which > point

Re: Testing Leader Election reconfiguration

2016-03-09 Thread Tom Barber
Thanks Stuart. I do put a note in my charm message indicating the leader IP address so that users know which to connect to. So with juju wait, would I destroy a unit then execute juju wait? At which point it will hang until the leader election stuff is over and all becomes stable again? Also,

Re: Testing Leader Election reconfiguration

2016-03-09 Thread Stuart Bishop
On 9 March 2016 at 20:31, Tom Barber wrote: > Morning all > > I'm trying to test for charm reconfiguration if the leader goes AWOL. I put the role of the unit in its workload status, so it is easy for operators to see which unit is master. And this also makes it easy

Testing Leader Election reconfiguration

2016-03-09 Thread Tom Barber
Morning all I'm trying to test for charm reconfiguration if the leader goes AWOL. Adam suggested that I watch the status waiting for the next leader election hook the wait on that and then check my service configs. Sounds sane! Implementation though has been a pain. >From Amulet I assume I