Re: New flakey juju tests

2014-07-10 Thread Dimiter Naydenov
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Thanks Martin,

I'll take a look at both of these, since I approved the first and
landed the second. Sorry about the flakiness :/

On 10.07.2014 00:26, Martin Packman wrote:
 We've done a lot of work recently improving the reliability of our 
 test suite. Unfortunately, feature work has been introducing new
 tests that are intermittently failing.
 
 http://juju-ci.vapour.ws:8080/job/github-merge-juju/1/console
 
 FAIL: machine_test.go:1750: MachineSuite.TestWatchInterfaces ... 
 machine_test.go:1808: wc.AssertOneChange() testing/watcher.go:76: 
 c.Fatalf(watcher sent unexpected change: (_, %v), ok) ... Error:
 watcher sent unexpected change: (_, true)
 
 ... FAIL  github.com/juju/juju/state  123.265s
 
 This test was added in pr 207.
 
 https://github.com/juju/juju/pull/207
 
 
 http://juju-ci.vapour.ws:8080/job/github-merge-juju/3/console
 
 FAIL: server_test.go:96:
 serverSuite.TestAPIServerCanListenOnBothIPv4AndIPv6 ... 
 server_test.go:104: c.Assert(err, gc.IsNil) ... value *net.OpError
 = net.OpError{Op:listen, Net:tcp, 
 Addr:(*net.TCPAddr)(0xc2107e9f00), 
 Err:(*os.SyscallError)(0xc21027f560)} (listen tcp :54321: bind: 
 address already in use)
 
 ... FAIL  github.com/juju/juju/state/apiserver30.369s
 
 This test was added in pr 224.
 
 https://github.com/juju/juju/pull/224
 
 
 Can we have another look at these tests and fix them up to be
 properly robust? I don't want to back out changes that have been in
 trunk for a while, but we can't leave unreliable tests on trunk.
 Thanks,
 
 Martin
 


- -- 
Dimiter Naydenov dimiter.nayde...@canonical.com
juju-core team
-BEGIN PGP SIGNATURE-
Version: GnuPG v1
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJTvjHsAAoJENzxV2TbLzHwXYoIALuFI2e0fKIkFwHTkhNeJjir
/9/v3GRqx8VyILUWvVKlzv3MuBvOPHJo9hfcE1h0n/LnIrZkwd/UmNqrScLJYbPn
sIRPlyrPiwrUb1d8AF+6KKghaAYGrV+HKNkvmra9z2aX+lsJ5RWBag2m7n3Qo62U
t+mnN90IE1c5GHCNdeN6VUwQ/Z9QFOOT/fJo6BwaXERQ9qortltpFzqt1sVLGBBJ
KXQtWhtPouu0q9mNOMq4gJUS4qMf4etM/jn+uLujLZ5Tq/qtcatyPGyxL3NuL/NM
XAbop0XlrtZNQ23ixosvB7R5uPp0HJ8scUGfkhFWSQ7rPEwSIzxuj7t1Nk7NZQM=
=hD+c
-END PGP SIGNATURE-

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Current handling of failed upgrades is screwy

2014-07-10 Thread Menno Smits
So I've noticed that the way we currently handle failed upgrades in the
machine agent doesn't make a lot of sense.

Looking at cmd/jujud/machine.go:821, an error is created if
PerformUpgrade() fails but nothing is ever done with it. It's not returned
and it's not logged. This means that if upgrade steps fail, the agent
continues running with the new software version, probably with partially
applied upgrade steps, and there is no way to know.

I have a unit tested fix ready which causes the machine agent to exit (by
returning the error as a fatalError) if PerformUpgrade fails but before
proposing I realised that's not the right thing to do. The agent's upstart
script will restart the agent and probably cause the upgrade to run and
fail again so we end up with an endless restart loop.

The error could also be returned as a non-fatal (to the runner) error but
that will just cause the upgrade-steps worker to continuously restart,
attempting the upgrade and failing.

Another approach could be to set the global agent-version back to the
previous software version before killing the machine agent but other agents
may have already upgraded and we can't currently roll them back in any
reliable way.

Our upgrade story will be improving in the coming weeks (I'm working on
that). In the mean time what should we do?

Perhaps the safest thing to do is just log the error and keep the agent
running the new version and hope for the best? There is a significant
chance of problems but this is basically what we're doing now (except
without logging that there's a problem).

Does anyone have a better idea?

- Menno
-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Current handling of failed upgrades is screwy

2014-07-10 Thread John Meinel
I think it fundamentally comes down to is the reason upgrade failed
transient or permanent, if we can try again later, do so, else log at
Error level, and keep on with your life, because that is the only chance of
recovery (from what you've said, at least).

John
=:-


On Thu, Jul 10, 2014 at 11:18 AM, Menno Smits menno.sm...@canonical.com
wrote:

 So I've noticed that the way we currently handle failed upgrades in the
 machine agent doesn't make a lot of sense.

 Looking at cmd/jujud/machine.go:821, an error is created if
 PerformUpgrade() fails but nothing is ever done with it. It's not returned
 and it's not logged. This means that if upgrade steps fail, the agent
 continues running with the new software version, probably with partially
 applied upgrade steps, and there is no way to know.

 I have a unit tested fix ready which causes the machine agent to exit (by
 returning the error as a fatalError) if PerformUpgrade fails but before
 proposing I realised that's not the right thing to do. The agent's upstart
 script will restart the agent and probably cause the upgrade to run and
 fail again so we end up with an endless restart loop.

 The error could also be returned as a non-fatal (to the runner) error
 but that will just cause the upgrade-steps worker to continuously restart,
 attempting the upgrade and failing.

 Another approach could be to set the global agent-version back to the
 previous software version before killing the machine agent but other agents
 may have already upgraded and we can't currently roll them back in any
 reliable way.

 Our upgrade story will be improving in the coming weeks (I'm working on
 that). In the mean time what should we do?

 Perhaps the safest thing to do is just log the error and keep the agent
 running the new version and hope for the best? There is a significant
 chance of problems but this is basically what we're doing now (except
 without logging that there's a problem).

 Does anyone have a better idea?

 - Menno





 --
 Juju-dev mailing list
 Juju-dev@lists.ubuntu.com
 Modify settings or unsubscribe at:
 https://lists.ubuntu.com/mailman/listinfo/juju-dev


-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: New flakey juju tests

2014-07-10 Thread Martin Packman
On 10/07/2014, Dimiter Naydenov dimiter.nayde...@canonical.com wrote:

 I'll take a look at both of these, since I approved the first and
 landed the second. Sorry about the flakiness :/

Thanks Dimiter!

Martin

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: New flakey juju tests

2014-07-10 Thread Frank Mueller
I've created an issue for it and see what happens.

https://bugs.launchpad.net/juju-core/+bug/1340156

mue

On Thu, Jul 10, 2014 at 3:32 PM, Martin Packman 
martin.pack...@canonical.com wrote:

 On 10/07/2014, Dimiter Naydenov dimiter.nayde...@canonical.com wrote:
 
  I'll take a look at both of these, since I approved the first and
  landed the second. Sorry about the flakiness :/

 Thanks Dimiter!

 Martin

 --
 Juju-dev mailing list
 Juju-dev@lists.ubuntu.com
 Modify settings or unsubscribe at:
 https://lists.ubuntu.com/mailman/listinfo/juju-dev




-- 
** Frank Mueller frank.muel...@canonical.com
** Software Engineer - Juju Development
** Canonical
-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


juju stable 1.20.1 is released

2014-07-10 Thread Curtis Hovey-Canonical
juju-core 1.20.1

A new stable release of Juju, juju-core 1.20.1, is now available.
This release replaces 1.20.0.


Getting Juju

juju-core 1.20.1 is available for utopic and backported to earlier
series in the following PPA:

https://launchpad.net/~juju/+archive/stable


Noteworthy

This release fixes several issues seen in slower environments. The
performance improvements also improved reliability. Juju CI saw a 35%
speed improvement in the test suite.  While we had planned to release
1.20.1 on 2014-07-17, the performance improvements were just too good to
delay a whole week.


Resolved issues

* Juju 1.20 consistently fails to bootstrap a MAAS environment
  Lp 1339240

* Juju bootstrap fails because mongodb is unreachable
  Lp 1337340

* Juju 1.20.x slow bootstrap
  Lp 1338179

* API-endpoints fails if run just after bootstrap
  Lp 1338511

* Machines are killed if mongo fails
  Lp 1339770

* Restore doesn't
  Lp 1336967


Finally

We encourage everyone to subscribe the mailing list at
juju-...@lists.canonical.com, or join us on #juju-dev on freenode.


-- 
Curtis Hovey
Canonical Cloud Development and Operations
http://launchpad.net/~sinzui

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Proposal: making apt-get upgrade optional

2014-07-10 Thread Tim Penhey
On 11/07/14 02:47, Nate Finch wrote:
 Late to the party, but +1 for OS-neutral names.  Keep in mind, there's
 no separate update/upgrade steps on Windows.  There's no list of
 software that exists that needs to get updated on Windows, as that is
 done automatically.  Luckily, it sounds like we want update to always
 happen on Ubuntu anyway, so we don't need to find a name for it.

Not necessarily.  We want to default it to always update, especially for
cloud machines, but we want to be able to turn it off for the local
provider when demoing as it will be much faster.

Tim

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev