Re: New flakey juju tests
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Thanks Martin, I'll take a look at both of these, since I approved the first and landed the second. Sorry about the flakiness :/ On 10.07.2014 00:26, Martin Packman wrote: We've done a lot of work recently improving the reliability of our test suite. Unfortunately, feature work has been introducing new tests that are intermittently failing. http://juju-ci.vapour.ws:8080/job/github-merge-juju/1/console FAIL: machine_test.go:1750: MachineSuite.TestWatchInterfaces ... machine_test.go:1808: wc.AssertOneChange() testing/watcher.go:76: c.Fatalf(watcher sent unexpected change: (_, %v), ok) ... Error: watcher sent unexpected change: (_, true) ... FAIL github.com/juju/juju/state 123.265s This test was added in pr 207. https://github.com/juju/juju/pull/207 http://juju-ci.vapour.ws:8080/job/github-merge-juju/3/console FAIL: server_test.go:96: serverSuite.TestAPIServerCanListenOnBothIPv4AndIPv6 ... server_test.go:104: c.Assert(err, gc.IsNil) ... value *net.OpError = net.OpError{Op:listen, Net:tcp, Addr:(*net.TCPAddr)(0xc2107e9f00), Err:(*os.SyscallError)(0xc21027f560)} (listen tcp :54321: bind: address already in use) ... FAIL github.com/juju/juju/state/apiserver30.369s This test was added in pr 224. https://github.com/juju/juju/pull/224 Can we have another look at these tests and fix them up to be properly robust? I don't want to back out changes that have been in trunk for a while, but we can't leave unreliable tests on trunk. Thanks, Martin - -- Dimiter Naydenov dimiter.nayde...@canonical.com juju-core team -BEGIN PGP SIGNATURE- Version: GnuPG v1 Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJTvjHsAAoJENzxV2TbLzHwXYoIALuFI2e0fKIkFwHTkhNeJjir /9/v3GRqx8VyILUWvVKlzv3MuBvOPHJo9hfcE1h0n/LnIrZkwd/UmNqrScLJYbPn sIRPlyrPiwrUb1d8AF+6KKghaAYGrV+HKNkvmra9z2aX+lsJ5RWBag2m7n3Qo62U t+mnN90IE1c5GHCNdeN6VUwQ/Z9QFOOT/fJo6BwaXERQ9qortltpFzqt1sVLGBBJ KXQtWhtPouu0q9mNOMq4gJUS4qMf4etM/jn+uLujLZ5Tq/qtcatyPGyxL3NuL/NM XAbop0XlrtZNQ23ixosvB7R5uPp0HJ8scUGfkhFWSQ7rPEwSIzxuj7t1Nk7NZQM= =hD+c -END PGP SIGNATURE- -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Current handling of failed upgrades is screwy
So I've noticed that the way we currently handle failed upgrades in the machine agent doesn't make a lot of sense. Looking at cmd/jujud/machine.go:821, an error is created if PerformUpgrade() fails but nothing is ever done with it. It's not returned and it's not logged. This means that if upgrade steps fail, the agent continues running with the new software version, probably with partially applied upgrade steps, and there is no way to know. I have a unit tested fix ready which causes the machine agent to exit (by returning the error as a fatalError) if PerformUpgrade fails but before proposing I realised that's not the right thing to do. The agent's upstart script will restart the agent and probably cause the upgrade to run and fail again so we end up with an endless restart loop. The error could also be returned as a non-fatal (to the runner) error but that will just cause the upgrade-steps worker to continuously restart, attempting the upgrade and failing. Another approach could be to set the global agent-version back to the previous software version before killing the machine agent but other agents may have already upgraded and we can't currently roll them back in any reliable way. Our upgrade story will be improving in the coming weeks (I'm working on that). In the mean time what should we do? Perhaps the safest thing to do is just log the error and keep the agent running the new version and hope for the best? There is a significant chance of problems but this is basically what we're doing now (except without logging that there's a problem). Does anyone have a better idea? - Menno -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Current handling of failed upgrades is screwy
I think it fundamentally comes down to is the reason upgrade failed transient or permanent, if we can try again later, do so, else log at Error level, and keep on with your life, because that is the only chance of recovery (from what you've said, at least). John =:- On Thu, Jul 10, 2014 at 11:18 AM, Menno Smits menno.sm...@canonical.com wrote: So I've noticed that the way we currently handle failed upgrades in the machine agent doesn't make a lot of sense. Looking at cmd/jujud/machine.go:821, an error is created if PerformUpgrade() fails but nothing is ever done with it. It's not returned and it's not logged. This means that if upgrade steps fail, the agent continues running with the new software version, probably with partially applied upgrade steps, and there is no way to know. I have a unit tested fix ready which causes the machine agent to exit (by returning the error as a fatalError) if PerformUpgrade fails but before proposing I realised that's not the right thing to do. The agent's upstart script will restart the agent and probably cause the upgrade to run and fail again so we end up with an endless restart loop. The error could also be returned as a non-fatal (to the runner) error but that will just cause the upgrade-steps worker to continuously restart, attempting the upgrade and failing. Another approach could be to set the global agent-version back to the previous software version before killing the machine agent but other agents may have already upgraded and we can't currently roll them back in any reliable way. Our upgrade story will be improving in the coming weeks (I'm working on that). In the mean time what should we do? Perhaps the safest thing to do is just log the error and keep the agent running the new version and hope for the best? There is a significant chance of problems but this is basically what we're doing now (except without logging that there's a problem). Does anyone have a better idea? - Menno -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: New flakey juju tests
On 10/07/2014, Dimiter Naydenov dimiter.nayde...@canonical.com wrote: I'll take a look at both of these, since I approved the first and landed the second. Sorry about the flakiness :/ Thanks Dimiter! Martin -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: New flakey juju tests
I've created an issue for it and see what happens. https://bugs.launchpad.net/juju-core/+bug/1340156 mue On Thu, Jul 10, 2014 at 3:32 PM, Martin Packman martin.pack...@canonical.com wrote: On 10/07/2014, Dimiter Naydenov dimiter.nayde...@canonical.com wrote: I'll take a look at both of these, since I approved the first and landed the second. Sorry about the flakiness :/ Thanks Dimiter! Martin -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev -- ** Frank Mueller frank.muel...@canonical.com ** Software Engineer - Juju Development ** Canonical -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
juju stable 1.20.1 is released
juju-core 1.20.1 A new stable release of Juju, juju-core 1.20.1, is now available. This release replaces 1.20.0. Getting Juju juju-core 1.20.1 is available for utopic and backported to earlier series in the following PPA: https://launchpad.net/~juju/+archive/stable Noteworthy This release fixes several issues seen in slower environments. The performance improvements also improved reliability. Juju CI saw a 35% speed improvement in the test suite. While we had planned to release 1.20.1 on 2014-07-17, the performance improvements were just too good to delay a whole week. Resolved issues * Juju 1.20 consistently fails to bootstrap a MAAS environment Lp 1339240 * Juju bootstrap fails because mongodb is unreachable Lp 1337340 * Juju 1.20.x slow bootstrap Lp 1338179 * API-endpoints fails if run just after bootstrap Lp 1338511 * Machines are killed if mongo fails Lp 1339770 * Restore doesn't Lp 1336967 Finally We encourage everyone to subscribe the mailing list at juju-...@lists.canonical.com, or join us on #juju-dev on freenode. -- Curtis Hovey Canonical Cloud Development and Operations http://launchpad.net/~sinzui -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Proposal: making apt-get upgrade optional
On 11/07/14 02:47, Nate Finch wrote: Late to the party, but +1 for OS-neutral names. Keep in mind, there's no separate update/upgrade steps on Windows. There's no list of software that exists that needs to get updated on Windows, as that is done automatically. Luckily, it sounds like we want update to always happen on Ubuntu anyway, so we don't need to find a name for it. Not necessarily. We want to default it to always update, especially for cloud machines, but we want to be able to turn it off for the local provider when demoing as it will be much faster. Tim -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev