With the LXD 2.0 release at the start of last week and the prospect of some stability, I spent a good chunk of the week testing of Juju and LXD.
What CI has been doing so far this cycle has been running our standard deployment tests with the lxd provider on a pre-prepared machine with a known-working package set. My two goals beyond this were: - Validating the first install experience of the LXD provider on xenial - Using lxd in place of lxc in containerised workloads across clouds The conclusion being at present, we don't have an experience that works for either of these. For the lxd provider, I understand we're resigned to the user having to manually configure a bridge for lxd before bootstrap can work. Currently the documentation is confused as to what exactly the steps are, the release notes refer to these two links: <https://linuxcontainers.org/lxd/getting-started-cli/> <http://insights.ubuntu.com/2016/04/07/lxd-networking-lxdbr0-explained/> But I think the latest advice is as committed with this change to our documentation: <https://github.com/juju/docs/pull/998/files> Note that just running dpkg-reconfigure is not enough, you have to poke a service or run `lxc` afterwards or you get this error with beta4: $ juju bootstrap --config default-series=xenial lxd-test lxd ERROR cannot find network interface "lxcbr0": route ip+net: no such network interface ERROR invalid config: route ip+net: no such network interface That's probably the cause of the other confusion in the updated docs - now we *do* want the bridge named lxdbr0 not lxcbr0. If the user already has lxc setup in some fashion there's a different error from juju telling them to run the dpkg-reconfigure command. That should presumably always appear whenever there's no usable bridge. This also presents a challenge for automated testing of the lxd provider in a clean environment, dpkg-reconfigure isn't the nicest thing to use non-interactively, and I can't find clear reference to what the exact required pieces are for the juju provider. As part of the juju 2.0 packaging for Ubuntu, we need an autopackagetest that will run in a fresh xenial machine, so this script is what I added to do the lxd configuration: <http://bazaar.launchpad.net/~juju-qa/ubuntu/xenial/juju/xenial-2.0-beta4/view/head:/debian/tests/setup-lxd.sh> With the additional step afterwards to call `lxc finger` that works (with caveats) for me. In the autopkgtest.ubuntu.com infrastructure however it does not, and it has also failed in two different ways for Steve Langasek and Martin Pitt: "autopkgtest lxd provider tests fail for 2.0" <https://bugs.launchpad.net/ubuntu/+source/juju-core/+bug/1571082> So, at present we don't have confidence that the LXD provider will work, even with the manual configuration step, for users installing Xenial for the first time. When it comes to using lxd in clouds, as I understand it we've settled on retaining the 'lxc' and 'lxd' name distinction in 2.0 - which does mean bundles have to be manually changed at present to start using lxd. Most of the CI bundle testing is using real bundles out of the store, which all still say 'lxc' and therefore don't exercise the lxd container code at all. We do have the container networking test which uses 'juju add-machine ... lxd:0' - and fails due to the networking setup: "container networking lxd 'Missing parent for bridged type nic'" <https://bugs.launchpad.net/juju-core/+bug/1571053> That is probably less interesting than the default behaviour without the feature flag. As a separate test, I updated one of our simple bundles just to say 'lxd' in two places where it had 'lxc' for a service before. The deployment timed out after 24 minutes, where the normal test with lxc takes 12 minutes. The reason for that turns out to be pretty simple. Looking back at the lxd provider test, it hung for over 20 minutes just updating packages when setting up the first container: In container /var/log/apt/history.log Start-Date: 2016-04-15 22:11:16 ... End-Date: 2016-04-15 22:33:03 Unlike other providers, lxd exposes no way to use the daily images instead of release images, so at present any machine using lxd containers with juju for the first time will get the xenial beta2 image then upgrade basically every package. This issue goes away next week, but gets in the way of testing before then. In a related note, the lxc container handling in juju manages images on the state server, but from what I see of the lxd code, each deployed machine will fetch images from cloud-images.ubuntu.com and keep a separate set of images. That makes the above problem much worse for any bundle with multiple machines that use containers. Finally, we'll need to update the log gathering code in CI to know how to look inside lxd containers. At the moment, only the machine log seems to be linked into the /var/log/lxd/ directory, so the cloud-init logs and other pieces are currently missing. It does seem we can peek inside using paths like: /var/lib/lxd/containers/juju-d9c2c426-f268-47d9-8b96-4468b3f60b51-machine-0/rootfs/var/log/apt/term.log But I'm not sure if that's behaviour we can rely on with all lxd configurations. Martin -- Juju-dev mailing list [email protected] Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
