Matthias reminded me that it would be worth giving people a consolidated update on the current state of Launchpad builders, and changes we intend to make in the near future.
== History == Dedicated watchers of https://launchpad.net/builders may have noticed quite a few changes recently. The overall trend is that we're working on moving all builders into OpenStack clouds, a system we call ScalingStack [1]. This is giving us much better density of builders - a single unit of hardware can support many builder nodes - allowing us to vastly increase our build capacity compared to a year or two ago while saving on rack space at the same time. Here's a rough timeline of production changes so far in this project: 2014-08: ScalingStack enabled for amd64/i386 PPAs on production 2015-03 - 2015-07: Fixing various blockers for building Ubuntu in ScalingStack (new-style ddebs, modern sbuild, teaching Launchpad about virtualised-only architectures, several full test rebuilds) 2015-08: Ubuntu amd64/i386 builds switched to ScalingStack 2015-10: Ubuntu ppc64el builds switched to ScalingStack and PPAs enabled [2]; arm64/armhf undergoing testing [1] https://insights.ubuntu.com/2014/10/30/scalingstack-2x-performance-in-launchpads-build-farm-with-openstack/ [2] http://blog.launchpad.net/ppa/ppas-for-ppc64el == Design == The guest instances are reset to a clean state between each build. In fact, in order to minimise latency in the common case where the build farm is more than 0% idle, they're actually reset at the end of the previous build. This means that the guest configuration has to be generic: a given reset doesn't know whether the next build is going to be amd64 or i386 (the same guest images support both), or what kind of build it's going to do. This is an intentional trade-off, but it does mean that we can't do things like giving certain builds more RAM or disk: we need to find a reasonable point that gives us a good density of builders across all of our compute nodes while also being able to build most packages. At the moment our guests have four virtual CPUs, 4GiB RAM, 4GiB swap, and 60GiB disk. This can be tuned but at the cost of being able to support fewer concurrent instances, and the same cloud regions are used for autopkgtest workers and error retracers as well. In cases where a build exceeds these limits, do consider whether it's possible to squash it down a bit with reasonable effort: for example, splitting up translation units or performing less aggressive optimisation are valid approaches, and may even be acceptable upstream. == Common problems == * Build fails without a log This means that the build failed catastrophically enough that Launchpad was unable to retrieve the build log from launchpad-buildd at the end of the build. There are various possible causes. Running out of RAM or disk can have this effect, as can crashing the builder instance by way of a kernel bug, or a few other cases where the builder fails very early in the build. If you run into such a case and it's reproducible (i.e. a simple retry doesn't clear it up), feel free to ask the Launchpad team for advice. * Builder stuck in Cleaning on the /builders page This means that the process of resetting the builder to a clean state failed. In most cases this will disable the builder with useful notes instead, but there are some cases where this doesn't happen. We keep an eye on this to ensure that we don't end up with too few builders available; you don't need to tell us about them unless build queues are backing up. * Builder disabled on the /builders page This is occasionally done by hand, but is usually automatic as a result of a failed reset. In either case there should be useful notes visible on the page describing the individual builder in question. Again, we keep an eye on this to ensure that we don't end up with too few builders available; you don't need to tell us about them unless build queues are backing up. * lcy01 The lcy01-* builders frequently fail to reset at the moment, usually with copious error output from the host kernel. Our sysadmins are working on tracking down the root cause. == Future plans == The next major change will be to switch arm64 over to ScalingStack; this is being tested at the moment. Once that's done, all architectures will have at least nine builders, which will make it rare to ever find yourself waiting for builds. The remaining architectures are armhf (currently 19 builders on one physical chassis) and powerpc (currently nine builders on three physical machines). The plan for armhf is to share guests with arm64, which requires a kernel patch so that we can set the personality such that uname returns "armv7l" rather than "armv8l" as linux32 currently gives us. On powerpc, we can't share guests with ppc64el because of the different endianness, but once we have baseline cloud images (coming soon) we'll be able to bring up another set of guests alongside ppc64el on the same set of compute nodes. -- Colin Watson [[email protected]] -- ubuntu-devel mailing list [email protected] Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel
