LWN.net Weekly Edition for April 23, 2009
[LWN subscriber-only content]
Faster updates with yum-presto
Keeping up with an active distribution like Fedora consumes a fair amount of time, but also bandwidth. Depending on the frequency that a yum update is performed, hundreds of megabytes—or even gigabytes—can be required to bring the system up to date. A recent experiment in rawhide uses deltarpms and the yum Presto plugin to significantly reduce the size of the packages that needed to be retrieved. The experiment looks to be largely successful which means that Fedora will likely make the deltarpm files available more widely as part of Fedora 11.
The idea behind deltarpms is not a particularly new one, but the visibility has been raised by the recent Fedora Presto test day. The tools to build deltarpms were originally created by Michael Schröder of SUSE and have been around for a few years. Basically, the tools generate a binary difference (i.e. diff) between the new and old rpm files and create an rpm that just contains the differences (a drpm). Because package changes are typically fairly small and localized, the size difference between the new rpm and the drpm can be quite substantial.
The deltarpm tools do not require that the old rpm be present on the system when installing, instead they can reconstruct the state of the old rpm from the installation itself. As long as there is a drpm corresponding to the difference between the version currently installed and the version that needs to be installed, Presto will choose the more bandwidth-efficient package to download. If the deltarpm tools are unable to reconstruct the new rpm from the installed files and drpm—due to a local configuration file change for example—Presto will fall back to downloading the full rpm of the updated package.
For rawhide users, trying Presto out is quite simple:
yum install yum-prestowhich will install and enable the Presto plugin. Using it to update rawhide on April 22 would normally have required 68M, but using the drpms available (20 of 21 packages that needed updating) reduced that to 23M for a 66% reduction. There is a substantial pause after the packages have been downloaded while the deltarpm tools rebuild the rpms from drpms—in this case something on the order of one to two minutes. For someone at the end of a low-medium bandwidth link (or someone who pays by the the amount transferred), that tradeoff is likely to be a good one.
There are still a few infrastructure glitches on the Fedora side. Part of the reason for the test day and publicizing the new feature was to find and fix those problems before Fedora 11 ships. Because of the way the deltarpm tools work—reading both rpms into memory before doing the diff—and how the Fedora infrastructure builds rpms for all architectures in parallel, only packages smaller than 200M are currently turned into drpms. There are also questions about whether it makes sense to build source and debuginfo drpms. Those types of packages are not widely used so spending repository space and build resources on drpm versions may not be warranted. From a user perspective, though, it all works quite smoothly: install a package and get a lot of bandwidth savings.
SUSE has been using drpms for some time, at least since SUSE Linux 9.3 was released in 2005. Users automatically get drpms when using the zypper tool for package updates and drpms are created for all package updates as long as the diff is smaller than the full rpm. For users that would rather get the full rpm when doing updates, drpms can be disabled in /etc/zypp/zypp.conf.
Presto development is, unsurprisingly, a Fedora Hosted project with a Trac page and Git repository. It would seem that there has been some collaboration with the openSUSE folks on the drpm format and tools so that yum and zypper will interoperate. Given that both are rpm-based tools, it is good to see the two distributions working together.
One could argue, as some have, that there is too much package churn in Fedora. On the other hand, Fedora users do tend to expect very recent, often bleeding-edge, packages. Since that is unlikely to change, Presto will be very welcome for folks whose bandwidth is limited in some way—those who are unconcerned, need not install it. Meanwhile, with less fanfare, SUSE users have been getting those savings for some time.