On Thu, Feb 03, 2005 at 11:22:22PM +0100, Francesco Riosa wrote: > Thank for pointing it out, never read about it! > It was something similar to that, I was thinking about do it patching > source not binaries, thinking that patching binaries is difficult (both > for cpu and to reach good compression level).
It's preferable to do binpatching vs line based diffing for most sources. Smaller delta/patch, and can result in a perfect recreation of the tarball- one time patching, instead of applying N patches to jump to a version. While they're compressable, the results of diffball/xdelta/bdelta (aborted deltup differencer) blow away the results of diff patches for the most part. The main problem with diff based patches is that they're line based- change a single char in the new version, and you have to encode both the old version of the line, and the new version. That and diff storing context of the patch (+/- 3 lines typically), which is extra fluff (useful for fuzzyness, but not version upgrades), etc. > It seems that's not true, seeing his results, E'yep :) > also reading that > remembered me that there is a md5sum problem (that can be resolved) in > all this stuff. The route deltup takes to resolve it is basically a hack; it's reactive- the base problem (described in http://glep.gentoo.org/glep-0025.html) is that updates to a compressor, can result in a slightly different/smaller file. Eg, with bzip2 v0.9, you get a slightly larger compressed file then with bzip v1.0; differencing has to decompress, patch, then recompress the tarball- deltup relies on you to have an older version of bzip2 installed if you ever run into a file that was originally compressed with v0.9. So... it has an exception added to sidestep that particular case. Problem is, you need to keep adding exceptions in for new cases, reactively. So... it's not a huge issue, compressors don't change all that much (the current breed used- gzip/bzip2). A new compressor (ppmd/rzip fex) would resort in issues. An addendum to it thinking about it, is that deltup has to determine how the file was originally compressed, and then use the same settings- this is stored in the fdtu iirc. Again, not really a perfect solution (it works, but its ugly), nor particularly scalable for generating a massive set of diffs. So.. yeah. The md5 issue isn't really resolved, it's sidestepped with a set of special case additions. That ^^^ is the main beef I have with deltup, it works for the most part, but the methods it takes to overcome md5 related issues are fragile/slow/strike me as hacks. :) > Also it seem that is an old aged idea, that has never take place, can > someone please explain me why ? I personally ran out of steam trying to resolve the issue of required mirror space for it when I took a thwack at it- http://www.gentoo.org/proj/en/glep/glep-0025.html#distfile-mirror-additions If it's not a strictly opt-in feature (eg, some versions are offered on gentoo mirrors only via patches), users who don't care for patching/have a fast connection will get mad- reconstructing a file isn't the fastest thing known to man. Fex, say you're jumping from v 2.6, to v.10 of the linux kernel- w/ deltup/xdelta, you have to generate 3 intermediate files, since it can only apply one patch at a time. So... there is a space issue also, aside from the extra io. Diffball can apply multiple patches in a single run (no intermediate files) for fdtu/xdelta patches, so that's slightly sidestepped. There also is the question of how to generate the diffs... a dedicated box for it would be preferable, rather then foisting it off on devs. Offhand, deltup kind of rose up again via the dynamic deltup project. Not sure of the status of it- http://forums.gentoo.org/viewtopic.php?t=215262&highlight=dynamic+deltup last release as far as I can tell was in oct. The irc channel also is deserted, and the forum thread kind of shows signs of it being stopped/discontinued. Note I'm not affiliated/knowledge about that project- so, they might still be kicking (in which case, if you're reading this ml kindly rear your head and correct me if I'm wrong)... ~brian -- [email protected] mailing list
