On Thu, Feb 03, 2005 at 11:22:22PM +0100, Francesco Riosa wrote:
> Thank for pointing it out, never read about it!
> It was something similar to that, I was thinking about do it patching 
> source not binaries, thinking that patching binaries is difficult (both 
> for cpu and to reach good compression level).

It's preferable to do binpatching vs line based diffing for most sources.  
Smaller delta/patch, and can 
result in a perfect recreation of the tarball- one time patching, instead of 
applying N patches to jump to a version.
While they're compressable, the results of diffball/xdelta/bdelta (aborted 
deltup differencer) blow away the results 
of diff patches for the most part.

The main problem with diff based patches is that they're line based- change a 
single char in the new version, and you 
have to encode both the old version of the line, and the new version.  That and 
diff storing context of the patch 
(+/- 3 lines typically), which is extra fluff (useful for fuzzyness, but not 
version upgrades), etc.

> It seems that's not true, seeing his results, 

E'yep :)

> also reading that 
> remembered me that there is a md5sum problem (that can be resolved) in 
> all this stuff.

The route deltup takes to resolve it is basically a hack; it's reactive- the 
base problem (described in 
http://glep.gentoo.org/glep-0025.html) is that updates to a compressor, can 
result in a slightly different/smaller 
file.

Eg, with bzip2 v0.9, you get a slightly larger compressed file then with bzip 
v1.0; differencing has to 
decompress, patch, then recompress the tarball- deltup relies on you to have an 
older version of bzip2 installed if 
you ever run into a file that was originally compressed with v0.9.  So... it 
has an exception added to sidestep that 
particular case.  Problem is, you need to keep adding exceptions in for new 
cases, reactively.

So... it's not a huge issue, compressors don't change all that much (the 
current breed used- gzip/bzip2).  A new 
compressor (ppmd/rzip fex) would resort in issues.  

An addendum to it thinking about it, is that deltup has to determine how the 
file was originally compressed, and then 
use the same settings- this is stored in the fdtu iirc.  Again, not really a 
perfect solution (it works, but its 
ugly), nor particularly scalable for generating a massive set of diffs.

So.. yeah.  The md5 issue isn't really resolved, it's sidestepped with a set of 
special case additions.
That ^^^ is the main beef I have with deltup, it works for the most part, but 
the methods it takes to overcome md5 
related issues are fragile/slow/strike me as hacks. :)

> Also it seem that is an old aged idea, that has never take place, can 
> someone please explain me why ?

I personally ran out of steam trying to resolve the issue of required mirror 
space for it when I took a thwack at it-
http://www.gentoo.org/proj/en/glep/glep-0025.html#distfile-mirror-additions

If it's not a strictly opt-in feature (eg, some versions are offered on gentoo 
mirrors only via patches), users who 
don't care for patching/have a fast connection will get mad- reconstructing a 
file isn't the fastest thing known to 
man.  Fex, say you're jumping from v 2.6, to v.10 of the linux kernel- w/ 
deltup/xdelta, you have to generate 3 
intermediate files, since it can only apply one patch at a time.  So... there 
is a space issue also, aside from the 
extra io.

Diffball can apply multiple patches in a single run (no intermediate files) for 
fdtu/xdelta patches, so that's 
slightly sidestepped.

There also is the question of how to generate the diffs... a dedicated box for 
it would be preferable, rather then 
foisting it off on devs.

Offhand, deltup kind of rose up again via the dynamic deltup project.  Not sure 
of the status of it-

http://forums.gentoo.org/viewtopic.php?t=215262&highlight=dynamic+deltup

last release as far as I can tell was in oct.  The irc channel also is 
deserted, and the forum thread kind of shows 
signs of it being stopped/discontinued.

Note I'm not affiliated/knowledge about that project- so, they might still be 
kicking (in which case, if you're 
reading this ml kindly rear your head and correct me if I'm wrong)...

~brian

--
[email protected] mailing list

Reply via email to