On Mon, 2007-07-02 at 07:47 -0400, Dan Williams wrote: > > With updatinator, bsdiff-ing the changes between files and gzipping new > files is the sweet spot. I tested plain blobs, gzipped blobs, bzip2-ed > blobs, bsdiff + gzip, and bsdiff + bzip2. Unfortunately, bzip2's memory > consumption on decompress isn't all that great according to other > people's research, and gzip gives us the best balance between > compression ratio and decompression mem/cpu usage. > > Scott, you may also want to re-test the rsync benchmarks using rsync > compression to make the bandwidth numbers in the rsync benchmarks go > down. You didn't mention anything specifically about using wire > compression, but the numbers look like you hadn't used it? > > This patch adds bsdiff + gzip compression to updatinator: > > http://people.redhat.com/dcbw/updatinator-bsdiff-gzip.patch
Cool. I'll merge it. > The difference sizes, using "du -csb" on the difference blob directory > are as follows. This is the amount of data that would be transferred > over the network, not including HTTP headers. > > 464->465: 6,869,799 bytes > 465->466: 18,870,574 bytes > > The resulting image directories verify with both verify-manifest and > diff -rua. > > As an improvement, we should provide a manifest-diff file for each > update path (along with the manifests for the actual image) that lists > the blobs with their own sha1 sum so that they are self-verifying. This > would also simplify the patch quite a bit because there would be less > path-munging. The diff-manifest tool that generates the blob diffs will > output this data in a somewhat suitable format already. As a different approach I was thinking of making the "manifest file" just contain a sha1 has and a gpg sign, and then make the actual manifest data a blob. That way we'd automatically reuse gzip and bsdiff features from the blobs. _______________________________________________ Devel mailing list [email protected] http://lists.laptop.org/listinfo/devel
