Matt Taggart [EMAIL PROTECTED] writes:
I had a similar idea as Andrea Mennucc mentions in #372712 for the problem of
so many pdiffs. The idea is similar to a scheme you might use for nightly
incremental backups. You might run a zero backup once a month, a one
backup every 15 days, a two every 7, a three every 3 and a four every
day. For example:
July 2006 Aug 2006
00 4 4 3 2
4 4 3 4 4 3 24 3 4 4 3 4 2
4 4 3 4 4 3 23 4 4 1 4 4 2
1 3 4 4 3 4 24 4 3 4 4 3 2
3 4 4 3 4 4 24 3 4 4 1
4 1
On any given day you'd need at most 5 patches and many days far less than
that. The reason for doing this is not just to reduce the number of files,
but the overall data, as a lot of the data in the diff is redundant. Consider
the case of a package that is updated every day for a month. Under the
current
scheme a client not updating for that month would need to download the
differences for that package 30 times right? Under an incremental scheme the
worst case is 5 diffs for that package. It's an even bigger win for longer
periods of time, the current scheme will start really falling down once we
get
a few more months of pdiffs.
Thanks,
But then again why have incremental diffs at all?
2 patches can be merged by using a file with enough uniqe lines, apply
both patches, diff again. No need to work off the actual Packages
file, they don't have to be stored for this.
It is true that for every day the patch files will all grow (- the
packages with multiple updates in that time) but they aren't so big
and compression gets better for larger files.
Given the crawling speed of the rred method downloading more than a
few days (~300k) worth of patches is slower than the full file (3Mb)
even on a slow dsl line. A combined patch would only use one download,
one gunzip and one rred run. I think that would be worth the space
increase for the patch files.
I would recommend to name the combined patch files after the md5sum
(or sha1) of the Packages/Sources file they patch. That way no index
needs to be downloaded.
MfG
Goswin
---
Sizes for combined patches:
-rw-r--r-- 1 reprepro nogroup 26K Jul 27 13:55 comb.2006-07-26-1318.02.gz
-rw-r--r-- 1 reprepro nogroup 54K Jul 27 13:55 comb.2006-07-25-1313.19.gz
-rw-r--r-- 1 reprepro nogroup 90K Jul 27 13:55 comb.2006-07-24-1338.19.gz
-rw-r--r-- 1 reprepro nogroup 132K Jul 27 13:55 comb.2006-07-24-0235.54.gz
-rw-r--r-- 1 reprepro nogroup 170K Jul 27 13:55 comb.2006-07-22-1308.51.gz
-rw-r--r-- 1 reprepro nogroup 186K Jul 27 13:55 comb.2006-07-21-1255.40.gz
-rw-r--r-- 1 reprepro nogroup 206K Jul 27 13:55 comb.2006-07-20-1302.38.gz
-rw-r--r-- 1 reprepro nogroup 226K Jul 27 13:56 comb.2006-07-19-1301.33.gz
-rw-r--r-- 1 reprepro nogroup 246K Jul 27 13:56 comb.2006-07-18-1311.49.gz
-rw-r--r-- 1 reprepro nogroup 289K Jul 27 13:56 comb.2006-07-17-1328.22.gz
-rw-r--r-- 1 reprepro nogroup 332K Jul 27 13:56 comb.2006-07-16-2314.28.gz
-rw-r--r-- 1 reprepro nogroup 351K Jul 27 13:57 comb.2006-07-15-1308.02.gz
-rw-r--r-- 1 reprepro nogroup 370K Jul 27 13:57 comb.2006-07-14-1250.45.gz
-rw-r--r-- 1 reprepro nogroup 392K Jul 27 13:57 comb.2006-07-13-1257.25.gz
-rw-r--r-- 1 reprepro nogroup 424K Jul 27 13:57 comb.2006-07-12-1242.39.gz
-rw-r--r-- 1 reprepro nogroup 443K Jul 27 13:58 comb.2006-07-11-1246.14.gz
-rw-r--r-- 1 reprepro nogroup 462K Jul 27 13:58 comb.2006-07-10-1321.18.gz
-rw-r--r-- 1 reprepro nogroup 495K Jul 27 13:58 comb.2006-07-10-0029.06.gz
-rw-r--r-- 1 reprepro nogroup 538K Jul 27 13:59 comb.2006-07-08-1242.03.gz
-rw-r--r-- 1 reprepro nogroup 547K Jul 27 13:59 comb.2006-07-07-1233.30.gz
--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]