Hello community, here is the log from the commit of package librsync for openSUSE:Factory checked in at 2013-03-18 07:07:41 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Comparing /work/SRC/openSUSE:Factory/librsync (Old) and /work/SRC/openSUSE:Factory/.librsync.new (New) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Package is "librsync", Maintainer is "[email protected]" Changes: -------- --- /work/SRC/openSUSE:Factory/librsync/librsync.changes 2012-02-16 10:06:01.000000000 +0100 +++ /work/SRC/openSUSE:Factory/.librsync.new/librsync.changes 2013-03-18 07:07:42.000000000 +0100 @@ -1,0 +2,8 @@ +Fri Mar 15 14:36:14 UTC 2013 - [email protected] + +- apply librsync-logn-search.patch, librsync-logn-sumset.patch + librsync-man-example.diff +- refresh all patches +- enable tests + +------------------------------------------------------------------- New: ---- librsync-logn-search.patch librsync-logn-sumset.patch librsync-man-example.diff series ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Other differences: ------------------ ++++++ librsync.spec ++++++ --- /var/tmp/diff_new_pack.G3S327/_old 2013-03-18 07:07:44.000000000 +0100 +++ /var/tmp/diff_new_pack.G3S327/_new 2013-03-18 07:07:44.000000000 +0100 @@ -1,7 +1,7 @@ # # spec file for package librsync # -# Copyright (c) 2012 SUSE LINUX Products GmbH, Nuernberg, Germany. +# Copyright (c) 2013 SUSE LINUX Products GmbH, Nuernberg, Germany. # # All modifications and additions to the file contributed by third parties # remain the property of their copyright owners, unless otherwise agreed @@ -31,6 +31,9 @@ Source: http://prdownloads.sourceforge.net/rproxy/%{name}-%{version}.tar.bz2 Patch0: %{name}-%{version}-strictalias.diff Patch1: librsync-0.9.7-largefiles.patch +Patch2: librsync-logn-search.patch +Patch3: librsync-logn-sumset.patch +Patch4: librsync-man-example.diff BuildRoot: %{_tmppath}/%{name}-%{version}-build %description @@ -72,8 +75,11 @@ %prep %setup -q -%patch0 -%patch1 +%patch0 -p1 +%patch1 -p1 +%patch2 -p1 +%patch3 -p1 +%patch4 -p1 %build autoreconf -fi @@ -84,6 +90,10 @@ %makeinstall %{__rm} %{buildroot}%{_libdir}/librsync.la +%check +pushd testsuite +make %{?_smp_mflags} check + %post -n %lname -p /sbin/ldconfig %postun -n %lname -p /sbin/ldconfig ++++++ librsync-0.9.7-largefiles.patch ++++++ --- /var/tmp/diff_new_pack.G3S327/_old 2013-03-18 07:07:44.000000000 +0100 +++ /var/tmp/diff_new_pack.G3S327/_new 2013-03-18 07:07:44.000000000 +0100 @@ -1,10 +1,10 @@ RCS file: /cvsroot/librsync/librsync/mdfour.h,v retrieving revision 1.7 retrieving revision 1.8 -Index: mdfour.h +Index: b/mdfour.h =================================================================== ---- mdfour.h.orig 2004-02-08 00:17:57.000000000 +0100 -+++ mdfour.h 2007-09-02 10:10:50.000000000 +0200 +--- a/mdfour.h ++++ b/mdfour.h @@ -1,7 +1,7 @@ /*= -*- c-basic-offset: 4; indent-tabs-mode: nil; -*- * @@ -23,10 +23,10 @@ #if HAVE_UINT64 uint64_t totalN; #else -Index: patch.c +Index: b/patch.c =================================================================== ---- patch.c.orig 2004-09-17 23:35:50.000000000 +0200 -+++ patch.c 2007-09-02 10:10:50.000000000 +0200 +--- a/patch.c ++++ b/patch.c @@ -1,7 +1,7 @@ /*= -*- c-basic-offset: 4; indent-tabs-mode: nil; -*- * @@ -50,10 +50,10 @@ if (!len) return RS_BLOCKED; -Index: doc/rdiff.1 +Index: b/doc/rdiff.1 =================================================================== ---- doc/rdiff.1.orig 2004-02-08 00:17:57.000000000 +0100 -+++ doc/rdiff.1 2007-09-02 10:10:50.000000000 +0200 +--- a/doc/rdiff.1 ++++ b/doc/rdiff.1 @@ -1,6 +1,6 @@ .\" .\" librsync -- dynamic caching and delta update in HTTP ++++++ librsync-0.9.7-strictalias.diff ++++++ --- /var/tmp/diff_new_pack.G3S327/_old 2013-03-18 07:07:44.000000000 +0100 +++ /var/tmp/diff_new_pack.G3S327/_new 2013-03-18 07:07:44.000000000 +0100 @@ -1,7 +1,7 @@ -Index: netint.c +Index: b/netint.c =================================================================== ---- netint.c.orig 2004-09-17 23:35:50.000000000 +0200 -+++ netint.c 2007-09-02 10:10:50.000000000 +0200 +--- a/netint.c ++++ b/netint.c @@ -121,7 +121,7 @@ rs_squirt_n4(rs_job_t *job, int val) rs_result rs_suck_netint(rs_job_t *job, rs_long_t *v, int len) @@ -11,7 +11,7 @@ int i; rs_result result; -@@ -130,13 +130,13 @@ rs_suck_netint(rs_job_t *job, rs_long_t +@@ -130,13 +130,13 @@ rs_suck_netint(rs_job_t *job, rs_long_t return RS_INTERNAL_ERROR; } @@ -27,10 +27,10 @@ } return RS_DONE; -Index: readsums.c +Index: b/readsums.c =================================================================== ---- readsums.c.orig 2004-02-08 00:17:57.000000000 +0100 -+++ readsums.c 2007-09-02 10:10:50.000000000 +0200 +--- a/readsums.c ++++ b/readsums.c @@ -111,15 +111,15 @@ static rs_result rs_loadsig_s_weak(rs_jo static rs_result rs_loadsig_s_strong(rs_job_t *job) { ++++++ librsync-logn-search.patch ++++++ From: Victor Denisov ( victordenisov ) - 2012-09-24 10:07:15 PDT URL: http://sourceforge.net/tracker/?func=detail&aid=3571263&group_id=56125&atid=479441 Subject: performance issue resolution for large files - ID: 3571263 When files being rsynced are hundreds of Gbytes size collisions in hash table kill librsync. So linear collision resolution has been replaced with log n collision resolution based on binary search. Size of hash table is 65536 buckets. So when files size is (block_size * 65536 * t) then linear collision resolution is t / (log t) slower than binary search resolution. If block size is 2048 bytes then for 1TB speed up is 630 times. for 100GB - 80 times. Index: b/search.c =================================================================== --- a/search.c +++ b/search.c @@ -48,57 +48,73 @@ #include "search.h" #include "checksum.h" - -#define TABLESIZE (1<<16) +#define TABLE_SIZE (1<<16) #define NULL_TAG (-1) - #define gettag2(s1,s2) (((s1) + (s2)) & 0xFFFF) #define gettag(sum) gettag2((sum)&0xFFFF,(sum)>>16) - -static int -rs_compare_targets(rs_target_t const *t1, rs_target_t const *t2) -{ - return ((int) t1->t - (int) t2->t); -} - - rs_result rs_build_hash_table(rs_signature_t * sums) { - int i; + int rs_compare_targets(void const *a1, void const *a2) { + rs_target_t const *t1 = a1; + rs_target_t const *t2 = a2; + + int v = (int) t1->t - (int) t2->t; + if (v != 0) + return v; + + rs_weak_sum_t w1 = sums->block_sigs[t1->i].weak_sum; + rs_weak_sum_t w2 = sums->block_sigs[t2->i].weak_sum; + + v = (w1 > w2) - (w1 < w2); + if (v != 0) + return v; + + return memcmp(sums->block_sigs[t1->i].strong_sum, + sums->block_sigs[t2->i].strong_sum, + sums->strong_sum_len); + } + + int i; - sums->tag_table = calloc(TABLESIZE, sizeof sums->tag_table[0]); + sums->tag_table = calloc(TABLE_SIZE, sizeof(sums->tag_table[0])); if (!sums->tag_table) return RS_MEM_ERROR; if (sums->count > 0) { sums->targets = calloc(sums->count, sizeof(rs_target_t)); - if (!sums->targets) + if (!sums->targets) { + free(sums->tag_table); + sums->tag_table = NULL; return RS_MEM_ERROR; + } for (i = 0; i < sums->count; i++) { sums->targets[i].i = i; sums->targets[i].t = gettag(sums->block_sigs[i].weak_sum); } - /* FIXME: Perhaps if this operating system has comparison_fn_t - * like GNU, then use it in the cast. But really does anyone - * care? */ qsort(sums->targets, sums->count, sizeof(sums->targets[0]), - (int (*)(const void *, const void *)) rs_compare_targets); + rs_compare_targets); } - for (i = 0; i < TABLESIZE; i++) - sums->tag_table[i] = NULL_TAG; + for (i = 0; i < TABLE_SIZE; i++) { + sums->tag_table[i].l = NULL_TAG; + sums->tag_table[i].r = NULL_TAG; + } for (i = sums->count - 1; i >= 0; i--) { - sums->tag_table[sums->targets[i].t] = i; + sums->tag_table[sums->targets[i].t].l = i; } - rs_trace("done"); + for (i = 0; i < sums->count; i++) { + sums->tag_table[sums->targets[i].t].r = i; + } + + rs_trace("rs_build_hash_table done"); return RS_DONE; } @@ -119,44 +135,39 @@ rs_search_for_block(rs_weak_sum_t weak_s rs_signature_t const *sig, rs_stats_t * stats, rs_long_t * match_where) { - int hash_tag = gettag(weak_sum); - int j = sig->tag_table[hash_tag]; - rs_strong_sum_t strong_sum; - int got_strong = 0; + rs_strong_sum_t strong_sum; + int got_strong = 0; + int hash_tag = gettag(weak_sum); + tag_table_entry_t *bucket = &(sig->tag_table[hash_tag]); + int l = bucket->l; + int r = bucket->r + 1; + int v = 1; - if (j == NULL_TAG) { + if (l == NULL_TAG) return 0; - } - - for (; j < sig->count && sig->targets[j].t == hash_tag; j++) { - int i = sig->targets[j].i; - int token; - - if (weak_sum != sig->block_sigs[i].weak_sum) - continue; - token = sig->block_sigs[i].i; - - rs_trace("found weak match for %08x in token %d", weak_sum, token); - - if (!got_strong) { - rs_calc_strong_sum(inbuf, block_len, &strong_sum); - got_strong = 1; + while (l < r) { + int m = (l + r) >> 1; + int i = sig->targets[m].i; + rs_block_sig_t *b = &(sig->block_sigs[i]); + v = (weak_sum > b->weak_sum) - (weak_sum < b->weak_sum); + if (v == 0) { + if (!got_strong) { + rs_calc_strong_sum(inbuf, block_len, &strong_sum); + got_strong = 1; + } + v = memcmp(strong_sum, b->strong_sum, sig->strong_sum_len); } - - /* FIXME: Use correct dynamic sum length! */ - if (memcmp(strong_sum, sig->block_sigs[i].strong_sum, - sig->strong_sum_len) == 0) { - /* XXX: This is a remnant of rsync: token number 1 is the - * block at offset 0. It would be good to clear this - * up. */ + if (0 == v) { + int token = b->i; *match_where = (rs_long_t)(token - 1) * sig->block_len; - return 1; - } else { - rs_trace("this was a false positive, the strong sig doesn't match"); - stats->false_matches++; + break; } - } - return 0; + if (v > 0) + l = m + 1; + else + r = m; + } + return !v; } ++++++ librsync-logn-sumset.patch ++++++ From: Victor Denisov ( victordenisov ) - 2012-09-24 10:07:15 PDT URL: http://sourceforge.net/tracker/?func=detail&aid=3571263&group_id=56125&atid=479441 Subject: performance issue resolution for large files - ID: 3571263 When files being rsynced are hundreds of Gbytes size collisions in hash table kill librsync. So linear collision resolution has been replaced with log n collision resolution based on binary search. Size of hash table is 65536 buckets. So when files size is (block_size * 65536 * t) then linear collision resolution is t / (log t) slower than binary search resolution. If block size is 2048 bytes then for 1TB speed up is 630 times. for 100GB - 80 times. Index: b/sumset.h =================================================================== --- a/sumset.h +++ b/sumset.h @@ -39,6 +39,11 @@ typedef struct rs_target { typedef struct rs_block_sig rs_block_sig_t; +typedef struct tag_table_entry { + int l; + int r; +} tag_table_entry_t ; + /* * This structure describes all the sums generated for an instance of * a file. It incorporates some redundancy to make it easier to @@ -50,8 +55,8 @@ struct rs_signature { int remainder; /* flength % block_length */ int block_len; /* block_length */ int strong_sum_len; - rs_block_sig_t *block_sigs; /* points to info for each chunk */ - int *tag_table; + rs_block_sig_t *block_sigs; /* points to info for each chunk */ + tag_table_entry_t *tag_table; rs_target_t *targets; }; ++++++ librsync-man-example.diff ++++++ From: Matt McCutchen ( hashproduct ) - 2009-11-20 17:06:56 PST URL: http://sourceforge.net/tracker/?func=detail&aid=2901539&group_id=56125&atid=479441 Subject: Add an example to the rdiff(1) man page - ID: 2901539 For a user who just wants to use rdiff(1) to perform the three phases of the delta-transfer algorithm, the man page is pretty sparse. The patch adds a complete example and also adds a reference to the rsync technical report in both the rdiff(1) and librsync(3) man pages. It is based on the following explanation I gave on the rsync list: http://lists.samba.org/archive/rsync/2009-November/024261.html Index: b/doc/librsync.3 =================================================================== --- a/doc/librsync.3 +++ b/doc/librsync.3 @@ -53,6 +53,8 @@ scriptable access to rsync functions. .PP .I rdiff and librsync Manual .PP +\fIhttp://rsync.samba.org/tech_report/\fP +.PP \fIhttp://rproxy.sourceforge.net/\fP or \fIhttp://linuxcare.com.au/rproxy/\fP. .PP \fIdraft-pool-rsync\fP Index: b/doc/rdiff.1 =================================================================== --- a/doc/rdiff.1 +++ b/doc/rdiff.1 @@ -33,7 +33,22 @@ rdiff \- compute and apply signature-bas You can use \fBrdiff\fP to update files, much like \fBrsync\fP does. However, unlike \fBrsync\fP, \fBrdiff\fP puts you in control. There are three steps to updating a file: \fBsignature\fP, \fBdelta\fP, and -\fBpatch\fP. +\fBpatch\fP. Here is an example of the entire process, assuming you start +with files \fIsrc/file\fP and \fIdest/file.old\fP: +.PP +.RS +(cd dest && rdiff signature file.old file.sig) +.br +cp dest/file.sig src/ +.br +(cd src && rdiff delta file.sig file file.delta) +.br +cp src/file.delta dest/ +.br +(cd dest && rdiff patch file file.delta file) +.RE +.PP +Now \fIdest/file\fP is a copy of \fIsrc/file\fP. .SH DESCRIPTION In every case where a filename must be specified, \- may be used instead to mean either standard input or standard output as @@ -46,6 +61,8 @@ found, invalid options, IO error, etc), an internal error or unhandled situation in librsync or rdiff. .SH "SEE ALSO" .BR librsync "(3)" +.PP +\fIhttp://rsync.samba.org/tech_report/\fP .SH "AUTHOR" Martin Pool <[email protected]> .PP ++++++ series ++++++ # Patch series file for quilt, created by quilt setup # Source: librsync-0.9.7.tar.bz2 # Patchdir: librsync-0.9.7 # librsync-0.9.7-strictalias.diff librsync-0.9.7-largefiles.patch librsync-logn-search.patch librsync-logn-sumset.patch librsync-man-example.diff -- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
