Re: [RFC PATCH] Add SHA1 support
On Tuesday, March 17, 2020 9:17:09 PM CET, Sebastian Andrzej Siewior wrote: On 2020-03-17 00:03:03 [+0100], Dimitrios Apostolou via rsync wrote: On Thursday, February 20, 2020 10:34:53 PM CET, Sebastian Andrzej Siewior via rsync wrote: I'm still not sure if rsync requires a cryptographic hash _or_ if a strong hash like xxHash64 would be just fine for the job. I'm fairly sure the hash should *not* be easy to spoof, so I'd say a cryptographic hash is needed. As an example, if a file is replaced by a file of the same size and same hash, rsync (if -c is in use) will consider the file is the same, and avoid copying it. correct. The same goes for currently used md5 which has known collision attacks. So if you intend to spoo it, you can manufacture the same hash for two different files for both algorithms. This was not the case in 2008 when rsync 3.0.0 came out defaulting to MD5. I still think you need a cryptographic hash, even though I am not sure of how strict the requirement is. MD4 was replaced by MD5 in rsync, despite MD4 being 2x faster. I would guess it was replaced because of its weaknesses. Dimitris -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: [RFC PATCH] Add SHA1 support
On 2020-03-17 00:03:03 [+0100], Dimitrios Apostolou via rsync wrote: > On Thursday, February 20, 2020 10:34:53 PM CET, Sebastian Andrzej Siewior > via rsync wrote: > > > > I'm still not sure if rsync requires a cryptographic hash _or_ if a > > strong hash like xxHash64 would be just fine for the job. > > I'm fairly sure the hash should *not* be easy to spoof, so I'd say a > cryptographic hash is needed. > > As an example, if a file is replaced by a file of the same size and same > hash, > rsync (if -c is in use) will consider the file is the same, and avoid > copying it. correct. The same goes for currently used md5 which has known collision attacks. So if you intend to spoo it, you can manufacture the same hash for two different files for both algorithms. The question is how likely it is that this happens by chance. According to [0] xxhash64 scores a solid 10. It is better than crc32 which has been used a lot as a checksum for files. [0] https://github.com/Cyan4973/xxHash > Dimitris Sebastian -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: [RFC PATCH] Add SHA1 support
On Thursday, February 20, 2020 10:34:53 PM CET, Sebastian Andrzej Siewior via rsync wrote: I'm still not sure if rsync requires a cryptographic hash _or_ if a strong hash like xxHash64 would be just fine for the job. I'm fairly sure the hash should *not* be easy to spoof, so I'd say a cryptographic hash is needed. As an example, if a file is replaced by a file of the same size and same hash, rsync (if -c is in use) will consider the file is the same, and avoid copying it. Dimitris -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: [RFC PATCH] Add SHA1 support
On 2020-02-20 20:06:39 [+0100], Markus Ueberall wrote: > On 2020-02-09 23:19, Sebastian Andrzej Siewior wrote: > > [...] > > My primar motivation to use SHA1 for checksumming (by default) instead > > of MD5 is not the additional security bits but performance. On a decent > > x86 box the SHA1 performance is almost the same as MD5's but with > > acceleration it outperforms MD5. > > > > The other alternative would be to go for xxHash64 [0] which has the > > superior performance but provides a non-cryptographic hash so I though > > SHA1 would be better here. > > [...] > > With respect to *both* speed and security, wouldn't BLAKE3 be a better, > modern alternative if we're looking at checksumming? > It's "[r]eleased into the public domain with CC0 1.0. Alternatively, it is > licensed under the Apache License 2.0". And the performance (see the chart > at https://github.com/BLAKE3-team/BLAKE3) is *impressive* ... I'm still not sure if rsync requires a cryptographic hash _or_ if a strong hash like xxHash64 would be just fine for the job. > Kind regards, Markus Sebastian -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: [RFC PATCH] Add SHA1 support
On 2020-02-09 23:19, Sebastian Andrzej Siewior wrote: [...] My primar motivation to use SHA1 for checksumming (by default) instead of MD5 is not the additional security bits but performance. On a decent x86 box the SHA1 performance is almost the same as MD5's but with acceleration it outperforms MD5. The other alternative would be to go for xxHash64 [0] which has the superior performance but provides a non-cryptographic hash so I though SHA1 would be better here. [...] With respect to *both* speed and security, wouldn't BLAKE3 be a better, modern alternative if we're looking at checksumming? It's "[r]eleased into the public domain with CC0 1.0. Alternatively, it is licensed under the Apache License 2.0". And the performance (see the chart at https://github.com/BLAKE3-team/BLAKE3) is *impressive* ... Kind regards, Markus -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
[RFC PATCH] Add SHA1 support
From: Sebastian Andrzej Siewior This is a huge all-in-one patch and deserves a little cleanup and splitting. However, I wanted to get it out here for some feedback. My primar motivation to use SHA1 for checksumming (by default) instead of MD5 is not the additional security bits but performance. On a decent x86 box the SHA1 performance is almost the same as MD5's but with acceleration it outperforms MD5. The other alternative would be to go for xxHash64 [0] which has the superior performance but provides a non-cryptographic hash so I though SHA1 would be better here. For linking against OpenSSL as of today the rsync license would need an "OpenSSL exception" [1]. The master branch of OpenSSL is licensed under the Apache License 2.0 so we could wait until 3.0 is released and use the C version of the algorithm in the meantime. Here are numbers from a ryzen test box: small file: |$ dd if=/dev/zero of=/dev/shm/out bs=1073741824 count=1 |1+0 records in |1+0 records out |1073741824 bytes (1,1 GB, 1,0 GiB) copied, 0,503252 s, 2,1 GB/s Old hash: |$ time ./rsync -c /dev/shm/out --checksum-choice=md4 |-rw-r--r-- 1,073,741,824 2020/02/08 16:34:42 out | |real0m1,064s |user0m0,984s |sys 0m0,080s MD5 from openssl (should match built-in speed): |$ time ./rsync -c /dev/shm/out --checksum-choice=md5 |-rw-r--r-- 1,073,741,824 2020/02/08 16:34:42 out | |real0m1,433s |user0m1,293s |sys 0m0,140s SHA1 from openssl: |$ time ./rsync -c /dev/shm/out --checksum-choice=sha1 |-rw-r--r-- 1,073,741,824 2020/02/08 16:34:42 out | |real0m0,619s |user0m0,524s |sys 0m0,096s SHA1 from the built-in code: |time ./rsync -c /dev/shm/out --checksum-choice=sha1 |-rw-r--r-- 1,073,741,824 2020/02/08 16:34:42 out | |real0m1,561s |user0m1,465s |sys 0m0,096s [1] https://opensource.stackexchange.com/questions/2233/gpl-v3-with-openssl-exception [0] https://github.com/Cyan4973/xxHash Signed-off-by: Sebastian Andrzej Siewior --- Makefile.in | 4 +- checksum.c| 144 + configure.ac | 5 + lib/md32_common.h | 258 + lib/md5.c | 15 +- lib/mdigest.h | 77 - lib/sha1.c| 19 +++ lib/sha1.h| 20 +++ lib/sha_local.h | 401 ++ main.c| 2 + rsync.h | 2 +- 11 files changed, 902 insertions(+), 45 deletions(-) create mode 100644 lib/md32_common.h create mode 100644 lib/sha1.c create mode 100644 lib/sha1.h create mode 100644 lib/sha_local.h diff --git a/Makefile.in b/Makefile.in index 9bb977eb6b0a8..a390afe4ed829 100644 --- a/Makefile.in +++ b/Makefile.in @@ -32,7 +32,7 @@ VERSION=@RSYNC_VERSION@ GENFILES=configure.sh aclocal.m4 config.h.in proto.h proto.h-tstamp rsync.1 rsyncd.conf.5 HEADERS=byteorder.h config.h errcode.h proto.h rsync.h ifuncs.h itypes.h inums.h \ lib/pool_alloc.h -LIBOBJ=lib/wildmatch.o lib/compat.o lib/snprintf.o lib/mdfour.o lib/md5.o \ +LIBOBJ=lib/wildmatch.o lib/compat.o lib/snprintf.o lib/mdfour.o lib/md5.o lib/sha1.o \ lib/permstring.o lib/pool_alloc.o lib/sysacls.o lib/sysxattrs.o @LIBOBJS@ zlib_OBJS=zlib/deflate.o zlib/inffast.o zlib/inflate.o zlib/inftrees.o \ zlib/trees.o zlib/zutil.o zlib/adler32.o zlib/compress.o zlib/crc32.o diff --git a/checksum.c b/checksum.c index 3295252ba0120..77c36b59c93ec 100644 --- a/checksum.c +++ b/checksum.c @@ -32,6 +32,7 @@ extern char *checksum_choice; #define CSUM_MD4_OLD 3 #define CSUM_MD4 4 #define CSUM_MD5 5 +#define CSUM_SHA1 6 int xfersum_type = 0; /* used for the file transfer checksums */ int checksum_type = 0; /* used for the pre-transfer (--checksum) checksums */ @@ -54,6 +55,8 @@ int parse_csum_name(const char *name, int len) len = strlen(name); if (!name || (len == 4 && strncasecmp(name, "auto", 4) == 0)) { + if (protocol_version >= 31) + return CSUM_SHA1; if (protocol_version >= 30) return CSUM_MD5; if (protocol_version >= 27) @@ -68,6 +71,8 @@ int parse_csum_name(const char *name, int len) return CSUM_MD5; if (len == 4 && strncasecmp(name, "none", 4) == 0) return CSUM_NONE; + if (len == 4 && strncasecmp(name, "sha1", 4) == 0) + return CSUM_SHA1; rprintf(FERROR, "unknown checksum name: %s\n", name); exit_cleanup(RERR_UNSUPPORTED); @@ -88,6 +93,8 @@ int csum_len_for_type(int cst, BOOL flist_csum) return MD4_DIGEST_LEN; case CSUM_MD5: return MD5_DIGEST_LEN; + case CSUM_SHA1: + return SHA1_DIGEST_LEN; default: /* paranoia to prevent missing case values */ exit_cleanup(RERR_UNSUPPORTED); } @@ -121,30 +128,48 @@ uint32 get_checksum1(char *buf1, int32 len) return (s1 & 0x) + (s2 << 16); } +static void