Re: [librsync-users] MD4 second-preimage attack

2006-03-04 Thread rsync2eran
Hi,

On 2006-03-01 18:29, Donovan Baarda wrote:
 http://sourceforge.net/mailarchive/forum.php?forum_id=29760max_rows=25style=flatviewmonth=200404viewday=5
 
 If I understand correctly, provided we add random seeds for blocksums,
 these weaknesses would only make attack case 4) easier.

It indeed affects only case 4, regardless of seek randomization.


 MD4 (with known seed) is thus completely broken, making rsync batch mode
 and librsync unsafe to use in malicious environments. Please do consider
 phasing out MD4.
  
 If someone can provide detailed evidence that some other algorithm
 provides more useful-hash-bits/sec, taking into account the random seed,
 I'll be convinced :-)

OpenSSL's implementation of SHA-1 is as fast as its implementation of
MD4, and both are much faster than librsync's built-in MD4. For MD4, an
efficient second-preimage attack is known, whereas for SHA1, even a
collision has not been demonstrated yet (the best attack presently known
takes ~2^63 work). Is this sufficient evidence?


 In any case, the first thing that will be tackled is changing to use the
 libmd/openssl API for hashes, which will make switching what hash we
 want to use painless... a configure option?

Makes sense, and while at it, please consider setting the default block
hash to SHA-1.


 The fastest hash function with no interesting known attacks is SHA-256,
 which is still somewhat expensive (though this can be partially
 addressed by the meta-hash idea discussed on the librsync list last
 July, Re: more info on 25gig files). SHA-1 may also be OK for a while
 despite the known collision attacks, and has acceptable speed.
 
 The whole file checksum is the real protection against attacks, and for
 this I think we should use something strong, which means enough useful
 hash bits to protect against attack. Remember we only calculate this
 once over the whole file, not once for every potential block match.

The whole-file hash would protect against undetected corruption, but
what about denial of service? It can be rather nasty, for example, to
find out in retrospect that one of the the incremental backups has been
corrupted.


 Remember we only use a small part of the md4sum for the strong
 blocksum, and hence are already using less than the full md4sum. We
 don't really need that many bits of hash for the blocksum.

For incremental backup applications, you'd want even the block hash to
have full cryptographic strength. You can't get that even with the full,
untruncated MD4 hash.


 1) Add whole filesum checksums in the signature and delta. The signature
 will have the oldfile filesum, the delta will have the oldfile and
 newfile filesums. Applications will optionally be able to verify the
 oldfile filesum against the delta before applying patches, and patch
 will evaluate the newfile filesum and return an error if it doesn't
 match.

Sounds great. How about making the full file hash default to SHA-256?


 2) Add a blocksum/filesum seed in the signature and delta. This will be
 used as the seed for both filesums and blocksums. It will default to
 random when generating signatures, but can optionally be specified on
 the command-line. The delta and patch operations will use the seed from
 the corresponding signature.

Excellent.


 3) Add variable blocksum size in the signature. An optimal blocksum size
 can be calculated from the filesize, blocksize, and failure probability.
 The blocksize can be specified on the commandline, but will default to
 sqrt(filesize). The blocksum size will be calculated the same as rsync.
 Note that for this to work with streams, the truncated md4 blocksums
 will not be emitted when generating the signature until the oldfile
 input is completely read (to get the filesize). This ties in with
 feature 4)

There's no need to delay output if the block size is specified by the user.

 4) Add blocksum collision detection when generating the signature.
 This will involve keeping the whole md4 blocksums and checking that
 the truncated strongsum doesn't match against an existing block that
 has a different whole md4sum. If a collision is detected, an error
 will be returned. It is up to the application to do things like
 re-try with another seed (Note you can't retry for piped input).

Maybe in some applications the signature stage has little memory, and
would prefer using a large block hash to prevent collisions and turning
off the explicit collision detection to save memory.


 I'm not entirely sure about using the seed for the filesums too. It may
 be sufficient and convenient to use an unseeded SHA-1 or something, and
 the delta would not need to include the seed. However, it is a
 no-overhead addition that should help.

With SHA-1, the file hash should definitely use a randomized seed. For
SHA-256, it probably doesn't matter.

  Eran
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: [librsync-users] MD4 second-preimage attack

2006-03-01 Thread Donovan Baarda
On Tue, 2006-02-21 at 14:58 -0800, [EMAIL PROTECTED] wrote:
 Hi,
 
 A year ago we discussed the strength of the MD4 hash used by rsync and
 librsync, and one of the points mentioned was that only collision
 attacks are known on MD4. Well, a recent paper by Wang et al [1] shows a
 several second preimage attacks. First, there's an algorithm which,
 given a random message and its MD4 hash, finds another message having
 the same hash with probability 2^-56. Second, if the attacker can also
 alter the original message (and thus its hash) slightly, he can find a
 second message having the same hash with just 2^27 MD4 invocations.

For the record, here it is in the SF mailarchive.

http://sourceforge.net/mailarchive/forum.php?forum_id=29760max_rows=25style=flatviewmonth=200404viewday=5

If I understand correctly, provided we add random seeds for blocksums,
these weaknesses would only make attack case 4) easier.

 Doubtless, even stronger attacks will soon be found.
 
 MD4 (with known seed) is thus completely broken, making rsync batch mode
 and librsync unsafe to use in malicious environments. Please do consider
 phasing out MD4.

Remember we only use a small part of the md4sum for the strong
blocksum, and hence are already using less than the full md4sum. We
don't really need that many bits of hash for the blocksum. The important
thing for the blocksum is speed. I think md4 + random seed is pretty
much the best useful hash bits per second for blocksums. We can adjust
the blocksum size to use more bits to compensate for the fact that ~20%
of the bits are insecure if we want.

If someone can provide detailed evidence that some other algorithm
provides more useful-hash-bits/sec, taking into account the random seed,
I'll be convinced :-)

In any case, the first thing that will be tackled is changing to use the
libmd/openssl API for hashes, which will make switching what hash we
want to use painless... a configure option?

 The fastest hash function with no interesting known attacks is SHA-256,
 which is still somewhat expensive (though this can be partially
 addressed by the meta-hash idea discussed on the librsync list last
 July, Re: more info on 25gig files). SHA-1 may also be OK for a while
 despite the known collision attacks, and has acceptable speed.

The whole file checksum is the real protection against attacks, and for
this I think we should use something strong, which means enough useful
hash bits to protect against attack. Remember we only calculate this
once over the whole file, not once for every potential block match.

 Also, as discussed in detail earlier: to thwart some attacks, rsync
 batch mode and librsync should be fixed to use a random seed.
[...]

There are a few things I'm proposing to add to the TODO;

1) Add whole filesum checksums in the signature and delta. The signature
will have the oldfile filesum, the delta will have the oldfile and
newfile filesums. Applications will optionally be able to verify the
oldfile filesum against the delta before applying patches, and patch
will evaluate the newfile filesum and return an error if it doesn't
match.

2) Add a blocksum/filesum seed in the signature and delta. This will be
used as the seed for both filesums and blocksums. It will default to
random when generating signatures, but can optionally be specified on
the command-line. The delta and patch operations will use the seed from
the corresponding signature.

3) Add variable blocksum size in the signature. An optimal blocksum size
can be calculated from the filesize, blocksize, and failure probability.
The blocksize can be specified on the commandline, but will default to
sqrt(filesize). The blocksum size will be calculated the same as rsync.
Note that for this to work with streams, the truncated md4 blocksums
will not be emitted when generating the signature until the oldfile
input is completely read (to get the filesize). This ties in with
feature 4)

4) Add blocksum collision detection when generating the signature. This
will involve keeping the whole md4 blocksums and checking that the
truncated strongsum doesn't match against an existing block that has a
different whole md4sum. If a collision is detected, an error will be
returned. It is up to the application to do things like re-try with
another seed (Note you can't retry for piped input).

I'm not entirely sure about using the seed for the filesums too. It may
be sufficient and convenient to use an unseeded SHA-1 or something, and
the delta would not need to include the seed. However, it is a
no-overhead addition that should help.

-- 
Donovan Baarda [EMAIL PROTECTED]
http://minkirri.apana.org.au/~abo/

-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: [librsync-users] MD4 second-preimage attack

2006-02-21 Thread Martin Pool
On Tue, 2006-02-21 at 14:58 -0800, [EMAIL PROTECTED] wrote:

 A year ago we discussed the strength of the MD4 hash used by rsync and
 librsync, and one of the points mentioned was that only collision
 attacks are known on MD4.

Could you please forward this into the bug tracker so it's not lost?

-- 
Martin



signature.asc
Description: This is a digitally signed message part
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html