Dear Dr. Anja Lehmann: On Wed, Apr 21, 2010 at 5:35 AM, Anja Lehmann <anja.lehm...@minicrypt.de> wrote: > > thanks for your email. It would indeed be great to see our combiners used in > practice or even in an IETF proposal! Just a few questions from my side ... > who is actually "behind" that project? Is is just an open community or is > there any company involved? And will all the implementations be open-source?
Why did you ask if there is a company involved? Did you apply for a patent on Comb4P? > From a theoretical point of view, truncating the final output harms a lot of > security properties. It was shown that by truncating the output of a > collision resistant hash function (considered as black-box) one would lose > the collision-resistance guarantee [1]. A similar problem seems to exist > with TCR and the MAC property as well. On the other hand, it is known that > the PRF-property is closed under truncation, i.e., any truncated output of a > PRF is still a pseudorandom value. I'm aware of the theoretical result that you cite that a combiner of two black-box hash functions can't preserve collision resistance while outputting less than double-length outputs [1]. But in practice we don't have two black boxes—we have two specific hash functions and a combiner of them could in fact have collision resistance even with an output length shorter than double-size. The performance and human-factors advantages of a smaller output (e.g. 256 bits instead of 512 bits) may turn out to be enough that we need to consider a construction which sacrifices those theoretical proofs but which seems to be safe in practice. For example, here is a notation for Comb4P (modelled on your thesis): <i>₈ is the binary representation of the integer i in 8 bits. Hⁿₓ(·)=Hₓ(<n>₈||·) Hⁿ⊻(·)=Hⁿ₀(·)⊕Hⁿ₁(·) Comb4P(M)= Tx←H⁰₁(M) Ty←Tx⊕H⁰₀(M) Tz←H²⊻(Ty) return Ty⊕H³⊻(Tz)||Tz (Note: the only difference from your thesis is that this uses 8 bits instead of 2 bits to separate the different uses of H.) So given that we have Comb4P(M), then we might consider Comb4Pshorthash(M): Comb4Pshorthash(M)=H⁴₂(Comb4P(M)) To me, it seems very unlikely that any real H⁴₂ (such as MD5, SHA1, SHA-256, Tiger, or a SHA-3 finalist) would be susceptible to collisions (much less other failures) when the inputs to it are so constrained and so difficult for the attacker to manipulate. A more efficient and simpler function would be Comb4Pshortxor: Comb4Pshortxor(M)=left_half_of(Comb4P(M)) ⊕ right_half_of(Comb4P(M)) equivalently: Comb4Pshortxor(M)= Tx←H⁰₁(M) Ty←Tx⊕H⁰₀(M) Tz←H²⊻(Ty) return Ty⊕H³⊻(Tz)⊕Tz I agree of course that Comb4P is clearly better than Comb4Pshorthash or Comb4Pshortxor when we can afford to store and use a full 512-bit output value. Comb4P provably preserves four properties, and neither Comb4Pshorthash nor Comb4Pshortxor can say that. On the other hand, compare Comb4Pshortxor to H₀. It would seem extremely unlikely that Comb4Pshortxor[H₀, H₁] is any weaker than H₀ in any real sense. (If it helps, imagine that H₀ could be SHA-256 and H₁ could be a SHA-3 candidate such as Shabal-256.) By the same argument, Comb4Pshortxor[H₀, H₁] is probably not any weaker than H₁. So, while we should certainly keep in mind that Comb4P has provable security properties that Comb4Pshort* lack, I think it is quite reasonable to assume that using Comb4Pshort* is safer than using a single hash function. How much it matters whether the output is 256-bits or 512-bits is an empirical question that remains to be seen. Our Google Summer of Code Student Yu Xue will hopefully write benchmarks for Comb4P before the summer is out [2]. Prof. Gligoroski has invited his students to measure SHA-3 candidates in the context of Tahoe-LAFS [3] (see also my follow-up: [4]). François Deppierraz has posted some measurements of the current Tahoe-LAFS (which uses SHA-256d and AES-256) on his low-power ARM system: [5]. Another reason that it might turn out to matter is human-factors rather than performance. We use the outputs of hash function in our keys, and the shorter those keys are, the better it is for user interfaces which choose to expose those keys directly to users: [6]. We use secure hash functions in various ways in Tahoe-LAFS but in most cases we use the secure hash function in a Merkle Tree. In addition to the Merkle Trees that we already use, I recently got very excited about the prospect of using a Merkle Signature Scheme (e.g. [7]) for our "100 Year Cryptography" project's digital signatures. Those schemes have traditionally been too inefficient in CPU, especially for key generation, but I'm hoping that we can overcome that performance problem. The concrete CPU performance of the underlying hash function could make a signficant difference in whether such a scheme is efficient enough for our use. If we were to use Comb4P as the underlying hash function in a Merkle Tree, the performance difference between a full 512-bit hash function output and a 256-bit hash function output could be almost a factor of two in the overall performance of the Merkle Tree. On the other hand, our experiments and measurements might tell us that the full Comb4P is efficient enough. I hope so! Regards, Zooko [1] Krzysztof Pietrzak. Compression from Collisions, or why CRHF Combiners have a Long Output. CRYPTO 2008. [2] http://tahoe-lafs.org/pipermail/tahoe-dev/2010-May/004377.html [3] http://tahoe-lafs.org/pipermail/tahoe-dev/2010-March/004108.html [4] http://tahoe-lafs.org/pipermail/tahoe-dev/2010-April/004272.html [5] http://tahoe-lafs.org/pipermail/tahoe-dev/2010-March/004150.html [6] http://tahoe-lafs.org/trac/tahoe-lafs/ticket/882# Tahoe URIs and gateway URLs are too long and ugly [7] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.95.1374&rep=rep1&type=pdf _______________________________________________ tahoe-dev mailing list tahoe-dev@allmydata.org http://allmydata.org/cgi-bin/mailman/listinfo/tahoe-dev