Re: [Cryptography] Evaluating draft-agl-tls-chacha20poly1305

2013-09-11 Thread Adam Langley
On Tue, Sep 10, 2013 at 10:59 PM, William Allen Simpson
william.allen.simp...@gmail.com wrote:
 I suggest:

ChaCha20 is run with the given key and sequence number nonce and with

the two counter words set to zero.  The first 32 bytes of the 64 byte
output are saved to become the one-time key for Poly1305.  The next 8
bytes of the output are saved to become the per-record input nonce
for this ChaCha20 TLS record.

 Or you could use 16 bytes, and cover all the input fields  There's no
 reason the counter part has to start at 1.

 Of course, this depends on not having a related-key attack, as mentioned
 in my previous messages

It is the case that most of the bottom row bits will be zero. However,
ChaCha20 is assumed to be secure at a 256-bit security level when used
as designed, with the bottom row being counters. If ChaCha/Salsa were
not secure in this formulation then I think they would have to be
abandoned completely.

Nobody worries that AES-CTR is weak when the counter starts at zero, right?

Taking 8 bytes from the initial block and using it as the nonce for
the plaintext encryption would mean that there would be a ~50% chance
of a collision after 2^32 blocks. This issue affects AES-GCM, which is
why the sequence number is used here.

Using 16 bytes from the initial block as the full bottom row would
work, but it still assumes that we're working around a broken cipher
and it prohibits implementations which pipeline all the ChaCha blocks,
including the initial one. That may be usefully faster, although it's
not the implementation path that I've taken so far.

There is an alternative formulation of Salsa/ChaCha that is designed
for random nonces, rather than counters: XSalsa/XChaCha. However,
since we have a sequence number already in TLS I've not used it.


Cheers

AGL
___
The cryptography mailing list
cryptography@metzdowd.com
http://www.metzdowd.com/mailman/listinfo/cryptography


Re: [Cryptography] Evaluating draft-agl-tls-chacha20poly1305

2013-09-11 Thread Adam Langley
[attempt two, because I bounced off the mailing list the first time.]

On Tue, Sep 10, 2013 at 9:35 PM, William Allen Simpson
william.allen.simp...@gmail.com wrote:
ChaCha20 is run with the given key and nonce and with the two counter
words set to zero.  The first 32 bytes of the 64 byte output are
saved to become the one-time key for Poly1305.  The remainder of the
output is discarded.

 Why generate the ICV key this way, instead of using a longer key blob
 from TLS and dividing it?  Is there a related-key attack?

The keying material from the TLS handshake is per-session information.
However, for a polynomial MAC, a unique key is needed per-record and
must be secret. Using stream cipher output as MAC key material is a
trick taken from [1], although it is likely to have more history than
that. (As another example, UMAC/VMAC runs AES-CTR with a separate key
to generate the per-record keys, as did Poly1305 in its original
paper.)

Authenticated decryption is largely the reverse of the encryption
process: the Poly1305 key is generated and the authentication tag
calculated.  The calculated tag is compared against the final 16
bytes of the authenticated ciphertext in constant time.  If they
match, the remaining ciphertext is decrypted to produce the
plaintext.

 If AEAD, aren't the ICV and cipher text generated in parallel?  So how do
 you check the ICV first, then decipher?

The Poly1305 key (ICV in your terms?) is taken from a prefix of the
ChaCha20 stream output. Thus the decryption proceeds as:

1) Generate one block of ChaCha20 keystream and use the first 32 bytes
as a Poly1305 key.
2) Feed Poly1305 the additional data and ciphertext, with the length
prefixing as described in the draft.
3) Verify that the Poly1305 authenticator matches the value in the
received record. If not, the record can be rejected immediately.
4) Run ChaCha20, starting with a counter value of one, to decrypt the
ciphertext.

An alternative implementation is possible where ChaCha20 is run in one
go on a buffer that consists of 64 zeros followed by the ciphertext.
The advantage of this is that it may be faster because the ChaCha20
blocks can be pipelined. The disadvantage is that it may need memory
copies to setup the input buffer correctly. A moot advantage, in the
case of TLS, of the steps that I outlined is that forgeries are
rejected faster.

 Needs a bit more implementation details.  I assume there's an
 implementation in the works.  (Always helps define things with
 something concrete.)

I currently have Chrome talking to OpenSSL, although the code needs
cleanup of course.

[1] http://cr.yp.to/highspeed/naclcrypto-20090310.pdf


Cheers

AGL
___
The cryptography mailing list
cryptography@metzdowd.com
http://www.metzdowd.com/mailman/listinfo/cryptography