Re: AES Modes

2004-11-01 Thread David A. McGrew
DJ,
On Oct 13, 2004, at 10:59 AM, [EMAIL PROTECTED] wrote:
On the IEEE 802 standards track, CCM and GCM have traction. CCM has 
been
in 802.11 for a while and the 802.16-2004 was published last week,
supplanting the broken DES-CBC mode with AES-CCM. For wireless 
systems, we
know and like CCM and it's going to take a lot to oust it.

I'm aware of a handful of other wireless protocols in development that
appear to be all headed the CCM way.
GCM proposed for 802.1ae linksec. This is the 'fast' mode for wired 
ethernet.

For packetised traffic below 1Gbps, CCM works just great. The CTR and 
CBC
parts of CCM run in parallel in hardware, making the basic latency = 1 
AES
(which is 11 clocks in a simple implementation). With a bit of HW loop
unrolling and pipelining, I can do CCM upto several gigabits.

CCM nicely matches the structure of packets. Namely
1) Get header - Process additional authenticated data
2) Get payload - Process MAC and encryption in parallel.
So it is not a bear to implement in a typical communications datapath 
IC,
where things are presented to you in this order.

GCM allows block level parallelism up to a point.
I'm not sure what you mean here.  Is there some limitation that you 
have in mind?

Hence enabling me to put
lots of parallel AES blocks on a chip and go at multi gigabits without
breaking a sweat. It does however have all that Galois arithmetic to do
per block, which increases the path depth a bit.
There is however a fundamental speed problem with packet oriented AES
based modes. The parallelism you can achieve on things like GCM 
requires
that you have multiple blocks to run in parallel. If I get a large 
number
of small packets, each  128 bits long, then there's nothing to do in
parallel at the block level and so my speed limit is determined by how
fast I can run 11 rounds of AES.
I'm not quite sure what you mean here, but GCM can be implemented with 
a single AES pipeline and a single GF(2^128) multiplier such that data 
from multiple packets can be in the pipeline at the same time.  In 
fact, this was a design requirement for GCM; the other authenticated 
encryption modes lack this property and their performance suffers for 
it when used for packet encryption at high data rates.  There is an 
analysis of this issue in Section 3.1. of 
http://eprint.iacr.org/2004/193/ that includes a comparison between 
GCM, CWC, and OCB.  Here is the result: GCM is the only authenticated 
encryption mode that doesn't stall its AES pipeline each time a new 
packet comes in, and this enables it to go twice as fast as CWC and 
three times as fast as OCB on Internet data, in the case in which a 
single AES pipeline is used.  (Other modes aren't in the contest, since 
they use chaining.)

This may come to bite us in the future and when we start having to 
protect
data pushing the terabits, we either need larger packets or something
different in the crypto. One way is to protect over large packet
aggregates, but no 802 level protocol is set up to support it. Stream
ciphers look attractive here, we can make them go *really* fast in
hardware.
CTR and GCM can be fully pipelined; I'm not sure what a stream cipher 
could do that would be better.  I guess one potential improvement would 
be lower latency due to reduced pipeline depth.  But I don't understand 
where a stream cipher would get the increase in data throughput that 
you're alluding to.

Best regards,
David
Another frustrating aspect of the current crop of modes is frame
expansion. Throwing in an 8 byte nonce and an 8 byte ICV per packet is 
a
heavy overhead. Deriving nonces from the protocol state is generally 
not
wise since the frame counts are either non existant (802.3, 802.11) or 
not
secured (802.16).

In the coming years, I would like to see link protocols (I.E. 802.*) 
move
link security down to the PHY, to protect management and data traffic
equally, to secure the protocol state as well as data and so to reduce
packet overhead. I would also like to see the standardized crypto and
protocols be structured to work over larger aggregates of packets,
protecting the structure of transmission as well as the content and
allowing much greater levels of parallelism in the HW implementations.

Obviously, none of this is very relevant above layer 2.
Regards,
DJ
From: Ian Grigg [EMAIL PROTECTED]
Sent: Oct 10, 2004 11:11 AM
To: Metzdowd Crypto [EMAIL PROTECTED]
Subject: AES Modes

I'm looking for basic mode to encrypt blocks (using AES)
of about 1k in length, +/- an order of magnitude.  Looking
at the above table (2nd link) there are oodles of proposed
ones.

It would be nice to have a mode that didn't also require
a separate MAC operation - I get the impression that
this is behind some of the proposals?
I think CCM is just about perfect for this goal.  The MAC isn't free, 
but
it's integrated into the chaining mode.  There are also some patented
modes that provide a MAC for almost no extra computation(OCB, IACBC), 
and
some proposed modes

Re: AES Modes

2004-10-19 Thread Eric Young
Quoting Brian Gladman [EMAIL PROTECTED]:

 Ian Grigg wrote:
 
  Jack Lloyd also passed along lots of good comments I'd
  like to forward (having gained permission) FTR.  I've
  edited them for brevity and pertinence.
 
 [snip]
   I'm obviously being naive here ... I had thought that the combined 
  mode would
be faster, as it would run through the data once only, and that AES 
  seems to
clip along faster than SHA1.
  
  AFAIK all of the modes that use only one block cipher invocation per 
  block of
  input are patented. EAX+CCM both use two AES operations per block, and
  byte-for-byte SHA-1 is 2-5x faster than AES (at least in the 
  implementations
  I've seen/used/written), so using AES+HMAC is probably going to be 
  faster than
  AES/EAX or AES/CCM. The obvious exception being boxes with hardware AES 
  chips
  and slow CPUs (eg, an ARM7 with an AES coprocessor), where AES will of 
  course
  be much faster than SHA-1.
 
 Maybe my C implementation of SHA1 is hopeless but I get SHA1 on an x86 
 at about 17 cycles per byte (over 100,000 bytes) and AES in C at 21 
 cycles per byte.
 
 So I would put these two algorihms at about the same speed in C. In 
 consequence I rather suspect that the 'two encryptions per block' cost 
 might also apply to combined modes when AES is used with HMAC-SHA1.

Are you running on a P4?  ASM for sha1 on a P4 takes about 11.9 cycles 
per byte.  The P4 is a very touchy x86 implementation.
On most other architectures I nearly always see a bit less than 2 times faster
sha1 vs AES.  On AMD64, asm, I have
AES-cbc at 12.2 cycles per byte and sha1 at 6.8.  This is about
as good a CPU as it gets (IPC near 3 for both implementations).

eric


-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: AES Modes

2004-10-19 Thread Brian Gladman
Eric Young wrote:
Quoting Brian Gladman [EMAIL PROTECTED]:

Ian Grigg wrote:

Jack Lloyd also passed along lots of good comments I'd
like to forward (having gained permission) FTR.  I've
edited them for brevity and pertinence.
[snip]
I'm obviously being naive here ... I had thought that the combined 
mode would
 be faster, as it would run through the data once only, and that AES 
seems to
 clip along faster than SHA1.

AFAIK all of the modes that use only one block cipher invocation per 
block of
input are patented. EAX+CCM both use two AES operations per block, and
byte-for-byte SHA-1 is 2-5x faster than AES (at least in the 
implementations
I've seen/used/written), so using AES+HMAC is probably going to be 
faster than
AES/EAX or AES/CCM. The obvious exception being boxes with hardware AES 
chips
and slow CPUs (eg, an ARM7 with an AES coprocessor), where AES will of 
course
be much faster than SHA-1.
Maybe my C implementation of SHA1 is hopeless but I get SHA1 on an x86 
at about 17 cycles per byte (over 100,000 bytes) and AES in C at 21 
cycles per byte.

So I would put these two algorihms at about the same speed in C. In 
consequence I rather suspect that the 'two encryptions per block' cost 
might also apply to combined modes when AES is used with HMAC-SHA1.
Are you running on a P4?  ASM for sha1 on a P4 takes about 11.9 cycles 
per byte.  The P4 is a very touchy x86 implementation.
On most other architectures I nearly always see a bit less than 2 times faster
sha1 vs AES.  On AMD64, asm, I have
AES-cbc at 12.2 cycles per byte and sha1 at 6.8.  This is about
as good a CPU as it gets (IPC near 3 for both implementations).
The SHA1 figure is for a P3 using VC++ set to generate code that will 
run on all Pentium family machines.  I have not optimised the C code for 
any particular machine. 17/12 for C/ASM is a bit worse than I would have 
hoped for but is not that bad.

I would not be surprised to see an average AES/SHA1 speed comparison in 
the 1.5:2.5 range but I was a bit surprised to see Jack's 2.0:5.0 range.

I will have to see if VC++ can be coaxed down from 17 cycles per byte 
for SHA1 without giving up on code that runs on all Pentium compatible 
machines :-)

   Brian Gladman
-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: AES Modes

2004-10-13 Thread Brian Gladman
Ian Grigg wrote:
Jack Lloyd also passed along lots of good comments I'd
like to forward (having gained permission) FTR.  I've
edited them for brevity and pertinence.
[snip]
 I'm obviously being naive here ... I had thought that the combined 
mode would
  be faster, as it would run through the data once only, and that AES 
seems to
  clip along faster than SHA1.

AFAIK all of the modes that use only one block cipher invocation per 
block of
input are patented. EAX+CCM both use two AES operations per block, and
byte-for-byte SHA-1 is 2-5x faster than AES (at least in the 
implementations
I've seen/used/written), so using AES+HMAC is probably going to be 
faster than
AES/EAX or AES/CCM. The obvious exception being boxes with hardware AES 
chips
and slow CPUs (eg, an ARM7 with an AES coprocessor), where AES will of 
course
be much faster than SHA-1.
Maybe my C implementation of SHA1 is hopeless but I get SHA1 on an x86 
at about 17 cycles per byte (over 100,000 bytes) and AES in C at 21 
cycles per byte.

So I would put these two algorihms at about the same speed in C. In 
consequence I rather suspect that the 'two encryptions per block' cost 
might also apply to combined modes when AES is used with HMAC-SHA1.

Rich Schroeppel's CS mode has been added to the NIST modes list earlier 
this year and is not patented. It seems to have a cost that is close to 
'one encryption per block' but it has the 'interesting' property of 
using the internal 'mid-point' state of the cipher algorithm that is in use.

   Brian Gladman
-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: AES Modes

2004-10-12 Thread Ian Grigg
Jack Lloyd also passed along lots of good comments I'd
like to forward (having gained permission) FTR.  I've
edited them for brevity and pertinence.
Jack Lloyd wrote:
 If it's small messages, CCM would probably work pretty well. Personally I think
 CCM is really poorly designed (in terms of easy implementation/usage), but take
 a look. There is also EAX, which is IMO significantly nicer. There are a ton of
 others (most of the ones on the page you link to support encrypt+MAC), but it
 seems like EAX and CCM are the only two that are going anywhere (many of the
 others are patented and/or rather painful to implement).

 CCM and EAX are both going to be slower than AES+HMAC because they use AES in
 some variant of CBC-MAC. Some of the others have faster MACs, mostly ones based
 on universal hash functions, but the best of them (OCB in particular) have been
 patented.
I'm obviously being naive here ... I had thought that
the combined mode would be faster, as it would run through
the data once only, and that AES seems to clip along
faster than SHA1.
Are you saying that as far as speed goes, I may as well
do EAS (using CBC) and add a HMAC on the end?
Or are you saying that only the patented ones manage to
deliver the savings we all expect?  Hmm, reading about
OCB on Phil Rogaway's site does clarify this somewhat.
http://www.cs.ucdavis.edu/~rogaway/ocb/ocb-back.htm
iang
== To which jack replied:
I'm obviously being naive here ... I had thought that the combined mode would
 be faster, as it would run through the data once only, and that AES seems to
 clip along faster than SHA1.
AFAIK all of the modes that use only one block cipher invocation per block of
input are patented. EAX+CCM both use two AES operations per block, and
byte-for-byte SHA-1 is 2-5x faster than AES (at least in the implementations
I've seen/used/written), so using AES+HMAC is probably going to be faster than
AES/EAX or AES/CCM. The obvious exception being boxes with hardware AES chips
and slow CPUs (eg, an ARM7 with an AES coprocessor), where AES will of course
be much faster than SHA-1.
 Are you saying that as far as speed goes, I may as well do EAS (using CBC)
 and add a HMAC on the end?
At least on general purpose CPUs, yes.
 Or are you saying that only the patented ones manage to deliver the savings
 we all expect?  Hmm, reading about OCB on Phil Rogaway's site does clarify
 this somewhat.  http://www.cs.ucdavis.edu/~rogaway/ocb/ocb-back.htm
Pretty much. Though I just remembered that CWC has not been patented by it's
creators, but I wouldn't be at all surprised if it was covered by one of the
others. Even CWC is probably slower than AES+HMAC is software, though
apparently it's pretty fast in hardware.
-Jack
-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: AES Modes

2004-10-12 Thread John Kelsey
From: Ian Grigg [EMAIL PROTECTED]
Sent: Oct 10, 2004 11:11 AM
To: Metzdowd Crypto [EMAIL PROTECTED]
Subject: AES Modes


I'm looking for basic mode to encrypt blocks (using AES)
of about 1k in length, +/- an order of magnitude.  Looking
at the above table (2nd link) there are oodles of proposed
ones.

It would be nice to have a mode that didn't also require
a separate MAC operation - I get the impression that
this is behind some of the proposals?

I think CCM is just about perfect for this goal.  The MAC isn't free, but it's 
integrated into the chaining mode.  There are also some patented modes that provide a 
MAC for almost no extra computation(OCB, IACBC), and some proposed modes that combine 
an efficient, parallelizeable MAC with encryption in a secure way (CWC,GCM), though 
none of these are standards yet.

iang

--John

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: AES Modes

2004-10-11 Thread Brian Gladman
Ian Grigg wrote:
Has anyone kept up to date with AES modes?
http://csrc.nist.gov/CryptoToolkit/modes
http://csrc.nist.gov/CryptoToolkit/modes/proposedmodes/
I'm looking for basic mode to encrypt blocks (using AES)
of about 1k in length, +/- an order of magnitude.  Looking
at the above table (2nd link) there are oodles of proposed
ones.
It would be nice to have a mode that didn't also require
a separate MAC operation - I get the impression that
this is behind some of the proposals?
I provide some code and some speed comparison data for some of the AES 
modes here:

  http://fp.gladman.plus.com/AES/index.htm
I focus mainly on the combined encryption/authentication modes but I 
only cover those that I believe are free of licensing costs.

Brian Gladman
-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: AES Modes

2004-10-11 Thread Ian Grigg
Zooko provided a bunch of useful comments in private mail,
which I've edited and forward for list consumption.
Zooko Wilcox-O'Hearn wrote:
EAX is in the same class as CCM.  I think its slightly better.  Also 
there is GCM mode, which is perhaps a tiny bit faster, although maybe 
not if you have to re-key every datagram.  Not sure about the 
key-agility of these.

... I guess the IPv6 sec project has already specified such a thing in 
detail.  I'm not familiar with their solution.

If you really want interop and wide adoption, then the obvious thing to 
do is backport IPsec to IPv4.  Nobody can resist the authority of IETF!

Alternately, if you don't use a combined mode like EAX, then you 
should follow the generic composition cookbook from Bellare and 
Rogaway [1, 2].

Next time I do something like this for fun, I'll abandon AES entirely 
(whee!  how exciting) and try Helix [3].  Also, I printed out this 
intriguing document yesterday [4].  Haven't read it yet.  It focusses on 
higher-layer stuff -- freshness and sequencing.

Feel free to post to metzcrypt and give me credit for bringing the 
following four URLs to your attention.

[1] http://www.cs.ucdavis.edu/~rogaway/ocb/ocb-back.htm#alternatives
[2] http://www.cs.ucsd.edu/users/mihir/papers/oem.html
[3] http://citeseer.ist.psu.edu/561058.html
[4] http://citeseer.ist.psu.edu/661955.html

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]