Re: Fast MAC algorithms?
I recommend Poly1305 by DJB or VMAC by Ted Krovetz and Wei Dai. Both are much faster than HMAC and have security proven in terms of an underlying block cipher. VMAC is implemented in the nice Crypto++ library by Wei Dai, Poly1305 is implemented by DJB and is also in the new nacl library by DJB. http://cryptopp.com/benchmarks-amd64.html Says that VMAC(AES)-64 takes 0.6 cycles per byte (although watch out for that 3971 cycles to set up key and IV), compared to HMAC-SHA1 taking 11.2 cycles per byte (after 1218 cycles to set up key and IV). If you do any measurement comparing Poly1305 to VMAC, please report your measurement, at least to me privately if not to the list. I can use that sort of feedback to contribute improvements to the Crypto++ library. Thanks! Regards, Zooko Wilcox-O'Hearn --- Tahoe, the Least-Authority Filesystem -- http://allmydata.org store your data: $10/month -- http://allmydata.com/?tracking=zsig I am available for work -- http://zooko.com/résumé.html - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to majord...@metzdowd.com
Re: Fast MAC algorithms?
Joseph Ashwood wrote: RC-4 is broken when used as intended. ... If you take these into consideration, can it be used correctly? James A. Donald: Hence tricky Joseph Ashwood wrote: By the same argument a Viginere cipher is tricky to use securely, same with monoalphabetic and even Ceasar. Not that RC4 is anywhere near the brokenness of Viginere, etc, but the same argument can be applied, so the argument is flawed. You cannot use a Viginere cipher securely. You can use an RC4 cipher securely: To use RC4 securely discard the first hundred bytes of output, and renegotiate the key every gigabyte. - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to majord...@metzdowd.com
Re: Fast MAC algorithms?
-- From: James A. Donald jam...@echeque.com Subject: Re: Fast MAC algorithms? Joseph Ashwood wrote: RC-4 is broken when used as intended. ... If you take these into consideration, can it be used correctly? James A. Donald: Hence tricky Joseph Ashwood wrote: By the same argument a Viginere cipher is tricky to use securely, same with monoalphabetic and even Ceasar. Not that RC4 is anywhere near the brokenness of Viginere, etc, but the same argument can be applied, so the argument is flawed. You cannot use a Viginere cipher securely. You can use an RC4 cipher securely: To use RC4 securely discard the first hundred bytes of output, and renegotiate the key every gigabyte. The way to use a Viginere securely is to apply an All-Or-Nothing-Transform to the plaintext, then encrypt, this results in the attacker entropy of the system that is in excess of the size, and therefore a OTP. There are other ways, but this method is not significantly more complex than the efforts necessary to secure RC4 and results in provable secrecy. It is just tricky to use a Vigenere securely. Joe - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to majord...@metzdowd.com
Protocol Construction WAS Re: Fast MAC algorithms?
-- From: Ray Dillinger b...@sonic.net Subject: Re: Fast MAC algorithms? I mean, I get it that crypto is rarely the weakest link in a secured application. Still, why are folk always designing and adopting cryptographic tools for the next decade or so instead of for the next few centuries? Because we have no idea how to do that. If you were to ask 6 months ago we would've said AES-256 will last at least a decade, probably 50 years. A few years before that we were saying that SHA-1 is a great cryptographic hash. Running the math a few years ago I determined that with the trajectory of cryptographic research it would've been necessary to create a well over 1024-bit hash with behaviors that are perfect by todays knowledge just to last a human lifetime, since then the trajectory has changed significantly and the same exercise today would probably result in 2000+ bits, extrapolating the trajectory of the trajectory, the size would be entirely unacceptable. So, in short, collectively we have no idea how to make something secure for that long. So far, evidence supports the idea that the stereotypical Soviet tendency to overdesign might have been a better plan after all, because the paranoia about future discoveries and breaks that motivated that overdesign is being regularly proven out. And that is why Kelsey found an attack on GOST, and why there is a class of weak keys. That is the problem, all future attacks are rather by definition a surprise. This is fundamental infrastructure now! Crypto decisions now support the very roots of the world's data, and the cost of altering and reversing them grows ever larger. By scheduling likely times for upgrades the prices can be assessed better, scheduled better, and works far better for business than the OH . OUR IS BROKEN experience that always results from trying to plan for longer than a few years at a time. It is far cheaper to build within the available knowledge, and design for a few years. If you can deploy something once, even something that uses three times as many rounds or key bits as you think now that you need, Neither of those is a strong indicator of security. AES makes a great example, AES-256 has more rounds than AES-128, AES-256 has twice as many key bits as AES-128, and AES-256 has more attacks against it than AES-128. An increasing number of attack types are immune to the number of rounds, and key bits has rarely been a real issue. There is no way predicting the far future of cryptography, it is hard enough to predict the reasonably near future. Joe - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to majord...@metzdowd.com
Re: Fast MAC algorithms?
-- From: James A. Donald jam...@echeque.com Subject: Re: Fast MAC algorithms? james hughes wrote: On Jul 27, 2009, at 4:50 AM, James A. Donald wrote: No one can break arcfour used correctly - unfortunately, it is tricky to use it correctly. RC-4 is broken when used as intended. ... If you take these into consideration, can it be used correctly? Hence tricky By the same argument a Viginere cipher is tricky to use securely, same with monoalphabetic and even Ceasar. Not that RC4 is anywhere near the brokenness of Viginere, etc, but the same argument can be applied, so the argument is flawed. The question is: What level of heroic effort is acceptable before a cipher is considered broken? Is AES-256 still secure?3-DES? Right now, to me AES-256 seems to be about the line, it doesn't take significant effort to use it securely, and the impact on the security of modern protocols is effectively zero, so it doesn't need to be retired, but I wouldn't recommend it for most new protocol purposes. RC4 takes excessive heroic efforts to avoid the problems, and even teams with highly skilled members have gotten it horribly wrong. Generally, using RC4 is foolish at best. Joe - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to majord...@metzdowd.com
Re: Fast MAC algorithms?
From: Nicolas Williams nicolas.willi...@sun.com For example, many people use arcfour in SSHv2 over AES because arcfour is faster than AES. Joseph Ashwood wrote: I would argue that they use it because they are stupid. ARCFOUR should have been retired well over a decade ago, it is weak, it meets no reasonable security requirements, No one can break arcfour used correctly - unfortunately, it is tricky to use it correctly. - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to majord...@metzdowd.com
Re: Fast MAC algorithms?
On Jul 27, 2009, at 4:50 AM, James A. Donald wrote: From: Nicolas Williams nicolas.willi...@sun.com For example, many people use arcfour in SSHv2 over AES because arcfour is faster than AES. Joseph Ashwood wrote: I would argue that they use it because they are stupid. ARCFOUR should have been retired well over a decade ago, it is weak, it meets no reasonable security requirements, No one can break arcfour used correctly - unfortunately, it is tricky to use it correctly. RC-4 is broken when used as intended. The output has a statistical bias and can be distinguished. http://www.wisdom.weizmann.ac.il/~itsik/RC4/Papers/FluhrerMcgrew.pdf and there is exceptional bias in the second byte http://www.wisdom.weizmann.ac.il/~itsik/RC4/Papers/bc_rc4.ps The latter is the basis for breaking WEP http://www.wisdom.weizmann.ac.il/~itsik/RC4/Papers/wep_attack.ps These are not attacks on a reduced algorithm, it is on the full algorithm. If you take these into consideration, can it be used correctly? I guess tossing the first few words gets rid of the exceptional bias, and maybe change the key often to get rid of the statistical bias? Is this what you mean by used correctly? - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to majord...@metzdowd.com
Re: Fast MAC algorithms?
2) If you throw TCP processing in there, unless you are consistantly going to have packets on the order of at least 1000 bytes, your crypto algorithm is almost _irrelevant_. This is my experience, too. And I would add and lots of packets. The only crypto overhead that really mattered in a real application was the number of round-trip times it took to negotiate protocols and keys. Crypto's CPU time is very very seldom the limiting factor in real end-user application performance. Could the lack of support for TCP offload in Linux have skewed these figures somewhat? It could be that the caveat for the results isn't so much this was done ten years ago as this was done with a TCP stack that ignores the hardware's advanced capabilities. I have never seen a network card or chip whose advanced capabilities included the ability to speed up TCP. Most such advanced designs actually ran slower than merely doing TCP in the Linux kernel using an uncomplicated chip. I saw a Patent Office procurement of Suns in the '80s that demanded these slow TCP offload boards (I had to write the bootstrap code for the project) even though the motherboard came with an Ethernet chip and software stack that could run TCP *at wire speed* all day and night -- for free. The super whizzo board couldn't even send back-to-back packets, as I recall. Some government contractor had added the TCP offload requirement, presumably to inflate the price that they were adding a percentage markup to. As a crypto-relevant aside, last year I looked at using the crypto offload engine in the AMD Geode cpu chip to speed up Linux crypto operations in the OLPC. There was even a nice driver for it. Summary: useless. It had been designed by somebody who had no idea of the architecture of modern software. The crypto engine used DMA for speed, used physical rather than virtual addresses, and stored the keys internally in its registers -- so it couldn't work with virtual memory, and couldn't conveniently be shared between two different processes. It was SO much faster to do your crypto by hand in a shared library in a user process, than to cross into the kernel, copy the data to be in contiguous memory locations (or manually translate the addresses and lock down those pages into physical memory), copy the keys and IVs into the accelerator, do the crypto, copy the results back into virtual memory, and reschedule the user process. In typical applications (which don't always use the same key) you'd need to do this dance once for every block encrypted, or perhaps if you were lucky, for every packet. Even kernel crypto wasn't worth doing through the thing. And the software libraries were not only faster, they were also portable, running on anything, not just one obsolete chip. Hardware guys are just jerking off unless they spend a lot of time with software guys AT THE DESIGN STAGE before they lay out a single gate. One stupid design decision can take away all the potential gain. Every TCP offloader I've seen has had at least one. John - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to majord...@metzdowd.com
Re: Fast MAC algorithms?
[I realise this isn't crypto, but it's arguably security-relevant and arguably interesting :-)]. James Hughes hugh...@mac.com writes: TOEs that are implemented in a slow processor in a NIC card have been shown many times to be ineffective compared to keeping TCP in the fastest CPU (where it is now). The problem with statements like this is that they smack of the Linux religious zealotry against TCP offload support in the kernel, TOE's are bad because we say they are, and we'll keep asserting this until you go away. A decade ago, during the Win2K development, Microsoft were measuring a 1/3 reduction in CPU usage just from TCP checksum offload. Given the time frame this was probably on 300MHz PII's, but then again it'd be with late-90s vintage NICs. On the other hand I've seen even more impressive figures with their more recent TCP chimney offload (which just moves more of the NDIS stack onto the NIC, I think it came out around Server 2003). Does this mean that MS have figured out (a decade or so ago) how to make TOE work while the OSS community has been too occupied telling everyone it doesn't to do anything about it? There must be some reason for the difference between the two camps. Peter. - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to majord...@metzdowd.com
Re: Fast MAC algorithms?
On Jul 24, 2009, at 1:30 PM, Peter Gutmann wrote: [I realise this isn't crypto, but it's arguably security-relevant and arguably interesting :-)]. As long as we think this is interesting, (although I respectfully disagree that there are any inherent security problems with TOE. Maybe there are insecure implementation...). James Hughes hugh...@mac.com writes: TOEs that are implemented in a slow processor in a NIC card have been shown many times to be ineffective compared to keeping TCP in the fastest CPU (where it is now). The problem with statements like this is that they smack of the Linux religious zealotry against TCP offload support in the kernel, TOE's are bad because we say they are, and we'll keep asserting this until you go away. There were a dozen or so protocol offload research projects that the US government funded in the 90s. All failed. Is the people who say TOE's are bad because of zealotry or standing on the shoulders of the people that ran those projects. At Network Systems, we partnered with HT Kung of CMU at the time to move TCP out of a really slow Decstation. Result? A accelerator that cost as much as the workstation that was faster until the next processor version was available. Yes, we could have reduced it to a chip but it wasn't. The take away was that improving the software is the gift that keeps on giving. Moore's law means you get a faster TCP every time the clock ticks. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.41.1138 BTW, I am not a Linux bigot, just someone that got caught up in this issue more than a decade ago. I do not agree with your assertion or the Wikipedia page that this is linux bigotry. I find that page horribly inaccurate and self serving to the TOE manufacturing community. What I learned from participating in a project that spent $5M of tax payer money was that The protocol itself is a small fraction of the problem. A decade ago, during the Win2K development, Microsoft were measuring a 1/3 reduction in CPU usage just from TCP checksum offload. Given the time frame this was probably on 300MHz PII's, but then again it'd be with late-90s vintage NICs. On the other hand I've seen even more impressive figures with their more recent TCP chimney offload (which just moves more of the NDIS stack onto the NIC, I think it came out around Server 2003). Does this mean that MS have figured out (a decade or so ago) how to make TOE work while the OSS community has been too occupied telling everyone it doesn't to do anything about it? There must be some reason for the difference between the two camps. Offloading features like checksumming, fragmentation/reassembly (aka Large Segment Offload), packet categorization, slitting flows to different threads, etc. is not TOE. TOE is offloading of the TCP stack. The thin line that is crossed is where is the TCP state kept. If the state is kept in the card, then the protocol to get the data reliably to the application is has more corner cases (hence complexity) since the IP layer can be lossy and the socket layer can not. In all the research, this has always been the case. If there is something windows has not learned could be that processing TCP should be simple and quick. Since the source code is not available, I don't know if their software falls into the too complicated camp or not... In the case of Chimney partial stack offload, the state is in both places. Sounds simple straight forward, right? The case of iSCSI where a complete protocol conversion is done (the card looks like a SCSI card, but the data goes out over TCP/IP) it is a different story (which is also arguably still about solving the OS vendor's lack of software agility with hardware), but that is not the intent of this discussion. I fully agree that offloading features that makes the TCP processing easier is a good thing. Back to crypto? Peter. Jim - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to majord...@metzdowd.com
Re: Fast MAC algorithms?
-- From: Nicolas Williams nicolas.willi...@sun.com Sent: Tuesday, July 21, 2009 10:43 PM Subject: Re: Fast MAC algorithms? But that's not what I'm looking for here. I'm looking for the fastest MACs, with extreme security considerations (e.g., warning, warning! must rekey every 10 minutes) There's a reason everyone is ignoring that requirement rekeying in any modern system is more or less trivial. As an example take AES, rekeying every 10 minutes will have a throughput of 99.999% of the original, there will be bigger differences depending on whether or not you move the mouse. being possibly OK, depending on just how extreme -- the sort of algorithm that one would not make REQUIRED to implement, but which nonetheless one might use in some environments simply because it's fast. I would NEVER recommend it, let me repeat that I would NEVER recommend it, but Panama is a higher performing design, IIRC about 8x the speed of the good recommendations, but DON'T USE PANAMA. You wanted a bad recommendation, Panama is a bad recommendation. If you want a good recommendation that is faster, Poly1305-AES. You'll get some extra speed without compromising security. For example, many people use arcfour in SSHv2 over AES because arcfour is faster than AES. I would argue that they use it because they are stupid. ARCFOUR should have been retired well over a decade ago, it is weak, it meets no reasonable security requirements, and in most situations it is not actually faster due to the cache thrashing it frequently induces due to the large key expansion. In the crypto world one never designs weak-but-fast algorithms on purpose, only strong-and-preferably-fast ones. And when an algorithm is successfully attacked it's usually deprecated, The general preference is to permanently retire them. The better algorithms are generally at least as fast, that's part of the problem you seem to be having, you're not understanding that secure is not the same word as slow, in fact everyone has worked very hard in making the secure options at least as fast as the insecure. new ones tend to be slower because resistance against new attacks tends to require more computation. New ones tend to be faster than the old. New ones are designed with more recent CPUs in mind. New ones are designed with the best available knowledge on how to build security New ones are simpler by design New ones make use of everything that has been learned. I realized this would make my question seem a bit pointless, but hoped I might get a surprising answer :( I think the answer surprised you more than you expected. You had hoped for some long forgotten extremely fast algorithm, what you've instead learned is that the long forgotten algorithms were not only forgotten because of security, but that they were eclipsed on speed as well. I've moved this to the end to finish on the point The SSHv2 AES-based ciphers ought to be RTI and default choice, IMO, but that doesn't mean arcfour should not be available. I very strongly disagree. One of the fundamental assumptions of creating secure protocols is that sooner or later someone will bet their life on your work. This isn't an idle overstatement, instead it is an observation. How many people bet their life and lost because Twitter couldn't protect their information in Iran? How many people bet their life's savings on SSL/TLS? How many people trusted various options with their complete medical history? How many people bet their life or freedom on the ability of PGP to protect them? People bet their life on security all the time, it is a part of the job to make sure that bet is safe. Joe - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to majord...@metzdowd.com
Re: Fast MAC algorithms?
mhey...@gmail.com mhey...@gmail.com writes: 2) If you throw TCP processing in there, unless you are consistantly going to have packets on the order of at least 1000 bytes, your crypto algorithm is almost _irrelevant_. [...] for a Linux 2.2.14 kernel, remember, this was 10 years ago. Could the lack of support for TCP offload in Linux have skewed these figures somewhat? It could be that the caveat for the results isn't so much this was done ten years ago as this was done with a TCP stack that ignores the hardware's advanced capabilities. Peter. - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to majord...@metzdowd.com
Re: Fast MAC algorithms?
On Thu, Jul 23, 2009 at 1:34 AM, Peter Gutmannpgut...@cs.auckland.ac.nz wrote: mhey...@gmail.com mhey...@gmail.com writes: 2) If you throw TCP processing in there, unless you are consistantly going to have packets on the order of at least 1000 bytes, your crypto algorithm is almost _irrelevant_. [...] for a Linux 2.2.14 kernel, remember, this was 10 years ago. Could the lack of support for TCP offload in Linux have skewed these figures somewhat? It could be that the caveat for the results isn't so much this was done ten years ago as this was done with a TCP stack that ignores the hardware's advanced capabilities. TCP offload would, of course, help reduce CPU load and make crypto algorithm choice have more of an effect. With our tests, however, to actually show an effect, we had to use large packet sizes which reduced the impact of TCP - I know we were using 64K packets for some tests. Boosting the packet size also affected cycles-per-byte for NMAC-style algorithms because the outer function gets run less often for a given amount of data (IPSec processing occurs outbound prior to fragmentation). We needed to reduce the impact of TCP because it still remained that when doing something with the data, the cycles-per-byte of that processing greatly impacts the percentage of slowdown your MAC algorithm choice will have. To throw another monkey wrench into the works, obviously, you may think But what if I have a low power application, trying to be green, you know. So I want to use less processor intensive cryptography to save energy? Well, I sat in the middle of a group of people doing work for another DARPA project (SensIT) shortly after the ACSA project. The SensIT project was for low energy wireless sensors in which we experimented with different key exchange/agreement techniques in an attempt to economize energy. As a throw-in result, the SensIT people found it takes 3 orders of magnitude more energy to transmit or receive data on a per-bit basis than it does to do AES+HMAC-SHA1 (it came as a surprise to me back then that reception and transmission take similar amounts of energy). Moral: don't scrimp on crypto to save energy - at least for wireless, I don't know what it costs to send a bit down a twisted pair or fiber. The SensIT final report is available here: http://www.cs.umbc.edu/courses/graduate/CMSC691A/Spring04/papers/nailabs_report_00-010_final.pdf. -Michael Heyman - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to majord...@metzdowd.com
Re: Fast MAC algorithms?
On Thu, Jul 23, 2009 at 05:34:13PM +1200, Peter Gutmann wrote: mhey...@gmail.com mhey...@gmail.com writes: 2) If you throw TCP processing in there, unless you are consistantly going to have packets on the order of at least 1000 bytes, your crypto algorithm is almost _irrelevant_. [...] for a Linux 2.2.14 kernel, remember, this was 10 years ago. Could the lack of support for TCP offload in Linux have skewed these figures somewhat? It could be that the caveat for the results isn't so much this was done ten years ago as this was done with a TCP stack that ignores the hardware's advanced capabilities. How much NIC hardware does both, ESP/AH and TCP offload? My guess: not much. A shame, that. Once you've gotten a packet off the NIC to do ESP/AH processing, you've lost the opportunity to use TOE. Nico -- - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to majord...@metzdowd.com
Re: Fast MAC algorithms?
Note for Moderator. This is not crypto but TOE being the solution to networking performance problems is a perception that is dangerous to leave in the crypto community. On Jul 23, 2009, at 11:45 PM, Nicolas Williams wrote: On Thu, Jul 23, 2009 at 05:34:13PM +1200, Peter Gutmann wrote: mhey...@gmail.com mhey...@gmail.com writes: 2) If you throw TCP processing in there, unless you are consistantly going to have packets on the order of at least 1000 bytes, your crypto algorithm is almost _irrelevant_. [...] for a Linux 2.2.14 kernel, remember, this was 10 years ago. Could the lack of support for TCP offload in Linux have skewed these figures somewhat? It could be that the caveat for the results isn't so much this was done ten years ago as this was done with a TCP stack that ignores the hardware's advanced capabilities. How much NIC hardware does both, ESP/AH and TCP offload? My guess: not much. A shame, that. Once you've gotten a packet off the NIC to do ESP/AH processing, you've lost the opportunity to use TOE. IPSEC offload can have value. TOE are far more controversial. TOEs that are implemented in a slow processor in a NIC card have been shown many times to be ineffective compared to keeping TCP in the fastest CPU (where it is now). For vendors that can't optimize their TCP implementation (because it is just too complicated for then?) TOE is a siren call that detracts them from their real problem. Look at Van Jacobson post of May 2000 entitled TCP in 30 instructions. http://www.pdl.cmu.edu/mailinglists/ips/mail/msg00133.html There was a paper about this, but I am at a loss to find it. One can go even farther back to An Analysis of TCP Processing Overhead, Clark, Jacobson, Romkey and Salwen in 1989 which states The protocol itself is a small fraction of the problem. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.75.5741 Back to crypto please. Nico -- - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to majord...@metzdowd.com - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to majord...@metzdowd.com
Re: Fast MAC algorithms?
-- From: Nicolas Williams nicolas.willi...@sun.com Subject: Fast MAC algorithms? Which MAC algorithms would you recommend? I didn't see the primary requirement, you never give a speed requirement. OMAC-AES-128 should function around 100MB/sec, HMAC-SHA-512 about the same, HMAC-SHA1 about 150MB/sec, HMAC-MD5 250MB/sec. I wouldn't recommend MD5, but in many situations it can be acceptable, and none of these make use of parallelism to achieve the speeds. Joe - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to majord...@metzdowd.com
Re: Fast MAC algorithms?
On Tue, Jul 21, 2009 at 07:15:02PM -0500, Nicolas Williams wrote: I've an application that is performance sensitive, which can re-key very often (say, every 15 minutes, or more often still), and where no MAC is accepted after 2 key changes. In one case the entity generating a MAC is also the only entity validating the MAC (but the MAC does go on the wire). I'm interested in any MAC algorithms which are fast, and it doesn't matter how strong they are, as long as they meet some reasonable lower bound on work factor to forge a MAC or recover the key, say 2^64, given current cryptanalysis, plus a comfort factor. [...] Which MAC algorithms would you recommend? I'm getting the impression that key agility is important here, so one MAC that comes to mind is CMAC with a block cipher with a fast key schedule like Serpent. (If for some reason you really wanted to do something to make secuity auditors squirm you could even cut Serpent down to 16 rounds which would increase the message processing rate by about 2x and also speed up the key schedule. This seems like asking for it to me, though.) Another plausible answer might be Skein - it directly supports keying and nonces (so you don't have to take the per-message overhead of the extra hash as with HMAC), and has very good bulk throughput on 64-bit CPUs. -Jack - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to majord...@metzdowd.com
Re: Fast MAC algorithms?
On Wed, Jul 22, 2009 at 06:49:34AM +0200, Dan Kaminsky wrote: Operationally, HMAC-SHA-256 is the gold standard. There's wonky stuff all over the place -- Bernstein's polyaes work appeals to me -- but I wouldn't really ship anything but HMAC-SHA-256 at present time. Oh, I agree in general. As far as new apps and standards work I'd make HMAC-SHA-256 or AES, in an AEAD cipher mode, REQUIRED to implement and the default. But that's not what I'm looking for here. I'm looking for the fastest MACs, with extreme security considerations (e.g., warning, warning! must rekey every 10 minutes) being possibly OK, depending on just how extreme -- the sort of algorithm that one would not make REQUIRED to implement, but which nonetheless one might use in some environments simply because it's fast. For example, many people use arcfour in SSHv2 over AES because arcfour is faster than AES. The SSHv2 AES-based ciphers ought to be RTI and default choice, IMO, but that doesn't mean arcfour should not be available. In the crypto world one never designs weak-but-fast algorithms on purpose, only strong-and-preferably-fast ones. And when an algorithm is successfully attacked it's usually deprecated, put in the ash heap of history. But there is a place for weak-but-fast algos, as long as they're not too weak. Any weak-but-fast algos we might have now tend to be old algos that turned out to be weaker than designed to be, and new ones tend to be slower because resistance against new attacks tends to require more computation. I realized this would make my question seem a bit pointless, but hoped I might get a surprising answer :( Nico -- - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to majord...@metzdowd.com
Re: Fast MAC algorithms?
On Wed, Jul 22, 2009 at 1:43 AM, Nicolas Williamsnicolas.willi...@sun.com wrote: But that's not what I'm looking for here. I'm looking for the fastest MACs, with extreme security considerations...In the crypto world one never designs weak-but-fast algorithms on purpose, only strong-and-preferably-fast ones. And when an algorithm is successfully attacked it's usually deprecated, put in the ash heap of history. But there is a place for weak-but-fast algos, as long as they're not too weak. It just so happens that I worked on a DARPA funded project about 10 years ago looking at the effects of any possible strength vs speed trade off available for different MACing algorithms. We built the capability into FreeS/Wan's IPSec. Some of our MACs were so weak we called them Partial MACs (PMACs). PMACs authenticated only randomly selected pieces of the packet. We figured PMACs were good enough for video - who cares if Eve can feed you a frame or two of partially spoofed video as long as you can't get enough to notice. http://www.isso.sparta.com/documents/acsa_final_report.pdf The major take aways include: 1) HMAC-SHA1-96 can typically triple the amount of CPU required to move IP packets through the kernel over a no-crypto option. HMAC-MD5-96 can double it. 2) If you throw TCP processing in there, unless you are consistantly going to have packets on the order of at least 1000 bytes, your crypto algorithm is almost _irrelevant_. TCP costs up to ~1000 cycles per byte on 10 byte packets, 100 cycles per byte on 100 byte packets, and only gets down to ~15 cycles per byte at 1000 byte packets. For reference, HMAC-SHA1-96 takes about 25 cycles per byte for ~1000 byte packets. These are PentiumII numbers for a Linux 2.2.14 kernel, remember, this was 10 years ago. 3) If your host is actually going to do something with the data you receive, it is really really hard to find something that the crypto algorithm will affect. A coworker of mine struggled for to find a real world desktop application in which you could actually see a result (other then some numbers in a log file). Finally he found that viewing a video remotely in an X-window (thats uncompressed video) would have occasional drops that becomes noticeable if you pick your video well. Our video was of a circular radar screen with a rotating update line (I think it came from a screen saver). With this contrived application, we could change the MAC algorithm and see more or less disturbance in the video. I'd like to emphasize points 2 and 3. You need an application that either doesn't use TCP or that only uses TCP with MTU sized packets to even want to care about crypto performance. I don't think the paper points it out, but all are testing was done with two machines connected directly to each other. Any out of order processing TCP needs to do will only decrease the effect a MAC algorithm has. Also if you want to do _anything_ with the data other than ignore it, it will only further decrease the effect the MAC algorithm has. We tried timing FTP transfers, streaming an MPEG, and numerous other things that I don't remember but all these things had too much overhead to allow the choice of MAC algorithm to be noticed. 10 years of kernel network stack development and CPU improvements may have changed the numbers slightly but I believe you need a really specialized case, probably including real time requirements on marginal CPUs, before you need to look at faster MAC algorithms. Thanks for letting me reminisce about a really fun project (sprinkling rdtsc around the Linux kernel and getting Steve Kent upset (not really) at our attempted subversion of IPSec intent - we ended up doing it the way he wanted even though my way would have been cleaner grin/). -Michael Heyman - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to majord...@metzdowd.com