Re: [patch] make AES-cfb128-encrypt faster by uglifying it

2006-05-27 Thread Alex Dubov
While risking to be of too much annoyance, I would
like to persist on topic slightly further:

1. I'm using aes-128-cfb for media streaming and I
think it's rather good choice for the job.
2. Currently, aes-128-cfb works slower than it can (by
more than 20% and often beyond that) and suffers from
ecrypt/decrypt speed assymetry (36 MB/sec encryption
vs 30 MB/sec decryption on one of my machines - can be
of issue in life media streaming).
3. From my experience with gcc on powerpc, gcc handles
large unaligned load/stores correctly by splitting
them (sometimes unnecessary), but the code remains
correct and in working order.

Therefore, I would like to propose a patch using gcc
vector intrinsics when compiled with newer gcc and
falls back to the current version otherwise.
(I don't mind adding x86-only modifier to if defined
string, if needed).


__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com --- aes_cfb.c.prev  2004-12-30 21:43:33.0 +1100
+++ aes_cfb.c   2006-05-28 02:03:43.414593000 +1000
@@ -121,6 +121,67 @@
  * 128bit block we have used is contained in *num;
  */
 
+#if defined (__GNUC__)  __GNUC__ = 3  __GNUC_MINOR__ = 3
+typedef int __v16qi __attribute__ ((mode (V16QI)));
+
+void AES_cfb128_encrypt(const unsigned char *in, unsigned char *out,
+   const unsigned long length, const AES_KEY *key,
+   unsigned char *ivec, int *num, const int enc) {
+
+   unsigned int n, nr;
+   unsigned long l = 0;
+   unsigned char c;
+   __v16qi t_in;
+
+   assert(in  out  key  ivec  num);
+   n = *num;
+
+   if (enc) {
+   if (n) {
+   for (; l  length; l++) {
+   ivec[n] = out[l] = ivec[n] ^ in[l];
+   if(!(n = (n + 1) % AES_BLOCK_SIZE)) break;
+   }
+   }
+
+   for (; l + AES_BLOCK_SIZE = length; l += AES_BLOCK_SIZE) {
+   AES_encrypt(ivec, ivec, key);
+   t_in = *(__v16qi*)(in + l);
+   *(__v16qi*)(out + l) = *(__v16qi*)ivec ^ t_in;
+   *(__v16qi*)(ivec) = *(__v16qi*)(out + l);
+   }
+
+   if(l  length) AES_encrypt(ivec, ivec, key);
+   for (; l  length; l++) {
+   ivec[n++] = out[l] = ivec[n] ^ in[l];
+   }
+   } else {
+   if (n) {
+   for (; l  length; l++) {
+   c = in[l];
+   out[l] = ivec[n] ^ in[l];
+   ivec[n] = c;
+   if(!(n = (n + 1) % AES_BLOCK_SIZE)) break;
+   }
+   }
+
+   for (; l + AES_BLOCK_SIZE = length; l += AES_BLOCK_SIZE) {
+   AES_encrypt(ivec, ivec, key);
+   t_in = *(__v16qi*)(in + l);
+   *(__v16qi*)(out + l) = *(__v16qi*)ivec ^ t_in;
+   *(__v16qi*)(ivec) = t_in;
+   }
+
+   if(l  length) AES_encrypt(ivec, ivec, key);
+   for (; l  length; l++) {
+   c = in[l];
+   out[l] = ivec[n] ^ in[l];
+   ivec[n++] = c;
+   }
+   }
+   *num = n;
+}
+#else
 void AES_cfb128_encrypt(const unsigned char *in, unsigned char *out,
const unsigned long length, const AES_KEY *key,
unsigned char *ivec, int *num, const int enc) {
@@ -155,6 +216,7 @@
 
*num=n;
 }
+#endif
 
 /* This expects a single block of size nbits for both in and out. Note that
it corrupts any extra bits in the last byte of out */


Re: [patch] make AES-cfb128-encrypt faster by uglifying it

2006-05-26 Thread Alex Dubov
Ok. How about now?

--- Richard Levitte - VMS Whacker
[EMAIL PROTECTED] wrote:

 In message

[EMAIL PROTECTED]
 on Thu, 25 May 2006 22:50:15 -0700 (PDT), Alex Dubov
 [EMAIL PROTECTED] said:
 
 oakad I thought all major compilers have sort of
 long long,
 oakad didn't them? After all, emulated long long is
 still
 oakad only two integer xors as opposed to 8 with
 char.
 
 If you look in the script Configure, you'll see what
 kinds of
 platforms we claim to support.  That means that we
 have to be careful
 with the kind of assumptions we make.  For example,
 your patch would
 fail miserably on VMS for VAX (which I know is still
 used out there).
 
 However, nothing stops you from making variants with
 different types
 of integers, maybe with some help from the macros
 used and defined in
 crypto/bn/bn.h, which are correctly defined for each
 platform, as far
 as we know.
 
 Cheers,
 Richard
 
 -
 Please consider sponsoring my work on free software.
 See http://www.free.lp.se/sponsoring.html for
 details.
 
 -- 
 Richard Levitte
 [EMAIL PROTECTED]

 http://richard.levitte.org/
 
 When I became a man I put away childish things,
 including
  the fear of childishness and the desire to be very
 grown up.
   -- C.S. Lewis

__
 OpenSSL Project
 http://www.openssl.org
 Development Mailing List  
 openssl-dev@openssl.org
 Automated List Manager  
 [EMAIL PROTECTED]
 

__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

aes_cfb.c.diff
Description: 441793709-aes_cfb.c.diff


Re: [patch] make AES-cfb128-encrypt faster by uglifying it

2006-05-25 Thread Alex Dubov
I'm very sorry, that's not as easy as I thought. I
hope this one should work as expected.

--- Alex Dubov [EMAIL PROTECTED] wrote:

 Oops, minor correction.
 
 --- Alex Dubov [EMAIL PROTECTED] wrote:
 
  Hello.
  I was working on apache project using openssl and
  found that using larger integers in cfb128 xor
  improves performance by more than 50% in most
 cases.
  There are no drawbacks whatsoever, except the
 look.
  
  __
  Do You Yahoo!?
  Tired of spam?  Yahoo! Mail has the best spam
  protection around 
  http://mail.yahoo.com 
 
 __
 Do You Yahoo!?
 Tired of spam?  Yahoo! Mail has the best spam
 protection around 
 http://mail.yahoo.com 

__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

aes_cfb.c.diff
Description: 441793709-aes_cfb.c.diff


Re: [patch] make AES-cfb128-encrypt faster by uglifying it

2006-05-25 Thread Alex Dubov
I thought all major compilers have sort of long long,
didn't them? After all, emulated long long is still
only two integer xors as opposed to 8 with char.

--- Tim Rice [EMAIL PROTECTED] wrote:

 On Thu, 25 May 2006, Alex Dubov wrote:
 
Hello.
I was working on apache project using openssl
 and
found that using larger integers in cfb128 xor
improves performance by more than 50% in most
   cases.
There are no drawbacks whatsoever, except the
 look.
 ...
 --- aes_cfb.c.prev2006-05-25 15:34:04.0
 +1000
 +++ aes_cfb.c 2006-05-26 14:27:43.0 +1000
 @@ -125,34 +125,69 @@
   const unsigned long length, const AES_KEY *key,
   unsigned char *ivec, int *num, const int enc) {
  
 +#define u64 unsigned long long
 ...
 
 What about the platforms that don't have long
 long?
 
 -- 
 Tim Rice  Multitalents(707) 887-1469
 [EMAIL PROTECTED]
 
 

__
 OpenSSL Project
 http://www.openssl.org
 Development Mailing List  
 openssl-dev@openssl.org
 Automated List Manager  
 [EMAIL PROTECTED]
 


__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 
__
OpenSSL Project http://www.openssl.org
Development Mailing List   openssl-dev@openssl.org
Automated List Manager   [EMAIL PROTECTED]