I forgot to mention when I tested it was a slightly different impl that contains couple other small optimizations, in the tls1_mac() function I combined the first two update calls into one call which saved couple of ms also. the numbers were tls numbers.
as for the question of record size, the smaller the record the larger the percentage of saving since the saving is fixed. ----- Original Message ----- From: Deng Michael <[email protected]> To: "[email protected]" <[email protected]> Cc: Sent: Friday, December 9, 2011 5:15 PM Subject: Re: [openssl.org #2650] major ssl read/ write performance improvement - updated Hi Andrey, I measured on a chip that has no OS which supports cryto acceleration (cavium octeon). My setup doe not involve TCP io since the TCP data has been received and passed to ssl through custom BIO (or mem bio). I measure SSL_read or SSL_write (about 1K size) in ms (aes256_cbc/sha1). the measurement is done through cpu ticks, the number seems: without any change and crypto accel: 170ms (this is linear almost to the size of record) with cryto accel only: 54ms (or something like that, the acceleration is done on the same cavium cpu through engine interface) with the patch: 25ms since there is no OS so the code runs to finish and IOs are done separately. The memory allocation is based cavium provided code. for me the saving is fixed so the percentage depends on other part. I don't have a way of measuring if IO is involved. Regards, Michael ----- Original Message ----- From: Andrey Kulikov <[email protected]> To: [email protected] Cc: Sent: Thursday, December 8, 2011 4:11 PM Subject: Re: [openssl.org #2650] major ssl read/ write performance improvement - updated Hello Michael, I have tested youe patch. It is working stable at least with ccgost engine (and without any engine too, of cource). Thanks for contribution! Could you please describe, what was your test environmnet and test methodology? How did you measure that doubling read/write speed? What tool/profiler do you use? How it depends from SSL record size? What the overall speed improvement if we'll count OS IO? I'm asking because I'm trying to measure performance improvement your changes can give with my crypto-accelerator, and my results not even close to doube read/write speed. But my test resources are limited for the moment, and it is possible it is due to these limitations. In any case, I guess comunity will be grateful if your share your expirience. WBR, Andrey On 5 December 2011 14:33, Deng Michael via RT <[email protected]> wrote: > Hi, > I have changed the mac code which gives substantial improvement for both > read and write (not handshake) > > The saving is fairly major, on cpu with cryto acceleration, the change > can more than double the overall ssl read /write speed for 1K record > excluding OS IO time. ______________________________________________________________________ OpenSSL Project http://www.openssl.org Development Mailing List [email protected] Automated List Manager [email protected] ______________________________________________________________________ OpenSSL Project http://www.openssl.org Development Mailing List [email protected] Automated List Manager [email protected] ______________________________________________________________________ OpenSSL Project http://www.openssl.org Development Mailing List [email protected] Automated List Manager [email protected]
