Hi,
I have changed the mac code which gives substantial improvement for both read
and write (not handshake)
The saving is fairly major, on cpu with cryto acceleration, the change
can more than double the overall ssl read /write speed for 1K record
excluding OS IO time. this implies the change removed majority of the
code overhead for read and write.
The basic idea
is to remove all the EVP_MD_CTX duplications (which is very cpu
intensive) during read and write. the original code involves numerous
memory allocations and frees for each read or write all due to the ctx's
deep copy.
the new way of keeping the ctx is to
make it do state checkpoint and restore instead of deep copy, after
this change there is NO memory operation for read and write. The changes
are not too big also.
One catch (should not
really be a catch) is that at application level NO MORE than one thread
can work on the SAME SSL/TLS connection for read or write (read or write
can be done at the same time). But I would think most apps would NEVER
allow more than one thread to read or write on the same connection (I
don't think it would work if you do that anyway, even without my
change).
the patch file I attached is based on 1.0.0e version.
Andrey found some problem in original version of the patch when PKEY_METHS
engine is used. so this is an updated patch (complete, not incremental patch)
to fix that.
This checkpoint/restore is enabled if PKEY_METHS engine is used UNLESS the
engine code implements the control interface to do the checkpointing/restore.
As pointed out by others, there can be other ways to achieve similar thing, the
saving also depends your system's memory allocation routines. also part of the
patch look a bit like hack
Thanks to Andrey!
Regards,
Michael
checkpoint.patch
Description: Binary data