Andy Polyakov wrote:
> 
> > > But after examining SSL code I've realized that SSL layer *always* feeds
> > > libcrypto with *unaligned* data:-( So I'm considering to tweak the SSL
> > > code so that message payload gets aligned at 64-bit boundary. But before
> > > I do so I'd like to hear opinions on this.
> >
> > Sounds like a good idea, so long as it is optional... why? Coz people
> > keep wanting to cram it into embedded systems where space is at a
> > premium.
> >
> > It ought to be easy to provide an aligned_malloc() function that is
> > optionally just malloc(), right?
> Well, it probably ought to be called unaligned_malloc() then. Let me
> explain it a little bit better. As we know malloc() always returns a
> pointer "suitably aligned for any use." Now let's say I have SSLv3
> connection negotiated. I've allocated a buffer with malloc, it's (at
> least) 64 bit aligned, I read/write chunks of data... Now the point is
> that SSLv3 header is 5 bytes long and the rest is the message to be
> decrypted/encrypted which makes it (pointer argument passed to
> libcrypto.a) unaligned. My idea is to adjust pointer to buffer I
> read/write data to/from so that ptr+5 will be 64-bit aligned instead of
> ptr. Yes, I'll have to pass SSL3_RT_MAX_PACKET_SIZE+3 to malloc)_. But
> would it increase memory consumption? Absolutely not because malloc()
> shall *pad* SSL3_RT_MAX_PACKET_SIZE up to the size I'm asking for
> anyway! The only penalties I face here would be those arising from
> copying between unaligned buffer in the user space and aligned buffer in
> the kernel space in read/write system calls. And that's what I'll have
> to weight against each other. I mean penalties from unaligned<->aligned
> copy and gains from faster encryption/decryption pass. On the other hand
> one unaligned<->aligned copy is very likely to take place already,
> because (let's take receive case) after decryption plain text message
> gets copied to user supplied buffer which is very likely to be aligned,
> right? Now the latter copy becomes aligned<->aligned, i.e. penalties are
> simply get shifted to kernel mode...
> 
> SSLv2 is more tricky as the header is of variable length... And as a
> matter of fact I haven't look at it very close yet. As an alternative
> (to looking at it:-) we can simply declare SSLv3 as "high performance
> protocol" and one to be prefered to SSLv2 now even for performance
> reasons:-)

Well, since SSLv2 is deprecated, I wouldn't worry about it. How about a
function like this:

void *aligned_malloc(size_t total_size,size_t unaligned_size);

this returns a pointer, p, to a buffer of size total_size where
p+unaligned_size is aligned.

BTW, although malloc() returns a pointer "suitable aligned", that
doesn't mean it is fast. The comment refers to the fact that some CPUs
don't like fetching a long (say) except off a 4-byte boundary. In fact
an x86 will fetch them on any alignment, so malloc() is not obliged to
align at all, whereas for performance on an x86 you want to align on a
paragraph boundary (16 bytes?).

Cheers,

Ben.

--
http://www.apache-ssl.org/ben.html

"My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there."
     - Indira Gandhi
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [EMAIL PROTECTED]
Automated List Manager                           [EMAIL PROTECTED]

Reply via email to