>>> Running `make test` with Clang sanitizers results in some issues with
>>> unaligned pointers surrounding some uses of buffers cast to a size_t*.
>>> The sanitizers used were `-fsanitize=undefined -fsanitize=address`.
>> Those are conscious choices based on the fact that some CPUs, x86_64
>> included, are perfectly capable of tolerating unaligned access, in sense
>> that code doesn't crash and produces correct result. In other words,
>> it's legitimate platform-specific behaviour. As a compromise it's
>> possible to arrange it so that build doesn't attempt to utilize this
>> platform capability *if* compiled with -DPEDANTIC. Would it be
>> acceptable compromise?
> 
> It already seems to be controlled by the STRICT_ALIGNMENT define,
> which looks like:
> #define STRICT_ALIGNMENT 1
> #if defined(__i386)     || defined(__i386__)    || \
>     defined(__x86_64)   || defined(__x86_64__)  || \
>     defined(_M_IX86)    || defined(_M_AMD64)    || defined(_M_X64) || \
>     defined(__aarch64__)                        || \
>     defined(__s390__)   || defined(__s390x__)
> # undef STRICT_ALIGNMENT
> #endif

Yes, and suggestion is to !defined(PEDANTIC) over whole thing.

> This are the result I get without the strict alignment:
> without EVP (and AES-NI):
> type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
> aes-128 cbc     109991.09k   121907.05k   124370.43k   125518.43k   125509.63k
> 
> With EVP:
> aes-128-cbc     659929.40k   717046.56k   726172.76k   731055.52k   732257.95k
> 
> And with strict alignment:
> without EVP:
> type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
> aes-128 cbc     111016.13k   121922.45k   124197.29k   125595.48k   125438.63k
> 
> With EVP:
> type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
> aes-128-cbc     655557.30k   714415.45k   726278.83k   730977.09k   731893.55k

These are all assembly cases, with and without AES-NI and/or EVP, when
cbc128.c is not involved. I mean AES CBC-ing is handled in assembly in
all these cases, not in cbc128.c. But even if cbc128.c was involved, the
benefit can be observed when input or output is actually misaligned,
while openssl speed benchmarks ... aligned data. So that above results
don't tell anything about benefits of STRICT_ALIGNMENT being undefined.
And it's usually around 10%. And indeed, I just measured 12.5% on my
computer. [You have to configure with no-asm, and rig apps/speed.c to
use misaligned buffers].


______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       openssl-dev@openssl.org
Automated List Manager                           majord...@openssl.org

Reply via email to