Hi Grant,
> Hey Emeric,
>
> Thank you very much for the information. Hopefully the s_server + qat issue
> could be addressed soon.
>
> Regards,
>
> Grant
>
>
>
Intel's guys told me that the bug is related to prf and asked me to recompile
the engine using '--disable_qat_prf'. Doing that i can do some tests iwth the
qat engine but i'm facing stability issues:
[root@centos haproxy]# /usr/local/ssl/bin/openssl speed -engine qat -elapsed
-async_jobs 8 rsa2048
[WARNING][e_qat.c:1531:bind_qat()] QAT Warnings enabled.
engine "qat" set.
You have chosen to measure elapsed time instead of user CPU time.
Doing 2048 bit private rsa's for 10s: 13442 2048 bit private RSA's in 10.01s
Doing 2048 bit public rsa's for 10s: 290503 2048 bit public RSA's in 10.00s
OpenSSL 1.1.0e 16 Feb 2017
built on: reproducible build, date unspecified
options:bn(64,64) rc4(16x,int) des(int) aes(partial) idea(int) blowfish(ptr)
compiler: gcc -DDSO_DLFCN -DHAVE_DLFCN_H -DNDEBUG -DOPENSSL_THREADS
-DOPENSSL_NO_STATIC_ENGINE -DOPENSSL_PIC -DOPENSSL_IA32_SSE2
-DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM
-DSHA256_ASM -DSHA512_ASM -DRC4_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM
-DGHASH_ASM -DECP_NISTZ256_ASM -DPADLOCK_ASM -DPOLY1305_ASM
-DOPENSSLDIR="\"/usr/local/ssl/ssl\""
-DENGINESDIR="\"/usr/local/ssl/lib/engines-1.1\"" -Wa,--noexecstack
sign verify sign/s verify/s
rsa 2048 bits 0.000745s 0.000034s 1342.9 29050.3
Doing a benchmark using haproxy and qat engine stall to ~450 connections/sec
Stopping the injection, the haproxy process continue to steal cpu doing nothing
(top shows ~50% of one core, mainly in user):
here thre trace:
[root@centos ~]# strace -p 27085
Process 27085 attached
epoll_wait(3, {}, 200, 1000) = 0
epoll_wait(3, {}, 200, 1000) = 0
epoll_wait(3, {}, 200, 1000) = 0
epoll_wait(3, {}, 200, 1000) = 0
epoll_wait(3, {}, 200, 1000) = 0
The epoll awake all seconds, seems normal.
If i continue to inject re-using the same key (session resuming,no rsa
computation), i observe ~1500 connections/src
But stopping the injection the process steal 156% of cpu doing nothing ( core 1
20% in user and 80% in system, and core 2 76% in user):
Here the trace:
epoll_wait(3, {{EPOLLIN|EPOLLRDHUP, {u32=57, u64=57}}, {EPOLLIN|EPOLLRDHUP,
{u32=56, u64=56}}, {EPOLLIN|EPOLLRDHUP, {u32=55, u64=55}}, {EPOLLIN|EPOLLRDHUP,
{u32=54, u64=54}}, {EPOLLIN|EPOLLRDHUP, {u32=53, u64=53}}, {EPOLLIN|EPOLLRDHUP,
{u32=52, u64=52}}, {EPOLLIN|EPOLLRDHUP, {u32=51, u64=51}}, {EPOLLIN|EPOLLRDHUP,
{u32=50, u64=50}}, {EPOLLIN|EPOLLRDHUP, {u32=49, u64=49}}, {EPOLLIN|EPOLLRDHUP,
{u32=48, u64=48}}, {EPOLLIN|EPOLLRDHUP, {u32=47, u64=47}}, {EPOLLIN|EPOLLRDHUP,
{u32=45, u64=45}}, {EPOLLIN|EPOLLRDHUP, {u32=42, u64=42}}, {EPOLLIN|EPOLLRDHUP,
{u32=44, u64=44}}, {EPOLLIN|EPOLLRDHUP, {u32=43, u64=43}}}, 200, 1000) = 15
epoll_wait(3, {{EPOLLIN|EPOLLRDHUP, {u32=57, u64=57}}, {EPOLLIN|EPOLLRDHUP,
{u32=56, u64=56}}, {EPOLLIN|EPOLLRDHUP, {u32=55, u64=55}}, {EPOLLIN|EPOLLRDHUP,
{u32=54, u64=54}}, {EPOLLIN|EPOLLRDHUP, {u32=53, u64=53}}, {EPOLLIN|EPOLLRDHUP,
{u32=52, u64=52}}, {EPOLLIN|EPOLLRDHUP, {u32=51, u64=51}}, {EPOLLIN|EPOLLRDHUP,
{u32=50, u64=50}}, {EPOLLIN|EPOLLRDHUP, {u32=49, u64=49}}, {EPOLLIN|EPOLLRDHUP,
{u32=48, u64=48}}, {EPOLLIN|EPOLLRDHUP, {u32=47, u64=47}}, {EPOLLIN|EPOLLRDHUP,
{u32=45, u64=45}}, {EPOLLIN|EPOLLRDHUP, {u32=42, u64=42}}, {EPOLLIN|EPOLLRDHUP,
{u32=44, u64=44}}, {EPOLLIN|EPOLLRDHUP, {u32=43, u64=43}}}, 200, 1000) = 15
epoll_wait(3, {{EPOLLIN|EPOLLRDHUP, {u32=57, u64=57}}, {EPOLLIN|EPOLLRDHUP,
{u32=56, u64=56}}, {EPOLLIN|EPOLLRDHUP, {u32=55, u64=55}}, {EPOLLIN|EPOLLRDHUP,
{u32=54, u64=54}}, {EPOLLIN|EPOLLRDHUP, {u32=53, u64=53}}, {EPOLLIN|EPOLLRDHUP,
{u32=52, u64=52}}, {EPOLLIN|EPOLLRDHUP, {u32=51, u64=51}}, {EPOLLIN|EPOLLRDHUP,
{u32=50, u64=50}}, {EPOLLIN|EPOLLRDHUP, {u32=49, u64=49}}, {EPOLLIN|EPOLLRDHUP,
{u32=48, u64=48}}, {EPOLLIN|EPOLLRDHUP, {u32=47, u64=47}}, {EPOLLIN|EPOLLRDHUP,
{u32=45, u64=45}}, {EPOLLIN|EPOLLRDHUP, {u32=42, u64=42}}, {EPOLLIN|EPOLLRDHUP,
{u32=44, u64=44}}, {EPOLLIN|EPOLLRDHUP, {u32=43, u64=43}}}, 200, 1000) = 15
epoll_wait(3, {{EPOLLIN|EPOLLRDHUP, {u32=57, u64=57}}, {EPOLLIN|EPOLLRDHUP,
{u32=56, u64=56}}, {EPOLLIN|EPOLLRDHUP, {u32=55, u64=55}}, {EPOLLIN|EPOLLRDHUP,
{u32=54, u64=54}}, {EPOLLIN|EPOLLRDHUP, {u32=53, u64=53}}, {EPOLLIN|EPOLLRDHUP,
{u32=52, u64=52}}, {EPOLLIN|EPOLLRDHUP, {u32=51, u64=51}}, {EPOLLIN|EPOLLRDHUP,
{u32=50, u64=50}}, {EPOLLIN|EPOLLRDHUP, {u32=49, u64=49}}, {EPOLLIN|EPOLLRDHUP,
{u32=48, u64=48}}, {EPOLLIN|EPOLLRDHUP, {u32=47, u64=47}}, {EPOLLIN|EPOLLRDHUP,
{u32=45, u64=45}}, {EPOLLIN|EPOLLRDHUP, {u32=42, u64=42}}, {EPOLLIN|EPOLLRDHUP,
{u32=44, u64=44}}, {EPOLLIN|EPOLLRDHUP, {u32=43, u64=43}}}, 200, 1000) = 15
epoll_wait awake in very fast loop.
When this point is reached, some of time, re-starting the injection will crash
haproxy in segfault.
Here my haproxy's config:
global
tune.ssl.default-dh-param 2048
ssl-engine qat
ssl-async
listen gg
mode http
bind 0.0.0.0:9443 ssl crt /root/2048.pem ciphers AES
redirect location /
R,
Emeric