Hi,

In a process of evaluating performance of Intel Quick Assist Technology in conjunction with HAProxy software I acquired Intel C62x Chipset card for testing. I configured QAT engine in the following manner:

* /etc/qat/c6xx_dev[012].conf

[GENERAL]
ServicesEnabled = cy
ConfigVersion = 2
CyNumConcurrentSymRequests = 512
CyNumConcurrentAsymRequests = 64
statsGeneral = 1
statsDh = 1
statsDrbg = 1
statsDsa = 1
statsEcc = 1
statsKeyGen = 1
statsDc = 1
statsLn = 1
statsPrime = 1
statsRsa = 1
statsSym = 1
KptEnabled = 0
StorageEnabled = 0
PkeServiceDisabled = 0
DcIntermediateBufferSizeInKB = 64

[KERNEL]
NumberCyInstances = 0
NumberDcInstances = 0

[SHIM]
NumberCyInstances = 1
NumberDcInstances = 0
NumProcesses = 16
LimitDevAccess = 0

Cy0Name = "UserCY0"
Cy0IsPolled = 1
Cy0CoreAffinity = 0

OpenSSL produces good results without warnings / errors:

* No QAT involved

$ openssl speed -elapsed rsa2048
You have chosen to measure elapsed time instead of user CPU time.
Doing 2048 bits private rsa's for 10s: 10858 2048 bits private RSA's in 10.00s Doing 2048 bits public rsa's for 10s: 361207 2048 bits public RSA's in 10.00s
OpenSSL 1.1.1a FIPS  20 Nov 2018
built on: Tue Jan 22 20:43:41 2019 UTC
options:bn(64,64) md2(char) rc4(16x,int) des(int) aes(partial) idea(int) blowfish(ptr) compiler: gcc -fPIC -pthread -m64 -Wa,--noexecstack -Wall -O3 -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -Wa,--noexecstack -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DX25519_ASM -DPADLOCK_ASM -DPOLY1305_ASM -DZLIB -DNDEBUG -DPURIFY -DDEVRANDOM="\"/dev/urandom\"" -DSYSTEM_CIPHERS_FILE="/opt/openssl/etc/crypto-policies/back-ends/openssl.config"
                  sign    verify    sign/s verify/s
rsa 2048 bits 0.000921s 0.000028s   1085.8  36120.7

* QAT enabled

$ openssl speed -elapsed -engine qat -async_jobs 32 rsa2048
engine "qat" set.
You have chosen to measure elapsed time instead of user CPU time.
Doing 2048 bits private rsa's for 10s: 205425 2048 bits private RSA's in 10.00s Doing 2048 bits public rsa's for 10s: 2150270 2048 bits public RSA's in 10.00s
OpenSSL 1.1.1a FIPS  20 Nov 2018
built on: Tue Jan 22 20:43:41 2019 UTC
options:bn(64,64) md2(char) rc4(16x,int) des(int) aes(partial) idea(int) blowfish(ptr) compiler: gcc -fPIC -pthread -m64 -Wa,--noexecstack -Wall -O3 -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -Wa,--noexecstack -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DX25519_ASM -DPADLOCK_ASM -DPOLY1305_ASM -DZLIB -DNDEBUG -DPURIFY -DDEVRANDOM="\"/dev/urandom\"" -DSYSTEM_CIPHERS_FILE="/opt/openssl/etc/crypto-policies/back-ends/openssl.config"
                  sign    verify    sign/s verify/s
rsa 2048 bits 0.000049s 0.000005s  20542.5 215027.0

So far so good. Unfortunately HAProxy 1.8 iwth QAT engine enabled periodically fail with SSL checks of backend servers. The simplest configuration I could get to reproduce it:

* /etc/haproxy/haproxy.cfg

global
    user lbengine
    group lbengine
    daemon
    ssl-mode-async
    ssl-engine qat
    ssl-server-verify none
stats socket /run/lb_engine/process-1.sock user lbengine group lbengine mode 660 level admin expose-fd listeners process 1

defaults
    mode http
    timeout check 5s
    timeout connect 4s

backend pool_all
    default-server inter 5s

    server server1 ip1:443 check ssl
    server server2 ip2:443 check ssl
    ...
    server serverN ipN:443 check ssl

Without QAT enabled everything works just fine - healthchecks do not flap. With QAT engine enabled random server healtchecks flap: they fail and then shortly after they recover eg.

2019-03-06T15:06:22+01:00 localhost hapee-lb[1832]: Server pool_all/server1 is DOWN, reason: Layer6 timeout, check duration: 4000ms. 110 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue. 2019-03-06T15:06:32+01:00 localhost hapee-lb[1832]: Server pool_all/server1 is UP, reason: Layer6 check passed, check duration: 13ms. 117 active and 0 backup servers online. 0 sessions requeued, 0 total in queue.

Increasing check frequency (lowering check interval) makes the problem occur more frequently. Anybody has a clue why this is happening ? Has anybody seen such behavior ?
Regards,

Marcin Deranek

Reply via email to