On 5/7/19 3:35 PM, Marcin Deranek wrote:
> Hi Emeric,
> 
> On 5/7/19 1:53 PM, Emeric Brun wrote:
>> On 5/7/19 1:24 PM, Marcin Deranek wrote:
>>> Hi Emeric,
>>>
>>> On 5/7/19 11:44 AM, Emeric Brun wrote:
>>>> Hi Marcin,>>>>>> As I use HAProxy 1.8 I had to adjust the patch (see 
>>>> attachment for end result). Unfortunately after applying the patch there 
>>>> is no change in behavior: we still leak /dev/usdm_drv descriptors and have 
>>>> "stuck" HAProxy instances after reload..
>>>>>>> Regards,
>>>>>>
>>>>>>
>>>>
>>>> Could you perform a test recompiling the usdm_drv and the engine with this 
>>>> patch, it applies on QAT 1.7 but I've no hardware to test this version 
>>>> here.
>>>>
>>>> It should fix the fd leak.
>>>
>>> It did fix fd leak:
>>>
>>> # ls -al /proc/2565/fd|fgrep dev
>>> lr-x------ 1 root root 64 May  7 13:15 0 -> /dev/null
>>> lrwx------ 1 root root 64 May  7 13:15 7 -> /dev/usdm_drv
>>>
>>> # systemctl reload haproxy.service
>>> # ls -al /proc/2565/fd|fgrep dev
>>> lr-x------ 1 root root 64 May  7 13:15 0 -> /dev/null
>>> lrwx------ 1 root root 64 May  7 13:15 8 -> /dev/usdm_drv
>>>
>>> # systemctl reload haproxy.service
>>> # ls -al /proc/2565/fd|fgrep dev
>>> lr-x------ 1 root root 64 May  7 13:15 0 -> /dev/null
>>> lrwx------ 1 root root 64 May  7 13:15 9 -> /dev/usdm_drv
>>>
>>> But there are still stuck processes :-( This is with both patches included: 
>>> for QAT and HAProxy.
>>> Regards,
>>>
>>> Marcin Deranek
>>
>> Thank you Marcin! Anyway it's was also a bug.
>>
>> Could you process a 'show fds' command on a stucked process adding the patch 
>> in attachement.
> 
> I did apply this patch and all previous patches (QAT + HAProxy 
> ssl_free_engine). This is what I got after 1st reload:
> 
> show proc
> #<PID>          <type>          <relative PID>  <reloads>       <uptime>
> 8025            master          0               1               0d 00h03m25s
> # workers
> 31269           worker          1               0               0d 00h00m39s
> 31270           worker          2               0               0d 00h00m39s
> 31271           worker          3               0               0d 00h00m39s
> 31272           worker          4               0               0d 00h00m39s
> # old workers
> 9286            worker          [was: 1]        1               0d 00h03m25s
> 9287            worker          [was: 2]        1               0d 00h03m25s
> 9288            worker          [was: 3]        1               0d 00h03m25s
> 9289            worker          [was: 4]        1               0d 00h03m25s
> 
> @!9286 show fd
>      13 : st=0x05(R:PrA W:pra) ev=0x01(heopI) [lc] cache=0 owner=0x23eaae0 
> iocb=0x4877c0(mworker_accept_wrapper) tmask=0x1 umask=0x0
>      16 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x4e1ab0 
> iocb=0x4e1ab0(thread_sync_io_handler) tmask=0xffffffffffffffff umask=0x0
>      20 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x1601b840 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>      21 : st=0x22(R:pRa W:pRa) ev=0x00(heopi) [lc] cache=0 owner=0x1f0ec4f0 
> iocb=0x4ce6e0(conn_fd_handler) tmask=0x1 umask=0x0 cflg=0x00241300 fe=GLOBAL 
> mux=PASS mux_ctx=0x22ad8630
>    1412 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x1bab1f30 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1413 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x247e5bc0 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1414 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x18883650 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1415 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x14476c10 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1416 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x11a27850 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1418 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x12008230 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1419 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x1bb0a570 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1420 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x11c94790 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1421 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x1449e050 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1422 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x1f00c150 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1423 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x15f40550 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1424 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x124b6340 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1425 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x11fe4500 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1426 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x11c70a60 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1427 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x12572540 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1428 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x1249a420 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1430 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x11b224a0 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1431 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x14f668e0 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1432 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x1448a630 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1433 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x14f32010 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1434 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x1588ed80 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1435 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x1efb3e50 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1436 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x10f4cc40 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1437 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x1bac59b0 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1439 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x144b1a70 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1440 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x1170a380 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1441 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x1bad93f0 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1442 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x1bb27ca0 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1443 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x158233b0 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1444 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x124ba940 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1445 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x15f65850 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1446 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x1ab4c9e0 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1447 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x11e2a7b0 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1448 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x16923e40 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1449 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x15e156c0 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1450 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x1585f040 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1451 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x11d0c0f0 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1452 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x1bb00860 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1454 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x1bb1df90 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1455 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x11b16850 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1460 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x115ffe30 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1461 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x16936f10 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1462 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x15fbf350 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1463 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x1efd1630 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1465 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x1bacf6d0 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1467 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x11079580 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1468 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x11e425d0 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1469 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x144a7d60 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1472 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x23e6c10 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1474 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x158beac0 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1476 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x1270e190 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1480 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x11f10960 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1484 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x124a4b40 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1488 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x11b461d0 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1490 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x11643280 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1492 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x215945c0 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1499 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x14f68b30 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1500 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x19e59970 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1503 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x1fc7b710 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
>    1508 : st=0x05(R:PrA W:pra) ev=0x00(heopi) [lc] cache=0 owner=0x16e7cc90 
> iocb=0x4f4d50(ssl_async_fd_free) tmask=0x1 umask=0x0
> 
> Regards,
> 
> Marcin Deranek


Thank you Marcin, It shows that haproxy is waiting for an event on all those 
fds because a crypto jobs were launched on the engine 
and we can't free the session until the end of this job (it would result in a 
segfault).

So the processes are stucked, unable to free the session because the engine 
doesn't signal the end of those job via the async fd.

I didn't reproduce this issue on QAT 1.5 so I will try to discuss it with intel 
guys to known why there is this behavior change in the v1.7
and what we can do.

R,
Emeric



Reply via email to