Hi Marcin,

Good so we progress!

I've a testing platform here but I don't use the usdm_drv but the 
qat_contig_mem and I don't reproduce this issue (I'm using QAT 1.5, as the doc 
says to use with my chip) .

Anyway, could you re-compile a haproxy's binary if I provide you a testing 
patch?

The idea is to perform a deinit in the master to force a close of those '/dev's 
at each reload. Perhaps It won't fix our issue but this leak of fd should not 
be.

R,
Emeric

On 5/3/19 4:21 PM, Marcin Deranek wrote:
> Hi Emeric,
> 
> It looks like on every reload master leaks /dev/usdm_drv device:
> 
> # systemctl restart haproxy.service
> # ls -la /proc/$(cat haproxy.pid)/fd|fgrep dev
> lr-x------ 1 root root 64 May  3 15:40 0 -> /dev/null
> lrwx------ 1 root root 64 May  3 15:40 7 -> /dev/usdm_drv
> 
> # systemctl reload haproxy.service
> # ls -la /proc/$(cat haproxy.pid)/fd|fgrep dev
> lr-x------ 1 root root 64 May  3 15:40 0 -> /dev/null
> lrwx------ 1 root root 64 May  3 15:40 7 -> /dev/usdm_drv
> lrwx------ 1 root root 64 May  3 15:40 9 -> /dev/usdm_drv
> 
> # systemctl reload haproxy.service
> # ls -la /proc/$(cat haproxy.pid)/fd|fgrep dev
> lr-x------ 1 root root 64 May  3 15:40 0 -> /dev/null
> lrwx------ 1 root root 64 May  3 15:40 10 -> /dev/usdm_drv
> lrwx------ 1 root root 64 May  3 15:40 7 -> /dev/usdm_drv
> lrwx------ 1 root root 64 May  3 15:40 9 -> /dev/usdm_drv
> 
> Obviously workers do inherit this from the master. Looking at workers I see 
> the following:
> 
> * 1st gen:
> 
> # ls -al /proc/36083/fd|awk '/dev/ {print $NF}'|sort
> /dev/null
> /dev/null
> /dev/qat_adf_ctl
> /dev/qat_adf_ctl
> /dev/qat_adf_ctl
> /dev/qat_dev_processes
> /dev/uio19
> /dev/uio3
> /dev/uio35
> /dev/usdm_drv
> 
> * 2nd gen:
> 
> # ls -al /proc/41637/fd|awk '/dev/ {print $NF}'|sort
> /dev/null
> /dev/null
> /dev/qat_adf_ctl
> /dev/qat_adf_ctl
> /dev/qat_adf_ctl
> /dev/qat_dev_processes
> /dev/uio23
> /dev/uio39
> /dev/uio7
> /dev/usdm_drv
> /dev/usdm_drv
> 
> Looks like only /dev/usdm_drv is leaked.
> 
> Cheers,
> 
> Marcin Deranek
> 
> On 5/3/19 2:22 PM, Emeric Brun wrote:
>> Hi Marcin,
>>
>> On 4/29/19 6:41 PM, Marcin Deranek wrote:
>>> Hi Emeric,
>>>
>>> On 4/29/19 3:42 PM, Emeric Brun wrote:
>>>> Hi Marcin,
>>>>
>>>>>
>>>>>> I've also a contact at intel who told me to try this option on the qat 
>>>>>> engine:
>>>>>>
>>>>>>> --disable-qat_auto_engine_init_on_fork/--enable-qat_auto_engine_init_on_fork
>>>>>>>        Disable/Enable the engine from being initialized automatically 
>>>>>>> following a
>>>>>>>        fork operation. This is useful in a situation where you want to 
>>>>>>> tightly
>>>>>>>        control how many instances are being used for processes. For 
>>>>>>> instance if an
>>>>>>>        application forks to start a process that does not utilize QAT 
>>>>>>> currently
>>>>>>>        the default behaviour is for the engine to still automatically 
>>>>>>> get started
>>>>>>>        in the child using up an engine instance. After using this flag 
>>>>>>> either the
>>>>>>>        engine needs to be initialized manually using the engine message:
>>>>>>>        INIT_ENGINE or will automatically get initialized on the first 
>>>>>>> QAT crypto
>>>>>>>        operation. The initialization on fork is enabled by default.
>>>>>
>>>>> I tried to build QAT Engine with disabled auto init, but that did not 
>>>>> help. Now I get the following during startup:
>>>>>
>>>>> 2019-04-29T15:13:47.142297+02:00 host1 hapee-lb[16604]: qaeOpenFd:753 
>>>>> Unable to initialize memory file handle /dev/usdm_drv
>>>>> 2019-04-29T15:13:47+02:00 localhost hapee-lb[16611]: 127.0.0.1:60512 
>>>>> [29/Apr/2019:15:13:47.139] vip1/23: SSL handshake failure
>>>>
>>>> " INIT_ENGINE or will automatically get initialized on the first QAT 
>>>> crypto operation"
>>>>
>>>> Perhaps the init appears "with first qat crypto operation" and is delayed 
>>>> after the fork so if a chroot is configured, it doesn't allow some accesses
>>>> to /dev. Could you perform a test in that case without chroot enabled in 
>>>> the haproxy config ?
>>>
>>> Removed chroot and now it initializes properly. Unfortunately reload still 
>>> causes "stuck" HAProxy process :-(
>>>
>>> Marcin Deranek
>>
>> Could you check with "ls -l /proc/<masterpid>/fd" if the "/dev/<qatengine>" 
>> is open multiple times after a reload?
>>
>> Emeric
>>


Reply via email to