On 31 Oct 2024, at 12:51, Roi Dayan wrote:

> On 31/10/2024 13:31, Ilya Maximets wrote:
>> On 10/31/24 12:01, Eelco Chaudron wrote:
>>>
>>>
>>> On 30 Oct 2024, at 14:50, Ilya Maximets wrote:
>>>
>>>> Until now, functions that needed to call external programs like openssl
>>>> or ipsec commands were using subprocess commands directly.  Most of
>>>> these calls had no failure checks or any logging making it hard to
>>>> understand what is happening inside the daemon when something doesn't
>>>> work as intended.
>>>>
>>>> Some commands also had a chance to not read the command output in full.
>>>> That might sound like not a big problem, but in practice it causes
>>>> ovs-monitor-ipsec to deadlock pluto and itself with certain versions of
>>>> Libreswan (mainly Libreswan 5+).  The order of events is following:
>>>>
>>>>  1. ovs-monitor-ipsec calls ipsec status redirecting the output
>>>>     to a pipe.
>>>>  2. ipsec status calls ipsec whack.
>>>>  3. ipsec whack connects to pluto and asks for status.
>>>>  4. ovs-monitor-ipsec doesn't read the pipe in full.
>>>>  5. ipsec whack blocks on write to the other side of the pipe
>>>>     when it runs out of buffer space.
>>>>  6. pluto blocks on sendmsg to ipsec whack for the same reason.
>>>>  7. ovs-monitor-ipsec calls another ipsec command and blocks
>>>>     on connection to pluto.
>>>>
>>>> In this scenario the running process is at the mercy of garbage
>>>> collector and it doesn't run because we're blocked on calling another
>>>> ipsec command.  All the processes are completely blocked and will not
>>>> do any work until ipsec whack is killed.
>>>>
>>>> With this change we're introducing a new function that will be used
>>>> for all the external process execution commands and will read the full
>>>> output before returning, avoiding the deadlock.  It will also log all
>>>> the failures as warnings and the commands themselves at the debug level.
>>>>
>>>> We'll be adding more logic into this function in later commits as well,
>>>> so it will not stay that simple.
>>>>
>>>> Signed-off-by: Ilya Maximets <[email protected]>
>>>
>>> Thanks for the series, I did finish reviewing it and did some basic testing.
>>>
>>> Please find comments below and in the successive patches.
>>>
>>> //Eelco
>>>
>>>> ---
>>>>  ipsec/ovs-monitor-ipsec.in | 290 +++++++++++++++++--------------------
>>>>  1 file changed, 131 insertions(+), 159 deletions(-)
>>>>
>>>> diff --git a/ipsec/ovs-monitor-ipsec.in b/ipsec/ovs-monitor-ipsec.in
>>>> index 37c509ac6..4885e048f 100755
>>>> --- a/ipsec/ovs-monitor-ipsec.in
>>>> +++ b/ipsec/ovs-monitor-ipsec.in
>>>> @@ -84,6 +84,28 @@ monitor = None
>>>>  xfrm = None
>>>>
>>>>
>>>> +def run_command(args, description=None):
>>>> +    """ This function runs the process args[0] with args[1:] arguments
>>>> +    and returns a tuple: return-code, stdout, stderr. """
>>>> +
>>>> +    if not description:
>>>> +        description = args[-1]
>>>
>>> If the description is not given, it makes more sense to show the actual 
>>> command
>>> instead of the last argument.
>>> Or maybe replace the string with “run”, and in the timeout case just print 
>>> the
>>> entire command?
>>
>> We do already print the entire command (proc.args).  I guess what Roi
>> suggested before - printing the first argument - makes sense here, but
>> we'll need to adjust the text of the message.
>>
>> Maybe we can do:
>>
>>     if not description:
>>         description = "run %s command" % args[0]
>>
>> So it will turn into "Failed to run ipsec/openssl command; exit code: ..."
>> And then printing of the whole cmdline.
>>
>> What do you think?  Roi?
>
> yes this looks fine. thanks.

+1
>>
>>>
>>>> +
>>>> +    vlog.dbg("Running %s" % args)
>>>> +    proc = subprocess.Popen(args, stdout=subprocess.PIPE,
>>>> +                            stderr=subprocess.PIPE)
>>>> +    pout, perr = proc.communicate()
>>>> +
>>>> +    if proc.returncode or perr:
>>>> +        vlog.warn("Failed to %s; exit code: %d"
>>>> +                  % (description, proc.returncode))
>>>> +        vlog.warn("cmdline: %s" % proc.args)
>>>> +        vlog.warn("stderr: %s" % perr)
>>>> +        vlog.warn("stdout: %s" % pout)
>>>> +
>>>> +    return proc.returncode, pout.decode(), perr.decode()
>>>> +
>>>> +
>>>
>>> <SNIP>
>>>
>>

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to