On 31/10/2024 13:31, Ilya Maximets wrote:
> On 10/31/24 12:01, Eelco Chaudron wrote:
>>
>>
>> On 30 Oct 2024, at 14:50, Ilya Maximets wrote:
>>
>>> Until now, functions that needed to call external programs like openssl
>>> or ipsec commands were using subprocess commands directly.  Most of
>>> these calls had no failure checks or any logging making it hard to
>>> understand what is happening inside the daemon when something doesn't
>>> work as intended.
>>>
>>> Some commands also had a chance to not read the command output in full.
>>> That might sound like not a big problem, but in practice it causes
>>> ovs-monitor-ipsec to deadlock pluto and itself with certain versions of
>>> Libreswan (mainly Libreswan 5+).  The order of events is following:
>>>
>>>  1. ovs-monitor-ipsec calls ipsec status redirecting the output
>>>     to a pipe.
>>>  2. ipsec status calls ipsec whack.
>>>  3. ipsec whack connects to pluto and asks for status.
>>>  4. ovs-monitor-ipsec doesn't read the pipe in full.
>>>  5. ipsec whack blocks on write to the other side of the pipe
>>>     when it runs out of buffer space.
>>>  6. pluto blocks on sendmsg to ipsec whack for the same reason.
>>>  7. ovs-monitor-ipsec calls another ipsec command and blocks
>>>     on connection to pluto.
>>>
>>> In this scenario the running process is at the mercy of garbage
>>> collector and it doesn't run because we're blocked on calling another
>>> ipsec command.  All the processes are completely blocked and will not
>>> do any work until ipsec whack is killed.
>>>
>>> With this change we're introducing a new function that will be used
>>> for all the external process execution commands and will read the full
>>> output before returning, avoiding the deadlock.  It will also log all
>>> the failures as warnings and the commands themselves at the debug level.
>>>
>>> We'll be adding more logic into this function in later commits as well,
>>> so it will not stay that simple.
>>>
>>> Signed-off-by: Ilya Maximets <[email protected]>
>>
>> Thanks for the series, I did finish reviewing it and did some basic testing.
>>
>> Please find comments below and in the successive patches.
>>
>> //Eelco
>>
>>> ---
>>>  ipsec/ovs-monitor-ipsec.in | 290 +++++++++++++++++--------------------
>>>  1 file changed, 131 insertions(+), 159 deletions(-)
>>>
>>> diff --git a/ipsec/ovs-monitor-ipsec.in b/ipsec/ovs-monitor-ipsec.in
>>> index 37c509ac6..4885e048f 100755
>>> --- a/ipsec/ovs-monitor-ipsec.in
>>> +++ b/ipsec/ovs-monitor-ipsec.in
>>> @@ -84,6 +84,28 @@ monitor = None
>>>  xfrm = None
>>>
>>>
>>> +def run_command(args, description=None):
>>> +    """ This function runs the process args[0] with args[1:] arguments
>>> +    and returns a tuple: return-code, stdout, stderr. """
>>> +
>>> +    if not description:
>>> +        description = args[-1]
>>
>> If the description is not given, it makes more sense to show the actual 
>> command
>> instead of the last argument.
>> Or maybe replace the string with “run”, and in the timeout case just print 
>> the
>> entire command?
> 
> We do already print the entire command (proc.args).  I guess what Roi
> suggested before - printing the first argument - makes sense here, but
> we'll need to adjust the text of the message.
> 
> Maybe we can do:
> 
>     if not description:
>         description = "run %s command" % args[0]
> 
> So it will turn into "Failed to run ipsec/openssl command; exit code: ..."
> And then printing of the whole cmdline.
> 
> What do you think?  Roi?

yes this looks fine. thanks.

> 
>>
>>> +
>>> +    vlog.dbg("Running %s" % args)
>>> +    proc = subprocess.Popen(args, stdout=subprocess.PIPE,
>>> +                            stderr=subprocess.PIPE)
>>> +    pout, perr = proc.communicate()
>>> +
>>> +    if proc.returncode or perr:
>>> +        vlog.warn("Failed to %s; exit code: %d"
>>> +                  % (description, proc.returncode))
>>> +        vlog.warn("cmdline: %s" % proc.args)
>>> +        vlog.warn("stderr: %s" % perr)
>>> +        vlog.warn("stdout: %s" % pout)
>>> +
>>> +    return proc.returncode, pout.decode(), perr.decode()
>>> +
>>> +
>>
>> <SNIP>
>>
> 

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to