On 4/26/23 10:13, Stefan Hoffmann wrote: > On Tue, 2023-04-25 at 18:13 +0200, Ilya Maximets wrote: >> On 4/25/23 15:08, Stefan Hoffmann wrote: >>> On Mon, 2023-04-24 at 16:42 -0500, Terry Wilson wrote: >>>> >>>> On Fri, Apr 21, 2023 at 3:29 AM Stefan Hoffmann >>>> <[email protected]> wrote: >>>>> In some cases ovsdb server or relay gets restarted, ovsdb python >>>>> clients >>>>> may keep the local socket open. Instead of reconnecting a lot of >>>>> failures >>>>> will be logged. >>>>> This can be reproduced with ssl connections to the server/relay and >>>>> restarting it, so it has the same IP after restart. >>>>> >>>>> This patch catches the Exceptions at do_handshake to recreate the >>>>> connection on the client side. >>>>> >>>> >>>> >>>> Is this something we can add some tests for? I know we have some >>>> other tests that involve restarts, but I don't think they involve >>>> python clients or SSL. >>> >>> I can try to find a proper place to add such an test. Can you point me >>> to the correct place to add python clients to that test? >>> But I'm not sure, if we run always into that issue at a restart. >> >> Usually, such tests live somewhere around tests/ovsdb-idl.at. >> There is a basic CHECK_STREAM_OPEN_BLOCK test there, but it >> doesn't check SSL connections, maybe it's worth it to extend >> the test to be able to connect via SSL. This will require >> changing the tests/test-stream.py test script. >> >> On the subject, I'm not sure if the issue can be reliably >> reproduced, especially because it requires OpenSSL 3.0, AFAICT. >> >> I have one question for the code below. >> >>>> >>>>> Reviewed-by: Simon Horman <[email protected]> >>>>> Signed-off-by: Stefan Hoffmann <[email protected]> >>>>> Signed-off-by: Luca Czesla <[email protected]> >>>>> Signed-off-by: Max Lamprecht <[email protected]> >>>>> Co-authored-by: Luca Czesla <[email protected]> >>>>> Co-authored-by: Max Lamprecht <[email protected]> >>>>> --- >>>>> python/ovs/stream.py | 3 ++- >>>>> 1 file changed, 2 insertions(+), 1 deletion(-) >>>>> >>>>> diff --git a/python/ovs/stream.py b/python/ovs/stream.py >>>>> index ac5b0fd0c..b32341076 100644 >>>>> --- a/python/ovs/stream.py >>>>> +++ b/python/ovs/stream.py >>>>> @@ -824,7 +824,8 @@ class SSLStream(Stream): >>>>> self.socket.do_handshake() >>>>> except ssl.SSLWantReadError: >>>>> return errno.EAGAIN >>>>> - except ssl.SSLSyscallError as e: >>>>> + except (ssl.SSLSyscallError, ssl.SSLZeroReturnError, >>>>> + ssl.SSLEOFError, OSError) as e: >> >> Do we actually need to catch the generic OSError here? >> It looks a bit odd to catch non-ssl exception. > > We got an "Transport endpoint is not connected" sometimes. > It tries getpeername() at a socket that is not connected anymore or > something like that. That results in the OSError. > We also saw, that the file descriptor was still there but not > connected. > > Here the traceback (I hope you can still read that after line wrap): > Traceback (most recent call last): > File "/usr/local/lib/python3.9/site- > packages/ovsdbapp/backend/ovs_idl/connection.py", line 107, in run > self.idl.run() > File "/usr/local/lib/python3.9/site-packages/ovs-3.1.0- > py3.9.egg/ovs/db/idl.py", line 433, in run > self._session.run() > File "/usr/local/lib/python3.9/site-packages/ovs-3.1.0- > py3.9.egg/ovs/jsonrpc.py", line 519, in run > error = self.stream.connect() > File "/usr/local/lib/python3.9/site-packages/ovs-3.1.0- > py3.9.egg/ovs/stream.py", line 824, in connect > self.socket.do_handshake() > File "/usr/local/lib/python3.9/site-packages/eventlet/green/ssl.py", > line 312, in do_handshake > return self._call_trampolining( > File "/usr/local/lib/python3.9/site-packages/eventlet/green/ssl.py", > line 158, in _call_trampolining > return func(*a, **kw) > File "/usr/local/lib/python3.9/ssl.py", line 1305, in do_handshake > self._check_connected() > File "/usr/local/lib/python3.9/ssl.py", line 1089, in > _check_connected > self.getpeername() > > OSError: [Errno 107] Transport endpoint is not connected
OK. Yeah, looking through the ssl.py implementation, it can indeed throw a generic OSError. That's unfortunate, but OK, I guess. Thanks for clarification! >> >> Best regards, Ilya Maximets. > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
