Re: [PATCH 3/3] nbd: Use shutdown(SHUT_WR) after last item sent

Daniel P . Berrangé Fri, 27 Mar 2020 10:48:09 -0700

On Fri, Mar 27, 2020 at 12:42:21PM -0500, Eric Blake wrote:
> On 3/27/20 11:35 AM, Daniel P. Berrangé wrote:
> > On Fri, Mar 27, 2020 at 11:19:36AM -0500, Eric Blake wrote:
> > > Although the remote end should always be tolerant of a socket being
> > > arbitrarily closed, there are situations where it is a lot easier if
> > > the remote end can be guaranteed to read EOF even before the socket
> > > has closed.  In particular, when using gnutls, if we fail to inform
> > > the remote end about an impending teardown, the remote end cannot
> > > distinguish between our closing the socket as intended vs. a malicious
> > > intermediary interrupting things, and may result in spurious error
> > > messages.
> > 
> > Does this actually matter in the NBD case ?
> > 
> > It has an explicit NBD command for requesting shutdown, and once
> > that's processed, it is fine to just close the socket abruptly - I
> > don't see a benefit to a TLS shutdown sequence on top.
> 
> You're right that the NBD protocol has ways for the client to advertise it
> will be shutting down, AND documents that the server must be robust to
> clients that just abruptly disconnect after that point.  But we don't have
> control over all such servers, and there may very well be a server that logs
> an error on abrupt closure, where it would be silent if we did a proper
> gnutls_bye.  Which is more important: maximum speed in disconnecting after
> we expressed intent, or maximum attempt at catering to all sorts of remote
> implementations that might not be as tolerant as qemu is of an abrupt
> termination?


It is the cost / benefit tradeoff here that matters. Correctly using
gnutls_bye(), in contexts which aren't expected to block is non-trivial
bringing notable extra code complexity. It isn't an obvious win to me
for something that just changes an error message for a scenario that
can already be cleanly handled at the application protocol level.

> 
> > AFAIK, the TLS level clean shutdown is only required if the
> > application protocol does not have any way to determine an
> > unexpected shutdown itself.
> 
> 'man gnutls_bye' states:
> 
>        Note that not all implementations will properly terminate a TLS
> connec‐
>        tion.   Some  of  them, usually for performance reasons, will
> terminate
>        only the  underlying  transport  layer,  and  thus  not
> distinguishing
>        between  a  malicious  party prematurely terminating the connection
> and
>        normal termination.
> 
> You're right that because the protocol has an explicit message, we can
> reliably distinguish any early termination prior to
> NBD_OPT_ABORT/NBD_CMD_DISC as being malicious, so the only case where it
> matters is if we have a premature termination after we asked for clean
> shutdown, at which point a malicious termination didn't lose any data. So on
> that front, I guess you are right that not using gnutls_bye isn't going to
> have much impact.
> 
> > 
> > This is relevant for HTTP where the connection data stream may not
> > have a well defined end condition.
> > 
> > In the NBD case though, we have an explicit NBD_CMD_DISC to trigger
> > the disconnect. After processing that message, an EOF is acceptable
> > regardless of whether ,
> > before processing that message, any EOF is a unexpected.
> > 
> > >            Or, we can end up with a deadlock where both ends are stuck
> > > on a read() from the other end but neither gets an EOF.
> > 
> > If the socket has been closed abruptly why would it get stuck in
> > read() - it should see EOF surely ?
> 
> That's what I'm trying to figure out: the nbdkit testsuite definitely hung
> even though 'qemu-nbd --list' exited, but I haven't yet figured out whether
> the bug lies in nbdkit proper or in libnbd, nor whether a cleaner tls
> shutdown would have prevented the hang in a more reliable manner.
> https://www.redhat.com/archives/libguestfs/2020-March/msg00191.html


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [PATCH 3/3] nbd: Use shutdown(SHUT_WR) after last item sent

Reply via email to