According to the NBD spec, the server should close the connection rather than replying to NBD_CMD_DISC (after flushing any pending replies to earlier commands), and a compliant client should not send any data after that command. However, when nbdkit is running multithreaded, we already have multiple threads competing for a read lock, and each of those threads would try to read() from the socket, which will never make progress unless the client hangs up so that the read fails with EOF.
But if the client waits to close its end until after seeing EOF from the server (as was the case with the nbd plugin as the client until the previous patch), then both sides are deadlocked on a read. We already short-circuit a read attempt when the socket is closed; we should likewise avoid a read attempt after the client has requested a disconnect, so that we aren't at the mercy of the client properly using shutdown(). Note that this patch in isolation without the previous patch to the nbd plugin is sufficient to make the testsuite deadlock go away. Signed-off-by: Eric Blake <[email protected]> --- src/connections.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/connections.c b/src/connections.c index 46f2cd4..6c1f8cd 100644 --- a/src/connections.c +++ b/src/connections.c @@ -1056,8 +1056,9 @@ recv_request_send_reply (struct connection *conn) /* Read the request packet. */ { ACQUIRE_LOCK_FOR_CURRENT_SCOPE (&conn->read_lock); - if (get_status (conn) < 0) - return -1; + r = get_status (conn); + if (r <= 0) + return r; r = conn->recv (conn, &request, sizeof request); if (r == -1) { nbdkit_error ("read request: %m"); -- 2.14.3 _______________________________________________ Libguestfs mailing list [email protected] https://www.redhat.com/mailman/listinfo/libguestfs
