25.03.2019 17:44, Eric Blake wrote: > On 3/25/19 5:12 AM, Vladimir Sementsov-Ogievskiy wrote: >> 23.03.2019 17:24, Eric Blake wrote: >>> The NBD spec is clear that when structured replies are active, a >>> simple error reply is acceptable to any command except for >>> NBD_CMD_READ. However, we were mistakenly requiring structured errors >>> for NBD_CMD_BLOCK_STATUS, and hanging up on a server that gave a >>> simple error (since qemu does not behave as such a server, we didn't >>> notice the problem until now). Broken since its introduction in >>> commit 78a33ab5 (v2.12). >>> >>> Howeve, even if we had gotten it correct to accept simple errors back >>> then, we run into another problem: the client treats the server's >>> reply as fatal and hangs up on the connection, instead of merely >>> failing the block status request but being ready for the next >>> command. Broken in commit 7f86068d (unreleased). >>> >>> Signed-off-by: Eric Blake <ebl...@redhat.com> >>> --- >>> > >>> +++ b/block/nbd-client.c >>> @@ -718,9 +718,7 @@ static int >>> nbd_co_receive_blockstatus_reply(NBDClientSession *s, >>> bool received = false; >>> >>> assert(!extent->length); >>> - NBD_FOREACH_REPLY_CHUNK(s, iter, handle, s->info.structured_reply, >>> - NULL, &reply, &payload) >>> - { >>> + NBD_FOREACH_REPLY_CHUNK(s, iter, handle, false, NULL, &reply, >>> &payload) { > > This part changes things to permit a simple reply if that reply is an > error. In turn, if the server sends a simple reply error, > iter.request_ret is set to the error returned by the server while > leaving iter.ret and iter.err unset (because we successfully got the > server's reply and can keep the connection alive), with commit 7f86068d > in play (prior to that commit, iter.ret was left 0 because the server > replied successfully, while iter.err was set to the set to the server's > error, while leaving iter.fatal unset). > >>> int ret; >>> NBDStructuredReplyChunk *chunk = &reply.structured; >>> >>> @@ -758,9 +756,11 @@ static int >>> nbd_co_receive_blockstatus_reply(NBDClientSession *s, >>> payload = NULL; >>> } >>> >>> - if (!extent->length && !iter.err) { >>> - error_setg(&iter.err, >>> - "Server did not reply with any status extents"); >>> + if (!extent->length && !iter.request_ret) { >> >> Hmm, I don't see, what is changed. You just don't set ret and err if there >> already >> set request_ret.. So, finally it may lead to other return code in >> nbd_client_co_block_status >> and local_err will not be set and traced. But there is no connection close >> in this case >> in current code, s->quit is not set. > > And that's okay. The whole point of this is that if the server replied > with an error message (iter.request_ret), then it is okay that the > server did not give us any extent->length, and the caller will turn > iter.request_ret into the EINVAL reply (or whatever other errno value > the server sent) to .bdrv_co_block_status() while keeping the connection > alive. But if the server did NOT send an error reply (iter.request_ret > is 0), but also did not send us any status, then we need to turn that > into an error condition (because our caller expected us to make progress > on success, even though the server did not make progress).
I just say, that I don't see "the client treats the server's reply as fatal and hangs up on the connection" in this place before your change. Or what you mean? > >> >>> + if (!iter.err) { >>> + error_setg(&iter.err, >>> + "Server did not reply with any status extents"); >>> + } >>> if (!iter.ret) { >>> iter.ret = -EIO; >>> } >>> >> >> > -- Best regards, Vladimir