Re: [Qemu-devel] [PATCH 6/9] nbd: BLOCK_STATUS for standard get_block_status function: client part
On 03/12/2018 08:13 AM, Vladimir Sementsov-Ogievskiy wrote: We should probably validate that the length field is a multiple of min_block (if a server tells us that all operations must be 512-byte aligned, then reports an extent that is smaller than 512 bytes, we have no way to ask for the status of the second half of the sector). Probably also something that needs to be explicitly stated in the NBD spec. [1] related question: what server should reply on zero-length request? I'll add + if (!bytes) { + *pnum = 0; + return 0; + } to nbd_client_co_block_status, to prevent such situation, but looks like spec lacks the information. nbd.git commit ee926037 mentions that NBD_CMD_READ, _WRITE, _TRIM, and _WRITE_ZEROES have unspecified behavior for zero-length transactions; we should do the same for NBD_CMD_BLOCK_STATUS. But in the meantime, handling it gracefully with a no-op reply (the way qemu.git commit ef8c887e handles 0-length structured read) is fine. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Re: [Qemu-devel] [PATCH 6/9] nbd: BLOCK_STATUS for standard get_block_status function: client part
16.02.2018 23:40, Eric Blake wrote: On 02/15/2018 07:51 AM, Vladimir Sementsov-Ogievskiy wrote: Minimal realization: only one extent in server answer is supported. Flag NBD_CMD_FLAG_REQ_ONE is used to force this behavior. Tests 140, 147 and 205 are fixed due to now server failed on searching export in context of NBD_OPT_SET_META_CONTEXT option negotiation. Signed-off-by: Vladimir Sementsov-Ogievskiy --- [...] Instead of doing a memcpy() and then in-place bit-swizzling, you could do the swapping as part of assignment, for one less function call (and make the code a bit easier to extend, if we later drop our REQ_ONE limitation on only having one extent, because you'll advance payload as needed): extent->length = payload_advance32(&payload); extent->flags = payload_advance32(&payload); We should probably validate that the length field is a multiple of min_block (if a server tells us that all operations must be 512-byte aligned, then reports an extent that is smaller than 512 bytes, we have no way to ask for the status of the second half of the sector). Probably also something that needs to be explicitly stated in the NBD spec. [1] related question: what server should reply on zero-length request? I'll add + if (!bytes) { + *pnum = 0; + return 0; + } to nbd_client_co_block_status, to prevent such situation, but looks like spec lacks the information. -- Best regards, Vladimir
Re: [Qemu-devel] [PATCH 6/9] nbd: BLOCK_STATUS for standard get_block_status function: client part
16.02.2018 23:40, Eric Blake wrote: On 02/15/2018 07:51 AM, Vladimir Sementsov-Ogievskiy wrote: Minimal realization: only one extent in server answer is supported. Flag NBD_CMD_FLAG_REQ_ONE is used to force this behavior. Tests 140, 147 and 205 are fixed due to now server failed on searching export in context of NBD_OPT_SET_META_CONTEXT option negotiation. Signed-off-by: Vladimir Sementsov-Ogievskiy --- [...] void nbd_client_detach_aio_context(BlockDriverState *bs) { NBDClientSession *client = nbd_get_client_session(bs); @@ -828,6 +966,7 @@ int nbd_client_init(BlockDriverState *bs, client->info.request_sizes = true; client->info.structured_reply = true; + client->info.base_allocation = true; Hmm - the combination structured_reply = false and base_allocation = true is bogus. Then again, these two fields are in-out; left false when handing over to the kernel NBD transmission phase (since the kernel module does not yet support structured replies let alone block status), and true when requested with qemu as the transmission driver (since we want to use it if available). I don't know if having a single tri-state enum is any better than two bools (on input, it is either all-or-none; on output, it is either none (old server), structured reads only (qemu 2.11 server, for example), or all (this series' server). ohh, sorry for replying in so many emails. about this: I'd prefer to leave it as is, and rethink (if needed) when implementing dirty-bitmaps context, because it's now obvious now, how it will be refactored for dirty bitmaps. -- Best regards, Vladimir
Re: [Qemu-devel] [PATCH 6/9] nbd: BLOCK_STATUS for standard get_block_status function: client part
16.02.2018 23:40, Eric Blake wrote: On 02/15/2018 07:51 AM, Vladimir Sementsov-Ogievskiy wrote: Minimal realization: only one extent in server answer is supported. Flag NBD_CMD_FLAG_REQ_ONE is used to force this behavior. Tests 140, 147 and 205 are fixed due to now server failed on searching export in context of NBD_OPT_SET_META_CONTEXT option negotiation. Signed-off-by: Vladimir Sementsov-Ogievskiy --- [...] +++ b/nbd/client.c @@ -594,6 +594,108 @@ static QIOChannel *nbd_receive_starttls(QIOChannel *ioc, return QIO_CHANNEL(tioc); } +/* nbd_negotiate_simple_meta_context: + * Set one meta context. Simple means that reply must contain zero (not + * negotiated) or one (negotiated) contexts. More contexts would be considered + * as a protocol error. + * return 1 for successful negotiation, context_id is set + * 0 if operation is unsupported, + * -errno with errp set for any other error + */ Good enough for our first use. Will obviously need improvements if we support base:allocation AND dirty bitmap exposure at the same time, in future patches ;) +static int nbd_negotiate_simple_meta_context(QIOChannel *ioc, + const char *export, + const char *context, + uint32_t *context_id, + Error **errp) +{ + int ret; + NBDOptionReply reply; + uint32_t received_id; + bool received; + size_t export_len = strlen(export); + size_t context_len = strlen(context); + size_t data_len = 4 + export_len + 4 + 4 + context_len; [...] + + if (nbd_read(ioc, &received_id, sizeof(received_id), errp) < 0) { + return -EIO; + } + be32_to_cpus(&received_id); + + len = reply.length - sizeof(received_id); + name = g_malloc(len + 1); + if (nbd_read(ioc, name, len, errp) < 0) { + g_free(name); + return -EIO; + } + name[len] = '\0'; + if (strcmp(context, name)) { + error_setg(errp, "Failed to negotiate meta context '%s', server " + "answered with different context '%s'", context, + name); This check may not be valid for other context namespaces, but is correct for "base:allocation". so, it is negotiation of "simple meta context". I'll improve somehow comment about the functions... -- Best regards, Vladimir
Re: [Qemu-devel] [PATCH 6/9] nbd: BLOCK_STATUS for standard get_block_status function: client part
16.02.2018 23:40, Eric Blake wrote: On 02/15/2018 07:51 AM, Vladimir Sementsov-Ogievskiy wrote: Minimal realization: only one extent in server answer is supported. Flag NBD_CMD_FLAG_REQ_ONE is used to force this behavior. Tests 140, 147 and 205 are fixed due to now server failed on searching export in context of NBD_OPT_SET_META_CONTEXT option negotiation. Signed-off-by: Vladimir Sementsov-Ogievskiy --- [...] + static int nbd_co_request(BlockDriverState *bs, NBDRequest *request, QEMUIOVector *write_qiov) { @@ -784,6 +878,50 @@ int nbd_client_co_pdiscard(BlockDriverState *bs, int64_t offset, int bytes) return nbd_co_request(bs, &request, NULL); } +int64_t coroutine_fn nbd_client_co_get_block_status(BlockDriverState *bs, + int64_t sector_num, + int nb_sectors, int *pnum, + BlockDriverState **file) Needs rebasing on top of Kevin's block branch to use the byte-based interface. I also need to finish up my promised followups on that series, as NBD (and other protocol drivers) should have consistent behavior on what it means to report OFFSET_VALID (or whether that should be limited to just format/filter drivers). Looks like it's already in master, so I should just rebase on master.-- Best regards, Vladimir
Re: [Qemu-devel] [PATCH 6/9] nbd: BLOCK_STATUS for standard get_block_status function: client part
16.02.2018 23:40, Eric Blake wrote: On 02/15/2018 07:51 AM, Vladimir Sementsov-Ogievskiy wrote: Minimal realization: only one extent in server answer is supported. Flag NBD_CMD_FLAG_REQ_ONE is used to force this behavior. [...] + memcpy(extent, payload, sizeof(*extent)); + be32_to_cpus(&extent->length); + be32_to_cpus(&extent->flags); Instead of doing a memcpy() and then in-place bit-swizzling, you could do the swapping as part of assignment, for one less function call (and make the code a bit easier to extend, if we later drop our REQ_ONE limitation on only having one extent, because you'll advance payload as needed): extent->length = payload_advance32(&payload); extent->flags = payload_advance32(&payload); Aha, yes. The funny thing is that these are my own helpers. We should probably validate that the length field is a multiple of min_block (if a server tells us that all operations must be 512-byte aligned, then reports an extent that is smaller than 512 bytes, we have no way to ask for the status of the second half of the sector). Probably also something that needs to be explicitly stated in the NBD spec. [1] + + if (extent->length > orig_length) { + error_setg(errp, "Protocol error: server sent chunk exceeding requested" + " region"); + return -EINVAL; That matches the current spec wording, but I'm not sure I agree with it - what's wrong with a server providing a final extent that extends beyond the request, if the information was already available for free (the classic example: if the server never replies with HOLE or ZERO, then the entire file has the same status, so all requests could trivially be replied to by taking the starting offset to the end of the file as the returned length, rather than just clamping at the requested length). Maybe. But our already released clients not prepared to such change =( But, on the other hand, this gives us possibility to understand, the the whole target (for backup/mirror) is zero in one request, skipping target-zeroing loop by 4gb chunks.. What about adding such possibility with an additionally negotiated option or something like this? (and don't we now have same possibility with something like INFO?) + } + + return 0; +} + /* nbd_parse_error_payload * on success @errp contains message describing nbd error reply */ -- Best regards, Vladimir
Re: [Qemu-devel] [PATCH 6/9] nbd: BLOCK_STATUS for standard get_block_status function: client part
On 02/15/2018 07:51 AM, Vladimir Sementsov-Ogievskiy wrote: Minimal realization: only one extent in server answer is supported. Flag NBD_CMD_FLAG_REQ_ONE is used to force this behavior. Tests 140, 147 and 205 are fixed due to now server failed on searching export in context of NBD_OPT_SET_META_CONTEXT option negotiation. Signed-off-by: Vladimir Sementsov-Ogievskiy --- +++ b/block/nbd-client.c @@ -228,6 +228,45 @@ static int nbd_parse_offset_hole_payload(NBDStructuredReplyChunk *chunk, return 0; } +/* nbd_parse_blockstatus_payload + * support only one extent in reply and only for + * base:allocation context Reasonable, since the server should not be sending us contexts we did not ask for during negotiation. + */ +static int nbd_parse_blockstatus_payload(NBDClientSession *client, + NBDStructuredReplyChunk *chunk, + uint8_t *payload, uint64_t orig_length, + NBDExtent *extent, Error **errp) +{ +uint32_t context_id; + +if (chunk->length != sizeof(context_id) + sizeof(extent)) { Okay as long as we use REQ_ONE to ensure exactly one extent per chunk. +error_setg(errp, "Protocol error: invalid payload for " + "NBD_REPLY_TYPE_BLOCK_STATUS"); +return -EINVAL; +} + +context_id = payload_advance32(&payload); +if (client->info.meta_base_allocation_id != context_id) { +error_setg(errp, "Protocol error: unexpected context id: %d for " + "NBD_REPLY_TYPE_BLOCK_STATUS, when negotiated context " + "id is %d", context_id, + client->info.meta_base_allocation_id); +return -EINVAL; +} + +memcpy(extent, payload, sizeof(*extent)); +be32_to_cpus(&extent->length); +be32_to_cpus(&extent->flags); Instead of doing a memcpy() and then in-place bit-swizzling, you could do the swapping as part of assignment, for one less function call (and make the code a bit easier to extend, if we later drop our REQ_ONE limitation on only having one extent, because you'll advance payload as needed): extent->length = payload_advance32(&payload); extent->flags = payload_advance32(&payload); We should probably validate that the length field is a multiple of min_block (if a server tells us that all operations must be 512-byte aligned, then reports an extent that is smaller than 512 bytes, we have no way to ask for the status of the second half of the sector). Probably also something that needs to be explicitly stated in the NBD spec. [1] + +if (extent->length > orig_length) { +error_setg(errp, "Protocol error: server sent chunk exceeding requested" + " region"); +return -EINVAL; That matches the current spec wording, but I'm not sure I agree with it - what's wrong with a server providing a final extent that extends beyond the request, if the information was already available for free (the classic example: if the server never replies with HOLE or ZERO, then the entire file has the same status, so all requests could trivially be replied to by taking the starting offset to the end of the file as the returned length, rather than just clamping at the requested length). +} + +return 0; +} + /* nbd_parse_error_payload * on success @errp contains message describing nbd error reply */ @@ -642,6 +681,61 @@ static int nbd_co_receive_cmdread_reply(NBDClientSession *s, uint64_t handle, return iter.ret; } +static int nbd_co_receive_blockstatus_reply(NBDClientSession *s, +uint64_t handle, uint64_t length, +NBDExtent *extent, Error **errp) +{ +NBDReplyChunkIter iter; +NBDReply reply; +void *payload = NULL; +Error *local_err = NULL; +bool received = false; + +NBD_FOREACH_REPLY_CHUNK(s, iter, handle, s->info.structured_reply, +NULL, &reply, &payload) +{ +int ret; +NBDStructuredReplyChunk *chunk = &reply.structured; + +assert(nbd_reply_is_structured(&reply)); + +switch (chunk->type) { +case NBD_REPLY_TYPE_BLOCK_STATUS: Depending on the outcome of the discussion on the NBD list, here's where you could make a client easily listen to both your initial server (that sent 5) and a server compliant to the current spec wording (where this constant is 3); although it remains to be seen if that's the only difference between your initial implementation and the NBD spec wording that gets promoted to stable. +if (received) { +s->quit = true; +error_setg(&local_err, "Several BLOCK_STATUS chunks in reply"); +nbd_iter_error(&iter, true, -EINVAL, &local_err); +} We may change this in the future to ask for more than one
[Qemu-devel] [PATCH 6/9] nbd: BLOCK_STATUS for standard get_block_status function: client part
Minimal realization: only one extent in server answer is supported. Flag NBD_CMD_FLAG_REQ_ONE is used to force this behavior. Tests 140, 147 and 205 are fixed due to now server failed on searching export in context of NBD_OPT_SET_META_CONTEXT option negotiation. Signed-off-by: Vladimir Sementsov-Ogievskiy --- block/nbd-client.h | 5 ++ include/block/nbd.h| 3 + block/nbd-client.c | 139 + block/nbd.c| 3 + nbd/client.c | 114 + tests/qemu-iotests/140.out | 2 +- tests/qemu-iotests/143.out | 2 +- tests/qemu-iotests/205 | 3 +- 8 files changed, 268 insertions(+), 3 deletions(-) diff --git a/block/nbd-client.h b/block/nbd-client.h index 612c4c21a0..ca0cc141c0 100644 --- a/block/nbd-client.h +++ b/block/nbd-client.h @@ -61,4 +61,9 @@ void nbd_client_detach_aio_context(BlockDriverState *bs); void nbd_client_attach_aio_context(BlockDriverState *bs, AioContext *new_context); +int64_t coroutine_fn nbd_client_co_get_block_status(BlockDriverState *bs, +int64_t sector_num, +int nb_sectors, int *pnum, +BlockDriverState **file); + #endif /* NBD_CLIENT_H */ diff --git a/include/block/nbd.h b/include/block/nbd.h index b16215d17a..baf12e5428 100644 --- a/include/block/nbd.h +++ b/include/block/nbd.h @@ -262,6 +262,7 @@ struct NBDExportInfo { /* In-out fields, set by client before nbd_receive_negotiate() and * updated by server results during nbd_receive_negotiate() */ bool structured_reply; +bool base_allocation; /* base:allocation context for NBD_CMD_BLOCK_STATUS */ /* Set by server results during nbd_receive_negotiate() */ uint64_t size; @@ -269,6 +270,8 @@ struct NBDExportInfo { uint32_t min_block; uint32_t opt_block; uint32_t max_block; + +uint32_t meta_base_allocation_id; }; typedef struct NBDExportInfo NBDExportInfo; diff --git a/block/nbd-client.c b/block/nbd-client.c index b1cbe95b13..a80d69d3cd 100644 --- a/block/nbd-client.c +++ b/block/nbd-client.c @@ -228,6 +228,45 @@ static int nbd_parse_offset_hole_payload(NBDStructuredReplyChunk *chunk, return 0; } +/* nbd_parse_blockstatus_payload + * support only one extent in reply and only for + * base:allocation context + */ +static int nbd_parse_blockstatus_payload(NBDClientSession *client, + NBDStructuredReplyChunk *chunk, + uint8_t *payload, uint64_t orig_length, + NBDExtent *extent, Error **errp) +{ +uint32_t context_id; + +if (chunk->length != sizeof(context_id) + sizeof(extent)) { +error_setg(errp, "Protocol error: invalid payload for " + "NBD_REPLY_TYPE_BLOCK_STATUS"); +return -EINVAL; +} + +context_id = payload_advance32(&payload); +if (client->info.meta_base_allocation_id != context_id) { +error_setg(errp, "Protocol error: unexpected context id: %d for " + "NBD_REPLY_TYPE_BLOCK_STATUS, when negotiated context " + "id is %d", context_id, + client->info.meta_base_allocation_id); +return -EINVAL; +} + +memcpy(extent, payload, sizeof(*extent)); +be32_to_cpus(&extent->length); +be32_to_cpus(&extent->flags); + +if (extent->length > orig_length) { +error_setg(errp, "Protocol error: server sent chunk exceeding requested" + " region"); +return -EINVAL; +} + +return 0; +} + /* nbd_parse_error_payload * on success @errp contains message describing nbd error reply */ @@ -642,6 +681,61 @@ static int nbd_co_receive_cmdread_reply(NBDClientSession *s, uint64_t handle, return iter.ret; } +static int nbd_co_receive_blockstatus_reply(NBDClientSession *s, +uint64_t handle, uint64_t length, +NBDExtent *extent, Error **errp) +{ +NBDReplyChunkIter iter; +NBDReply reply; +void *payload = NULL; +Error *local_err = NULL; +bool received = false; + +NBD_FOREACH_REPLY_CHUNK(s, iter, handle, s->info.structured_reply, +NULL, &reply, &payload) +{ +int ret; +NBDStructuredReplyChunk *chunk = &reply.structured; + +assert(nbd_reply_is_structured(&reply)); + +switch (chunk->type) { +case NBD_REPLY_TYPE_BLOCK_STATUS: +if (received) { +s->quit = true; +error_setg(&local_err, "Several BLOCK_STATUS chunks in reply"); +nbd_iter_error(&iter, true, -EINVAL, &local_err); +} +received =