Re: [Libguestfs] [PATCH libnbd 0/4] Miscellaneous Rust cleanups
On Tue, Oct 10, 2023 at 03:06:06PM +0100, Richard W.M. Jones wrote: > Add an overview libnbd-rust(3) man page pointing to the real > documentation. This is like OCaml & Golang. > > When reviewing the real rustdocs I noticed they basically converted > the man pages into plain text, resulting in lots of problems such as > internal links not working, no `code` annotations, etc. So I wrote a > simple POD to rustdoc translator. It is by no means perfect, but it > fixes many of the issues. > > Also build the examples, to make sure we don't get any regressions. For the series: Reviewed-by: Eric Blake There is an XXX about displying links rather than just italicized texts for pod L<> constructs linking to non-libnbd pages, which can be solved later, but your approach for now is definitely better than plain text. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [libnbd PATCH 2/2] info: Show human sizes for block_size values
On Sun, Oct 08, 2023 at 11:34:05AM +0200, Laszlo Ersek wrote: > On 10/7/23 11:27, Richard W.M. Jones wrote: > > On Fri, Oct 06, 2023 at 10:18:09AM -0500, Eric Blake wrote: > >> Adding a human-readable size for block constraints is useful. For: > >> > >> $ ./run nbdinfo -- [ nbdkit memory \ > >>--filter=blocksize-policy blocksize-preferred=32k 1M ] | grep pref > >> > >> this changes pre-patch: > >>block_size_preferred: 32768 > >> to post-patch: > >>block_size_preferred: 32768 (32K) > > > > I think info/nbdinfo.pod needs to be updated. > > Good catch! > > With that: > > series > Reviewed-by: Laszlo Ersek Fixed, and series pushed as 42a85181..d4e8bb2d -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [nbdkit] Two POSIX questions for you ...
On Mon, Oct 09, 2023 at 01:32:07PM +0200, Laszlo Ersek wrote: > >> Shell arrays are > >> a bash-specific feature, and the extent array is deeply ingrained. See > >> especially commit df63b23b6280 ("log: Use strict shell quoting for every > >> parameter displayed in the log file.", 2021-01-04). In particular the > >> assignment > >> > >> extents=(0x0 0x8000 "hole,zero" 0x8000 0x8000 "") > >> > >> turns "extents" into an array variable. > >> > >> In the bug tracker, comment > >> <https://aur.archlinux.org/packages/nbdkit#comment-937375> says, "Thus > >> this is an upstream issue; their scripts are calling sh when they should > >> be calling bash". I think that's correct; for logscript=..., we should > >> require /bin/bash in the manual, and execute the script with /bin/bash > >> explicitly, not just system(). > > > > This would create a runtime dependency from nbdkit to /bin/bash which > > I'd like to avoid. > > The runtime dependency is already there in our logscript interface; the > > name=(a b c ... z) > > syntax is already bash-only, for defining a shell array. The man page for nbdkit-log-filter does not mention extents=() anywhere, and exports=("") is only shown as a single sample log output, without mention in the LOG SCRIPT section. I checked both 1.34.4 I have pre-installed on Fedora 38, and filters/log/nbdkit-log-filter.pod at the top of development. Merely stating "Strings and lists are shell-quoted." or that "Other parameters like C are turned into shell variables C<$offset> etc." does not properly describe which items passed to the logscript will actually be lists under which exported names, whether or not we choose to stick to bash-only syntax. So we already have a bug that our documentation is incomplete; and therefore, if we change what we output, we can also use that as a chance to rectify our documentation bug to what we really want to support. > > So the question is basically how to best emulate an array in the POSIX > shell. Some (rough) options that occur to me: > > - Use named variables such as name_0, name_1, name_2, ... and so on. > Requires eval tricks, and if the array is large, it creates many > variables, which some shells (?) may have issues with. If we're going to run out of memory from having too many strings to pass through all extents information through the environ, we will do so whether we do it as a single array variable or as multiple individual variables. > > - Generate a shell function with a huge case statement; like "get_name > 0" should print "a", "get_name 1" should print "b", etc. The caller > would then do > > element=$(get_name $idx) We'd also need to define something like $extents_max=N so that the script can get correct bounds for the loop it writes around the $(get_name $idx) calls. > > - write the elements of the array to a text file (one, quoted, element > per line), and then use a combination of "tail" and "head" for fetching > the right line. Incredibly slow, of course. > > ... I'm sure stackoverflow has further / better ideas for emulating > arrays in the POSIX shell. > > And then, because keeping the current (fast, but nonportable) solution > would be nice, we should probably add a new argument for the log filter, > "compat" or "posix" or some such, which would select the more restricted > interface. Yes, having a way to fine-tune what we output for the scripts consumption would be nice, and perhaps extensible (JSON output, anyone?). -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [nbdkit] Two POSIX questions for you ...
e: ./nbdkit -vf data data='1 2 3' data='--run' data='nbdinfo $uri' which is not subject to option reordering regardless of POSIXLY_CORRECT. > > That's annoying :-( Yes, if we want option reordering in spite of POSIXLY_CORRECT possibly leaking through the environment, we have quite a bit of work to do. Even if we don't want option reordering, we probably have quite a few test cases that need to be rewritten to list --run before the plugin name. I'd actually lean towards allowing option reordering in spite of POSIXLY_CORRECT (it makes for nicer command lines, and we already limit the set of non-options we accept because we will then hand them off to the .config plugin callback as name=value pairs, possibly with the magic name). But I'm not sure how invasive it will be to write such a patch. On the other hand, it should be easy enough to tweak our CI engine to have at least one platform with POSIXLY_CORRECT=1 set in the environment to make sure we still pass, if that's what we want. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
[Libguestfs] [libnbd PATCH 0/2] Improve nbdinfo display of block constraints
Based on Laszlo's approval of my idea here: https://listman.redhat.com/archives/libguestfs/2023-September/032661.html but as I would like to resync human-size.h back to nbdkit, I'm reluctant to apply patch 1 this until I get Rich's consent to relicensing (this email serves as my consent for my contribution here): https://listman.redhat.com/archives/libguestfs/2023-October/032755.html Eric Blake (2): utils: Slightly simplify human_size() info: Show human sizes for block_size values common/include/human-size.h | 14 ++ info/show.c | 26 +++--- 2 files changed, 25 insertions(+), 15 deletions(-) -- 2.41.0 ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
[Libguestfs] [libnbd PATCH 1/2] utils: Slightly simplify human_size()
Use an array of characters instead of strings for less .data storage. Merge the loop conditional for fewer lines of code. Signed-off-by: Eric Blake --- common/include/human-size.h | 14 ++ 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/common/include/human-size.h b/common/include/human-size.h index 47729c3c..8b1e0132 100644 --- a/common/include/human-size.h +++ b/common/include/human-size.h @@ -159,7 +159,7 @@ human_size_parse (const char *str, static inline char * human_size (char *buf, uint64_t bytes, bool *human) { - static const char ext[][2] = { "E", "P", "T", "G", "M", "K", "" }; + static const char ext[] = "EPTGMK"; size_t i; if (buf == NULL) { @@ -170,18 +170,16 @@ human_size (char *buf, uint64_t bytes, bool *human) /* Work out which extension to use, if any. */ i = 6; - if (bytes != 0) { -while ((bytes & 1023) == 0) { - bytes >>= 10; - i--; -} + while (bytes && (bytes & 1023) == 0) { +bytes >>= 10; +i--; } /* Set the flag to true if we're going to add a human-readable extension. */ if (human) -*human = ext[i][0] != '\0'; +*human = ext[i] != '\0'; - snprintf (buf, HUMAN_SIZE_LONGEST, "%" PRIu64 "%s", bytes, ext[i]); + snprintf (buf, HUMAN_SIZE_LONGEST, "%" PRIu64 "%.1s", bytes, [i]); return buf; } -- 2.41.0 ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
[Libguestfs] [libnbd PATCH 2/2] info: Show human sizes for block_size values
Adding a human-readable size for block constraints is useful. For: $ ./run nbdinfo -- [ nbdkit memory \ --filter=blocksize-policy blocksize-preferred=32k 1M ] | grep pref this changes pre-patch: block_size_preferred: 32768 to post-patch: block_size_preferred: 32768 (32K) Signed-off-by: Eric Blake --- info/show.c | 26 +++--- 1 file changed, 19 insertions(+), 7 deletions(-) diff --git a/info/show.c b/info/show.c index 6aeffb54..ac483f34 100644 --- a/info/show.c +++ b/info/show.c @@ -35,6 +35,7 @@ #include "nbdinfo.h" static void show_boolean (const char *name, bool cond); +static void show_size (const char *name, int64_t size); static int collect_context (void *opaque, const char *name); static char *get_content (struct nbd_handle *, int64_t size); @@ -181,13 +182,9 @@ show_one_export (struct nbd_handle *nbd, const char *desc, show_boolean ("can_trim", can_trim); if (can_zero >= 0) show_boolean ("can_zero", can_zero); -if (block_minimum > 0) - fprintf (fp, "\t%s: %" PRId64 "\n", "block_size_minimum", block_minimum); -if (block_preferred > 0) - fprintf (fp, "\t%s: %" PRId64 "\n", "block_size_preferred", - block_preferred); -if (block_maximum > 0) - fprintf (fp, "\t%s: %" PRId64 "\n", "block_size_maximum", block_maximum); +show_size ("block_size_minimum", block_minimum); +show_size ("block_size_preferred", block_preferred); +show_size ("block_size_maximum", block_maximum); } else { if (first) @@ -304,6 +301,21 @@ show_boolean (const char *name, bool cond) ansi_restore (fp); } +/* Used for displaying sizes in non-JSON output. */ +void show_size (const char *name, int64_t size) +{ + char size_str[HUMAN_SIZE_LONGEST]; + bool human_size_flag = false; + + if (size > 0) { +human_size (size_str, size, _size_flag); +if (human_size_flag) + fprintf (fp, "\t%s: %" PRId64 " (%s)\n", name, size, size_str); +else + fprintf (fp, "\t%s: %" PRId64 "\n", name, size); + } +} + static int collect_context (void *opaque, const char *name) { -- 2.41.0 ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [PATCH libnbd 3/5] common: Combine human-size.h headers into one
On Sun, Sep 03, 2023 at 04:23:23PM +0100, Richard W.M. Jones wrote: > Copy the human_size() function from common/utils/ into the new > human-size.h header in common/include/. Remove human-size.c and > combine the tests into one. > --- > common/include/human-size.h | 51 ++ This file was created by inheriting nbdkit's weaker BSD licensing... > common/include/test-human-size.c | 79 +--- > common/utils/Makefile.am | 10 +--- > common/utils/human-size.c| 56 > common/utils/human-size.h| 49 -- ...while this file was originally created with libnbd's LGPLv2+ licensing. By merging LGPLv2+ code into a file containing only a BSD license header, you have created an ambiguity on what license should be assumed when using human_size(). Could you explicitly clarify that the relaxing of the license was intentional, so that it is safe to then merge libnbd's code into nbdkit without dragging in LGPLv2+ code? > +static inline char * > +human_size (char *buf, uint64_t bytes, bool *human) > +{ > + static const char ext[][2] = { "E", "P", "T", "G", "M", "K", "" }; > + size_t i; Code motion, so this is pre-existing, but this seems rather lengthy, compared to a more compact: static const char ext[] = "EPTGMK"; > + > + if (buf == NULL) { > +buf = malloc (HUMAN_SIZE_LONGEST); > +if (buf == NULL) > + return NULL; > + } > + > + /* Work out which extension to use, if any. */ > + i = 6; > + if (bytes != 0) { > +while ((bytes & 1023) == 0) { > + bytes >>= 10; > + i--; > +} > + } > + > + /* Set the flag to true if we're going to add a human-readable extension. > */ > + if (human) > +*human = ext[i][0] != '\0'; *human = ext[i] != '\0'; > + > + snprintf (buf, HUMAN_SIZE_LONGEST, "%" PRIu64 "%s", bytes, ext[i]); snprintf (buf, HUMAN_SIZE_LONGEST, "%" PRIu64 ".1s", bytes, [i]); > + return buf; > +} > + -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [PATCH v7 01/12] nbd/server: Support a request payload
On Mon, Sep 25, 2023 at 02:22:31PM -0500, Eric Blake wrote: > Upcoming additions to support NBD 64-bit effect lengths allow for the > possibility to distinguish between payload length (capped at 32M) and > effect length (64 bits, although we generally assume 63 bits because > of off_t limitations). [...] > +++ b/nbd/server.c > @@ -2322,9 +2322,11 @@ static int coroutine_fn > nbd_co_receive_request(NBDRequestData *req, > Error **errp) > { > NBDClient *client = req->client; > +bool extended_with_payload; > bool check_length = false; > bool check_rofs = false; > bool allocate_buffer = false; > +bool payload_okay = false; > unsigned payload_len = 0; Pre-existing type mismatch caught as a result of Vladimir's review of 12/12, but: > int valid_flags = NBD_CMD_FLAG_FUA; > int ret; > @@ -2338,6 +2340,13 @@ static int coroutine_fn > nbd_co_receive_request(NBDRequestData *req, > > trace_nbd_co_receive_request_decode_type(request->cookie, request->type, > nbd_cmd_lookup(request->type)); > +extended_with_payload = client->mode >= NBD_MODE_EXTENDED && > +request->flags & NBD_CMD_FLAG_PAYLOAD_LEN; > +if (extended_with_payload) { > +payload_len = request->len; this can assign a 64-bit number into a 32-bit variable, which can truncate to 0,... > +check_length = true; > +} > + > switch (request->type) { > case NBD_CMD_DISC: > /* Special case: we're going to disconnect without a reply, > @@ -2354,6 +2363,15 @@ static int coroutine_fn > nbd_co_receive_request(NBDRequestData *req, > break; > > case NBD_CMD_WRITE: > +if (client->mode >= NBD_MODE_EXTENDED) { > +if (!extended_with_payload) { > +/* The client is noncompliant. Trace it, but proceed. */ > +trace_nbd_co_receive_ext_payload_compliance(request->from, > +request->len); > +} > +valid_flags |= NBD_CMD_FLAG_PAYLOAD_LEN; > +} > +payload_okay = true; > payload_len = request->len; ...the pre-existing code is safe only as long as request->len cannot exceed 32 bytes (which it can't do until later in this series actually enables extended requests). Switching the type now is prudent... > check_length = true; > allocate_buffer = true; > @@ -2395,6 +2413,14 @@ static int coroutine_fn > nbd_co_receive_request(NBDRequestData *req, > request->len, NBD_MAX_BUFFER_SIZE); > return -EINVAL; > } > +if (payload_len && !payload_okay) { > +/* > + * For now, we don't support payloads on other commands; but > + * we can keep the connection alive by ignoring the payload. > + */ > +assert(request->type != NBD_CMD_WRITE); > +request->len = 0; ...otherwise, this check is bypassed for a request size of exactly 4G if check_length is false and thus the previous conditional for request->len vs. NBD_MAX_BUFFER_SIZE didn't trigger (prior to this patch, payload_len was only set for CND_WRITE which also set check_length). Thus, I'm squashing in: diff --git i/nbd/server.c w/nbd/server.c index 5258064e5ac..1cb66e86a89 100644 --- i/nbd/server.c +++ w/nbd/server.c @@ -2327,7 +2327,7 @@ static int coroutine_fn nbd_co_receive_request(NBDRequestData *req, bool check_rofs = false; bool allocate_buffer = false; bool payload_okay = false; -unsigned payload_len = 0; +uint64_t payload_len = 0; int valid_flags = NBD_CMD_FLAG_FUA; int ret; -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [PATCH v7 12/12] nbd/server: Add FLAG_PAYLOAD support to CMD_BLOCK_STATUS
On Thu, Oct 05, 2023 at 05:33:26PM +0300, Vladimir Sementsov-Ogievskiy wrote: > > +static int > > +nbd_co_block_status_payload_read(NBDClient *client, NBDRequest *request, > > + Error **errp) > > +{ > > +int payload_len = request->len; > > payload_len should be uint64_t > > > +g_autofree char *buf = NULL; > > +size_t count, i, nr_bitmaps; > > +uint32_t id; > > + > > otherwise, we may do something unexpected here, when reqeuest->len is too big > for int: > > > +if (payload_len > NBD_MAX_BUFFER_SIZE) { > > +error_setg(errp, "len (%" PRIu64 ") is larger than max len (%u)", > > + request->len, NBD_MAX_BUFFER_SIZE); > > +return -EINVAL; > > +} Oh, it looks like I introduced that same type mismatch in commit 8db7e2d6 as well, although it appears to have a latent effect until this series enables the ability for request->length to actually exceed 32 bits. I'll reply on 1/12 with another squash I'm making there. > > + > > +assert(client->contexts.exp == client->exp); > > +nr_bitmaps = client->exp->nr_export_bitmaps; > > +request->contexts = g_new0(NBDMetaContexts, 1); > > +request->contexts->exp = client->exp; > > + > > +if (payload_len % sizeof(uint32_t) || > > +payload_len < sizeof(NBDBlockStatusPayload) || > > +payload_len > (sizeof(NBDBlockStatusPayload) + > > + sizeof(id) * client->contexts.count)) { > > +goto skip; > > +} > > [..] > > >* connection right away, -EAGAIN to indicate we were interrupted and the > > @@ -2505,7 +2593,18 @@ static int coroutine_fn > > nbd_co_receive_request(NBDRequestData *req, > > break; > > > > case NBD_CMD_BLOCK_STATUS: > > -request->contexts = >contexts; > > +if (extended_with_payload) { > > +ret = nbd_co_block_status_payload_read(client, request, errp); > > +if (ret < 0) { > > +return ret; > > +} > > +/* payload now consumed */ > > +check_length = extended_with_payload = false; > > why set extended_with_payload to false? it's a bit misleading. And you don't > do this for WRITE request. Indeed; it doesn't make any different to later in the function. Will drop. > > > +payload_len = 0; > > +valid_flags |= NBD_CMD_FLAG_PAYLOAD_LEN; > > +} else { > > +request->contexts = >contexts; > > +} > > valid_flags |= NBD_CMD_FLAG_REQ_ONE; > > break; > > > > [..] > > > with payload_len changed to uint64_t, your squash-in applied and s/>/>=/ > fixed: > Reviewed-by: Vladimir Sementsov-Ogievskiy Thanks for the careful review. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [PATCH v7 12/12] nbd/server: Add FLAG_PAYLOAD support to CMD_BLOCK_STATUS
On Wed, Oct 04, 2023 at 04:55:02PM -0500, Eric Blake wrote: > > > +static int > > > +nbd_co_block_status_payload_read(NBDClient *client, NBDRequest *request, > > > + Error **errp) > > > > [..] > > > +for (i = 0; i < count; i++) { > > > +id = ldl_be_p(buf + sizeof(NBDBlockStatusPayload) + sizeof(id) * > > > i); > > > +if (id == NBD_META_ID_BASE_ALLOCATION) { > > > +if (request->contexts->base_allocation) { > > > +goto skip; > > > +} > > > > should we also check that base_allocation is negotiated? > > Oh, good point. Without that check, the client can pass in random id > numbers that it never negotiated. I've queued 1-11 and will probably > send a pull request for those this week, while respinning this patch > to fix the remaining issues you pointed out. I'm squashing in the following. If you can review it today, I'll include it in my pull request this afternoon; if not, we still have time before soft freeze to get it in the next batch. diff --git i/nbd/server.c w/nbd/server.c index 30816b42386..62654579cbc 100644 --- i/nbd/server.c +++ w/nbd/server.c @@ -2478,19 +2478,22 @@ nbd_co_block_status_payload_read(NBDClient *client, NBDRequest *request, for (i = 0; i < count; i++) { id = ldl_be_p(buf + sizeof(NBDBlockStatusPayload) + sizeof(id) * i); if (id == NBD_META_ID_BASE_ALLOCATION) { -if (request->contexts->base_allocation) { +if (!client->contexts.base_allocation || +request->contexts->base_allocation) { goto skip; } request->contexts->base_allocation = true; } else if (id == NBD_META_ID_ALLOCATION_DEPTH) { -if (request->contexts->allocation_depth) { +if (!client->contexts.allocation_depth || +request->contexts->allocation_depth) { goto skip; } request->contexts->allocation_depth = true; } else { -int idx = id - NBD_META_ID_DIRTY_BITMAP; +unsigned idx = id - NBD_META_ID_DIRTY_BITMAP; -if (idx > nr_bitmaps || request->contexts->bitmaps[idx]) { +if (idx > nr_bitmaps || !client->contexts.bitmaps[idx] || +request->contexts->bitmaps[idx]) { goto skip; } request->contexts->bitmaps[idx] = true; diff --git i/nbd/trace-events w/nbd/trace-events index 3cf2d00e458..00ae3216a11 100644 --- i/nbd/trace-events +++ w/nbd/trace-events @@ -70,7 +70,7 @@ nbd_co_send_chunk_read(uint64_t cookie, uint64_t offset, void *data, uint64_t si nbd_co_send_chunk_read_hole(uint64_t cookie, uint64_t offset, uint64_t size) "Send structured read hole reply: cookie = %" PRIu64 ", offset = %" PRIu64 ", len = %" PRIu64 nbd_co_send_extents(uint64_t cookie, unsigned int extents, uint32_t id, uint64_t length, int last) "Send block status reply: cookie = %" PRIu64 ", extents = %u, context = %d (extents cover %" PRIu64 " bytes, last chunk = %d)" nbd_co_send_chunk_error(uint64_t cookie, int err, const char *errname, const char *msg) "Send structured error reply: cookie = %" PRIu64 ", error = %d (%s), msg = '%s'" -nbd_co_receive_block_status_payload_compliance(uint64_t from, int len) "client sent unusable block status payload: from=0x%" PRIx64 ", len=0x%x" +nbd_co_receive_block_status_payload_compliance(uint64_t from, uint64_t len) "client sent unusable block status payload: from=0x%" PRIx64 ", len=0x%" PRIx64 nbd_co_receive_request_decode_type(uint64_t cookie, uint16_t type, const char *name) "Decoding type: cookie = %" PRIu64 ", type = %" PRIu16 " (%s)" nbd_co_receive_request_payload_received(uint64_t cookie, uint64_t len) "Payload received: cookie = %" PRIu64 ", len = %" PRIu64 nbd_co_receive_ext_payload_compliance(uint64_t from, uint64_t len) "client sent non-compliant write without payload flag: from=0x%" PRIx64 ", len=0x%" PRIx64 -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [PATCH v7 12/12] nbd/server: Add FLAG_PAYLOAD support to CMD_BLOCK_STATUS
On Sat, Sep 30, 2023 at 04:24:02PM +0300, Vladimir Sementsov-Ogievskiy wrote: > On 25.09.23 22:22, Eric Blake wrote: > > Allow a client to request a subset of negotiated meta contexts. For > > example, a client may ask to use a single connection to learn about > > both block status and dirty bitmaps, but where the dirty bitmap > > queries only need to be performed on a subset of the disk; forcing the > > server to compute that information on block status queries in the rest > > of the disk is wasted effort (both at the server, and on the amount of > > traffic sent over the wire to be parsed and ignored by the client). > > > > Qemu as an NBD client never requests to use more than one meta > > context, so it has no need to use block status payloads. Testing this > > instead requires support from libnbd, which CAN access multiple meta > > contexts in parallel from a single NBD connection; an interop test > > submitted to the libnbd project at the same time as this patch > > demonstrates the feature working, as well as testing some corner cases > > (for example, when the payload length is longer than the export > > length), although other corner cases (like passing the same id > > duplicated) requires a protocol fuzzer because libnbd is not wired up > > to break the protocol that badly. > > > > This also includes tweaks to 'qemu-nbd --list' to show when a server > > is advertising the capability, and to the testsuite to reflect the > > addition to that output. > > > > Of note: qemu will always advertise the new feature bit during > > NBD_OPT_INFO if extended headers have alreay been negotiated > > (regardless of whether any NBD_OPT_SET_META_CONTEXT negotiation has > > occurred); but for NBD_OPT_GO, qemu only advertises the feature if > > block status is also enabled (that is, if the client does not > > negotiate any contexts, then NBD_CMD_BLOCK_STATUS cannot be used, so > > the feature is not advertised). > > > > Signed-off-by: Eric Blake > > --- > > > > [..] > > > > > +/* > > + * nbd_co_block_status_payload_read > > + * Called when a client wants a subset of negotiated contexts via a > > + * BLOCK_STATUS payload. Check the payload for valid length and > > + * contents. On success, return 0 with request updated to effective > > + * length. If request was invalid but all payload consumed, return 0 > > + * with request->len and request->contexts->count set to 0 (which will > > + * trigger an appropriate NBD_EINVAL response later on). Return > > + * negative errno if the payload was not fully consumed. > > + */ > > +static int > > +nbd_co_block_status_payload_read(NBDClient *client, NBDRequest *request, > > + Error **errp) > > [..] > > > +payload_len > (sizeof(NBDBlockStatusPayload) + > > + sizeof(id) * client->contexts.count)) { > > +goto skip; > > +} > > + > > +buf = g_malloc(payload_len); > > +if (nbd_read(client->ioc, buf, payload_len, > > + "CMD_BLOCK_STATUS data", errp) < 0) { > > +return -EIO; > > +} > > +trace_nbd_co_receive_request_payload_received(request->cookie, > > + payload_len); > > +request->contexts->bitmaps = g_new0(bool, nr_bitmaps); > > +count = (payload_len - sizeof(NBDBlockStatusPayload)) / sizeof(id); > > +payload_len = 0; > > + > > +for (i = 0; i < count; i++) { > > +id = ldl_be_p(buf + sizeof(NBDBlockStatusPayload) + sizeof(id) * > > i); > > +if (id == NBD_META_ID_BASE_ALLOCATION) { > > +if (request->contexts->base_allocation) { > > +goto skip; > > +} > > should we also check that base_allocation is negotiated? Oh, good point. Without that check, the client can pass in random id numbers that it never negotiated. I've queued 1-11 and will probably send a pull request for those this week, while respinning this patch to fix the remaining issues you pointed out. > > > +request->contexts->base_allocation = true; > > +} else if (id == NBD_META_ID_ALLOCATION_DEPTH) { > > +if (request->contexts->allocation_depth) { > > +goto skip; > > +} > > same here > > > +request->contexts->allocation_depth = true; > > +} else { > > +int idx = id - NBD_META_ID_
Re: [Libguestfs] [PATCH v7 01/12] nbd/server: Support a request payload
On Thu, Sep 28, 2023 at 12:09:51PM +0300, Vladimir Sementsov-Ogievskiy wrote: > On 27.09.23 18:59, Eric Blake wrote: > > We could also try to be a bit more complicated by peeking at the next > > few bytes: if they look like a magic number of the next request, > > assume the client set the bit accidentally but didn't send a payload > > after all; for anything else, assume the client did pass a payload. > > But adding in machinery to peek at a prefix is more complex than > > either assuming a payload is always present (as done in this patch) or > > assuming the bit was in error (and dropping the connection > > unconditionally). Preferences? > > > Ohh, you are right, thanks for comprehensive explanation. I really missed > some things you are saying about. Yes, now I agree that "payload always exist > when flag is set" is the best effort. Finally, that was our aim of the > protocol design: make it more context independent. Probably, we may fix that > in specification as preferable or at least possible server behavior about > non-compliant client. One other possibility I just thought of: have a heuristic where the flag set with h->request_length less than 512 bytes is likely to indicate an intentional payload (even if for a command where we weren't expecting payload, so still a client error); while the flag set wtih h->request_length >= 512 bytes is likely to be a mistaken setting of the flag (but also still a client error). NBD_CMD_WRITE is probably the only command that will ever need to send a payload larger than one sector, but that command already has handling to accept payloads of all sizes because we know what to do with them and where the client is not in error. > > r-b coming soon, I just need to take another look with corrected picture in > mind. > > -- > Best regards, > Vladimir > -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [libnbd PATCH] info: Try harder for graceful disconnect from server
On Fri, Sep 22, 2023 at 04:03:19PM -0500, Eric Blake wrote: > On Fri, Sep 22, 2023 at 09:47:55PM +0100, Richard W.M. Jones wrote: > > On Fri, Sep 22, 2023 at 12:33:36PM -0500, Eric Blake wrote: > > > There are a number of ways in which gathering information can fail. > > > But when possible, it makes sense to let the server know that we are > > > done talking, as it minimizes the likelihood that nbdinfo's error > > > message will be obscured by an accompanying error message by the > > > server complaining about an unclean disconnect. > > > > > > For example, with a one-off qemu temporarily patched as mentioned in > > > commit 0f8ee8c6 to advertise sizes larger than 2^63, kicking off > > > 'qemu-nbd -f raw -r file &' produces: > > > > > > pre-patch: > > > > > > $ ./run nbdinfo --size nbd://localhost > > > /home/eblake/libnbd/info/.libs/nbdinfo: nbd_get_size: server claims size > > > 9223372036854781440 which does not fit in signed result: Value too large > > > for defined data type > > > qemu-nbd: option negotiation failed: Failed to read opts magic: > > > Unexpected end-of-file before all data were read > > > > This doesn't necessarily seem like a bug? > > It's a quality of service thing - qemu is just being noisy that the > client closed the socket abruptly without giving any reason why. It > doesn't change libnbd's behavior (nbdinfo has already reported its > error message), but may confuse people reading qemu-nbd logs. > > It may also be a case where we could patch qemu-nbd to not output a > complaint if the client exited at a clean point in the back-and-forth > transactions. We still want to be noisy if the socket closes after > the first byte has been read, but if there are zero bytes available, > announcing that the client no longer cares does not add much value. > > > > > This feels like quite a significant change in behaviour to me. In > > particular I'm worried if it interacts in some subtle way with the > > forking done by the "[ ... ]" syntax for running servers on the > > command line (or any of the other ways that libnbd might fork/exec). > > Interesting observation. atexit() handlers are not preserved across > exec, and I think all our fork() paths end in either exec or _exit > (also where atexit handlers are ignored), so I don't think we are > risking calling the handler twice. > > > > > Can we hold this patch until after 1.18 is released and then put it > > in? Should only be a week or two. > > Sure, being conservative on this one is fine by me. I'll delay it > until after 1.18. > > > > > Provisionally ACKed for post-1.18 Now in as commit fd4f3fea, so we can start getting some soak time under CI. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [PATCH v7 01/12] nbd/server: Support a request payload
On Wed, Sep 27, 2023 at 11:55:41AM +0300, Vladimir Sementsov-Ogievskiy wrote: > On 25.09.23 22:22, Eric Blake wrote: > > Upcoming additions to support NBD 64-bit effect lengths allow for the > > possibility to distinguish between payload length (capped at 32M) and > > effect length (64 bits, although we generally assume 63 bits because > > of off_t limitations). Without that extension, only the NBD_CMD_WRITE > > request has a payload; but with the extension, it makes sense to allow > > at least NBD_CMD_BLOCK_STATUS to have both a payload and effect length > > in a future patch (where the payload is a limited-size struct that in > > turn gives the real effect length as well as a subset of known ids for > > which status is requested). Other future NBD commands may also have a > > request payload, so the 64-bit extension introduces a new > > NBD_CMD_FLAG_PAYLOAD_LEN that distinguishes between whether the header > > length is a payload length or an effect length, rather than > > hard-coding the decision based on the command. > > > > According to the spec, a client should never send a command with a > > payload without the negotiation phase proving such extension is > > available. So in the unlikely event the bit is set or cleared > > incorrectly, the client is already at fault; if the client then > > provides the payload, we can gracefully consume it off the wire and > > fail the command with NBD_EINVAL (subsequent checks for magic numbers > > ensure we are still in sync), while if the client fails to send > > payload we block waiting for it (basically deadlocking our connection > > to the bad client, but not negatively impacting our ability to service > > other clients, so not a security risk). Note that we do not support > > the payload version of BLOCK_STATUS yet. > > > > Signed-off-by: Eric Blake > > --- > > > > v7: another try at better logic [Vladimir] > > > > v5: retitled from v4 13/24, rewrite on top of previous patch's switch > > statement [Vladimir] > > > > v4: less indentation on several 'if's [Vladimir] > > --- > > nbd/server.c | 37 + > > nbd/trace-events | 1 + > > 2 files changed, 34 insertions(+), 4 deletions(-) > > > > diff --git a/nbd/server.c b/nbd/server.c > > index 7a6f95071f8..1eabcfc908d 100644 > > --- a/nbd/server.c > > +++ b/nbd/server.c > > @@ -2322,9 +2322,11 @@ static int coroutine_fn > > nbd_co_receive_request(NBDRequestData *req, > > Error **errp) > > { > > NBDClient *client = req->client; > > +bool extended_with_payload; > > bool check_length = false; > > bool check_rofs = false; > > bool allocate_buffer = false; > > +bool payload_okay = false; > > unsigned payload_len = 0; > > int valid_flags = NBD_CMD_FLAG_FUA; > > int ret; > > @@ -2338,6 +2340,13 @@ static int coroutine_fn > > nbd_co_receive_request(NBDRequestData *req, > > > > trace_nbd_co_receive_request_decode_type(request->cookie, > > request->type, > > > > nbd_cmd_lookup(request->type)); > > +extended_with_payload = client->mode >= NBD_MODE_EXTENDED && > > +request->flags & NBD_CMD_FLAG_PAYLOAD_LEN; > > +if (extended_with_payload) { > > +payload_len = request->len; > > +check_length = true; > > +} > > + > > switch (request->type) { > > case NBD_CMD_DISC: > > /* Special case: we're going to disconnect without a reply, > > @@ -2354,6 +2363,15 @@ static int coroutine_fn > > nbd_co_receive_request(NBDRequestData *req, > > break; > > > > case NBD_CMD_WRITE: > > +if (client->mode >= NBD_MODE_EXTENDED) { > > +if (!extended_with_payload) { > > +/* The client is noncompliant. Trace it, but proceed. */ > > +trace_nbd_co_receive_ext_payload_compliance(request->from, > > +request->len); > > +} > > +valid_flags |= NBD_CMD_FLAG_PAYLOAD_LEN; > > +} > > +payload_okay = true; > > payload_len = request->len; > > check_length = true; > > allocate_buffer = true; > > @@ -2395,6 +2413,14 @@ static int coroutine_fn > > nbd_co_receive_request(NBDRequestData *req, >
Re: [Libguestfs] LIBNBD SECURITY: Negative results from nbd_get_size() - CVE-2023-5215
On Tue, Sep 26, 2023 at 02:12:27PM -0500, Eric Blake wrote: > We have discovered a security flaw with potential minor impact in > libnbd. > > Lifecycle > - > > Reported: 2023-09-17 Fixed: 2023-09-22 Published: 2023-09-26 > > At the time of this email, the Red Hat security team is analyzing > potential security impacts to determine if a CVE is warranted against > libnbd; if one is assigned, a followup email will announce that > identifier. However, even if a CVE is not assigned to libnbd, the > issues documented here warrant an audit of clients that utilize the > nbd_get_size() API from libnbd, to see if they might be subject to a > weakness when interpreting a large size as a negative value. The > libnbd developers felt it more important to issue this security notice > prior to the release of v1.18 than to hold up the release schedule > waiting for final analysis on whether libnbd needs a CVE. The Red Hat security team assigned this CVE-2023-5215 as a low-impact security vulnerability, with a rating of low impact severity. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
[Libguestfs] LIBNBD SECURITY: Negative results from nbd_get_size()
We have discovered a security flaw with potential minor impact in libnbd. Lifecycle - Reported: 2023-09-17 Fixed: 2023-09-22 Published: 2023-09-26 At the time of this email, the Red Hat security team is analyzing potential security impacts to determine if a CVE is warranted against libnbd; if one is assigned, a followup email will announce that identifier. However, even if a CVE is not assigned to libnbd, the issues documented here warrant an audit of clients that utilize the nbd_get_size() API from libnbd, to see if they might be subject to a weakness when interpreting a large size as a negative value. The libnbd developers felt it more important to issue this security notice prior to the release of v1.18 than to hold up the release schedule waiting for final analysis on whether libnbd needs a CVE. Credit -- Reported by Rich Jones (rjo...@redhat.com) during fuzz tests, patched by Eric Blake Description --- libnbd is a Network Block Device (NBD) client library. The NBD protocol states that a server advertises the size of an export using a 64-bit unsigned number. In turn, libnbd has an API nbd_get_size() that returns int64_t, documenting that a non-negative value is a successful return of the server's export size, and -1 is an error indicator with a corresponding error message in nbd_get_error(3). This leads to confusing results when talking to a server that advertises an export with an unsigned size of 2^63 bytes or larger: a client that tests 'result < 0' would treat the size as an error although libnbd did not provide an error message; while a client that tests 'result == -1' and stores size as a signed integer could end up seeing a negative value as success, and possibly go on to encounter subsequent logic issues due to integer overflow or iteration bounds that are not expecting comparison against a negative value. This confusing API result has been present in all stable releases of libnbd, since v1.0 in April 2019. In patched libnbd, the API now uniformly treats all large export sizes as an EOVERFLOW error with a return value of -1 and corresponding error message. In the efforts to patch the nbd_get_size() issue in stable libnbd, two additional related issues were identified that were introduced in the experimental v1.17.4 release: code added for 64-bit NBD extensions could hit libnbd assertion failures during the nbd_block_status() API based on actions taken by a malicious server (one by intentionally advertising a large export size regardless of whether extended headers are in use, the other by intentionally disobeying the NBD protocol for the contents of a 64-bit block status reply when extended headers are in use). Odd-numbered minor version releases have always been considered experimental; production systems should only be using even-numbered minor version releases, and thus are not vulnerable to a server-triggered assertion failure. Test if libnbd is vulnerable There are no direct methods for testing whether nbd_get_size() will convert a large export size into an EOVERFLOW error, when using a compliant server. However, the libnbd patches include instructions for making one-off modifications to qemu to create an intentionally non-compliant server for reproducing all scenarios described in this notice. In general, it is probably faster to upgrade to a known-good libnbd than to try and test if an installed libnbd is still vulnerable. Workarounds --- As a mitigating factor, no known production servers (such as nbd-server, qemu-nbd, nbdkit) will advertise an export size large enough to overflow the subset of non-negative int64_t values. An application that insists on using TLS to connect only to a known-good server will never encounter a size that would result in an undocumented result to nbd_get_size(). Without TLS, an application that wishes to deal solely with positive sizes for exports should check that nbd_get_size() is non-negative, rather than the weaker comparison of the result to just -1. Other libnbd API take offset parameters as a uint64_t; an audit of these APIs did not find any scenarios where a negative size value returned by nbd_get_size() and then converted to uint64_t could cause any further misbehavior in libnbd. Fixes - The original nbd_get_size(3) ambiguous return value flaw was introduced in libnbd v0.1 (commit 40881fce75), when the API was introduced. The server-triggered assertion failures on nbd_block_status(3) were introduced in libnbd v1.17.4 (commits e8d837d306 and ab992766cd); there are no plans to release a development v1.17.6 containing the fixes, since stable v1.18.0 is scheduled for 27 September 2023. As of this email, the following branches have been patched: * development branch (1.17) https://gitlab.com/nbdkit/libnbd/-/commit/0f8ee8c6bd6dd93de771e6d4da87ec5a59504aae https://gitlab.com/nbdkit/libnbd/-/commit/dbc08c932c17e72d42944a5df447d0ea714e896a https://gitlab.com/nbd
[Libguestfs] [PATCH v7 11/12] nbd/server: Prepare for per-request filtering of BLOCK_STATUS
The next commit will add support for the optional extension NBD_CMD_FLAG_PAYLOAD during NBD_CMD_BLOCK_STATUS, where the client can request that the server only return a subset of negotiated contexts, rather than all contexts. To make that task easier, this patch populates the list of contexts to return on a per-command basis (for now, identical to the full set of negotiated contexts). Signed-off-by: Eric Blake --- v5: fix null dereference on early error [Vladimir], hoist in assertion from v4 24/24 v4: split out NBDMetaContexts refactoring to its own patch, track NBDRequests.contexts as a pointer rather than inline --- include/block/nbd.h | 1 + nbd/server.c| 22 +- 2 files changed, 18 insertions(+), 5 deletions(-) diff --git a/include/block/nbd.h b/include/block/nbd.h index 2006497f987..4e7bd6342f9 100644 --- a/include/block/nbd.h +++ b/include/block/nbd.h @@ -77,6 +77,7 @@ typedef struct NBDRequest { uint16_t flags; /* NBD_CMD_FLAG_* */ uint16_t type; /* NBD_CMD_* */ NBDMode mode; /* Determines which network representation to use */ +NBDMetaContexts *contexts; /* Used by NBD_CMD_BLOCK_STATUS */ } NBDRequest; typedef struct NBDSimpleReply { diff --git a/nbd/server.c b/nbd/server.c index 44ebbd139b2..2d4cec75a49 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -2505,6 +2505,7 @@ static int coroutine_fn nbd_co_receive_request(NBDRequestData *req, break; case NBD_CMD_BLOCK_STATUS: +request->contexts = >contexts; valid_flags |= NBD_CMD_FLAG_REQ_ONE; break; @@ -2745,17 +2746,18 @@ static coroutine_fn int nbd_handle_request(NBDClient *client, "discard failed", errp); case NBD_CMD_BLOCK_STATUS: +assert(request->contexts); if (!request->len) { return nbd_send_generic_reply(client, request, -EINVAL, "need non-zero length", errp); } assert(client->mode >= NBD_MODE_EXTENDED || request->len <= UINT32_MAX); -if (client->contexts.count) { +if (request->contexts->count) { bool dont_fragment = request->flags & NBD_CMD_FLAG_REQ_ONE; -int contexts_remaining = client->contexts.count; +int contexts_remaining = request->contexts->count; -if (client->contexts.base_allocation) { +if (request->contexts->base_allocation) { ret = nbd_co_send_block_status(client, request, exp->common.blk, request->from, @@ -2768,7 +2770,7 @@ static coroutine_fn int nbd_handle_request(NBDClient *client, } } -if (client->contexts.allocation_depth) { +if (request->contexts->allocation_depth) { ret = nbd_co_send_block_status(client, request, exp->common.blk, request->from, request->len, @@ -2781,8 +2783,9 @@ static coroutine_fn int nbd_handle_request(NBDClient *client, } } +assert(request->contexts->exp == client->exp); for (i = 0; i < client->exp->nr_export_bitmaps; i++) { -if (!client->contexts.bitmaps[i]) { +if (!request->contexts->bitmaps[i]) { continue; } ret = nbd_co_send_bitmap(client, request, @@ -2798,6 +2801,10 @@ static coroutine_fn int nbd_handle_request(NBDClient *client, assert(!contexts_remaining); return 0; +} else if (client->contexts.count) { +return nbd_send_generic_reply(client, request, -EINVAL, + "CMD_BLOCK_STATUS payload not valid", + errp); } else { return nbd_send_generic_reply(client, request, -EINVAL, "CMD_BLOCK_STATUS not negotiated", @@ -2876,6 +2883,11 @@ static coroutine_fn void nbd_trip(void *opaque) } else { ret = nbd_handle_request(client, , req->data, _err); } +if (request.contexts && request.contexts != >contexts) { +assert(request.type == NBD_CMD_BLOCK_STATUS); +g_free(request.contexts->bitmaps); +g_free(request.contexts); +} if (ret < 0) { error_prepend(_err, "Failed to send reply: "); goto disconnect; -- 2.41.0 ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
[Libguestfs] [PATCH v7 12/12] nbd/server: Add FLAG_PAYLOAD support to CMD_BLOCK_STATUS
Allow a client to request a subset of negotiated meta contexts. For example, a client may ask to use a single connection to learn about both block status and dirty bitmaps, but where the dirty bitmap queries only need to be performed on a subset of the disk; forcing the server to compute that information on block status queries in the rest of the disk is wasted effort (both at the server, and on the amount of traffic sent over the wire to be parsed and ignored by the client). Qemu as an NBD client never requests to use more than one meta context, so it has no need to use block status payloads. Testing this instead requires support from libnbd, which CAN access multiple meta contexts in parallel from a single NBD connection; an interop test submitted to the libnbd project at the same time as this patch demonstrates the feature working, as well as testing some corner cases (for example, when the payload length is longer than the export length), although other corner cases (like passing the same id duplicated) requires a protocol fuzzer because libnbd is not wired up to break the protocol that badly. This also includes tweaks to 'qemu-nbd --list' to show when a server is advertising the capability, and to the testsuite to reflect the addition to that output. Of note: qemu will always advertise the new feature bit during NBD_OPT_INFO if extended headers have alreay been negotiated (regardless of whether any NBD_OPT_SET_META_CONTEXT negotiation has occurred); but for NBD_OPT_GO, qemu only advertises the feature if block status is also enabled (that is, if the client does not negotiate any contexts, then NBD_CMD_BLOCK_STATUS cannot be used, so the feature is not advertised). Signed-off-by: Eric Blake --- v5: factor out 'id - NBD_MTA_ID_DIRTY_BITMAP' [Vladimir], rework logic on zero-length requests to be clearer [Vladimir], rebase to earlier changes --- docs/interop/nbd.txt | 2 +- nbd/server.c | 114 -- qemu-nbd.c| 1 + nbd/trace-events | 1 + tests/qemu-iotests/223.out| 12 +- tests/qemu-iotests/307.out| 10 +- .../tests/nbd-qemu-allocation.out | 2 +- 7 files changed, 122 insertions(+), 20 deletions(-) diff --git a/docs/interop/nbd.txt b/docs/interop/nbd.txt index 9aae5e1f294..18efb251de9 100644 --- a/docs/interop/nbd.txt +++ b/docs/interop/nbd.txt @@ -69,4 +69,4 @@ NBD_CMD_BLOCK_STATUS for "qemu:dirty-bitmap:", NBD_CMD_CACHE NBD_CMD_FLAG_FAST_ZERO * 5.2: NBD_CMD_BLOCK_STATUS for "qemu:allocation-depth" * 7.1: NBD_FLAG_CAN_MULTI_CONN for shareable writable exports -* 8.2: NBD_OPT_EXTENDED_HEADERS +* 8.2: NBD_OPT_EXTENDED_HEADERS, NBD_FLAG_BLOCK_STATUS_PAYLOAD diff --git a/nbd/server.c b/nbd/server.c index 2d4cec75a49..898580a9b0b 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -512,6 +512,9 @@ static int nbd_negotiate_handle_export_name(NBDClient *client, bool no_zeroes, if (client->mode >= NBD_MODE_STRUCTURED) { myflags |= NBD_FLAG_SEND_DF; } +if (client->mode >= NBD_MODE_EXTENDED && client->contexts.count) { +myflags |= NBD_FLAG_BLOCK_STAT_PAYLOAD; +} trace_nbd_negotiate_new_style_size_flags(client->exp->size, myflags); stq_be_p(buf, client->exp->size); stw_be_p(buf + 8, myflags); @@ -699,6 +702,10 @@ static int nbd_negotiate_handle_info(NBDClient *client, Error **errp) if (client->mode >= NBD_MODE_STRUCTURED) { myflags |= NBD_FLAG_SEND_DF; } +if (client->mode >= NBD_MODE_EXTENDED && +(client->contexts.count || client->opt == NBD_OPT_INFO)) { +myflags |= NBD_FLAG_BLOCK_STAT_PAYLOAD; +} trace_nbd_negotiate_new_style_size_flags(exp->size, myflags); stq_be_p(buf, exp->size); stw_be_p(buf + 8, myflags); @@ -2420,6 +2427,87 @@ static int coroutine_fn nbd_co_send_bitmap(NBDClient *client, return nbd_co_send_extents(client, request, ea, last, context_id, errp); } +/* + * nbd_co_block_status_payload_read + * Called when a client wants a subset of negotiated contexts via a + * BLOCK_STATUS payload. Check the payload for valid length and + * contents. On success, return 0 with request updated to effective + * length. If request was invalid but all payload consumed, return 0 + * with request->len and request->contexts->count set to 0 (which will + * trigger an appropriate NBD_EINVAL response later on). Return + * negative errno if the payload was not fully consumed. + */ +static int +nbd_co_block_status_payload_read(NBDClient *client, NBDRequest *request, + Error **errp) +{ +int payload_len = request->len; +g_autofree char *buf = NULL; +size_t count, i, nr_bitmaps; +uint32_t id; + +if (payload_len > NBD_MAX_BUFFER_SIZE) { +
[Libguestfs] [PATCH v7 08/12] nbd/client: Accept 64-bit block status chunks
Once extended mode is enabled, we need to accept 64-bit status replies (even for replies that don't exceed a 32-bit length). It is easier to normalize narrow replies into wide format so that the rest of our code only has to handle one width. Although a server is non-compliant if it sends a 64-bit reply in compact mode, or a 32-bit reply in extended mode, it is still easy enough to tolerate these mismatches. In normal execution, we are only requesting "base:allocation" which never exceeds 32 bits for flag values. But during testing with x-dirty-bitmap, we can force qemu to connect to some other context that might have 64-bit status bit; however, we ignore those upper bits (other than mapping qemu:allocation-depth into something that 'qemu-img map --output=json' can expose), and since that only affects testing, we really don't bother with checking whether more than the two least-significant bits are set. Signed-off-by: Eric Blake Reviewed-by: Vladimir Sementsov-Ogievskiy --- v5: factor out duplicate length calculation [Vladimir], add R-b v4: tweak comments and error message about count mismatch, fix setting of wide in loop [Vladimir] --- block/nbd.c| 49 -- block/trace-events | 1 + 2 files changed, 35 insertions(+), 15 deletions(-) diff --git a/block/nbd.c b/block/nbd.c index 76461430411..52ebc8b2f53 100644 --- a/block/nbd.c +++ b/block/nbd.c @@ -615,13 +615,17 @@ static int nbd_parse_offset_hole_payload(BDRVNBDState *s, */ static int nbd_parse_blockstatus_payload(BDRVNBDState *s, NBDStructuredReplyChunk *chunk, - uint8_t *payload, uint64_t orig_length, - NBDExtent32 *extent, Error **errp) + uint8_t *payload, bool wide, + uint64_t orig_length, + NBDExtent64 *extent, Error **errp) { uint32_t context_id; +uint32_t count; +size_t ext_len = wide ? sizeof(*extent) : sizeof(NBDExtent32); +size_t pay_len = sizeof(context_id) + wide * sizeof(count) + ext_len; /* The server succeeded, so it must have sent [at least] one extent */ -if (chunk->length < sizeof(context_id) + sizeof(*extent)) { +if (chunk->length < pay_len) { error_setg(errp, "Protocol error: invalid payload for " "NBD_REPLY_TYPE_BLOCK_STATUS"); return -EINVAL; @@ -636,8 +640,15 @@ static int nbd_parse_blockstatus_payload(BDRVNBDState *s, return -EINVAL; } -extent->length = payload_advance32(); -extent->flags = payload_advance32(); +if (wide) { +count = payload_advance32(); +extent->length = payload_advance64(); +extent->flags = payload_advance64(); +} else { +count = 0; +extent->length = payload_advance32(); +extent->flags = payload_advance32(); +} if (extent->length == 0) { error_setg(errp, "Protocol error: server sent status chunk with " @@ -658,7 +669,7 @@ static int nbd_parse_blockstatus_payload(BDRVNBDState *s, * (always a safe status, even if it loses information). */ if (s->info.min_block && !QEMU_IS_ALIGNED(extent->length, - s->info.min_block)) { + s->info.min_block)) { trace_nbd_parse_blockstatus_compliance("extent length is unaligned"); if (extent->length > s->info.min_block) { extent->length = QEMU_ALIGN_DOWN(extent->length, @@ -672,13 +683,15 @@ static int nbd_parse_blockstatus_payload(BDRVNBDState *s, /* * We used NBD_CMD_FLAG_REQ_ONE, so the server should not have * sent us any more than one extent, nor should it have included - * status beyond our request in that extent. However, it's easy - * enough to ignore the server's noncompliance without killing the + * status beyond our request in that extent. Furthermore, a wide + * server should have replied with an accurate count (we left + * count at 0 for a narrow server). However, it's easy enough to + * ignore the server's noncompliance without killing the * connection; just ignore trailing extents, and clamp things to * the length of our request. */ -if (chunk->length > sizeof(context_id) + sizeof(*extent)) { -trace_nbd_parse_blockstatus_compliance("more than one extent"); +if (count != wide || chunk->length > pay_len) { +trace_nbd_parse_blockstatus_compliance("unexpected extent count"); } if (extent->length > orig_length) { extent->length = orig_length; @@ -1124,7 +1137,7 @@ nbd_co_receive_cmdread_reply(BDRVNBDState
[Libguestfs] [PATCH v7 10/12] nbd/server: Refactor list of negotiated meta contexts
Peform several minor refactorings of how the list of negotiated meta contexts is managed, to make upcoming patches easier: Promote the internal type NBDExportMetaContexts to the public opaque type NBDMetaContexts, and mark exp const. Use a shorter member name in NBDClient. Hoist calls to nbd_check_meta_context() earlier in their callers, as the number of negotiated contexts may impact the flags exposed in regards to an export, which in turn requires a new parameter. Drop a redundant parameter to nbd_negotiate_meta_queries. No semantic change intended on the success path; on the failure path, dropping context in nbd_check_meta_export even when reporting an error is safer. Signed-off-by: Eric Blake Reviewed-by: Vladimir Sementsov-Ogievskiy --- v5: rebase to master, tweak commit message [Vladimir], R-b added v4: new patch split out from v3 13/14, with smaller impact (quit trying to separate exp outside of NBDMeataContexts) --- include/block/nbd.h | 1 + nbd/server.c| 55 - 2 files changed, 31 insertions(+), 25 deletions(-) diff --git a/include/block/nbd.h b/include/block/nbd.h index ba8724f5336..2006497f987 100644 --- a/include/block/nbd.h +++ b/include/block/nbd.h @@ -29,6 +29,7 @@ typedef struct NBDExport NBDExport; typedef struct NBDClient NBDClient; typedef struct NBDClientConnection NBDClientConnection; +typedef struct NBDMetaContexts NBDMetaContexts; extern const BlockExportDriver blk_exp_nbd; diff --git a/nbd/server.c b/nbd/server.c index b09ee44c159..44ebbd139b2 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -105,11 +105,13 @@ struct NBDExport { static QTAILQ_HEAD(, NBDExport) exports = QTAILQ_HEAD_INITIALIZER(exports); -/* NBDExportMetaContexts represents a list of contexts to be exported, +/* + * NBDMetaContexts represents a list of meta contexts in use, * as selected by NBD_OPT_SET_META_CONTEXT. Also used for - * NBD_OPT_LIST_META_CONTEXT. */ -typedef struct NBDExportMetaContexts { -NBDExport *exp; + * NBD_OPT_LIST_META_CONTEXT. + */ +struct NBDMetaContexts { +const NBDExport *exp; /* associated export */ size_t count; /* number of negotiated contexts */ bool base_allocation; /* export base:allocation context (block status) */ bool allocation_depth; /* export qemu:allocation-depth */ @@ -117,7 +119,7 @@ typedef struct NBDExportMetaContexts { * export qemu:dirty-bitmap:, * sized by exp->nr_export_bitmaps */ -} NBDExportMetaContexts; +}; struct NBDClient { int refcount; @@ -144,7 +146,7 @@ struct NBDClient { uint32_t check_align; /* If non-zero, check for aligned client requests */ NBDMode mode; -NBDExportMetaContexts export_meta; +NBDMetaContexts contexts; /* Negotiated meta contexts */ uint32_t opt; /* Current option being negotiated */ uint32_t optlen; /* remaining length of data in ioc for the option being @@ -455,10 +457,10 @@ static int nbd_negotiate_handle_list(NBDClient *client, Error **errp) return nbd_negotiate_send_rep(client, NBD_REP_ACK, errp); } -static void nbd_check_meta_export(NBDClient *client) +static void nbd_check_meta_export(NBDClient *client, NBDExport *exp) { -if (client->exp != client->export_meta.exp) { -client->export_meta.count = 0; +if (exp != client->contexts.exp) { +client->contexts.count = 0; } } @@ -504,6 +506,7 @@ static int nbd_negotiate_handle_export_name(NBDClient *client, bool no_zeroes, error_setg(errp, "export not found"); return -EINVAL; } +nbd_check_meta_export(client, client->exp); myflags = client->exp->nbdflags; if (client->mode >= NBD_MODE_STRUCTURED) { @@ -521,7 +524,6 @@ static int nbd_negotiate_handle_export_name(NBDClient *client, bool no_zeroes, QTAILQ_INSERT_TAIL(>exp->clients, client, next); blk_exp_ref(>exp->common); -nbd_check_meta_export(client); return 0; } @@ -641,6 +643,9 @@ static int nbd_negotiate_handle_info(NBDClient *client, Error **errp) errp, "export '%s' not present", sane_name); } +if (client->opt == NBD_OPT_GO) { +nbd_check_meta_export(client, exp); +} /* Don't bother sending NBD_INFO_NAME unless client requested it */ if (sendname) { @@ -729,7 +734,6 @@ static int nbd_negotiate_handle_info(NBDClient *client, Error **errp) client->check_align = check_align; QTAILQ_INSERT_TAIL(>exp->clients, client, next); blk_exp_ref(>exp->common); -nbd_check_meta_export(client); rc = 1; } return rc; @@ -852,7 +856,7 @@ static bool nbd_strshift(const char **str, const char *prefix) * Handle queries to 'base' namespace. For now, only the base:allocation * context is available. Return true
[Libguestfs] [PATCH v7 09/12] nbd/client: Request extended headers during negotiation
All the pieces are in place for a client to finally request extended headers. Note that we must not request extended headers when qemu-nbd is used to connect to the kernel module (as nbd.ko does not expect them, but expects us to do the negotiation in userspace before handing the socket over to the kernel), but there is no harm in all other clients requesting them. Extended headers are not essential to the information collected during 'qemu-nbd --list', but probing for it gives us one more piece of information in that output. Update the iotests affected by the new line of output. Signed-off-by: Eric Blake Reviewed-by: Vladimir Sementsov-Ogievskiy --- v5: add R-b v4: rebase to earlier changes, tweak commit message for why qemu-nbd connection to /dev/nbd cannot use extended mode [Vladimir] --- nbd/client-connection.c | 2 +- nbd/client.c | 20 ++- qemu-nbd.c| 3 +++ tests/qemu-iotests/223.out| 6 ++ tests/qemu-iotests/233.out| 4 tests/qemu-iotests/241.out| 3 +++ tests/qemu-iotests/307.out| 5 + .../tests/nbd-qemu-allocation.out | 1 + 8 files changed, 38 insertions(+), 6 deletions(-) diff --git a/nbd/client-connection.c b/nbd/client-connection.c index aa0201b7107..f9da67c87e3 100644 --- a/nbd/client-connection.c +++ b/nbd/client-connection.c @@ -93,7 +93,7 @@ NBDClientConnection *nbd_client_connection_new(const SocketAddress *saddr, .do_negotiation = do_negotiation, .initial_info.request_sizes = true, -.initial_info.mode = NBD_MODE_STRUCTURED, +.initial_info.mode = NBD_MODE_EXTENDED, .initial_info.base_allocation = true, .initial_info.x_dirty_bitmap = g_strdup(x_dirty_bitmap), .initial_info.name = g_strdup(export_name ?: "") diff --git a/nbd/client.c b/nbd/client.c index a2f253062aa..29ffc609a4b 100644 --- a/nbd/client.c +++ b/nbd/client.c @@ -953,15 +953,23 @@ static int nbd_start_negotiate(QIOChannel *ioc, QCryptoTLSCreds *tlscreds, if (fixedNewStyle) { int result = 0; +if (max_mode >= NBD_MODE_EXTENDED) { +result = nbd_request_simple_option(ioc, + NBD_OPT_EXTENDED_HEADERS, + false, errp); +if (result) { +return result < 0 ? -EINVAL : NBD_MODE_EXTENDED; +} +} if (max_mode >= NBD_MODE_STRUCTURED) { result = nbd_request_simple_option(ioc, NBD_OPT_STRUCTURED_REPLY, false, errp); -if (result < 0) { -return -EINVAL; +if (result) { +return result < 0 ? -EINVAL : NBD_MODE_STRUCTURED; } } -return result ? NBD_MODE_STRUCTURED : NBD_MODE_SIMPLE; +return NBD_MODE_SIMPLE; } else { return NBD_MODE_EXPORT_NAME; } @@ -1034,6 +1042,7 @@ int nbd_receive_negotiate(QIOChannel *ioc, QCryptoTLSCreds *tlscreds, } switch (info->mode) { +case NBD_MODE_EXTENDED: case NBD_MODE_STRUCTURED: if (base_allocation) { result = nbd_negotiate_simple_meta_context(ioc, info, errp); @@ -1144,7 +1153,7 @@ int nbd_receive_export_list(QIOChannel *ioc, QCryptoTLSCreds *tlscreds, *info = NULL; result = nbd_start_negotiate(ioc, tlscreds, hostname, , - NBD_MODE_STRUCTURED, NULL, errp); + NBD_MODE_EXTENDED, NULL, errp); if (tlscreds && sioc) { ioc = sioc; } @@ -1155,6 +1164,7 @@ int nbd_receive_export_list(QIOChannel *ioc, QCryptoTLSCreds *tlscreds, switch ((NBDMode)result) { case NBD_MODE_SIMPLE: case NBD_MODE_STRUCTURED: +case NBD_MODE_EXTENDED: /* newstyle - use NBD_OPT_LIST to populate array, then try * NBD_OPT_INFO on each array member. If structured replies * are enabled, also try NBD_OPT_LIST_META_CONTEXT. */ @@ -1191,7 +1201,7 @@ int nbd_receive_export_list(QIOChannel *ioc, QCryptoTLSCreds *tlscreds, break; } -if (result == NBD_MODE_STRUCTURED && +if (result >= NBD_MODE_STRUCTURED && nbd_list_meta_contexts(ioc, [i], errp) < 0) { goto out; } diff --git a/qemu-nbd.c b/qemu-nbd.c index 70aa3c487a0..e6681450cfe 100644 --- a/qemu-nbd.c +++ b/qemu-nbd.c @@ -235,6 +235,9 @@ static int qemu_nbd_client_list(SocketAddress *saddr, QCryptoTLSCreds *tls, printf(" opt block: %u\n", list[i].op
[Libguestfs] [PATCH v7 07/12] nbd/client: Initial support for extended headers
Update the client code to be able to send an extended request, and parse an extended header from the server. Note that since we reject any structured reply with a too-large payload, we can always normalize a valid header back into the compact form, so that the caller need not deal with two branches of a union. Still, until a later patch lets the client negotiate extended headers, the code added here should not be reached. Note that because of the different magic numbers, it is just as easy to trace and then tolerate a non-compliant server sending the wrong header reply as it would be to insist that the server is compliant. Signed-off-by: Eric Blake --- v5: fix logic bug on error reporting [Vladimir] v4: split off errp handling to separate patch [Vladimir], better function naming [Vladimir] --- include/block/nbd.h | 3 +- block/nbd.c | 2 +- nbd/client.c| 104 +--- nbd/trace-events| 3 +- 4 files changed, 74 insertions(+), 38 deletions(-) diff --git a/include/block/nbd.h b/include/block/nbd.h index 8a765e78dfb..ba8724f5336 100644 --- a/include/block/nbd.h +++ b/include/block/nbd.h @@ -389,7 +389,8 @@ int nbd_init(int fd, QIOChannelSocket *sioc, NBDExportInfo *info, Error **errp); int nbd_send_request(QIOChannel *ioc, NBDRequest *request); int coroutine_fn nbd_receive_reply(BlockDriverState *bs, QIOChannel *ioc, - NBDReply *reply, Error **errp); + NBDReply *reply, NBDMode mode, + Error **errp); int nbd_client(int fd); int nbd_disconnect(int fd); int nbd_errno_to_system_errno(int err); diff --git a/block/nbd.c b/block/nbd.c index 22d3cb11ac8..76461430411 100644 --- a/block/nbd.c +++ b/block/nbd.c @@ -458,7 +458,7 @@ static coroutine_fn int nbd_receive_replies(BDRVNBDState *s, uint64_t cookie, /* We are under mutex and cookie is 0. We have to do the dirty work. */ assert(s->reply.cookie == 0); -ret = nbd_receive_reply(s->bs, s->ioc, >reply, errp); +ret = nbd_receive_reply(s->bs, s->ioc, >reply, s->info.mode, errp); if (ret == 0) { ret = -EIO; error_setg(errp, "server dropped connection"); diff --git a/nbd/client.c b/nbd/client.c index cecb0f04377..a2f253062aa 100644 --- a/nbd/client.c +++ b/nbd/client.c @@ -1346,22 +1346,29 @@ int nbd_disconnect(int fd) int nbd_send_request(QIOChannel *ioc, NBDRequest *request) { -uint8_t buf[NBD_REQUEST_SIZE]; +uint8_t buf[NBD_EXTENDED_REQUEST_SIZE]; +size_t len; -assert(request->mode <= NBD_MODE_STRUCTURED); /* TODO handle extended */ -assert(request->len <= UINT32_MAX); trace_nbd_send_request(request->from, request->len, request->cookie, request->flags, request->type, nbd_cmd_lookup(request->type)); -stl_be_p(buf, NBD_REQUEST_MAGIC); stw_be_p(buf + 4, request->flags); stw_be_p(buf + 6, request->type); stq_be_p(buf + 8, request->cookie); stq_be_p(buf + 16, request->from); -stl_be_p(buf + 24, request->len); +if (request->mode >= NBD_MODE_EXTENDED) { +stl_be_p(buf, NBD_EXTENDED_REQUEST_MAGIC); +stq_be_p(buf + 24, request->len); +len = NBD_EXTENDED_REQUEST_SIZE; +} else { +assert(request->len <= UINT32_MAX); +stl_be_p(buf, NBD_REQUEST_MAGIC); +stl_be_p(buf + 24, request->len); +len = NBD_REQUEST_SIZE; +} -return nbd_write(ioc, buf, sizeof(buf), NULL); +return nbd_write(ioc, buf, len, NULL); } /* nbd_receive_simple_reply @@ -1388,30 +1395,36 @@ static int nbd_receive_simple_reply(QIOChannel *ioc, NBDSimpleReply *reply, return 0; } -/* nbd_receive_structured_reply_chunk +/* nbd_receive_reply_chunk_header * Read structured reply chunk except magic field (which should be already - * read). + * read). Normalize into the compact form. * Payload is not read. */ -static int nbd_receive_structured_reply_chunk(QIOChannel *ioc, - NBDStructuredReplyChunk *chunk, - Error **errp) +static int nbd_receive_reply_chunk_header(QIOChannel *ioc, NBDReply *chunk, + Error **errp) { int ret; +size_t len; +uint64_t payload_len; -assert(chunk->magic == NBD_STRUCTURED_REPLY_MAGIC); +if (chunk->magic == NBD_STRUCTURED_REPLY_MAGIC) { +len = sizeof(chunk->structured); +} else { +assert(chunk->magic == NBD_EXTENDED_REPLY_MAGIC); +len = sizeof(chunk->extended); +} ret = nbd_read(ioc, (uint8_t *)chunk + sizeof(chunk->magic), - sizeof(*chunk) - sizeof(chunk->magic), "structured chunk", +
[Libguestfs] [PATCH v7 05/12] nbd/server: Enable initial support for extended headers
Time to start supporting clients that request extended headers. Now we can finally reach the code added across several previous patches. Even though the NBD spec has been altered to allow us to accept NBD_CMD_READ larger than the max payload size (provided our response is a hole or broken up over more than one data chunk), we are not planning to take advantage of that, and continue to cap NBD_CMD_READ to 32M regardless of header size. For NBD_CMD_WRITE_ZEROES and NBD_CMD_TRIM, the block layer already supports 64-bit operations without any effort on our part. For NBD_CMD_BLOCK_STATUS, the client's length is a hint, and the previous patch took care of implementing the required NBD_REPLY_TYPE_BLOCK_STATUS_EXT. We do not yet support clients that want to do request payload filtering of NBD_CMD_BLOCK_STATUS; that will be added in later patches, but is not essential for qemu as a client since qemu only requests the single context base:allocation. Signed-off-by: Eric Blake Reviewed-by: Vladimir Sementsov-Ogievskiy --- v5: add R-b, s/8.1/8.2/ v4: split out parts into earlier patches, rebase to earlier changes, simplify handling of generic replies, retitle (compare to v3 9/14) --- docs/interop/nbd.txt | 1 + nbd/server.c | 21 + 2 files changed, 22 insertions(+) diff --git a/docs/interop/nbd.txt b/docs/interop/nbd.txt index f5ca25174a6..9aae5e1f294 100644 --- a/docs/interop/nbd.txt +++ b/docs/interop/nbd.txt @@ -69,3 +69,4 @@ NBD_CMD_BLOCK_STATUS for "qemu:dirty-bitmap:", NBD_CMD_CACHE NBD_CMD_FLAG_FAST_ZERO * 5.2: NBD_CMD_BLOCK_STATUS for "qemu:allocation-depth" * 7.1: NBD_FLAG_CAN_MULTI_CONN for shareable writable exports +* 8.2: NBD_OPT_EXTENDED_HEADERS diff --git a/nbd/server.c b/nbd/server.c index 8448167b12a..b09ee44c159 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -482,6 +482,10 @@ static int nbd_negotiate_handle_export_name(NBDClient *client, bool no_zeroes, [10 .. 133] reserved (0) [unless no_zeroes] */ trace_nbd_negotiate_handle_export_name(); +if (client->mode >= NBD_MODE_EXTENDED) { +error_setg(errp, "Extended headers already negotiated"); +return -EINVAL; +} if (client->optlen > NBD_MAX_STRING_SIZE) { error_setg(errp, "Bad length received"); return -EINVAL; @@ -1264,6 +1268,10 @@ static int nbd_negotiate_options(NBDClient *client, Error **errp) case NBD_OPT_STRUCTURED_REPLY: if (length) { ret = nbd_reject_length(client, false, errp); +} else if (client->mode >= NBD_MODE_EXTENDED) { +ret = nbd_negotiate_send_rep_err( +client, NBD_REP_ERR_EXT_HEADER_REQD, errp, +"extended headers already negotiated"); } else if (client->mode >= NBD_MODE_STRUCTURED) { ret = nbd_negotiate_send_rep_err( client, NBD_REP_ERR_INVALID, errp, @@ -1280,6 +1288,19 @@ static int nbd_negotiate_options(NBDClient *client, Error **errp) errp); break; +case NBD_OPT_EXTENDED_HEADERS: +if (length) { +ret = nbd_reject_length(client, false, errp); +} else if (client->mode >= NBD_MODE_EXTENDED) { +ret = nbd_negotiate_send_rep_err( +client, NBD_REP_ERR_INVALID, errp, +"extended headers already negotiated"); +} else { +ret = nbd_negotiate_send_rep(client, NBD_REP_ACK, errp); +client->mode = NBD_MODE_EXTENDED; +} +break; + default: ret = nbd_opt_drop(client, NBD_REP_ERR_UNSUP, errp, "Unsupported option %" PRIu32 " (%s)", -- 2.41.0 ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
[Libguestfs] [PATCH v7 04/12] nbd/server: Support 64-bit block status
The NBD spec states that if the client negotiates extended headers, the server must avoid NBD_REPLY_TYPE_BLOCK_STATUS and instead use NBD_REPLY_TYPE_BLOCK_STATUS_EXT which supports 64-bit lengths, even if the reply does not need more than 32 bits. As of this patch, client->mode is still never NBD_MODE_EXTENDED, so the code added here does not take effect until the next patch enables negotiation. For now, all metacontexts that we know how to export never populate more than 32 bits of information, so we don't have to worry about NBD_REP_ERR_EXT_HEADER_REQD or filtering during handshake, and we always send all zeroes for the upper 32 bits of status during NBD_CMD_BLOCK_STATUS. Note that we previously had some interesting size-juggling on call chains, such as: nbd_co_send_block_status(uint32_t length) -> blockstatus_to_extents(uint32_t bytes) -> bdrv_block_status_above(bytes, _t num) -> nbd_extent_array_add(uint64_t num) -> store num in 32-bit length But we were lucky that it never overflowed: bdrv_block_status_above never sets num larger than bytes, and we had previously been capping 'bytes' at 32 bits (since the protocol does not allow sending a larger request without extended headers). This patch adds some assertions that ensure we continue to avoid overflowing 32 bits for a narrow client, while fully utilizing 64-bits all the way through when the client understands that. Even in 64-bit math, overflow is not an issue, because all lengths are coming from the block layer, and we know that the block layer does not support images larger than off_t (if lengths were coming from the network, the story would be different). Signed-off-by: Eric Blake Reviewed-by: Vladimir Sementsov-Ogievskiy --- v5: stronger justification on assertion [Vladimir], add R-b v4: split conversion to big-endian across two helper functions rather than in-place union [Vladimir] --- nbd/server.c | 108 ++- 1 file changed, 82 insertions(+), 26 deletions(-) diff --git a/nbd/server.c b/nbd/server.c index 4cd061f9da4..8448167b12a 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -2103,20 +2103,24 @@ static int coroutine_fn nbd_co_send_sparse_read(NBDClient *client, } typedef struct NBDExtentArray { -NBDExtent32 *extents; +NBDExtent64 *extents; unsigned int nb_alloc; unsigned int count; uint64_t total_length; +bool extended; bool can_add; bool converted_to_be; } NBDExtentArray; -static NBDExtentArray *nbd_extent_array_new(unsigned int nb_alloc) +static NBDExtentArray *nbd_extent_array_new(unsigned int nb_alloc, +NBDMode mode) { NBDExtentArray *ea = g_new0(NBDExtentArray, 1); +assert(mode >= NBD_MODE_STRUCTURED); ea->nb_alloc = nb_alloc; -ea->extents = g_new(NBDExtent32, nb_alloc); +ea->extents = g_new(NBDExtent64, nb_alloc); +ea->extended = mode >= NBD_MODE_EXTENDED; ea->can_add = true; return ea; @@ -2135,15 +2139,36 @@ static void nbd_extent_array_convert_to_be(NBDExtentArray *ea) int i; assert(!ea->converted_to_be); +assert(ea->extended); ea->can_add = false; ea->converted_to_be = true; for (i = 0; i < ea->count; i++) { -ea->extents[i].flags = cpu_to_be32(ea->extents[i].flags); -ea->extents[i].length = cpu_to_be32(ea->extents[i].length); +ea->extents[i].length = cpu_to_be64(ea->extents[i].length); +ea->extents[i].flags = cpu_to_be64(ea->extents[i].flags); } } +/* Further modifications of the array after conversion are abandoned */ +static NBDExtent32 *nbd_extent_array_convert_to_narrow(NBDExtentArray *ea) +{ +int i; +NBDExtent32 *extents = g_new(NBDExtent32, ea->count); + +assert(!ea->converted_to_be); +assert(!ea->extended); +ea->can_add = false; +ea->converted_to_be = true; + +for (i = 0; i < ea->count; i++) { +assert((ea->extents[i].length | ea->extents[i].flags) <= UINT32_MAX); +extents[i].length = cpu_to_be32(ea->extents[i].length); +extents[i].flags = cpu_to_be32(ea->extents[i].flags); +} + +return extents; +} + /* * Add extent to NBDExtentArray. If extent can't be added (no available space), * return -1. @@ -2154,19 +2179,27 @@ static void nbd_extent_array_convert_to_be(NBDExtentArray *ea) * would result in an incorrect range reported to the client) */ static int nbd_extent_array_add(NBDExtentArray *ea, -uint32_t length, uint32_t flags) +uint64_t length, uint32_t flags) { assert(ea->can_add); if (!length) { return 0; } +if (!ea->extended) { +assert(length <= UINT32_MAX); +} /* Extend previous extent if flags are the same */ if (ea->count > 0 && flags =
[Libguestfs] [PATCH v7 03/12] nbd/server: Prepare to send extended header replies
Although extended mode is not yet enabled, once we do turn it on, we need to reply with extended headers to all messages. Update the low level entry points necessary so that all other callers automatically get the right header based on the current mode. Signed-off-by: Eric Blake Reviewed-by: Vladimir Sementsov-Ogievskiy --- v5: s/iov->iov_len/iov[0].iov_len/ [Vladimir], add R-b v4: new patch, split out from v3 9/14 --- nbd/server.c | 30 ++ 1 file changed, 22 insertions(+), 8 deletions(-) diff --git a/nbd/server.c b/nbd/server.c index e227e470d41..4cd061f9da4 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -1938,8 +1938,6 @@ static inline void set_be_chunk(NBDClient *client, struct iovec *iov, size_t niov, uint16_t flags, uint16_t type, NBDRequest *request) { -/* TODO - handle structured vs. extended replies */ -NBDStructuredReplyChunk *chunk = iov->iov_base; size_t i, length = 0; for (i = 1; i < niov; i++) { @@ -1947,12 +1945,26 @@ static inline void set_be_chunk(NBDClient *client, struct iovec *iov, } assert(length <= NBD_MAX_BUFFER_SIZE + sizeof(NBDStructuredReadData)); -iov[0].iov_len = sizeof(*chunk); -stl_be_p(>magic, NBD_STRUCTURED_REPLY_MAGIC); -stw_be_p(>flags, flags); -stw_be_p(>type, type); -stq_be_p(>cookie, request->cookie); -stl_be_p(>length, length); +if (client->mode >= NBD_MODE_EXTENDED) { +NBDExtendedReplyChunk *chunk = iov->iov_base; + +iov[0].iov_len = sizeof(*chunk); +stl_be_p(>magic, NBD_EXTENDED_REPLY_MAGIC); +stw_be_p(>flags, flags); +stw_be_p(>type, type); +stq_be_p(>cookie, request->cookie); +stq_be_p(>offset, request->from); +stq_be_p(>length, length); +} else { +NBDStructuredReplyChunk *chunk = iov->iov_base; + +iov[0].iov_len = sizeof(*chunk); +stl_be_p(>magic, NBD_STRUCTURED_REPLY_MAGIC); +stw_be_p(>flags, flags); +stw_be_p(>type, type); +stq_be_p(>cookie, request->cookie); +stl_be_p(>length, length); +} } static int coroutine_fn nbd_co_send_chunk_done(NBDClient *client, @@ -2509,6 +2521,8 @@ static coroutine_fn int nbd_send_generic_reply(NBDClient *client, { if (client->mode >= NBD_MODE_STRUCTURED && ret < 0) { return nbd_co_send_chunk_error(client, request, -ret, error_msg, errp); +} else if (client->mode >= NBD_MODE_EXTENDED) { +return nbd_co_send_chunk_done(client, request, errp); } else { return nbd_co_send_simple_reply(client, request, ret < 0 ? -ret : 0, NULL, 0, errp); -- 2.41.0 ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
[Libguestfs] [PATCH v7 06/12] nbd/client: Plumb errp through nbd_receive_replies
Instead of ignoring the low-level error just to refabricate our own message to pass to the caller, we can just plumb the caller's errp down to the low level. Signed-off-by: Eric Blake --- v5: set errp on more failure cases [Vladimir], typo fix v4: new patch [Vladimir] --- block/nbd.c | 18 -- 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/block/nbd.c b/block/nbd.c index 4a7f37da1c6..22d3cb11ac8 100644 --- a/block/nbd.c +++ b/block/nbd.c @@ -416,7 +416,8 @@ static void coroutine_fn GRAPH_RDLOCK nbd_reconnect_attempt(BDRVNBDState *s) reconnect_delay_timer_del(s); } -static coroutine_fn int nbd_receive_replies(BDRVNBDState *s, uint64_t cookie) +static coroutine_fn int nbd_receive_replies(BDRVNBDState *s, uint64_t cookie, +Error **errp) { int ret; uint64_t ind = COOKIE_TO_INDEX(cookie), ind2; @@ -457,20 +458,25 @@ static coroutine_fn int nbd_receive_replies(BDRVNBDState *s, uint64_t cookie) /* We are under mutex and cookie is 0. We have to do the dirty work. */ assert(s->reply.cookie == 0); -ret = nbd_receive_reply(s->bs, s->ioc, >reply, NULL); -if (ret <= 0) { -ret = ret ? ret : -EIO; +ret = nbd_receive_reply(s->bs, s->ioc, >reply, errp); +if (ret == 0) { +ret = -EIO; +error_setg(errp, "server dropped connection"); +} +if (ret < 0) { nbd_channel_error(s, ret); return ret; } if (nbd_reply_is_structured(>reply) && s->info.mode < NBD_MODE_STRUCTURED) { nbd_channel_error(s, -EINVAL); +error_setg(errp, "unexpected structured reply"); return -EINVAL; } ind2 = COOKIE_TO_INDEX(s->reply.cookie); if (ind2 >= MAX_NBD_REQUESTS || !s->requests[ind2].coroutine) { nbd_channel_error(s, -EINVAL); +error_setg(errp, "unexpected cookie value"); return -EINVAL; } if (s->reply.cookie == cookie) { @@ -842,9 +848,9 @@ static coroutine_fn int nbd_co_do_receive_one_chunk( } *request_ret = 0; -ret = nbd_receive_replies(s, cookie); +ret = nbd_receive_replies(s, cookie, errp); if (ret < 0) { -error_setg(errp, "Connection closed"); +error_prepend(errp, "Connection closed: "); return -EIO; } assert(s->ioc); -- 2.41.0 ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
[Libguestfs] [PATCH v7 01/12] nbd/server: Support a request payload
Upcoming additions to support NBD 64-bit effect lengths allow for the possibility to distinguish between payload length (capped at 32M) and effect length (64 bits, although we generally assume 63 bits because of off_t limitations). Without that extension, only the NBD_CMD_WRITE request has a payload; but with the extension, it makes sense to allow at least NBD_CMD_BLOCK_STATUS to have both a payload and effect length in a future patch (where the payload is a limited-size struct that in turn gives the real effect length as well as a subset of known ids for which status is requested). Other future NBD commands may also have a request payload, so the 64-bit extension introduces a new NBD_CMD_FLAG_PAYLOAD_LEN that distinguishes between whether the header length is a payload length or an effect length, rather than hard-coding the decision based on the command. According to the spec, a client should never send a command with a payload without the negotiation phase proving such extension is available. So in the unlikely event the bit is set or cleared incorrectly, the client is already at fault; if the client then provides the payload, we can gracefully consume it off the wire and fail the command with NBD_EINVAL (subsequent checks for magic numbers ensure we are still in sync), while if the client fails to send payload we block waiting for it (basically deadlocking our connection to the bad client, but not negatively impacting our ability to service other clients, so not a security risk). Note that we do not support the payload version of BLOCK_STATUS yet. Signed-off-by: Eric Blake --- v7: another try at better logic [Vladimir] v5: retitled from v4 13/24, rewrite on top of previous patch's switch statement [Vladimir] v4: less indentation on several 'if's [Vladimir] --- nbd/server.c | 37 + nbd/trace-events | 1 + 2 files changed, 34 insertions(+), 4 deletions(-) diff --git a/nbd/server.c b/nbd/server.c index 7a6f95071f8..1eabcfc908d 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -2322,9 +2322,11 @@ static int coroutine_fn nbd_co_receive_request(NBDRequestData *req, Error **errp) { NBDClient *client = req->client; +bool extended_with_payload; bool check_length = false; bool check_rofs = false; bool allocate_buffer = false; +bool payload_okay = false; unsigned payload_len = 0; int valid_flags = NBD_CMD_FLAG_FUA; int ret; @@ -2338,6 +2340,13 @@ static int coroutine_fn nbd_co_receive_request(NBDRequestData *req, trace_nbd_co_receive_request_decode_type(request->cookie, request->type, nbd_cmd_lookup(request->type)); +extended_with_payload = client->mode >= NBD_MODE_EXTENDED && +request->flags & NBD_CMD_FLAG_PAYLOAD_LEN; +if (extended_with_payload) { +payload_len = request->len; +check_length = true; +} + switch (request->type) { case NBD_CMD_DISC: /* Special case: we're going to disconnect without a reply, @@ -2354,6 +2363,15 @@ static int coroutine_fn nbd_co_receive_request(NBDRequestData *req, break; case NBD_CMD_WRITE: +if (client->mode >= NBD_MODE_EXTENDED) { +if (!extended_with_payload) { +/* The client is noncompliant. Trace it, but proceed. */ +trace_nbd_co_receive_ext_payload_compliance(request->from, +request->len); +} +valid_flags |= NBD_CMD_FLAG_PAYLOAD_LEN; +} +payload_okay = true; payload_len = request->len; check_length = true; allocate_buffer = true; @@ -2395,6 +2413,14 @@ static int coroutine_fn nbd_co_receive_request(NBDRequestData *req, request->len, NBD_MAX_BUFFER_SIZE); return -EINVAL; } +if (payload_len && !payload_okay) { +/* + * For now, we don't support payloads on other commands; but + * we can keep the connection alive by ignoring the payload. + */ +assert(request->type != NBD_CMD_WRITE); +request->len = 0; +} if (allocate_buffer) { /* READ, WRITE */ req->data = blk_try_blockalign(client->exp->common.blk, @@ -2405,10 +2431,13 @@ static int coroutine_fn nbd_co_receive_request(NBDRequestData *req, } } if (payload_len) { -/* WRITE */ -assert(req->data); -ret = nbd_read(client->ioc, req->data, payload_len, - "CMD_WRITE data", errp); +if (payload_okay) { +assert(req->data); +ret = nbd_read(client->ioc, req->data, payload_len, + "CMD_WRITE data", errp); +} else { +ret = nbd_drop(cli
[Libguestfs] [PATCH v7 02/12] nbd/server: Prepare to receive extended header requests
Although extended mode is not yet enabled, once we do turn it on, we need to accept extended requests for all messages. Previous patches have already taken care of supporting 64-bit lengths, now we just need to read it off the wire. Note that this implementation will block indefinitely on a buggy client that sends a non-extended payload (that is, we try to read a full packet before we ever check the magic number, but a client that mistakenly sends a simple request after negotiating extended headers doesn't send us enough bytes), but it's no different from any other client that stops talking to us partway through a packet and thus not worth coding around. Signed-off-by: Eric Blake Reviewed-by: Vladimir Sementsov-Ogievskiy --- v6: fix sign extension bug v5: no change v4: new patch, split out from v3 9/14 --- nbd/nbd-internal.h | 5 - nbd/server.c | 43 ++- 2 files changed, 34 insertions(+), 14 deletions(-) diff --git a/nbd/nbd-internal.h b/nbd/nbd-internal.h index 133b1d94b50..dfa02f77ee4 100644 --- a/nbd/nbd-internal.h +++ b/nbd/nbd-internal.h @@ -34,8 +34,11 @@ * https://github.com/yoe/nbd/blob/master/doc/proto.md */ -/* Size of all NBD_OPT_*, without payload */ +/* Size of all compact NBD_CMD_*, without payload */ #define NBD_REQUEST_SIZE(4 + 2 + 2 + 8 + 8 + 4) +/* Size of all extended NBD_CMD_*, without payload */ +#define NBD_EXTENDED_REQUEST_SIZE (4 + 2 + 2 + 8 + 8 + 8) + /* Size of all NBD_REP_* sent in answer to most NBD_OPT_*, without payload */ #define NBD_REPLY_SIZE (4 + 4 + 8) /* Size of reply to NBD_OPT_EXPORT_NAME */ diff --git a/nbd/server.c b/nbd/server.c index 1eabcfc908d..e227e470d41 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -1411,11 +1411,13 @@ nbd_read_eof(NBDClient *client, void *buffer, size_t size, Error **errp) static int coroutine_fn nbd_receive_request(NBDClient *client, NBDRequest *request, Error **errp) { -uint8_t buf[NBD_REQUEST_SIZE]; -uint32_t magic; +uint8_t buf[NBD_EXTENDED_REQUEST_SIZE]; +uint32_t magic, expect; int ret; +size_t size = client->mode >= NBD_MODE_EXTENDED ? +NBD_EXTENDED_REQUEST_SIZE : NBD_REQUEST_SIZE; -ret = nbd_read_eof(client, buf, sizeof(buf), errp); +ret = nbd_read_eof(client, buf, size, errp); if (ret < 0) { return ret; } @@ -1423,13 +1425,21 @@ static int coroutine_fn nbd_receive_request(NBDClient *client, NBDRequest *reque return -EIO; } -/* Request - [ 0 .. 3] magic (NBD_REQUEST_MAGIC) - [ 4 .. 5] flags (NBD_CMD_FLAG_FUA, ...) - [ 6 .. 7] type(NBD_CMD_READ, ...) - [ 8 .. 15] cookie - [16 .. 23] from - [24 .. 27] len +/* + * Compact request + * [ 0 .. 3] magic (NBD_REQUEST_MAGIC) + * [ 4 .. 5] flags (NBD_CMD_FLAG_FUA, ...) + * [ 6 .. 7] type(NBD_CMD_READ, ...) + * [ 8 .. 15] cookie + * [16 .. 23] from + * [24 .. 27] len + * Extended request + * [ 0 .. 3] magic (NBD_EXTENDED_REQUEST_MAGIC) + * [ 4 .. 5] flags (NBD_CMD_FLAG_FUA, NBD_CMD_FLAG_PAYLOAD_LEN, ...) + * [ 6 .. 7] type(NBD_CMD_READ, ...) + * [ 8 .. 15] cookie + * [16 .. 23] from + * [24 .. 31] len */ magic = ldl_be_p(buf); @@ -1437,13 +1447,20 @@ static int coroutine_fn nbd_receive_request(NBDClient *client, NBDRequest *reque request->type = lduw_be_p(buf + 6); request->cookie = ldq_be_p(buf + 8); request->from = ldq_be_p(buf + 16); -request->len= (uint32_t)ldl_be_p(buf + 24); /* widen 32 to 64 bits */ +if (client->mode >= NBD_MODE_EXTENDED) { +request->len = ldq_be_p(buf + 24); +expect = NBD_EXTENDED_REQUEST_MAGIC; +} else { +request->len = (uint32_t)ldl_be_p(buf + 24); /* widen 32 to 64 bits */ +expect = NBD_REQUEST_MAGIC; +} trace_nbd_receive_request(magic, request->flags, request->type, request->from, request->len); -if (magic != NBD_REQUEST_MAGIC) { -error_setg(errp, "invalid magic (got 0x%" PRIx32 ")", magic); +if (magic != expect) { +error_setg(errp, "invalid magic (got 0x%" PRIx32 ", expected 0x%" + PRIx32 ")", magic, expect); return -EINVAL; } return 0; -- 2.41.0 ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
[Libguestfs] [PATCH v7 00/12] NBD 64-bit extensions for qemu
v6 was here: https://lists.gnu.org/archive/html/qemu-devel/2023-08/msg05231.html Since then: - patches v6 1-5 included in pull request - patch v6 6 logic improved, now v7 patch 1 - rebased to master Still needing review: - patch 1,6,7,11,12 Eric Blake (12): nbd/server: Support a request payload nbd/server: Prepare to receive extended header requests nbd/server: Prepare to send extended header replies nbd/server: Support 64-bit block status nbd/server: Enable initial support for extended headers nbd/client: Plumb errp through nbd_receive_replies nbd/client: Initial support for extended headers nbd/client: Accept 64-bit block status chunks nbd/client: Request extended headers during negotiation nbd/server: Refactor list of negotiated meta contexts nbd/server: Prepare for per-request filtering of BLOCK_STATUS nbd/server: Add FLAG_PAYLOAD support to CMD_BLOCK_STATUS docs/interop/nbd.txt | 1 + include/block/nbd.h | 5 +- nbd/nbd-internal.h| 5 +- block/nbd.c | 67 ++- nbd/client-connection.c | 2 +- nbd/client.c | 124 -- nbd/server.c | 418 ++ qemu-nbd.c| 4 + block/trace-events| 1 + nbd/trace-events | 5 +- tests/qemu-iotests/223.out| 18 +- tests/qemu-iotests/233.out| 4 + tests/qemu-iotests/241.out| 3 + tests/qemu-iotests/307.out| 15 +- .../tests/nbd-qemu-allocation.out | 3 +- 15 files changed, 516 insertions(+), 159 deletions(-) -- 2.41.0 ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [EXTERNAL] - Re: Libguestfs desired version installation
On Mon, Sep 25, 2023 at 10:16:10AM +0100, Richard W.M. Jones wrote: > On Mon, Sep 25, 2023 at 09:09:32AM +, Teja Konapalli wrote: > > Hi Richard, > > > > Yes, you are correct my ask is without upgrading OS version(from 8.2 > > to 9.0) is it possible to upgrade libguestfs alone to different > > versions. > > No, that is not supported. Worded differently: Libguestfs is open source. Therefore, you have the freedom to install whatever libguestfs version you want on whatever underlying system you want - and it may even work the way you want. However, if you choose to install a version of libguestfs on a distro that is different than the version that your distro already provides, then you will be stuck debugging any issues that arise on your own, and cannot reasonably expect either your distro or upstream to help you (your distro's answer will likely be "use the version we already debugged for you"). Upstream support is a limited resource - there are no contractual guarantees that anyone has to help you, and while it may be low cost when it works in your favor, you are far more likely to get responses based on the quality of effort you have put in to reporting problems accurately, rather than expecting others to do your homework for you. You mention using RHEL; since you are already paying to use a distro, you might as well make the most of your support contract. However, this is the upstream mailing list, and not the point of contact with your RHEL support contract. Then as Rich has already mentioned: > > > > Since you have (still) not described the exact environment you're using, or > > even accurately described the problem you're having, I don't know if this > > will make any difference. your distro support contract will likewise expect you to put in more effort of describing the actual scenario you are trying to solve, rather than the secondary problem of "can I upgrade libguestfs" that may or may not address the primary problem that you are facing. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [libnbd PATCH] info: Try harder for graceful disconnect from server
On Fri, Sep 22, 2023 at 09:47:55PM +0100, Richard W.M. Jones wrote: > On Fri, Sep 22, 2023 at 12:33:36PM -0500, Eric Blake wrote: > > There are a number of ways in which gathering information can fail. > > But when possible, it makes sense to let the server know that we are > > done talking, as it minimizes the likelihood that nbdinfo's error > > message will be obscured by an accompanying error message by the > > server complaining about an unclean disconnect. > > > > For example, with a one-off qemu temporarily patched as mentioned in > > commit 0f8ee8c6 to advertise sizes larger than 2^63, kicking off > > 'qemu-nbd -f raw -r file &' produces: > > > > pre-patch: > > > > $ ./run nbdinfo --size nbd://localhost > > /home/eblake/libnbd/info/.libs/nbdinfo: nbd_get_size: server claims size > > 9223372036854781440 which does not fit in signed result: Value too large > > for defined data type > > qemu-nbd: option negotiation failed: Failed to read opts magic: Unexpected > > end-of-file before all data were read > > This doesn't necessarily seem like a bug? It's a quality of service thing - qemu is just being noisy that the client closed the socket abruptly without giving any reason why. It doesn't change libnbd's behavior (nbdinfo has already reported its error message), but may confuse people reading qemu-nbd logs. It may also be a case where we could patch qemu-nbd to not output a complaint if the client exited at a clean point in the back-and-forth transactions. We still want to be noisy if the socket closes after the first byte has been read, but if there are zero bytes available, announcing that the client no longer cares does not add much value. > > This feels like quite a significant change in behaviour to me. In > particular I'm worried if it interacts in some subtle way with the > forking done by the "[ ... ]" syntax for running servers on the > command line (or any of the other ways that libnbd might fork/exec). Interesting observation. atexit() handlers are not preserved across exec, and I think all our fork() paths end in either exec or _exit (also where atexit handlers are ignored), so I don't think we are risking calling the handler twice. > > Can we hold this patch until after 1.18 is released and then put it > in? Should only be a week or two. Sure, being conservative on this one is fine by me. I'll delay it until after 1.18. > > Provisionally ACKed for post-1.18 > > Rich. > > > post-patch: > > $ ./run nbdinfo --size nbd://localhost > > /home/eblake/libnbd/info/.libs/nbdinfo: nbd_get_size: server claims size > > 9223372036854781440 which does not fit in signed result: Value too large > > for defined data type > > > > if nbdinfo is run in the same terminal as qemu-nbd. > > > > Signed-off-by: Eric Blake > > --- > > info/main.c | 13 + > > 1 file changed, 13 insertions(+) > > > > diff --git a/info/main.c b/info/main.c > > index c8248c61..05f19f43 100644 > > --- a/info/main.c > > +++ b/info/main.c > > @@ -88,6 +88,17 @@ usage (FILE *fp, int exitcode) > >exit (exitcode); > > } > > > > +void > > +clean_shutdown (void) > > +{ > > + /* If we are connected but detect an error, try to give the server > > + * notice that we are done talking. Ignore failures, as this is > > + * only a courtesy measure. > > + */ > > + if (nbd) > > +nbd_shutdown (nbd, 0); > > +} > > + > > int > > main (int argc, char *argv[]) > > { > > @@ -277,6 +288,7 @@ main (int argc, char *argv[]) > > fprintf (stderr, "%s: %s\n", progname, nbd_get_error ()); > > exit (EXIT_FAILURE); > >} > > + atexit (clean_shutdown); > >nbd_set_uri_allow_local_file (nbd, true); /* Allow ?tls-psk-file. */ > > > >/* Set optional modes in the handle. */ > > @@ -353,6 +365,7 @@ main (int argc, char *argv[]) > >free_exports (); > >nbd_shutdown (nbd, 0); > >nbd_close (nbd); > > + nbd = NULL; > > > > /* Close the output stream and copy it to the real stdout. */ > >if (fclose (fp) == EOF) { > > -- > > 2.41.0 > > > > > > -- > Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones > Read my programming and virtualization blog: http://rwmj.wordpress.com > virt-top is 'top' for virtual machines. Tiny program with many > powerful monitoring features, net stats, disk stats, logging, etc. > http://people.redhat.com/~rjones/virt-top > -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
[Libguestfs] [libnbd PATCH] info: Try harder for graceful disconnect from server
There are a number of ways in which gathering information can fail. But when possible, it makes sense to let the server know that we are done talking, as it minimizes the likelihood that nbdinfo's error message will be obscured by an accompanying error message by the server complaining about an unclean disconnect. For example, with a one-off qemu temporarily patched as mentioned in commit 0f8ee8c6 to advertise sizes larger than 2^63, kicking off 'qemu-nbd -f raw -r file &' produces: pre-patch: $ ./run nbdinfo --size nbd://localhost /home/eblake/libnbd/info/.libs/nbdinfo: nbd_get_size: server claims size 9223372036854781440 which does not fit in signed result: Value too large for defined data type qemu-nbd: option negotiation failed: Failed to read opts magic: Unexpected end-of-file before all data were read post-patch: $ ./run nbdinfo --size nbd://localhost /home/eblake/libnbd/info/.libs/nbdinfo: nbd_get_size: server claims size 9223372036854781440 which does not fit in signed result: Value too large for defined data type if nbdinfo is run in the same terminal as qemu-nbd. Signed-off-by: Eric Blake --- info/main.c | 13 + 1 file changed, 13 insertions(+) diff --git a/info/main.c b/info/main.c index c8248c61..05f19f43 100644 --- a/info/main.c +++ b/info/main.c @@ -88,6 +88,17 @@ usage (FILE *fp, int exitcode) exit (exitcode); } +void +clean_shutdown (void) +{ + /* If we are connected but detect an error, try to give the server + * notice that we are done talking. Ignore failures, as this is + * only a courtesy measure. + */ + if (nbd) +nbd_shutdown (nbd, 0); +} + int main (int argc, char *argv[]) { @@ -277,6 +288,7 @@ main (int argc, char *argv[]) fprintf (stderr, "%s: %s\n", progname, nbd_get_error ()); exit (EXIT_FAILURE); } + atexit (clean_shutdown); nbd_set_uri_allow_local_file (nbd, true); /* Allow ?tls-psk-file. */ /* Set optional modes in the handle. */ @@ -353,6 +365,7 @@ main (int argc, char *argv[]) free_exports (); nbd_shutdown (nbd, 0); nbd_close (nbd); + nbd = NULL; /* Close the output stream and copy it to the real stdout. */ if (fclose (fp) == EOF) { -- 2.41.0 ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [libnbd PATCH v2 0/6] Fix fuzzer fallout
On Thu, Sep 21, 2023 at 03:57:59PM -0500, Eric Blake wrote: > Cat's out of the bag: Rich's fuzzer run found not one, but two > independent assertion failures that a malicious server could trigger > in my recent 64-bit extension code additions. What's more, in the > process of fixing them, we've discovered another long-standing issue > where nbd_get_size() returns confusing results compared to its > documentation, when talking to an odd server that reports a really > large export size. > > After off-list discussion between Rich, Laszlo, and myself, we didn't > think an embargoed CVE against libnbd is necessary (the assertion > failures only happen to unstable releases, and the nbd_get_size() > misbehavior does not happen with normal servers and has been in place > since v1.0, so it is nothing new), so I am posting the series now for > public review. But we will still be reaching out to secalert for > their opinion (it may be that they still see a low-priority exploit in > an app that gets confused when trying to use a negative size as a loop > bound, for example). Once they answer, and regardless of whether we > end up doing a libnbd CVE after all, I will follow up to the mailing > list with a security incident (client apps that demand a positive > export size should probably favor nbd_get_size()<0 over > nbd_get_size()==-1). > > Eric Blake (6): > states: Tweak comment in OPT_GO state handler > fuzzing: Disable client-side strictness checks > api: Sanitize sizes larger than INT64_MAX > block_status: Fix assertion with large server size > block_status: Fix assertion on bad 64-bit block status reply > info: Tolerate missing size After making a few edits based on the reviews, the series now in as adf32845..f8375d3c. I'll wait for an answer from secalert before posting a followup security advisory email. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
[Libguestfs] nbdinfo default output [was: [libnbd PATCH v2 6/6] info: Tolerate missing size]
On Fri, Sep 22, 2023 at 09:03:53AM -0500, Eric Blake wrote: > > > $ ./run nbdinfo nbd://localhost > > > protocol: newstyle-fixed without TLS, using extended packets > > > ... > > > block_size_maximum: 33554432 For the human-readable output, should we also be using human_size() on the block_size_* values? I'd find: block_size_minimum: 1 block_size_preferred: 4096 (4K) block_size_maximum: 33554432 (32M) somewhat easier to read. I don't know if the JSON version would need to output that extra information, though. But that's a question independent of this patch series. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [libnbd PATCH v2 6/6] info: Tolerate missing size
On Fri, Sep 22, 2023 at 02:19:32PM +0200, Laszlo Ersek wrote: > On 9/21/23 22:58, Eric Blake wrote: > > As previous patches showed, the NBD spec does not yet forbid a server > > sending us a size that does not fit in int64_t. We should gracefully > > handle this during nbdinfo, rather than giving up early. > > > > With the same one-line hack to qemu to set the most significant bit of > > the export size, output changes from pre-patch: > > > > $ ./run nbdinfo nbd://localhost > > /home/eblake/libnbd/info/.libs/nbdinfo: nbd_get_size: server claims size > > 9223372036854781440 which does not fit in signed result: Value too large > > for defined data type > > qemu-nbd: option negotiation failed: Failed to read opts magic: Unexpected > > end-of-file before all data were read > > > > to post-patch: > > > > $ ./run nbdinfo nbd://localhost > > protocol: newstyle-fixed without TLS, using extended packets > > ... > > block_size_maximum: 33554432 > > [1] > > > > Sadly, since writing a server with such large export sizes requires a > > one-off hack, I don't see the point in adding a unit test. > > > > Signed-off-by: Eric Blake > > --- > > info/show.c | 25 + > > 1 file changed, 13 insertions(+), 12 deletions(-) > > > > diff --git a/info/show.c b/info/show.c > > index a71d837e..3d80545e 100644 > > --- a/info/show.c > > +++ b/info/show.c > > @@ -46,7 +46,7 @@ show_one_export (struct nbd_handle *nbd, const char *desc, > > bool first, bool last) > > { > >int64_t i, size; > > - char size_str[HUMAN_SIZE_LONGEST]; > > + char size_str[HUMAN_SIZE_LONGEST] = "unavailable"; > >bool human_size_flag; Uninitialized... > >char *export_name = NULL; > >char *export_desc = NULL; > > @@ -89,13 +89,10 @@ show_one_export (struct nbd_handle *nbd, const char > > *desc, > > return false; > >} > >size = nbd_get_size (nbd); > > - if (size == -1) { > > -fprintf (stderr, "%s: %s\n", progname, nbd_get_error ()); > > -exit (EXIT_FAILURE); > > + if (size >= 0) { > > +human_size (size_str, size, _size_flag); > >} > > > > - human_size (size_str, size, _size_flag); ...and relies on human_size() to initialize it, but that is now conditionally skipped. Would matter if... > > - > >if (uri_is_meaningful ()) > > uri = nbd_get_uri (nbd); > > > > @@ -130,7 +127,8 @@ show_one_export (struct nbd_handle *nbd, const char > > *desc, > > show_context = true; > > > >/* Get content last, as it moves the connection out of negotiating */ > > - content = get_content (nbd, size); > > + if (size >= 0) > > +content = get_content (nbd, size); > > > >if (!json_output) { > > ansi_colour (ANSI_FG_BOLD_BLACK, fp); > > or > > > > $ ./run nbdinfo nbd://localhost --json > > { > > "protocol": "newstyle-fixed", > > ... > > "block_size_maximum": 33554432, > > "export-size-str": "unavailable" > > } ] > > } > > @@ -140,10 +138,12 @@ show_one_export (struct nbd_handle *nbd, const char > > *desc, > > fprintf (fp, ":\n"); > > if (desc && *desc) > >fprintf (fp, "\tdescription: %s\n", desc); > > -if (human_size_flag) > > - fprintf (fp, "\texport-size: %" PRIi64 " (%s)\n", size, size_str); > > -else > > - fprintf (fp, "\texport-size: %" PRIi64 "\n", size); > > +if (size >= 0) { > > + if (human_size_flag) > > +fprintf (fp, "\texport-size: %" PRIi64 " (%s)\n", size, size_str); ...branching on the variable is guarded by anything different than the condition that controlled whether the variable was set in the first place. Although it is not broken here, it is a trap for the unwary, so I'll be initializing human_size_flag to false at its declaration. > > + else > > +fprintf (fp, "\texport-size: %" PRIi64 "\n", size); > > +} [2] > > if (content) > >fprintf (fp, "\tcontent: %s\n", content); > > if (uri) > > @@ -273,7 +273,8 @@ show_one_export (struct nbd_handle *nbd, const char > > *desc, > > block_maximum); > > > > /* Put this one at the end because of the stupid comma thing in JSON. > > */ > &
[Libguestfs] [libnbd PATCH v2 3/6] api: Sanitize sizes larger than INT64_MAX
scope of this patch (and could surprise users who were depending on the current truncation semantics). GoLang.ml generates UInt64 via native Go 'uint64' passed through 'C.uint64_t()', and Rust.ml generates UInt64 via native Rust 'u64' interpreted as C uint64_t. In both cases, while I am unsure whether those languages (which have tighter type rules than C) let you get away with directly assigning a negative value to the native type when you really want a positive value over 2^63; but since it is a direct map of an unsigned 64-bit value between the native type and C, there should be no surprises to people fluent in those languages. OCaml.ml is a bit different; as OCaml lacks a native unsigned 64-bit type, it generates UInt64 as native 'int64' converted to C via 'Int64_val()'. Thus, an OCaml client MUST pass a negative value if they want to access offsets beyond 2^63. But again, someone familiar with the language should be familiar with the limitations. Finally to demonstrate the difference in this patch, I temporarily applied this patch to qemu (here, on top of qemu commit 49076448): | diff --git i/nbd/server.c w/nbd/server.c | index b5f93a20c9c..228ce66ed2b 100644 | --- i/nbd/server.c | +++ w/nbd/server.c | @@ -691,7 +691,7 @@ static int nbd_negotiate_handle_info(NBDClient *client, Error **errp) | myflags |= NBD_FLAG_SEND_DF; | } | trace_nbd_negotiate_new_style_size_flags(exp->size, myflags); | -stq_be_p(buf, exp->size); | +stq_be_p(buf, exp->size | 0x8000ULL); | stw_be_p(buf + 8, myflags); | rc = nbd_negotiate_send_info(client, NBD_INFO_EXPORT, | sizeof(buf), buf, errp); then with a just-built 'qemu-nbd -f raw -r file -t &' running, we have pre-patch: $ nbdsh --base -u nbd://localhost -c - <<\EOF > try: > print(h.get_size()) > except nbd.Error as ex: > print(ex.string) > EOF -9223372036854770176 vs. post-patch: $ ./run nbdsh --base -u nbd://localhost -c - <<\EOF > try: > print(h.get_size()) > except nbd.Error as ex: > print(ex.string) > EOF nbd_get_size: server claims size 9223372036854781440 which does not fit in signed result: Value too large for defined data type A more complex patch to qemu to mask that bit back off from the offset parameter to NBD_CMD_READ/WRITE is also possible to explore behavior of passing large offsets over the wire, although I don't show it here. All stable releases of NBD have had this return value issue. We cannot guarantee whether clients may have their own arithmetic bugs (such as treating the size as signed, then entering an infinite loop when using a negative size as a loop bound), so we will be issuing a security notice in case client apps need to file their own CVEs. However, since all known production servers do not produces sizes that large, and our audit shows that all stable releases of libnbd gracefully handle large offsets even when a client convers a negative int64_t result of nbd_get_size() back into large uint64_t offset values in subsequent API calls, we did not deem it high enough risk to issue a CVE against libnbd proper at this time, although we have reached out to Red Hat's secalert team to see if revisiting that decision might be warranted. Based on recent IRC chatter, there is also a slight possibility that some future extension to the NBD protocol could specifically allow clients to opt in to an extension where the server reports an export size of 2^64-1 (all ones) for a unidirectional connection where offsets are ignored (either a read-only export of indefinite length, or an append-only export data sink - either way, basically turning NBD into a cross-system FIFO rather than a seekable device); if such an extension materializes, we'd probably add a named constant negative sentinel value distinct from -1 for actual return from nbd_get_size() at that time. Fixes: 40881fce75 ("lib: Expose flags and export size through the API.", v0.1) Signed-off-by: Eric Blake --- generator/API.ml | 6 +- lib/flags.c | 6 ++ 2 files changed, 11 insertions(+), 1 deletion(-) diff --git a/generator/API.ml b/generator/API.ml index 14988403..c4547615 100644 --- a/generator/API.ml +++ b/generator/API.ml @@ -2492,7 +2492,11 @@ "get_size", { permitted_states = [ Negotiating; Connected; Closed ]; shortdesc = "return the export size"; longdesc = "\ -Returns the size in bytes of the NBD export." +Returns the size in bytes of the NBD export. + +Note that this call fails with C for an unlikely +server that advertises a size which cannot fit in a 64-bit +signed integer." ^ non_blocking_test_call_description; see_also = [SectionLink "Size of the export"; Link "opt_info"]; example = Some "examples/get-size.c"; diff --git a/lib/flags.c b/lib/flags.c index 7e6ddedd..394abe87 100644 --- a/lib/flags.c +++ b/lib/flags.c
[Libguestfs] [libnbd PATCH v2 0/6] Fix fuzzer fallout
Cat's out of the bag: Rich's fuzzer run found not one, but two independent assertion failures that a malicious server could trigger in my recent 64-bit extension code additions. What's more, in the process of fixing them, we've discovered another long-standing issue where nbd_get_size() returns confusing results compared to its documentation, when talking to an odd server that reports a really large export size. After off-list discussion between Rich, Laszlo, and myself, we didn't think an embargoed CVE against libnbd is necessary (the assertion failures only happen to unstable releases, and the nbd_get_size() misbehavior does not happen with normal servers and has been in place since v1.0, so it is nothing new), so I am posting the series now for public review. But we will still be reaching out to secalert for their opinion (it may be that they still see a low-priority exploit in an app that gets confused when trying to use a negative size as a loop bound, for example). Once they answer, and regardless of whether we end up doing a libnbd CVE after all, I will follow up to the mailing list with a security incident (client apps that demand a positive export size should probably favor nbd_get_size()<0 over nbd_get_size()==-1). Eric Blake (6): states: Tweak comment in OPT_GO state handler fuzzing: Disable client-side strictness checks api: Sanitize sizes larger than INT64_MAX block_status: Fix assertion with large server size block_status: Fix assertion on bad 64-bit block status reply info: Tolerate missing size generator/API.ml | 6 +++- generator/states-newstyle-opt-go.c | 5 ++- generator/states-reply-chunk.c | 50 -- generator/C.ml | 2 +- lib/flags.c| 6 fuzzing/libnbd-fuzz-wrapper.c | 5 ++- info/show.c| 25 --- 7 files changed, 60 insertions(+), 39 deletions(-) -- 2.41.0 ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
[Libguestfs] [libnbd PATCH v2 5/6] block_status: Fix assertion on bad 64-bit block status reply
If a server replies to a block status command with an invalid count in NBD_REPLY_TYPE_BLOCK_STATUS_EXT, we correctly detect the server's error, but fail to mark that we've consumed enough data off the wire to resync back to the server's next reply. Rich's fuzzing run initially found this, but I was able to quickly write a one-byte patch on top of my pending qemu patches [1] to reproduce it: [1] https://lists.gnu.org/archive/html/qemu-devel/2023-08/msg05231.html | diff --git i/nbd/server.c w/nbd/server.c | index 898580a9b0b..bd8d46ba3c4 100644 | --- i/nbd/server.c | +++ w/nbd/server.c | @@ -2326,7 +2326,7 @@ nbd_co_send_extents(NBDClient *client, NBDRequest *request, NBDExtentArray *ea, | iov[1].iov_base = _ext; | iov[1].iov_len = sizeof(meta_ext); | stl_be_p(_ext.context_id, context_id); | -stl_be_p(_ext.count, ea->count); | +stl_be_p(_ext.count, !ea->count); | | nbd_extent_array_convert_to_be(ea); | iov[2].iov_base = ea->extents; then with a just-built 'qemu-nbd -f raw -r file -t &' running, we have pre-patch: $ ./run nbdsh --base -u nbd://localhost -c - <<\EOF > def f(*k): > pass > try: > h.block_status(1,0,f) > except nbd.Error as ex: > print(ex.string) > h.shutdown() > EOF nbdsh: generator/states-reply-chunk.c:701: enter_STATE_REPLY_CHUNK_REPLY_FINISH: Assertion `h->payload_left == 0' failed. Aborted (core dumped) vs. post-patch: $ ./run nbdsh --base -u nbd://localhost -c - <<\EOF > def f(*k): > pass > try: > h.block_status(1,0,f) > except nbd.Error as ex: > print(ex.string) > h.shutdown() > EOF nbd_block_status: block-status: command failed: Protocol error Appears to be a casualty of rebasing: I added h->payload_left verification fairly late in the game, then floated it earlier in the series, and missed a spot where I added a state machine jump to RESYNC without having updated h->payload_left. An audit of h->hlen modification sites show that all other chunked reads updated h->payload_left appropriately (often in the next statement, but sometimes in a later state when that made logic easier). Requires a non-compliant server, and only possible when extended headers are negotiated, which does not affect any stable released libnbd. Thus, there is no reason to create a CVE, although since I will already be doing a security info email about previous patches also addressing fuzzer findings, I can mention this at the same time. Fixes: ab992766cd ("block_status: Accept 64-bit extents during block status") Thanks: Richard W.M. Jones Signed-off-by: Eric Blake --- generator/states-reply-chunk.c | 1 + 1 file changed, 1 insertion(+) diff --git a/generator/states-reply-chunk.c b/generator/states-reply-chunk.c index 20407d91..5a31c192 100644 --- a/generator/states-reply-chunk.c +++ b/generator/states-reply-chunk.c @@ -476,6 +476,7 @@ REPLY.CHUNK_REPLY.RECV_BS_HEADER: if (h->bs_count != be32toh (h->sbuf.reply.payload.bs_hdr_64.count)) { h->rbuf = NULL; h->rlen = h->payload_left; +h->payload_left = 0; SET_NEXT_STATE (%RESYNC); return 0; } -- 2.41.0 ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
[Libguestfs] [libnbd PATCH v2 1/6] states: Tweak comment in OPT_GO state handler
While auditing code, I stumbled across a confusing comment which references a state name that does not exist. In addition to improving the comment, I added an assertion, since there is action at a distance (prepare_for_reply_payload is in states-newstyle.c) for ignoring the oversized payload. Signed-off-by: Eric Blake Reviewed-by: Richard W.M. Jones Reviewed-by: Laszlo Ersek --- generator/states-newstyle-opt-go.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/generator/states-newstyle-opt-go.c b/generator/states-newstyle-opt-go.c index 2ef440d4..5bc9a9ae 100644 --- a/generator/states-newstyle-opt-go.c +++ b/generator/states-newstyle-opt-go.c @@ -149,8 +149,11 @@ NEWSTYLE.OPT_GO.CHECK_REPLY: switch (reply) { case NBD_REP_INFO: -if (len > maxpayload /* see RECV_NEWSTYLE_OPT_GO_REPLY */) +if (len > maxpayload) { + /* See prepare_for_reply_payload, used in RECV_REPLY */ + assert (h->rbuf == NULL); debug (h, "skipping large NBD_REP_INFO"); +} else { uint16_t info; uint64_t exportsize; -- 2.41.0 ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
[Libguestfs] [libnbd PATCH v2 2/6] fuzzing: Disable client-side strictness checks
When fuzzing, it is more desirable to always provoke the server into sending a response, rather than sometimes accidentally skipping a wire call because a client-side strictness test failed. [Our fuzzer could probably be made even more powerful by changing the fuzzer input file to be a series of records, where each record is the API to call and then the server's response; right now, the sequence of APIs called is hard-coded, which is not as powerful at testing potential cross-command coupling. But that's a project for another day.] Signed-off-by: Eric Blake Reviewed-by: Richard W.M. Jones Reviewed-by: Laszlo Ersek --- fuzzing/libnbd-fuzz-wrapper.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/fuzzing/libnbd-fuzz-wrapper.c b/fuzzing/libnbd-fuzz-wrapper.c index cbd55380..fcd1d04c 100644 --- a/fuzzing/libnbd-fuzz-wrapper.c +++ b/fuzzing/libnbd-fuzz-wrapper.c @@ -193,8 +193,11 @@ client (int sock) } /* Note we ignore errors in these calls because we are only - * interested in whether the process crashes. + * interested in whether the process crashes. Likewise, we don't + * want to accidentally avoid sending traffic to the server merely + * because client side strictness sees a problem. */ + nbd_set_strict_mode (nbd, 0); /* Enable a metadata context, for block status below. */ nbd_add_meta_context (nbd, LIBNBD_CONTEXT_BASE_ALLOCATION); -- 2.41.0 ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
[Libguestfs] [libnbd PATCH v2 4/6] block_status: Fix assertion with large server size
As mentioned in the previous commit ("api: Sanitize sizes larger than INT64_MAX"), the NBD spec does not (yet) prohibit a server from advertising a size larger than INT64_MAX. While we can't report such size to the user, v1.16 was at least internally consistent with the server's size everywhere else. But when adding code to implement 64-bit block status, I intentionally wanted to guarantee that the callback sees a positive int64_t length even when the server's wire value can be 64 bits, for ease in writing language bindings (OCaml in particular benefitted from that guarantee), even though I didn't document that in the API until now. That was because I had blindly assumed that the server's exportsize fit in 63 bits, and therefore I didn't have to worry about arithmetic overflow once I capped the extent length to exportsize. But the fuzzer quickly proved me wrong. What's more, with the same one-line hack to qemu as shown in the previous commit to advertise a size with the high-order bit set, $ ./run nbdsh --base -u nbd://localhost -c - <<\EOF > def f(*k): > pass > h.block_status(1,0,f) > EOF nbdsh: generator/states-reply-chunk.c:554: enter_STATE_REPLY_CHUNK_REPLY_RECV_BS_ENTRIES: Assertion `h->exportsize <= INT64_MAX' failed. Aborted (core dumped) even though it did not dump core in v1.16. Since my assumption was bad, rewrite the logic to increment total after bounds-checking rather than before, and to bounds-check based on INT64_MAX+1-64M rather than on the export size. As before, we never report a zero-length extent to the callback. Whether or not secalert advises us to create a CVE for the previous patch, this bug does not deserve its own CVE as it was only introduced in recent unstable releases. Fixes: e8d837d306 ("block_status: Add some sanity checking of server lengths", v1.17.4) Thanks: Richard W.M. Jones Signed-off-by: Eric Blake --- generator/states-reply-chunk.c | 49 ++ generator/C.ml | 2 +- 2 files changed, 27 insertions(+), 24 deletions(-) diff --git a/generator/states-reply-chunk.c b/generator/states-reply-chunk.c index 2cebe456..20407d91 100644 --- a/generator/states-reply-chunk.c +++ b/generator/states-reply-chunk.c @@ -547,11 +547,16 @@ REPLY.CHUNK_REPLY.RECV_BS_ENTRIES: break; } +/* Be careful to avoid arithmetic overflow, even when the user + * disabled LIBNBD_STRICT_BOUNDS to pass a suspect offset, or the + * server returns suspect lengths or advertised exportsize larger + * than 63 bits. We guarantee that callbacks will not see a + * length exceeding INT64_MAX or the advertised h->exportsize. + */ name = h->meta_contexts.ptr[i].name; -total = 0; -cap = h->exportsize - cmd->offset; -assert (cap <= h->exportsize); -assert (h->exportsize <= INT64_MAX); +total = cap = 0; +if (cmd->offset <= h->exportsize) + cap = h->exportsize - cmd->offset; /* Need to byte-swap the entries returned into the callback size * requested by the caller. The NBD protocol allows truncation as @@ -560,10 +565,11 @@ REPLY.CHUNK_REPLY.RECV_BS_ENTRIES: * don't like. We stop iterating on a zero-length extent (error * only if it is the first extent), on an extent beyond the * exportsize (unconditional error after truncating to - * exportsize), and on an extent exceeding a 32-bit callback (no - * error, and to simplify alignment, we truncate to 4G-64M); but - * do not diagnose issues with the server's length alignments, - * flag values, nor compliance with the REQ_ONE command flag. + * exportsize), and on an extent exceeding a callback length limit + * (no error, and to simplify alignment, we truncate to 64M before + * the limit); but we do not diagnose issues with the server's + * length alignments, flag values, nor compliance with the REQ_ONE + * command flag. */ for (i = 0, stop = false; i < h->bs_count && !stop; ++i) { if (type == NBD_REPLY_TYPE_BLOCK_STATUS) { @@ -572,16 +578,12 @@ REPLY.CHUNK_REPLY.RECV_BS_ENTRIES: } else { orig_len = len = be64toh (h->bs_raw.wide[i].length); -if (len > h->exportsize) { - /* Since we already asserted exportsize is at most 63 bits, - * this ensures the extent length will appear positive even - * if treated as signed; treat this as an error now, rather - * than waiting for the comparison to cap later, to avoid - * arithmetic overflow. +if (len > INT64_MAX) { + /* Pick an aligned value rather than overflowing 64-bit + * callback; this does not require an error. */ stop = true; - cmd->error = cmd->error ? : EPROTO; - len = h->exportsize; + len = INT64_MAX + 1ULL - MAX_REQUEST_SIZE;
[Libguestfs] [libnbd PATCH v2 6/6] info: Tolerate missing size
As previous patches showed, the NBD spec does not yet forbid a server sending us a size that does not fit in int64_t. We should gracefully handle this during nbdinfo, rather than giving up early. With the same one-line hack to qemu to set the most significant bit of the export size, output changes from pre-patch: $ ./run nbdinfo nbd://localhost /home/eblake/libnbd/info/.libs/nbdinfo: nbd_get_size: server claims size 9223372036854781440 which does not fit in signed result: Value too large for defined data type qemu-nbd: option negotiation failed: Failed to read opts magic: Unexpected end-of-file before all data were read to post-patch: $ ./run nbdinfo nbd://localhost protocol: newstyle-fixed without TLS, using extended packets ... block_size_maximum: 33554432 or $ ./run nbdinfo nbd://localhost --json { "protocol": "newstyle-fixed", ... "block_size_maximum": 33554432, "export-size-str": "unavailable" } ] } Sadly, since writing a server with such large export sizes requires a one-off hack, I don't see the point in adding a unit test. Signed-off-by: Eric Blake --- info/show.c | 25 + 1 file changed, 13 insertions(+), 12 deletions(-) diff --git a/info/show.c b/info/show.c index a71d837e..3d80545e 100644 --- a/info/show.c +++ b/info/show.c @@ -46,7 +46,7 @@ show_one_export (struct nbd_handle *nbd, const char *desc, bool first, bool last) { int64_t i, size; - char size_str[HUMAN_SIZE_LONGEST]; + char size_str[HUMAN_SIZE_LONGEST] = "unavailable"; bool human_size_flag; char *export_name = NULL; char *export_desc = NULL; @@ -89,13 +89,10 @@ show_one_export (struct nbd_handle *nbd, const char *desc, return false; } size = nbd_get_size (nbd); - if (size == -1) { -fprintf (stderr, "%s: %s\n", progname, nbd_get_error ()); -exit (EXIT_FAILURE); + if (size >= 0) { +human_size (size_str, size, _size_flag); } - human_size (size_str, size, _size_flag); - if (uri_is_meaningful ()) uri = nbd_get_uri (nbd); @@ -130,7 +127,8 @@ show_one_export (struct nbd_handle *nbd, const char *desc, show_context = true; /* Get content last, as it moves the connection out of negotiating */ - content = get_content (nbd, size); + if (size >= 0) +content = get_content (nbd, size); if (!json_output) { ansi_colour (ANSI_FG_BOLD_BLACK, fp); @@ -140,10 +138,12 @@ show_one_export (struct nbd_handle *nbd, const char *desc, fprintf (fp, ":\n"); if (desc && *desc) fprintf (fp, "\tdescription: %s\n", desc); -if (human_size_flag) - fprintf (fp, "\texport-size: %" PRIi64 " (%s)\n", size, size_str); -else - fprintf (fp, "\texport-size: %" PRIi64 "\n", size); +if (size >= 0) { + if (human_size_flag) +fprintf (fp, "\texport-size: %" PRIi64 " (%s)\n", size, size_str); + else +fprintf (fp, "\texport-size: %" PRIi64 "\n", size); +} if (content) fprintf (fp, "\tcontent: %s\n", content); if (uri) @@ -273,7 +273,8 @@ show_one_export (struct nbd_handle *nbd, const char *desc, block_maximum); /* Put this one at the end because of the stupid comma thing in JSON. */ -fprintf (fp, "\t\"export-size\": %" PRIi64 ",\n", size); +if (size >= 0) + fprintf (fp, "\t\"export-size\": %" PRIi64 ",\n", size); fprintf (fp, "\t\"export-size-str\": \"%s\"\n", size_str); if (last) -- 2.41.0 ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
[Libguestfs] [libnbd PATCH] generator: Fix assertion with ill-formed 64-bit block status reply
If a server replies to a block status command with an invalid length in NBD_REPLY_TYPE_BLOCK_STATUS_EXT, we correctly detect the server's error, but fail to mark that we've consumed enough data off the wire to resync back to the server's next reply. Symptoms seen during a fuzzing run: │ $ ./fuzzing/libnbd-fuzz-wrapper │+id\:01\,sig\:06\,src\:000396\,time\:16888628\,execs\:54159419\,op\:havoc\,rep\:4 │ libnbd-fuzz-wrapper: generator/states-reply-chunk.c:698: enter_STATE_REPLY_CHUNK_REPLY_FINISH: │+Assertion `h->payload_left == 0' failed. │ Aborted (core dumped) │ read: Connection reset by peer Appears to be a casualty of rebasing: I added h->payload_left verification fairly late in the game, then floated it earlier in the series, and missed a spot where I added a state machine jump to RESYNC without having updated h->payload_left. An audit of all other assignments to h->rlen in that file was able to find corresponding assignments to h->payload_left (often the next statement, but sometimes split across states based on what made the next state easier to code). Requires a non-compliant server, but I was able to come up with a one-line tweak to pending qemu patches that could trigger it. Not creating a CVE as it only appears in unstable releases. Fixes: ab992766cd ("block_status: Accept 64-bit extents during block status") Thanks: Richard W.M. Jones Signed-off-by: Eric Blake --- I'm investigating another crash that Rich sent me off-list, but I suspect it will be a similar non-CVE situation caused by my recent 64-bit extension patches. I'll wait to apply this one for just a bit more, in case I can get the corpus file or two from Rich's fuzzing run to add to fuzzing/testcase_dir. generator/states-reply-chunk.c | 1 + 1 file changed, 1 insertion(+) diff --git a/generator/states-reply-chunk.c b/generator/states-reply-chunk.c index 2cebe456..a5d3aefe 100644 --- a/generator/states-reply-chunk.c +++ b/generator/states-reply-chunk.c @@ -476,6 +476,7 @@ REPLY.CHUNK_REPLY.RECV_BS_HEADER: if (h->bs_count != be32toh (h->sbuf.reply.payload.bs_hdr_64.count)) { h->rbuf = NULL; h->rlen = h->payload_left; +h->payload_left = 0; SET_NEXT_STATE (%RESYNC); return 0; } -- 2.41.0 ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] Plans for nbdkit 1.36 & libnbd 1.18
On Tue, Sep 12, 2023 at 10:00:06PM +0100, Richard W.M. Jones wrote: > The current stable versions of nbdkit & libnbd were first released on: > > nbdkit 1.34.0 => 14 April 2023 > libnbd 1.16.0 => 18 April 2023 > > which is about 5 months ago. Since we tend to release these projects > approximately every 6 months, I'd like to aim for a release at the end > of September or beginning of October, about 3 weeks from now. Sounds good to me. > > If there are large features, then let's discuss whether to add them > now or else hold off. (IIUC all 64 bit NBD patches are now upstream? > Are there more?) 64-bit extensions are complete in libnbd; I don't have any major features for libnbd outstanding at the present. For nbdkit, any work to support 64-bit extensions is far enough out that it won't make this stable release. > > I'll prepare draft release notes in the coming weeks. > > Rich. > -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [PATCH nbdkit 10/10] XXX docs: Remove references to -U - when it is implicit
On Sat, Sep 09, 2023 at 02:57:58PM +0100, Richard W.M. Jones wrote: > XXX NOTE XXX > > I would not apply this patch immediately, since online documentation > will get updated as soon as I do that. Best to wait until after 1.36 > is released at least. At the earliest, on the day that we are ready to cut 1.36 (I don't see a problem with the docs being updated once the tarball will be ready, rather than having to wait yet longer to 1.36.1). But I agree that we aren't quite ready for the stable release yet. > > XXX END NOTE XXX > --- > docs/nbdkit-captive.pod | 6 +++--- > filters/cacheextents/nbdkit-cacheextents-filter.pod | 2 +- > filters/checkwrite/nbdkit-checkwrite-filter.pod | 6 +++--- > filters/pause/nbdkit-pause-filter.pod | 2 +- > filters/retry/nbdkit-retry-filter.pod | 2 +- > plugins/linuxdisk/nbdkit-linuxdisk-plugin.pod | 4 ++-- > plugins/nbd/nbdkit-nbd-plugin.pod | 2 +- > plugins/random/nbdkit-random-plugin.pod | 2 +- > plugins/sparse-random/nbdkit-sparse-random-plugin.pod | 2 +- > plugins/torrent/nbdkit-torrent-plugin.pod | 6 +++--- > plugins/vddk/nbdkit-vddk-plugin.pod | 4 ++-- > BENCHMARKING | 4 ++-- Unless I'm mistaken, BENCHMARKING is the only file in this list that is not live on the website, and therefore which could be hoisted into 9/10 if desired. But I see no harm in leaving it here. Obviously, we'd better not forget to apply this at the correct later date, and you'll have to touch up the commit message at that point. But once we are ready, feel free to add Reviewed-by: Eric Blake -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [PATCH nbdkit 09/10] tests: Remove references to -U - when it is implicit
On Sat, Sep 09, 2023 at 02:57:57PM +0100, Richard W.M. Jones wrote: > In tests where we used 'nbdkit -U - ... --run', remove -U - as that is > now implicit. > --- > tests/Makefile.am| 2 +- Also big, but likewise useful. (We'll still have to use explicit -U- in libnbd for a while longer, but that shouldn't hold up this patch in nbdkit). > tests/test-zero.sh | 2 +- > 133 files changed, 232 insertions(+), 232 deletions(-) After doing git grep -e '-U -', I see the remaining files are in patch 10/10 (where changing them now before the stable release is imminent could cause confusion). Reviewed-by: Eric Blake -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [PATCH nbdkit 08/10] tests: Be punctilious about using requires_run in tests that use --run
On Sat, Sep 09, 2023 at 02:57:56PM +0100, Richard W.M. Jones wrote: > This requires that nbdkit is built with the --run feature, which > (currently) is not true for Windows. (In some tests we separately > checked for !Windows, but let's favour consistency.) > --- > plugins/rust/test-ramdisk.sh | 2 ++ > tests/test-S3.sh | 1 + > tests/test-vsock.sh | 1 + > 75 files changed, 76 insertions(+) Big, but mechanical and makes sense. Reviewed-by: Eric Blake -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [PATCH nbdkit 07/10] tests/test-parallel-*.sh: Remove redundant comment
On Sat, Sep 09, 2023 at 02:57:55PM +0100, Richard W.M. Jones wrote: > --- > tests/test-parallel-file.sh | 1 - > tests/test-parallel-nbd.sh | 1 - > tests/test-parallel-sh.sh | 1 - > 3 files changed, 3 deletions(-) Reviewed-by: Eric Blake -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [PATCH nbdkit 06/10] server: Make --run imply -U -
On Sat, Sep 09, 2023 at 02:57:54PM +0100, Richard W.M. Jones wrote: > Almost always when you used nbdkit --run you should also use -U - (to > use a private Unix domain socket). Otherwise nbdkit listened on TCP > port 10809, which had two bad side effects: It permitted other > processes to interfere with your --run command, and it reserved a > public TCP port which would stop two instances of nbdkit running at > the same time. This was a frequent cause of bugs in test cases. Indeed, I can easily find a number of commits where we added -U- after the fact because of testsuite failures ;) > > Switch the default so now --run implies -U - > > You can still get the old behaviour by using --port explicitly, but > that is almost certainly a bad idea. (Using --run and --vsock works > the same way as before too. It is also usually a bad idea, although > we use it in one test.) > --- > docs/nbdkit-captive.pod | 7 --- > docs/nbdkit.pod | 9 - > server/main.c | 9 + > 3 files changed, 17 insertions(+), 8 deletions(-) > Reviewed-by: Eric Blake -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [PATCH nbdkit 04/10] server: Calculate $uri in one place
On Mon, Sep 11, 2023 at 03:09:21PM +0100, Richard W.M. Jones wrote: > > > - case SERVICE_MODE_UNIXSOCKET: > > > -fprintf (fp, "nbd%s+unix://", tls == 2 ? "s" : ""); > > > -if (export_name && strcmp (export_name, "") != 0) { > > > - putc ('/', fp); > > > - uri_quote (export_name, fp); > > > -} > > > -fprintf (fp, "\\?socket="); > > > -uri_quote (unixsocket, fp); > > > > Beforehand, we were manually shell-quoting the ? in the Unix URI... > > The shell quoting here was only marginally useful before this change. > In theory there might be a file in a subdirectory called > 'nbd+unix:/Xsocket=' which would match :-) > > > + switch (service_mode) { > > > + case SERVICE_MODE_UNIXSOCKET: > > > +fprintf (fp, "nbd%s+unix://", tls == 2 ? "s" : ""); > > > +if (export_name && strcmp (export_name, "") != 0) { > > > + putc ('/', fp); > > > + uri_quote (export_name, fp); > > > +} > > > +fprintf (fp, "?socket="); > > > +uri_quote (unixsocket, fp); > > > > ...where the manual shell-quoting is no longer injected. Yes, this > > looks correct (the appearance of the quoting, using '' instead of \, > > may be different, but the resulting string as parsed by the shell is > > the same). My point was that pre-patch, we had: nbd+unix://\?socket=... and post-patch, we have: 'nbd+unix://?socket=...' but both forms are equally quoted; after the shell removes \ or '' quoting, we are left with: nbd+unix://?socket=... and neither form allowed the glob expansion of the (unlikely) file nbd+unix:/Xsocket=... My comment was more about why changing the fprintf("\\?socket=") pre-patch to fprintf("?socket=") post-patch was correct. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [PATCH nbdkit 05/10] server: Add the NBD URI to debug output
On Sat, Sep 09, 2023 at 02:57:53PM +0100, Richard W.M. Jones wrote: > Example after applying this patch: > > $ ./nbdkit -fv --port -e foo null 1M > /home/rjones/d/nbdkit/server/nbdkit -f -v --port= -e foo -- > /home/rjones/d/nbdkit/plugins/null/.libs/nbdkit-null-plugin.so 1M > nbdkit: debug: nbdkit 1.35.12 > nbdkit: debug: TLS disabled: could not load TLS certificates > nbdkit: debug: NBD URI: nbd://localhost:/foo > > An alternative I considered was adding a --print-uri option which > would print the URI on stdout. Maybe we could do this as an > alternative later. Normally the server does not print anything on > stdout, and it is problematic in some modes, like when using -s. Yeah, we'd have to print to stderr, even though it is informative. I don't think we'll need a --print-uri option in the short term, but as you mention, it's always something we can add later if need arises. > --- > server/main.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/server/main.c b/server/main.c > index 54eb348ba..0c9019d94 100644 > --- a/server/main.c > +++ b/server/main.c > @@ -637,6 +637,8 @@ main (int argc, char *argv[]) > * Note this may be NULL. > */ >uri = make_uri (); > + if (uri) > +debug ("NBD URI: %s", uri); Reviewed-by: Eric Blake Do we also want to output a debug statement when a URI is not possible, such as under -s? -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [PATCH nbdkit 04/10] server: Calculate $uri in one place
On Sat, Sep 09, 2023 at 02:57:52PM +0100, Richard W.M. Jones wrote: > Move the calculation of $uri to the main function (well, inlined > there), and out of --run code. > > This is largely code motion. In theory it changes the content of $uri > since we now shell quote it after generating it, but this ought not to > have any practical effect. > --- > server/internal.h | 1 + > server/captive.c | 41 ++-- > server/main.c | 79 +++ > 3 files changed, 82 insertions(+), 39 deletions(-) > > +++ b/server/captive.c > @@ -75,45 +75,8 @@ run_command (void) > >/* Construct $uri. */ >fprintf (fp, "uri="); > - switch (service_mode) { > - case SERVICE_MODE_SOCKET_ACTIVATION: > - case SERVICE_MODE_LISTEN_STDIN: > -break; /* can't form a URI, leave it blank */ > - case SERVICE_MODE_UNIXSOCKET: > -fprintf (fp, "nbd%s+unix://", tls == 2 ? "s" : ""); > -if (export_name && strcmp (export_name, "") != 0) { > - putc ('/', fp); > - uri_quote (export_name, fp); > -} > -fprintf (fp, "\\?socket="); > -uri_quote (unixsocket, fp); Beforehand, we were manually shell-quoting the ? in the Unix URI... > -break; > - case SERVICE_MODE_VSOCK: > -/* 1 = VMADDR_CID_LOCAL */ > -fprintf (fp, "nbd%s+vsock://1", tls == 2 ? "s" : ""); > -if (port) { > - putc (':', fp); > - shell_quote (port, fp); > -} > -if (export_name && strcmp (export_name, "") != 0) { > - putc ('/', fp); > - uri_quote (export_name, fp); > -} > -break; > - case SERVICE_MODE_TCPIP: > -fprintf (fp, "nbd%s://localhost", tls == 2 ? "s" : ""); > -if (port) { > - putc (':', fp); > - shell_quote (port, fp); > -} > -if (export_name && strcmp (export_name, "") != 0) { > - putc ('/', fp); > - uri_quote (export_name, fp); > -} > -break; > - default: > -abort (); > - } > + if (uri) > +shell_quote (uri, fp); ...while here, we shell-quote the entire string... >putc ('\n', fp); > >/* Since nbdkit 1.24, $nbd is a synonym for $uri. */ > diff --git a/server/main.c b/server/main.c > index 2a332bfdd..54eb348ba 100644 > --- a/server/main.c > +static char * > +make_uri (void) > + > + switch (service_mode) { > + case SERVICE_MODE_UNIXSOCKET: > +fprintf (fp, "nbd%s+unix://", tls == 2 ? "s" : ""); > +if (export_name && strcmp (export_name, "") != 0) { > + putc ('/', fp); > + uri_quote (export_name, fp); > +} > +fprintf (fp, "?socket="); > +uri_quote (unixsocket, fp); ...where the manual shell-quoting is no longer injected. Yes, this looks correct (the appearance of the quoting, using '' instead of \, may be different, but the resulting string as parsed by the shell is the same). > +break; > + case SERVICE_MODE_VSOCK: > +/* 1 = VMADDR_CID_LOCAL */ > +fprintf (fp, "nbd%s+vsock://1", tls == 2 ? "s" : ""); > +if (port) { > + putc (':', fp); > + fputs (port, fp); > +} > +if (export_name && strcmp (export_name, "") != 0) { > + putc ('/', fp); > + uri_quote (export_name, fp); > +} > +break; > + case SERVICE_MODE_TCPIP: > +fprintf (fp, "nbd%s://localhost", tls == 2 ? "s" : ""); > +if (port) { > + putc (':', fp); > + fputs (port, fp); > +} > +if (export_name && strcmp (export_name, "") != 0) { > + putc ('/', fp); > + uri_quote (export_name, fp); > +} > +break; > + > + case SERVICE_MODE_SOCKET_ACTIVATION: > + case SERVICE_MODE_LISTEN_STDIN: > +abort (); /* see above */ > + default: > +abort (); Could consolidate these labels, although I don't know if a compiler would be picky about: case ...: /* comment */ default: abort (); so I'm also fine leaving it as-is. Reviewed-by: Eric Blake -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [PATCH nbdkit 03/10] server: Don't set port as a side effect
On Sat, Sep 09, 2023 at 02:57:51PM +0100, Richard W.M. Jones wrote: > This removes another long-range code dependency. > --- > server/sockets.c | 10 +- > 1 file changed, 5 insertions(+), 5 deletions(-) > Reviewed-by: Eric Blake -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [PATCH nbdkit 02/10] server: Don't set export_name as a side effect of using --run
On Sat, Sep 09, 2023 at 02:57:50PM +0100, Richard W.M. Jones wrote: > Reduce long-range code dependencies by not setting export_name in > run_command (--run functionality), especially as this function is not > always called. > > If the -e option was not used at all then export_name will be NULL. > --- > server/captive.c | 12 +--- > 1 file changed, 5 insertions(+), 7 deletions(-) Reviewed-by: Eric Blake For other reviewers, remember that -e only impacts $uri during --run (it is still up to the plugin whether the client must connect to a particular export name, and the client's choice is not forced by -e). -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [PATCH nbdkit 01/10] server: Introduce service_mode concept
On Sat, Sep 09, 2023 at 02:57:49PM +0100, Richard W.M. Jones wrote: > Previously there were two places where similiar-ish logic was used to > decide if we are going to serve over a socket activation, -s, Unix > socket, AF_VSOCK or TCP/IP. Let's abstract that into a service_mode. > > One place where we did this was when calculating the $uri variable for > --run. This change adjusts and fixes this calculation (revealed as I > made the above change). In particular if --port was not set then the > $uri would contain fairly bogus values in some cases: > > $ nbdkit --vsock null 1M --run 'echo $uri ; nbdinfo $uri' > nbd > nbdinfo: nbd_connect_uri: NBD URI does not have a scheme: valid NBD URIs > should start with a scheme like nbd://, nbds:// or nbd+unix://: Invalid > argument > > (note uri='nbd') > > After this commit: > > $ nbdkit --vsock null 1M --run 'echo $uri ; nbdinfo $uri' > nbd+vsock://1 > protocol: newstyle-fixed without TLS, using structured packets > export="": > export-size: 1048576 (1M) > content: data > uri: nbd+vsock://1:10809/ > ... Nice. > --- > server/internal.h | 11 +++ > server/captive.c | 56 --- > server/main.c | 75 ++----- > 3 files changed, 91 insertions(+), 51 deletions(-) > Reviewed-by: Eric Blake -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [PATCH nbdkit 00/10] Make --run imply -U -
On Mon, Sep 11, 2023 at 01:11:46PM +0100, Richard W.M. Jones wrote: > On Mon, Sep 11, 2023 at 07:08:48AM -0500, Eric Blake wrote: > > On Sat, Sep 09, 2023 at 02:57:48PM +0100, Richard W.M. Jones wrote: > > > Should have done this a long time ago. I feel it is about time we > > > change the default of nbdkit --run to imply -U -, rather than opening > > > a public port. > > > > > > Patch series turned out to be a little bit more complicated than I > > > anticipated, but it contains some nice clean ups. > > > > Indeed, just from the summary it sounds like a good idea. nbdkit can > > take advantage of it immediately, but libnbd will need to continue to > > explicitly use -U - for a while longer (as long as older nbdkit tends > > to be the version still installed on various CI platforms). > > Although it might not be useful for libnbd, we could also had a > --dump-config sentinel, something like: > > $ nbdkit --dump-config > ... > run_default_socket=Unix > > (and add run_default_socket=TCP to older branches). If you can think > of a better name for this ... :-) That actually sounds good as-is. I don't know if backporting a patch to expose run_default_socket=TCP makes a difference (after all, the absence of run_default_socket= in the config output serves the same purpose), but I'll leave that up to your call. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [PATCH nbdkit 00/10] Make --run imply -U -
On Sat, Sep 09, 2023 at 02:57:48PM +0100, Richard W.M. Jones wrote: > Should have done this a long time ago. I feel it is about time we > change the default of nbdkit --run to imply -U -, rather than opening > a public port. > > Patch series turned out to be a little bit more complicated than I > anticipated, but it contains some nice clean ups. Indeed, just from the summary it sounds like a good idea. nbdkit can take advantage of it immediately, but libnbd will need to continue to explicitly use -U - for a while longer (as long as older nbdkit tends to be the version still installed on various CI platforms). > > Last patch updating the documentation wouldn't be applied any time > soon, so that the old docs stay around on the website. Also a good idea. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [libnbd PATCH v4 25/25] api: Add LIBNBD_STRICT_AUTO_FLAG control to nbd_set_strict
On Wed, Aug 02, 2023 at 08:50:45PM -0500, Eric Blake wrote: > We've previously documented that nbd_set_strict() can add new bits, > defaulting to on to provide client-side safety, but which can be > overridden when performing specific integration tests against a > server. Prior to the introduction of libnbd support for extended > headers, a user could manually disable STRICT_FLAGS and > STRICT_COMMANDS to test the response of an older server receiving an > unexpected NBD_CMD_FLAG_PAYLOAD_LEN; but for convenience sake, we > prefer our default for that flag to now track extended headers and > ignore what the user passes in, which removes the ability we had for > that integration test. Adding a new strictness knob lets the user be > in charge of that bit's contents once again. The caveat remains that > disabling the strictness bits makes it possible for a client to > violate the NBD protocol which may have undefined results (the > connection may drop, libnbd may hang, ...), so most clients will never > call nbd_set_strict() in the first place. > > Signed-off-by: Eric Blake > --- > I've pushed the remainder of this series (patches 21-25) as commits 6076a806..5c1dae92. 64-bit extension support should now be fully functional in libnbd; although we may still push followup patches if anything turns up during integration testing. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] libnbd | Failed pipeline for master | 0b587f8a
On Thu, Sep 07, 2023 at 04:49:53PM +, GitLab wrote: > > > Pipeline #996353812 has failed! > > Project: libnbd ( https://gitlab.com/nbdkit/libnbd ) > Branch: master ( https://gitlab.com/nbdkit/libnbd/-/commits/master ) > > Commit: 0b587f8a ( > https://gitlab.com/nbdkit/libnbd/-/commit/0b587f8a4f3e61e27985b8a3f66a3d7da45c00ef > ) > Commit Message: info: Prefer NBD_OPT_INFO when possible > > Now th... > Commit Author: Eric Blake ( https://gitlab.com/ebblake ) > > > Pipeline #996353812 ( https://gitlab.com/nbdkit/libnbd/-/pipelines/996353812 > ) triggered by Eric Blake ( https://gitlab.com/ebblake ) > had 2 failed jobs. > > Job #5039934820 ( https://gitlab.com/nbdkit/libnbd/-/jobs/5039934820/raw ) > > Stage: builds > Name: x86_64-fedora-rawhide-clang-prebuilt-env > Job #5039934815 ( https://gitlab.com/nbdkit/libnbd/-/jobs/5039934815/raw ) > > Stage: builds > Name: x86_64-fedora-rawhide-prebuilt-env Expected transient failures; we haven't quite finished packaging nbdkit 1.35.12 and pushing it out to rawhide. Should I do another patch that widens the version check to cover 1.35.11 in addition to 1.34.[012]? Or can we just live with this on the grounds that rawhide suffers from being bleeding edge, and it will clear up soon? -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [libnbd PATCH] info: Prefer NBD_OPT_INFO when possible
On Tue, Sep 05, 2023 at 03:23:58PM +0100, Richard W.M. Jones wrote: > On Tue, Sep 05, 2023 at 08:47:34AM -0500, Eric Blake wrote: > > diff --git a/info/info-list-uris.sh b/info/info-list-uris.sh > > index 0d6a16a0..d8ea9108 100755 > > --- a/info/info-list-uris.sh > > +++ b/info/info-list-uris.sh > > @@ -22,7 +22,14 @@ set -e > > set -x > > > > requires nbdkit --version > > -requires nbdkit file --version > > +# Avoid tickling a double-free bug in nbdkit 1.35.11 > > +requires nbdkit -U- -r file dir=. --run 'nbdsh --opt-mode -u "$uri" -c " > > +try: > > + h.opt_info() > > +except nbd.Error: > > + pass > > +h.opt_abort() > > +"' > > > > # This test requires nbdkit >= 1.22. > > I guess the double-free bug is actually in any (or many versions of) > nbdkit between 1.22 and 1.35.11, which is why the requires test is > needed here? The message seems to imply it only affects 1.35.11. Hmm. There is indeed a range of affected versions, but it is not as bad as you make out. The bug was introduced in 185e7d (v1.33.1) and was not backported to stable-1.32; so if we assume distros only use stable versions, then rejecting 1.34.[012] is sufficient. I've already backported the fix to stable-1.34, but don't know if you would prefer to cut the 1.34.3 release or let me do it. > > It's a shame that the requires test is so long, but I couldn't see any > way to shorten it. And as you pointed out in the next message, triggering the double-free can still have the side effect of a core file. You've convinced me that a version check is the best here. > > @@ -398,6 +395,13 @@ do_connect (struct nbd_handle *nbd) > > fprintf (stderr, "%s: %s\n", progname, nbd_get_error ()); > > exit (EXIT_FAILURE); > >} > > + > > + /* If we are in opt mode, request info on the original export name. > > + * However, ignoring failure at this time is okay, as later code > > + * may want to try an alternate export name. > > + */ > > + if (nbd_aio_is_negotiating (nbd)) > > +nbd_opt_info (nbd); > > I maybe don't fully understand when nbd_aio_is_negotiating would not > be true. Aren't we always in opt mode now? We now unconditionally request it, but it is up to the server whether we actually got it. An oldstyle server does not support opt mode. And our unit tests do have coverage of connecting to an oldstyle server. At any rate, with the revisions to patch 2 and 3 (nbd_shutdown now automatically covers opt_abort without needing a new flag), and with this patch updated to just do a version range exclusion, this series is now in as db9f6bf6..0b587f8a -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [libnbd PATCH] rust: Use mio::poll instead of requiring epoll
On Wed, Sep 06, 2023 at 05:02:06PM -0500, Eric Blake wrote: > > Okay, I'll merge in that aspect of my original along with your other > > improvements, and push something soon. Fingers crossed that we'll > > finally get a green CI run today. > > Now in as 75979812. Still not a green CI run; but the BSD builds now > work. The only remaining holdout is OpenSUSE Leap 15.5, where OCaml is > stuck at 4.06 and therefore lacks List.to_seq |> Map.of_seq. I'm not > sure if it is worth trying to copy the 4.07 implementations just so we > can keep the two sorted name maps (one of the reasons you WANT to use > a standard library implementation of Map is that writing a correct > self-balancing binary tree implementation is not trivial). But do we > need the resulting sets of APIs to be sorted by name? If we just > stick to constructs in 4.06, we may be hit with slower O(n^2) list > filtering when comparing which APIs go in which of the two lists (sync > vs async), but is our list of APIs so large that it actually makes a > noticeable difference to compile time? > > Rich, do you have any more ideas beyond Erik's recommendation [1] of > just dropping OpenSUSE 15.5 and focusing only on Tumbleweed? > > [1] https://gitlab.com/nbdkit/libnbd/-/merge_requests/5#note_1535510343 Thanks to stackoverflow [2], I came up with an alternative (using List.fold_left to build up from NameMap.empty through repeated NameMap.add, without any intermediate representation in Seq). As of commit db9f6bf6, we finally have green CI in libnbd again! [2] https://stackoverflow.com/questions/61220655/how-to-initialize-map-from-list-in-ocaml -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [libnbd PATCH] rust: Use mio::poll instead of requiring epoll
On Tue, Sep 05, 2023 at 09:45:44AM -0500, Eric Blake wrote: > On Mon, Sep 04, 2023 at 06:59:15PM +, Tage Johansson wrote: > > > > > > +// can be both readable and writable, but we survive just fine > > > > +// if we only see one direction even when both are available. > > > > +poll.registry().register( > > > > + SourceFd(), > > > > +Token(0), > > > > +MioInterest::READABLE | MioInterest::WRITABLE, > > > > +)?; > > > > +poll.poll( events, Some(Duration::ZERO))?; > > > Why do we want 'poll.poll()?;', that is, to fail this function if the > > > poll returns an error? We _expect_ poll to sometimes return an error > > > (namely, the fact that it timed out) if there is nothing pending on > > > the fd, at which point we WANT to successfully clear the ready_guard > > > for both read and write, rather than to error out of this function. > > > > > > You are right. I thought that the poll() call would return Ok(()) upon > > timeout, but according to the documentation: > > > > > Currently if the timeout elapses without any readiness events triggering > > > this will return Ok(()). However we’re not guaranteeing this behaviour > > > as this depends on the OS. > > > > So I guess it is best to ignore any errors from the poll call as in your > > patch. > > Okay, I'll merge in that aspect of my original along with your other > improvements, and push something soon. Fingers crossed that we'll > finally get a green CI run today. Now in as 75979812. Still not a green CI run; but the BSD builds now work. The only remaining holdout is OpenSUSE Leap 15.5, where OCaml is stuck at 4.06 and therefore lacks List.to_seq |> Map.of_seq. I'm not sure if it is worth trying to copy the 4.07 implementations just so we can keep the two sorted name maps (one of the reasons you WANT to use a standard library implementation of Map is that writing a correct self-balancing binary tree implementation is not trivial). But do we need the resulting sets of APIs to be sorted by name? If we just stick to constructs in 4.06, we may be hit with slower O(n^2) list filtering when comparing which APIs go in which of the two lists (sync vs async), but is our list of APIs so large that it actually makes a noticeable difference to compile time? Rich, do you have any more ideas beyond Erik's recommendation [1] of just dropping OpenSUSE 15.5 and focusing only on Tumbleweed? [1] https://gitlab.com/nbdkit/libnbd/-/merge_requests/5#note_1535510343 -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [PATCH nbdkit] server: Move size parsing code (nbdkit_parse_size) to common/include
On Tue, Sep 05, 2023 at 11:09:02AM +0100, Richard W.M. Jones wrote: > > > +static inline int64_t > > > +human_size_parse (const char *str, > > > + const char **error, const char **pstr) > > > +{ > > > + int64_t size; > > > + char *end; > > > + uint64_t scale = 1; > > > + > > > + /* XXX Should we also parse things like '1.5M'? */ > > > + /* XXX Should we allow hex? If so, hex cannot use scaling suffixes, > > > + * because some of them are valid hex digits. > > > + */ > > > + errno = 0; > > > + size = strtoimax (str, , 10); > > > > (1) A further improvement here (likely best done in a separate patch) > > could be to change the type of "size" to "intmax_t", from "int64_t". > > That way, the assignment will be safe even theoretically, *and* the > > overflow check at the bottom of the function (with the division & > > comparison of the quotient against INT_MAX) will work just the same. > > I'm always very unsure how this all works. In particular I seem to > recall that intmax_t is no longer really the maximum possible int > (because of int128) and so it's always 64 bit on platforms we care > about. Can Eric comment? intmax_t was supposed to be whatever the compiler supports as its largest integer type; but you are right that when gcc added __int128_t, they did NOT change intmax_t at the time (arguably by using weasel-words that as an implementation-defined type, it was not an integer type merely because you can't write integer literals of that type, even though it behaves integral in every other aspect). We're kind of in a frozen state where making intmax_t larger than 64 bits will break more programs than expected because it has ABI implications: https://stackoverflow.com/questions/21265462/why-in-g-stdintmax-t-is-not-a-int128-t My personal preference is to avoid intmax_t, as it has too much baggage (the risk of future widening, vs. NOT being the widest type after all), similar to how size_t already has baggage. In short, having something that is not platform specific is easier to reason about (for the same way that using size_t gives me more grief than directly using int32_t or int64_t; even though size_t is such a naturally occuring type, the fact that it is not uniform width makes it trickier to work with). > > > + > > > + if (INT64_MAX / scale < size) { > > > +*error = "could not parse size: size * scale overflows"; > > > +*pstr = str; > > > +return -1; > > > + } And thus I prefer that this comparison stay pegged to INT64_MAX, and not INT_MAX. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [libnbd PATCH] rust: Use mio::poll instead of requiring epoll
On Mon, Sep 04, 2023 at 06:59:15PM +, Tage Johansson wrote: > > > > +// can be both readable and writable, but we survive just fine > > > +// if we only see one direction even when both are available. > > > +poll.registry().register( > > > + SourceFd(), > > > +Token(0), > > > +MioInterest::READABLE | MioInterest::WRITABLE, > > > +)?; > > > +poll.poll( events, Some(Duration::ZERO))?; > > Why do we want 'poll.poll()?;', that is, to fail this function if the > > poll returns an error? We _expect_ poll to sometimes return an error > > (namely, the fact that it timed out) if there is nothing pending on > > the fd, at which point we WANT to successfully clear the ready_guard > > for both read and write, rather than to error out of this function. > > > You are right. I thought that the poll() call would return Ok(()) upon > timeout, but according to the documentation: > > > Currently if the timeout elapses without any readiness events triggering > > this will return Ok(()). However we’re not guaranteeing this behaviour > > as this depends on the OS. > > So I guess it is best to ignore any errors from the poll call as in your > patch. Okay, I'll merge in that aspect of my original along with your other improvements, and push something soon. Fingers crossed that we'll finally get a green CI run today. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [libnbd PATCH v9 6/7] rust: async: Add an example
On Mon, Sep 04, 2023 at 04:56:07PM +, Tage Johansson wrote: > > On 9/1/2023 10:41 PM, Eric Blake wrote: > > On Sat, Aug 26, 2023 at 11:29:59AM +, Tage Johansson wrote: > > > This patch adds an example using the asynchronous Rust bindings. > > > --- > > > rust/Cargo.toml| 1 + > > > rust/examples/concurrent-read-write.rs | 149 + > > > rust/run-tests.sh.in | 2 + > > > 3 files changed, 152 insertions(+) > > > create mode 100644 rust/examples/concurrent-read-write.rs > > > > > > diff --git a/rust/Cargo.toml b/rust/Cargo.toml > > > index c49f9f2..0879b34 100644 > > > --- a/rust/Cargo.toml > > > +++ b/rust/Cargo.toml > > > @@ -58,5 +58,6 @@ default = ["log", "tokio"] > > > anyhow = "1.0.72" > > > once_cell = "1.18.0" > > > pretty-hex = "0.3.0" > > > +rand = { version = "0.8.5", default-features = false, features = > > > ["small_rng", "min_const_gen"] } > > > tempfile = "3.6.0" > > > tokio = { version = "1.29.1", default-features = false, features = > > > ["rt-multi-thread", "macros"] } > > > diff --git a/rust/examples/concurrent-read-write.rs > > > b/rust/examples/concurrent-read-write.rs > > > new file mode 100644 > > > index 000..4858f76 > > > --- /dev/null > > > +++ b/rust/examples/concurrent-read-write.rs > > > @@ -0,0 +1,149 @@ > > > +//! Example usage with nbdkit: > > > +//! nbdkit -U - memory 100M \ > > > +//! --run 'cargo run --example concurrent-read-write -- > > > $unixsocket' > > > +//! Or connect over a URI: > > > +//! nbdkit -U - memory 100M \ > > > +//! --run 'cargo run --example concurrent-read-write -- $uri' > > Should be "$uri" here (to avoid accidental shell globbing surprises). > > > > > +//! > > > +//! This will read and write randomly over the first megabyte of the > > This says first megabyte...[1] > > > If I understand my code correctly, it is actually reading and writing > randomly over the entire memory. Yes (aka, the comment is wrong in the source file, we should clean it up there as well as here). > > > > +//! plugin using multi-conn, multiple threads and multiple requests in > > > +//! flight on each thread. > > > + > > > +#![deny(warnings)] > > > +use rand::prelude::*; > > > +use std::env; > > > +use std::sync::Arc; > > > +use tokio::task::JoinSet; > > > + > > > +/// Number of simultaneous connections to the NBD server. > > > +/// > > > +/// Note that some servers only support a limited number of > > > +/// simultaneous connections, and/or have a configurable thread pool > > > +/// internally, and if you exceed those limits then something will break. > > > +const NR_MULTI_CONN: usize = 8; > > > + > > > +/// Number of commands that can be "in flight" at the same time on each > > > +/// connection. (Therefore the total number of requests in flight may > > > +/// be up to NR_MULTI_CONN * MAX_IN_FLIGHT). > > > +const MAX_IN_FLIGHT: usize = 16; > > > + > > > +/// The size of large reads and writes, must be > 512. > > > +const BUFFER_SIZE: usize = 1024; > > 1024 isn't much larger than 512. It looks like you borrowed heavily > > from examples/threaded-reads-and-writes.c, but that used 1M as the > > large buffer. > > > The reason for this is that we can't read and write to a shared buffer > simultaneously in safe Rust. So the buffer gets created on the fly for each > read/write operation. And creating a 1M buffer so many times seemd like a > bit too much memory comsumtion. Agreed that lots of memory use is undesirable. The original C test exploits and documents that it is doing an unsafe optimization of reusing a single buffer across simultaneous operations merely to demonstrate speed of the API; so it is interesting proof that Rust has succeeded in this instance at its goal of letting the compiler tell us that our (known) unsafe buffer sharing is not kosher. Still, that's WHY Rust has the 'unsafe' keyword, to let us explicitly document when we know we are doing something that is unusual. It seems to me that this example would be truer to the original example if we are able to add in 'unsafe' and intentionally share a buffer across multiple threads (despite access to the contents of that buffer no longer being well-defined) than to avoid the 'unsafe' keyword. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [libnbd PATCH 0/3] Simplify nbd_shutdown vs. opt mode
On Fri, Sep 01, 2023 at 10:28:52PM +0100, Richard W.M. Jones wrote: > On Tue, Aug 29, 2023 at 05:20:40PM -0500, Eric Blake wrote: > > While working on a larger set of patches to make nbdinfo favor > > NBD_OPT_INFO over NBD_OPT_GO where possible (which requires use of > > nbd_set_opt_mode(,true) in more cases), I noticed that it got unwieldy > > to have to pick the correct shutdown function in all code paths. So I > > propose making the API smarter, by adding an opt-in flag that does the > > right thing on my behalf. > > > > If you have an idea for a better name for the flag, or think this > > functionality should be enabled by default, let me know. Part of the > > reason for choosing a new flag is that it becomes a compile-time > > witness of whether nbd_shutdown has the desired capability (if we > > allow it to auto-opt_abort without a flag, it's harder to tell whether > > we are running against an older libnbd where it errors out instead). > > My feeling is this should be enabled by default, as that does the > right thing by default. Makes sense. Certainly easier to write nbd_shutdown(nbd, 0) than nbd_shutdown(nbd, LIBNBD_SHUTDOWN_COVER_OPT_MODE). > > Whether or not we need to have a flag to disable it (ie the opposite > sense to the proposed flag) is up to you. At this point, there are so few affected callers (most programs don't use nbd_set_opt_mode, and nbdinfo is patched by this series), so for now I won't bother to add a witness flag; but I will go ahead and backport the fix to stable branches. Adding a flag in a followup patch (with the opposite sense of NOT shutting down if still in opt mode - mainly as the witness of the feature existing) is still an option. > > For the series: > Reviewed-by: Richard W.M. Jones I'll reply again with commit ids once I've amended patch 2 and 3; I've also just replied with a separate mail (oops, I didn't title it 4/3) with the reasons behind this series in the first place, before I can finish checking in the tail end of the 64-bit extension series. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
[Libguestfs] [libnbd PATCH] info: Prefer NBD_OPT_INFO when possible
Now that libnbd supports extended headers, it is worth observing that the qemu implementation for NBD_CMD_BLOCK_STATUS that supports an extended header for filtering the server's response, as advertised by NBD_FLAG_BLOCK_STAT_PAYLOAD, is conditionally advertised. When using NBD_OPT_GO, the flag is only advertised if the client requested more than one meta context (as otherwise, there is no use filtering). But for NBD_OPT_INFO, the flag is advertised if extended headers are enabled, regardless of whether meta contexts are negotiated yet. In order for upcoming libnbd patches to add support for probing and using this bit reliably, we therefore need 'nbdinfo --can ...' to favor the information learned by NBD_OPT_INFO instead of NBD_OPT_GO, because that is lighter weight than figuring out whether an export supports at least two meta contexts that can then be negotiated. Do this by changing nbdinfo to prefer opt mode always, then touching up the few places (--map, --content) that need to ensure NBD_OPT_GO. This is made easier by the previous patches that make nbd_shutdown() work sanely regardless of whether we are still in opt mode. In turn, this patch flushed out a double-free bug in 'nbdkit file dir=...' when querying opt_info on the default name, so a couple of unit tests need to avoid false negatives on platforms where nbdkit commit 39d62de9 is not yet backported. Signed-off-by: Eric Blake --- info/info-list-uris.sh | 9 - info/info-uri.sh | 9 - info/main.c| 14 +- info/map.c | 7 +++ 4 files changed, 32 insertions(+), 7 deletions(-) diff --git a/info/info-list-uris.sh b/info/info-list-uris.sh index 0d6a16a0..d8ea9108 100755 --- a/info/info-list-uris.sh +++ b/info/info-list-uris.sh @@ -22,7 +22,14 @@ set -e set -x requires nbdkit --version -requires nbdkit file --version +# Avoid tickling a double-free bug in nbdkit 1.35.11 +requires nbdkit -U- -r file dir=. --run 'nbdsh --opt-mode -u "$uri" -c " +try: + h.opt_info() +except nbd.Error: + pass +h.opt_abort() +"' # This test requires nbdkit >= 1.22. minor=$( nbdkit --dump-config | grep ^version_minor | cut -d= -f2 ) diff --git a/info/info-uri.sh b/info/info-uri.sh index d491e904..ba0c36d2 100755 --- a/info/info-uri.sh +++ b/info/info-uri.sh @@ -24,7 +24,14 @@ set -e set -x requires nbdkit --version -requires nbdkit file --version +# Avoid tickling a double-free bug in nbdkit 1.35.11 +requires nbdkit -U- -r file dir=. --run 'nbdsh --opt-mode -u "$uri" -c " +try: + h.opt_info() +except nbd.Error: + pass +h.opt_abort() +"' requires nbdkit -U - null --run 'test "$uri" != ""' requires jq --version diff --git a/info/main.c b/info/main.c index 204290a5..80d38224 100644 --- a/info/main.c +++ b/info/main.c @@ -138,7 +138,6 @@ main (int argc, char *argv[]) size_t output_len = 0; bool content_flag = false, no_content_flag = false; bool list_okay = true; - bool opt_mode = false; progname = argv[0]; colour = isatty (STDOUT_FILENO); @@ -281,11 +280,9 @@ main (int argc, char *argv[]) nbd_set_uri_allow_local_file (nbd, true); /* Allow ?tls-psk-file. */ /* Set optional modes in the handle. */ - opt_mode = !can && !map && !size_only; - if (opt_mode) { -nbd_set_opt_mode (nbd, true); + nbd_set_opt_mode (nbd, true); + if (!can && !map && !size_only) nbd_set_full_info (nbd, true); - } if (map) nbd_add_meta_context (nbd, map); @@ -398,6 +395,13 @@ do_connect (struct nbd_handle *nbd) fprintf (stderr, "%s: %s\n", progname, nbd_get_error ()); exit (EXIT_FAILURE); } + + /* If we are in opt mode, request info on the original export name. + * However, ignoring failure at this time is okay, as later code + * may want to try an alternate export name. + */ + if (nbd_aio_is_negotiating (nbd)) +nbd_opt_info (nbd); } /* The URI field in output is not meaningful unless there's a diff --git a/info/map.c b/info/map.c index 594f24ff..399e0355 100644 --- a/info/map.c +++ b/info/map.c @@ -54,6 +54,13 @@ do_map (void) uint64_t offset, align, max_len; size_t prev_entries_size; + /* Map mode requires switching over to transmission phase. */ + if (nbd_aio_is_negotiating (nbd) && + nbd_opt_go (nbd) == -1) { +fprintf (stderr, "%s: %s\n", progname, nbd_get_error ()); +exit (EXIT_FAILURE); + } + /* Did we get the requested map? */ if (!nbd_can_meta_context (nbd, map)) { fprintf (stderr, -- 2.41.0 ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [libnbd PATCH] rust: Use mio::poll instead of requiring epoll
On Mon, Sep 04, 2023 at 04:07:54PM +, Tage Johansson wrote: > From: Eric Blake > > CI shows our async handle fails to build on FreeBSD and MacOS (where > epoll() is not available as a syscall, and therefore not available as > a Rust crate). We can instead accomplish the same level-probing > effects by doing a zero-timeout poll with mio (which defers under the > hood to epoll on Linux, and kqueue on BSD). > > Fixes: 223a9965 ("rust: async: Create an async friendly handle type") > CC: Tage Johansson > Signed-off-by: Eric Blake > --- > rust/Cargo.toml | 3 ++- > rust/src/async_handle.rs | 40 +++- > 2 files changed, 25 insertions(+), 18 deletions(-) > > -// Use epoll to check the current read/write availability on the fd. > +// Use mio poll to check the current read/write availability on the > fd. > // This is needed because Tokio supports only edge-triggered > // notifications but Libnbd requires level-triggered notifications. > -let mut revent = epoll::Event { data: 0, events: 0 }; > // Setting timeout to 0 means that it will return immediately. > -epoll::wait(epfd, 0, std::slice::from_mut( revent))?; > -let revents = Events::from_bits(revent.events).unwrap(); > -if !revents.contains(Events::EPOLLIN) { > -ready_guard.clear_ready_matching(IoReady::READABLE); > -} > -if !revents.contains(Events::EPOLLOUT) { > -ready_guard.clear_ready_matching(IoReady::WRITABLE); > +// mio states that it is OS-dependent on whether a single event > +// can be both readable and writable, but we survive just fine > +// if we only see one direction even when both are available. > +poll.registry().register( > + SourceFd(), > +Token(0), > +MioInterest::READABLE | MioInterest::WRITABLE, > +)?; > +poll.poll( events, Some(Duration::ZERO))?; Why do we want 'poll.poll()?;', that is, to fail this function if the poll returns an error? We _expect_ poll to sometimes return an error (namely, the fact that it timed out) if there is nothing pending on the fd, at which point we WANT to successfully clear the ready_guard for both read and write, rather than to error out of this function. > +for event in { > +if !event.is_readable() { > +ready_guard.clear_ready_matching(IoReady::READABLE); > +} > +if !event.is_writable() { > +ready_guard.clear_ready_matching(IoReady::WRITABLE); > +} > } > ready_guard.retain_ready(); > +poll.registry().deregister( SourceFd())?; Furthermore, if the regsiter()/deregister() should always occur in pairs, the early poll()? exit breaks that assumption (I don't know if Rust has enough smarts to automatically clean up the unpaired registration). -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [libnbd PATCH v9 6/7] rust: async: Add an example
it.unwrap().unwrap()?; > +stats.requests += this_stats.requests; > +} > + > +// Make sure the number of requests that were required matches what > +// we expect. > +assert_eq!(stats.requests, NR_MULTI_CONN * NR_CYCLES); > + > +Ok(()) > +} > + > +async fn run_thread( > +task_idx: usize, > +socket_or_uri: String, > +export_size: u64, > +) -> anyhow::Result { > +// Start a new connection to the server. > +// We shall spawn many commands concurrently on different tasks and those > +// futures must be `'static`, hence we wrap the handle in an [Arc]. > +let nbd = Arc::new(libnbd::AsyncHandle::new()?); > + > +// Check if the user provided a URI or a unix socket. > +if socket_or_uri.contains("://") { > +nbd.connect_uri(socket_or_uri).await?; > +} else { > +nbd.connect_unix(socket_or_uri).await?; > +} > + > +let mut rng = SmallRng::seed_from_u64(44 as u64); Why are you hard-coding this to the same fixed seed, for every worker thread? > + > +// Issue commands. > +let mut stats = Stats::default(); > +let mut join_set = JoinSet::new(); > +//tokio::time::sleep(std::time::Duration::from_secs(1)).await; Does this comment need to remain? > +while stats.requests < NR_CYCLES || !join_set.is_empty() { > +while stats.requests < NR_CYCLES && join_set.len() < MAX_IN_FLIGHT { > +// If we want to issue another request, do so. Note that we > reuse > +// the same buffer for multiple in-flight requests. It doesn't > +// matter here because we're just trying to write random stuff, > +// but that would be Very Bad in a real application. > +// Simulate a mix of large and small requests. > +let size = if rng.gen() { BUFFER_SIZE } else { 512 }; Is this missing a math operation to check only one bit of the generated value? /me goes and reads more Rust docs Oh, so rng.gen() is shorthand for rng.gen::() when used in a boolean context, which really does have a 50/50 split in returned values? Cool shorthand! > +let offset = rng.gen_range(0..export_size - size as u64); [1]...but you are actually targetting anywhere in the image. It looks like that was a copy-paste error from a comment that has gone stale after refactoring in the original C code. > + > +let mut buf = [0u8; BUFFER_SIZE]; > +let nbd = nbd.clone(); > +if rng.gen() { > +join_set.spawn(async move { > +nbd.pread( buf, offset, None).await > +}); > +} else { > +// Fill the buf with random data. > +rng.fill( buf); > +join_set > +.spawn(async move { nbd.pwrite(, offset, None).await > }); I'm impressed: the async handle makes for functions that use a lot less boilerplate than the comparable C nbd_aio_* counterparts. > +} > +stats.requests += 1; > +} > +join_set.join_next().await.unwrap().unwrap()?; > +} > + > +if task_idx == 0 {} > +Ok(stats) > +} > diff --git a/rust/run-tests.sh.in b/rust/run-tests.sh.in > index 3ebf9a1..661c018 100755 -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [libnbd PATCH 2/2] rust: Use mio::poll instead of requiring epoll
On Fri, Sep 01, 2023 at 08:35:58AM +, Tage Johansson wrote: > > On 8/30/2023 10:11 PM, Eric Blake wrote: > > CI shows our async handle fails to build on FreeBSD and MacOS (where > > epoll() is not available as a syscall, and therefore not available as > > a Rust crate). We can instead accomplish the same level-probing > > effects by doing a zero-timeout poll with mio (which defers under the > > hood to epoll on Linux, and kqueue on BSD). > > > The problem with mio is that it uses edge triggered notifications. According > to this page <https://docs.rs/mio/latest/mio/struct.Poll.html>: > > > Draining readiness > > Once a readiness event is received, the corresponding operation must be > > performed repeatedly until it returns variant > > std::io::error::ErrorKind::WouldBlock. Unless this is done, there is no > > guarantee that another readiness event will be delivered, even if > > further data is received for the event source. Would that still be true even if we deregister the interest after our one-shot poll, and reregister it immediately before-hand? In other words, even if registration sets up edge triggers for future notifications, the fact that we are only doing a one-shot query seems like it would be sufficient to use our one-shot as a level probe (assuming that the registration starts by learning the current state of the fd before waiting for any future edges). I agree that once we get a notification after a .poll(), the normal assumption is that we must loop and consume all data before calling .poll again (because the .poll would block until the next edge - but if we didn't consume to the blocking point, there is no next edge). But since we are NOT looping, nor doing any consumption work, our timeout of 0 is trying to behave as an instant level check, and if adding a reregister/deregister wrapper around the .poll makes it more reliable (by wiping out all prior events and re-priming the registration to start with the current level), that may be all the more we need. > > The motivation behind using epoll in addition to tokio in the first place > was to check if fd is readable/writable without necessarily knowing if the > last read/write operation returned [ErrorKind::WouldBlock]. And it doesn't > seems that mio full fills that requirenment. Do we need to expand the libnbd API itself to make it more obvious whether our state machine completed because it hit a EWOULDBLOCK scenario? In general, once you kick the state machine (by sending a new command, or calling one of the two aio_notify calls), the state machine tries to then process as much data as possible until it blocks again, rather than stopping after just processing a single transaction. That is, if libnbd isn't already designed to work with edge triggers, how much harder would it be to modify it (perhaps by adding a new API) so that it DOES work nicely with edge triggers? If we can solve _that_ problem, then the Rust async handle wouldn't need to use epoll, mio, or anything else; tokio's edge trigger behavior would already be sufficient on its own. > > > I tried using the polling <https://docs.rs/polling/latest/polling/> crate, > which uses epoll and kqueue under the hood. but for some reason the > disconnect call failes, at least on my Linux machine. I have know idea > what's the problem. > > > Another idea would be that we create our own abstraction upon epoll and > kqueue. But I am unable to test any kqueue implementation, I think that I > can not even compile it on my machine. If there is already a solution out there that works, let's favor that instead of reimplementing a portability wrapper ourselves. > > > It would of course also be possible to disable the Rust bindings (or at > least the asynchronous API) on BSD and MacOS, but that would not be a good > solution I think. Yeah, crippling the BSD build is less desirable, even if doing it in the short term to get green CI is tempting. You mentioned that the disconnect API was hanging. Is it because we have not set up tokio to inform us when it detects that a socket has closed? Admittedly, I haven't reproduced the hang you mentioned yet; is it something that was easy to reproduce (for example, just remove all the epoll code without replacement, then run 'make check')? If I can reproduce the hang, maybe I can learn from strace where things are getting stuck? -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [nbdkit PATCH] sh: Allow pwrite to not consume all data
On Thu, Aug 31, 2023 at 10:52:59AM +0100, Richard W.M. Jones wrote: > > >Other plugins (eg written in C) ignore or > > > only partially consume the buffer in pwrite all the time, and that's > > > never a problem. > > > > I don't understand. > > > > https://libguestfs.org/nbdkit-plugin.3.html#pwrite > > > > """ > > The callback must write the whole count bytes if it can. The NBD > > protocol doesn't allow partial writes (instead, these would be errors). > > If the whole count bytes was written successfully, the callback should > > return 0 to indicate there was no error. > > > > If there is an error (including a short write which couldn't be > > recovered from), .pwrite should call nbdkit_error with an error message, > > and nbdkit_set_error to record an appropriate error (unless errno is > > sufficient), then return -1. > > """ > > > > How is it safe for a plugin (written in C) to *silently* throw away part > > of the data that the client wants written? > > Plugins can do what they want and we have plugins which do discard > data, silently or otherwise: > > https://gitlab.com/nbdkit/nbdkit/-/blob/c662932f2bc71b5879b6eb83c93478191e6efef2/plugins/full/full.c#L128 > https://gitlab.com/nbdkit/nbdkit/-/blob/c662932f2bc71b5879b6eb83c93478191e6efef2/plugins/null/null.c#L125 > https://gitlab.com/nbdkit/nbdkit/-/blob/c662932f2bc71b5879b6eb83c93478191e6efef2/plugins/ones/ones.c#L133 > > > I agree consistency between sh and other plugins is good, but the > > "baseline" seems unfathomable to me. I probably don't understand it. > > sh plugin is often used for testing where the data written doesn't > matter but we're interested in other metadata like what write calls > were issued, eg: > > https://gitlab.com/nbdkit/libnbd/-/blob/c713529e9fd0641b2d73f764517b5f9c21a767fd/copy/copy-sparse-no-extents.sh#L49 I look at it as: the plugin decides HOW to process the bytes that the client requested to write. A traditional plugin will store the data so it can be read back later (in which case, failing to read all of stdin better result in the plugin returning failure). But during testing, a plugin can declare "I am a data sink; I don't care what bytes you write, my behavior is to ignore them and focus on something else" (for example, it can be convenient to throw out a quick sh plugin that has a side effect of counting .pwrite calls). Such a plugin is going to confuse any client that expects to read back what it just wrote, but when you are testing the protocol with a client that only sends writes (and doesn't even attempt to send read commands, because it knows the writes aren't persistent), that is just fine. In other words, as long as your plugin documents that writes are not persistent, then returning success without draining stdin should be acceptable; just the same as a C plugin .pwrite that ignores the input buffer can still return success. Of course, until newer nbdkit has propagated to systems we care about, any unit tests using sh or eval plugins as a data sink still have to inject 'cat >/dev/null' to drain stdin, in order to pass under heavy load when EPIPE even happens in the first place (I was surprised at how hard it is to lose the race and actually get an EPIPE message from nbdkit talking to the sh plugin script on my machine). -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [nbdkit PATCH] sh: Allow pwrite to not consume all data
On Thu, Aug 31, 2023 at 11:12:59AM +0200, Laszlo Ersek wrote: > On 8/31/23 10:02, Richard W.M. Jones wrote: > > > > On Wed, Aug 30, 2023 at 05:21:19PM -0500, Eric Blake wrote: > >> I hit another transient failure in libnbd CI when a poorly-written > >> eval script did not consume all of stdin during .pwrite. As behaving > >> as a data sink can be a somewhat reasonable feature of a > >> quickly-written sh or eval plugin, we should not be so insistent as > >> treating an EPIPE failure as an immediate return of EIO to the client. > > > > I was thinking about this over night, and came to the conclusion that > > it's always fine to ignore EPIPE errors. > > Interesting; I formed the opposite impression! It took me a couple of tries to realize the subtle distinction. If the child process (aka the plugin script) dies with SIGPIPE (and we intentionally do signal(SIGPIPE, SIG_DFL) before exec'ing the child process, as it is notoriously hard to undo an inherited ignored SIGPIPE in shell), the parent process will see WIFSIGNALED and treat that as an EIO error to the NBD client. If the child process chooses to ignore SIGPIPE, but then sees its own EPIPE failure and exits with non-zero status as a result, the parent process will see the non-zero exit status and report an appropriate error to the NBD client (if it can parse an error name out of the plugin's stderr output, it uses that; otherwise it uses EIO). But if the parent process sees EPIPE, that merely means that the plugin script doesn't care to finish consuming stdin. It is indeterminate at that point whether the child has a reason for ignoring the pipe, so it is inappropriate to blindly treat failure to write to the child as a reason to claim the child would have failed with EIO, without giving the child a chance to exit() (or die by signal) first. > > > > > So my counter-proposal (coming soon) is going to simply turn the EPIPE > > error into a debug message and discard the rest of the write buffer. Yes, I liked your counter-proposal better than my attempt. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [PATCH nbdkit] sh: In pwrite, allow scripts to ignore stdin
On Thu, Aug 31, 2023 at 09:50:22AM +0100, Richard W.M. Jones wrote: > See comment for explanation and > https://listman.redhat.com/archives/libguestfs/2023-August/032468.html > --- > tests/Makefile.am| 2 + > plugins/sh/call.c| 33 +++- > tests/test-sh-pwrite-ignore-stdin.sh | 77 > 3 files changed, 100 insertions(+), 12 deletions(-) Having read the rest of the conversation, I now agree with you: no matter when EPIPE happens, it is more important that we honor the exit status rather than blindly turning the child process's refusal to empty the pipe as a fatal error. > +++ b/plugins/sh/call.c > @@ -275,22 +275,31 @@ call3 (const char *wbuf, size_t wbuflen, /* sent to > stdin (can be NULL) */ >r = write (pfds[0].fd, wbuf, wbuflen); >if (r == -1) { > if (errno == EPIPE) { > - /* We tried to write to the script but it didn't consume > - * the data. Probably the script exited without reading > - * from stdin. This is an error in the script. > + /* In nbdkit <= 1.35.11 we gave an error here, arguing that > + * scripts must always consume or discard their full input > + * when 'pwrite' is called. Previously we had many cases > + * where scripts forgot to discard the data on a path out of > + * pwrite (such as an error or where the script is not > + * interested in the data being written), resulting in > + * intermittent test failures. > + * > + * It is valid for a script to ignore the written data > + * (plenty of non-sh plugins do this), or for a script to be > + * gradually processing the data, encounter an error and > + * wish to exit immediately. Therefore ignore this error. I agree with Laszlo's observation that we may want to enhance this comment to mention that we still obey the exit status (a plugin that quit reading stdin because it was killed by a signal will show up as an error and not a successful .pwrite). > */ > - nbdkit_error ("%s: write to script failed because of a broken > pipe: " > -"this can happen if the script exits without " > -"consuming stdin, which usually indicates a bug " > -"in the script", > -argv0); > + nbdkit_debug ("%s: write: %m (ignored)", argv0); > + wbuflen = 0; /* discard the rest */ > } > -else > +else { >nbdkit_error ("%s: write: %m", argv0); > -goto error; > + goto error; > +} Yes, moving the 'goto error' into the else branch is something I ended up realizing I missed in my version of the patch, when I tried writing a unit test. > + } > + else { > +wbuf += r; > +wbuflen -= r; >} > - wbuf += r; > - wbuflen -= r; >/* After writing all the data we close the pipe so that > * the reader on the other end doesn't wait for more. > */ This code change does what we need, and is less complex than my attempt. > diff --git a/tests/test-sh-pwrite-ignore-stdin.sh > b/tests/test-sh-pwrite-ignore-stdin.sh > new file mode 100755 > index 0..3448eca17 > --- /dev/null > +++ b/tests/test-sh-pwrite-ignore-stdin.sh > + > +start_nbdkit -P $pid -U $sock sh - <<'EOF' > +case "$1" in > +can_write) echo 0 ;; > +pwrite) > +# Always ignore the input. If the offset >= 32M return an error. > +if [ $4 -ge 33554432 ]; then > +echo 'ENOSPC Out of space' >&2 > +exit 1 > +fi > +;; In my testing, I could not reliably reproduce an EPIPE error, either with your patch or mine, and whether or not I used your unit test or my (unpublished) attempt at one (I had tried to 'exec 0https://eklitzke.org/blocking-io-nonblocking-io-and-epoll was a nice read] > +get_size) > +echo 64M > +;; > +pread) > +;; > +*) exit 2 ;; > +esac > +EOF > + > +nbdsh -u "nbd+unix://?socket=$sock" -c ' > +buf = bytearray(16*1024*1024) > +h.pwrite(buf, 0) > + > +try: > +h.pwrite(buf, 32*1024*1024) > +assert False > +except nbd.Error: > +# Expect an error here. > + pass I would prefer if we go one step further, and import errno ... except nbd.Error as ex: assert ex.errnum == errno.ENOSPC to prove that we got the client's intended ENOSPC, rather than an accidental nbdkit treating EPIPE writing to the plugin as an instant EIO to the plugin (the pre-patch behavior). With that change, Reviewed-by: Eric Blake -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
[Libguestfs] [nbdkit PATCH] sh: Allow pwrite to not consume all data
I hit another transient failure in libnbd CI when a poorly-written eval script did not consume all of stdin during .pwrite. As behaving as a data sink can be a somewhat reasonable feature of a quickly-written sh or eval plugin, we should not be so insistent as treating an EPIPE failure as an immediate return of EIO to the client. Signed-off-by: Eric Blake --- I probably need to add unit test coverage of this before committing (although proving that I win the data race on a client process exiting faster than the parent can write enough data to hit EPIPE is hard). plugins/sh/nbdkit-sh-plugin.pod | 8 +++ plugins/sh/call.c | 38 ++--- 2 files changed, 34 insertions(+), 12 deletions(-) diff --git a/plugins/sh/nbdkit-sh-plugin.pod b/plugins/sh/nbdkit-sh-plugin.pod index b2c946a0..8b83a5b3 100644 --- a/plugins/sh/nbdkit-sh-plugin.pod +++ b/plugins/sh/nbdkit-sh-plugin.pod @@ -505,6 +505,14 @@ Unlike in other languages, if you provide a C method you B also provide a C method which exits with code C<0> (true). +With nbdkit E 1.36, this method may return C<0> without consuming +any data from stdin, and without producing any output, in order to +behave as an intentional data sink. But in older versions, nbdkit +would treat any C failure in writing to your script as an error +condition even if your script returns success; to avoid unintended +failures, you may want to include C<"cat >/dev/null"> in a script +intending to ignore the client's write requests. + =item C /path/to/script flush diff --git a/plugins/sh/call.c b/plugins/sh/call.c index 888c6459..79c67a04 100644 --- a/plugins/sh/call.c +++ b/plugins/sh/call.c @@ -34,6 +34,7 @@ #include #include +#include #include #include #include @@ -130,6 +131,7 @@ debug_call (const char **argv) */ static int call3 (const char *wbuf, size_t wbuflen, /* sent to stdin (can be NULL) */ + bool *pipe_full, /* set if wbuf not fully written */ string *rbuf, /* read from stdout */ string *ebuf, /* read from stderr */ const char **argv)/* script + parameters */ @@ -275,15 +277,8 @@ call3 (const char *wbuf, size_t wbuflen, /* sent to stdin (can be NULL) */ r = write (pfds[0].fd, wbuf, wbuflen); if (r == -1) { if (errno == EPIPE) { - /* We tried to write to the script but it didn't consume - * the data. Probably the script exited without reading - * from stdin. This is an error in the script. - */ - nbdkit_error ("%s: write to script failed because of a broken pipe: " -"this can happen if the script exits without " -"consuming stdin, which usually indicates a bug " -"in the script", -argv0); + *pipe_full = true; + r = wbuflen; } else nbdkit_error ("%s: write: %m", argv0); @@ -555,7 +550,7 @@ call (const char **argv) CLEANUP_FREE_STRING string rbuf = empty_vector; CLEANUP_FREE_STRING string ebuf = empty_vector; - r = call3 (NULL, 0, , , argv); + r = call3 (NULL, 0, NULL, , , argv); return handle_script_error (argv[0], , r); } @@ -568,7 +563,7 @@ call_read (string *rbuf, const char **argv) int r; CLEANUP_FREE_STRING string ebuf = empty_vector; - r = call3 (NULL, 0, rbuf, , argv); + r = call3 (NULL, 0, NULL, rbuf, , argv); r = handle_script_error (argv[0], , r); if (r == ERROR) string_reset (rbuf); @@ -584,7 +579,26 @@ call_write (const char *wbuf, size_t wbuflen, const char **argv) int r; CLEANUP_FREE_STRING string rbuf = empty_vector; CLEANUP_FREE_STRING string ebuf = empty_vector; + bool pipe_full = false; - r = call3 (wbuf, wbuflen, , , argv); + r = call3 (wbuf, wbuflen, _full, , , argv); + if (pipe_full && r == OK) { +/* We allow scripts to intentionally ignore data, but they must + * have no output when doing so. + */ +if (rbuf.len > 0 || ebuf.len > 0) { + nbdkit_error ("%s: write to script failed because of a broken pipe: " +"this can happen if the script exits without " +"consuming stdin, which usually indicates a bug " +"in the script", +argv[0]); + r = ERROR; +} +else + nbdkit_debug ("%s: write to script failed because of a broken pipe; " +"assuming this was an intentional data sink, although it " +"may indicate a bug in the script", +argv[0]); + } return handle_script_error (argv[0], , r); } -- 2.41.0 ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [libnbd PATCH v9 5/7] rust: async: Add a couple of integration tests
On Wed, Aug 30, 2023 at 12:16:07PM -0500, Eric Blake wrote: > > error[E0425]: cannot find value `EPOLL_CTL_ADD` in crate `libc` > > https://www.reddit.com/r/rust/comments/65kflg/does_rust_have_native_epoll_support/ > mentions how epoll is Linux-specific, and has comments about tokio > trying to build on top of mio in order to be platform-independent (mio > uses epoll on Linux where it is available, but does not require epoll > on other platforms). So I spent a couple hours experimenting with whether I could use mio::poll instead of epoll, and at least got things compiling on FreeBSD with the tests still passing on Linux (I'm not quite set up to actually prove that the tests work on FreeBSD). Review appreciated: https://listman.redhat.com/archives/libguestfs/2023-August/032464.html -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
[Libguestfs] [libnbd PATCH 1/2] maint: Favor 4-space indent in .rs files
Since rustfmt favors 4-space indent on .rs files, we should do likewise to reduce the churn when running rustfmt after editing in an emacs session set to honor editorconfig. Signed-off-by: Eric Blake --- .editorconfig | 4 1 file changed, 4 insertions(+) diff --git a/.editorconfig b/.editorconfig index 86ef47d0..f5bc5604 100644 --- a/.editorconfig +++ b/.editorconfig @@ -21,6 +21,10 @@ indent_size = 4 indent_style = tab indent_size = 4 +[*.rs] +# Match rustfmt style +indent_size = 4 + [{Makefile,Makefile.in,Makefile.am,*.mk}] # Make requires tabs. indent_style = tab -- 2.41.0 ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
[Libguestfs] [libnbd PATCH 2/2] rust: Use mio::poll instead of requiring epoll
CI shows our async handle fails to build on FreeBSD and MacOS (where epoll() is not available as a syscall, and therefore not available as a Rust crate). We can instead accomplish the same level-probing effects by doing a zero-timeout poll with mio (which defers under the hood to epoll on Linux, and kqueue on BSD). Fixes: 223a9965 ("rust: async: Create an async friendly handle type") CC: Tage Johansson Signed-off-by: Eric Blake --- rust/Cargo.toml | 2 +- rust/src/async_handle.rs | 46 +--- 2 files changed, 30 insertions(+), 18 deletions(-) diff --git a/rust/Cargo.toml b/rust/Cargo.toml index 0879b34c..391c80e9 100644 --- a/rust/Cargo.toml +++ b/rust/Cargo.toml @@ -49,7 +49,7 @@ thiserror = "1.0.40" log = { version = "0.4.19", optional = true } libc = "0.2.147" tokio = { optional = true, version = "1.29.1", default-features = false, features = ["rt", "sync", "net"] } -epoll = "4.3.3" +mio = "0.8.0" [features] default = ["log", "tokio"] diff --git a/rust/src/async_handle.rs b/rust/src/async_handle.rs index ab28b910..35b1c0d2 100644 --- a/rust/src/async_handle.rs +++ b/rust/src/async_handle.rs @@ -35,9 +35,11 @@ use crate::sys; use crate::Handle; use crate::{Error, FatalErrorKind, Result}; use crate::{AIO_DIRECTION_BOTH, AIO_DIRECTION_READ, AIO_DIRECTION_WRITE}; -use epoll::Events; +use mio::unix::SourceFd; +use mio::{Events, Interest as MioInterest, Poll, Token}; use std::sync::Arc; use std::sync::Mutex; +use std::time::Duration; use tokio::io::{unix::AsyncFd, Interest, Ready as IoReady}; use tokio::sync::Notify; use tokio::task; @@ -176,12 +178,12 @@ async fn polling_task(handle_data: ) -> Result<(), FatalErrorKind> { } = handle_data; let fd = handle.aio_get_fd().map_err(Error::to_fatal)?; let tokio_fd = AsyncFd::new(fd)?; -let epfd = epoll::create(false)?; -epoll::ctl( -epfd, -epoll::ControlOptions::EPOLL_CTL_ADD, -fd, -epoll::Event::new(Events::EPOLLIN | Events::EPOLLOUT, 42), +let mut events = Events::with_capacity(1); +let mut poll = Poll::new()?; +poll.registry().register( + SourceFd(), +Token(0), +MioInterest::READABLE | MioInterest::WRITABLE, )?; // The following loop does approximately the following things: @@ -248,19 +250,29 @@ async fn polling_task(handle_data: ) -> Result<(), FatalErrorKind> { } drop(pending_cmds_lock); -// Use epoll to check the current read/write availability on the fd. +// Use mio poll to check the current read/write availability on the fd. // This is needed because Tokio supports only edge-triggered // notifications but Libnbd requires level-triggered notifications. -let mut revent = epoll::Event { data: 0, events: 0 }; // Setting timeout to 0 means that it will return immediately. -epoll::wait(epfd, 0, std::slice::from_mut( revent))?; -let revents = Events::from_bits(revent.events).unwrap(); -if !revents.contains(Events::EPOLLIN) { -ready_guard.clear_ready_matching(IoReady::READABLE); -} -if !revents.contains(Events::EPOLLOUT) { -ready_guard.clear_ready_matching(IoReady::WRITABLE); -} +// mio states that it is OS-dependent on whether a single event +// can be both readable and writable, but we survive just fine +// if we only see one direction even when both are available. +match poll.poll( events, Some(Duration::ZERO)) { +Ok(_) => { +for event in { +if !event.is_readable() { +ready_guard.clear_ready_matching(IoReady::READABLE); +} +if !event.is_writable() { +ready_guard.clear_ready_matching(IoReady::WRITABLE); +} +} +} +Err(_) => { +ready_guard.clear_ready_matching(IoReady::READABLE); +ready_guard.clear_ready_matching(IoReady::WRITABLE); +} +}; ready_guard.retain_ready(); } } -- 2.41.0 ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
[Libguestfs] [libnbd PATCH 0/2] (Attempt to) fix Rust on BSD-based builds
I managed to get a build of the async Rust handle compiling on FreeBSD (although the cirrus CI appears to not actually run 'make check' on non-Linux machines, at least when run on my fork): https://gitlab.com/ebblake/libnbd/-/jobs/4985192286 However, I'd really like Tage's review on patch 2 to see if my Rust makes sense. Eric Blake (2): maint: Favor 4-space indent in .rs files rust: Use mio::poll instead of requiring epoll rust/Cargo.toml | 2 +- rust/src/async_handle.rs | 46 +--- .editorconfig| 4 3 files changed, 34 insertions(+), 18 deletions(-) -- 2.41.0 ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [libnbd PATCH v9 5/7] rust: async: Add a couple of integration tests
On Wed, Aug 30, 2023 at 08:42:00AM -0500, Eric Blake wrote: > On Tue, Aug 29, 2023 at 05:17:19PM -0500, Eric Blake wrote: > > On Sat, Aug 26, 2023 at 11:29:58AM +, Tage Johansson wrote: > > > Add a couple of integration tests as rust/tests/test_async_*.rs. They > > > are very similar to the tests for the synchronous API. > > > --- > > > > > rust/tests/test_async_460_block_status.rs | 98 > > > rust/tests/test_async_620_stats.rs| 69 > > > > Our series crossed paths. My commit 5aec7d3b add > > test_465_block_status_64.rs, but this commit of yours (at 0ed92592) > > did not add a counterpart async test, because you started your series > > before I added the 64-bit block status API. Since I heavily > > copy/pasted the synchronous 460 into 465, I'll probably end up doing > > the same for the async version, unless you beat me to it. > > I've done that in commit ac633f84. It missed Rich's cut of release > 1.17.4, and CI is now picking up a different set of failures, but it's > progress. > > x86_64-freebsd-current, aarch64-macos-13: >Compiling epoll v4.3.3 > error[E0425]: cannot find value `EPOLL_CTL_ADD` in crate `libc` > --> > /.cargo/registry/src/index.crates.io-6f17d22bba15001f/epoll-4.3.3/src/lib.rs:19:27 >| > 19 | EPOLL_CTL_ADD = libc::EPOLL_CTL_ADD, >| ^ not found in `libc` >| > help: consider importing this unit variant >| > 11 + use ControlOptions::EPOLL_CTL_ADD; Note that epoll() is a Linux-specific syscall, with no direct BSD counterpart, explaining the failures on FreeBSD and MacOS with its BSD heritage. https://www.reddit.com/r/rust/comments/65kflg/does_rust_have_native_epoll_support/ mentions how epoll is Linux-specific, and has comments about tokio trying to build on top of mio in order to be platform-independent (mio uses epoll on Linux where it is available, but does not require epoll on other platforms). Right now, I see rust/Cargo.tml is explicitly requiring epoll, which means it can only be built on Linux. Is there a way to drop the explicit dependency on epoll, where we instead configure epoll to use whatever it prefers (deferring to mio and eventually epoll on Linux, but to other code on BSD)? Or at a bare minimum, can we make it so that compiling the Rust async handler be limited to Linux-only, while building the synchronous code still works on BSD? [If it doesn't show by the content of my questions, I'm still quite new to Rust, and trying to figure out how crate dependencies work in cross-platform environments] -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [libnbd PATCH v9 5/7] rust: async: Add a couple of integration tests
On Tue, Aug 29, 2023 at 05:17:19PM -0500, Eric Blake wrote: > On Sat, Aug 26, 2023 at 11:29:58AM +, Tage Johansson wrote: > > Add a couple of integration tests as rust/tests/test_async_*.rs. They > > are very similar to the tests for the synchronous API. > > --- > > > rust/tests/test_async_460_block_status.rs | 98 > > rust/tests/test_async_620_stats.rs| 69 > > Our series crossed paths. My commit 5aec7d3b add > test_465_block_status_64.rs, but this commit of yours (at 0ed92592) > did not add a counterpart async test, because you started your series > before I added the 64-bit block status API. Since I heavily > copy/pasted the synchronous 460 into 465, I'll probably end up doing > the same for the async version, unless you beat me to it. I've done that in commit ac633f84. It missed Rich's cut of release 1.17.4, and CI is now picking up a different set of failures, but it's progress. x86_64-freebsd-current, aarch64-macos-13: Compiling epoll v4.3.3 error[E0425]: cannot find value `EPOLL_CTL_ADD` in crate `libc` --> /.cargo/registry/src/index.crates.io-6f17d22bba15001f/epoll-4.3.3/src/lib.rs:19:27 | 19 | EPOLL_CTL_ADD = libc::EPOLL_CTL_ADD, | ^ not found in `libc` | help: consider importing this unit variant | 11 + use ControlOptions::EPOLL_CTL_ADD; -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
[Libguestfs] [libnbd PATCH 3/3] info: Simplify shutdown calls
Depending on command line options and server capabilities, nbdinfo has legitimate reasons to either be in opt mode or fully connected at the time a handle is ready to close. Previously, to avoid unwanted error messages, we had to manually check state to call the correct function out of nbd_opt_abort or nbd_shutdown that would not fail; but now that nbd_shutdown has a new flag, it is simpler to not have to worry about it. Signed-off-by: Eric Blake --- info/list.c | 8 +--- info/main.c | 4 +--- 2 files changed, 2 insertions(+), 10 deletions(-) diff --git a/info/list.c b/info/list.c index 7848923f..d39e26e6 100644 --- a/info/list.c +++ b/info/list.c @@ -62,12 +62,6 @@ collect_exports (void) fprintf (stderr, "%s: %s\n", progname, nbd_get_error ()); exit (EXIT_FAILURE); } - if (probe_content) -/* Disconnect from the server to move the handle into a closed - * state, in case the server serializes further connections. - * But we can ignore errors in this case. - */ -nbd_opt_abort (nbd); } void @@ -127,7 +121,7 @@ list_all_exports (void) list_okay = false; if (probe_content) { - nbd_shutdown (nbd2, 0); + nbd_shutdown (nbd2, LIBNBD_SHUTDOWN_COVER_OPT_MODE); nbd_close (nbd2); } } diff --git a/info/main.c b/info/main.c index 572dd536..204290a5 100644 --- a/info/main.c +++ b/info/main.c @@ -354,9 +354,7 @@ main (int argc, char *argv[]) } free_exports (); - if (opt_mode) -nbd_opt_abort (nbd); - nbd_shutdown (nbd, 0); + nbd_shutdown (nbd, LIBNBD_SHUTDOWN_COVER_OPT_MODE); nbd_close (nbd); /* Close the output stream and copy it to the real stdout. */ -- 2.41.0 ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
[Libguestfs] [libnbd PATCH 2/3] api: Add new COVER_OPT_MODE flag to nbd_shutdown
When nbd_set_opt_mode() was introduced, we did not adjust the behavior of nbd_shutdown(), so as a result it fails if attempted after the initial connection but before nbd_opt_abort() or nbd_opt_go(). To avoid spurious error messages, at least nbdinfo has to perform special-casing on whether it is in opt mode (call nbd_opt_abort) or connected (call nbd_shutdown) prior to calling nbd_close. It would be nicer to just have a single API perform whatever is needed to gracefully shut the connection, without having to hard-code the special casing into each application. The approach taken here is to add a new flag; if the flag is absent (behavior of an older application), then nbd_opt_abort must still be invoked separately before attempting nbd_shutdown. But if the flag is present, then whether or not the server supports opt mode, the shutdown will be graceful. This patch updates the unit test to show the effect of the new flag, and the next commit will simplify nbdinfo by utilizing it. Signed-off-by: Eric Blake --- generator/API.ml | 21 - lib/disconnect.c | 15 +++ tests/shutdown-opt-mode.c | 27 ++- 3 files changed, 57 insertions(+), 6 deletions(-) diff --git a/generator/API.ml b/generator/API.ml index 1a9c8141..0b837259 100644 --- a/generator/API.ml +++ b/generator/API.ml @@ -255,6 +255,7 @@ let shutdown_flags = flag_prefix = "SHUTDOWN"; flags = [ "ABANDON_PENDING", 1 lsl 16; +"COVER_OPT_MODE", 1 lsl 17; ] } let all_flags = [ cmd_flags; handshake_flags; strict_flags; @@ -1231,7 +1232,8 @@ "set_opt_mode", { starting the connection. To leave the mode and proceed on to the ready state, you must use L successfully; a failed L returns to the negotiating state to allow a change of -export name before trying again. You may also use L +export name before trying again. You may also use L, +or the C flag of L, to end the connection without finishing negotiation."; example = Some "examples/list-exports.c"; see_also = [Link "get_opt_mode"; Link "aio_is_negotiating"; @@ -1240,7 +1242,7 @@ "set_opt_mode", { Link "opt_set_meta_context"; Link "opt_starttls"; Link "opt_structured_reply"; Link "set_tls"; Link "set_request_structured_replies"; -Link "aio_connect"]; +Link "aio_connect"; Link "shutdown"]; }; "get_opt_mode", { @@ -2667,7 +2669,7 @@ "shutdown", { default_call with args = []; optargs = [ OFlags ("flags", shutdown_flags, None) ]; ret = RErr; -permitted_states = [ Connected ]; +permitted_states = [ Negotiating; Connected ]; modifies_fd = true; shortdesc = "disconnect from the NBD server"; longdesc = "\ @@ -2678,7 +2680,7 @@ "shutdown", { This function works whether or not the handle is ready for transmission of commands. If more fine-grained control is -needed, see L. +needed, see L and L. The C argument is a bitmask, including zero or more of the following shutdown flags: @@ -2693,12 +2695,21 @@ "shutdown", { issuing those commands before informing the server of the intent to disconnect. +=item C = 0x2 + +If the server is still in option negotiation mode +(see L), gracefully abandon the +connection as if by L, instead of +complaining that the server is not yet fully connected. +If option negotiation mode was not in use or was +completed by L, this flag has no effect. + =back For convenience, the constant C is available to describe all shutdown flags recognized by this build of libnbd. A future version of the library may add new flags."; -see_also = [Link "close"; Link "aio_disconnect"]; +see_also = [Link "close"; Link "aio_disconnect"; Link "aio_opt_abort"]; example = Some "examples/reads-and-writes.c"; }; diff --git a/lib/disconnect.c b/lib/disconnect.c index 083d6cd3..d250e739 100644 --- a/lib/disconnect.c +++ b/lib/disconnect.c @@ -30,6 +30,20 @@ int nbd_unlocked_shutdown (struct nbd_handle *h, uint32_t flags) { + /* If still negotiating, return an error unless COVER_OPT_ABORT lets + * us trigger opt_abort instead. + */ + if (nbd_internal_is_state_negotiating (get_next_state (h))) { +if (flags & LIBNBD_SHUTDOWN_COVER_OPT_MODE) { + if (nbd_unlocked_aio_opt_abort (h) == -1) +return -1; + goto wait; +} +set_error (EINVAL, "invalid state: COVER_OPT_ABORT not requested, but " + "the handle is still negotiating options with the server"); +return -1; + } + /* If ABANDON_PENDING, abort any commands that have not yet had any * bytes sent to the server, so
[Libguestfs] [libnbd PATCH 1/3] tests: Test behavior of nbd_shutdown during opt mode
When we added nbd_set_opt_mode (v1.4), we did not do anything special to nbd_shutdown(). As a result, clients that use opt mode when available, but which want to gracefully close a socket rather than just forcefully disconnect via nbd_close(), have to take care to check the current state and then decide between nbd_opt_abort or nbd_shutdown (if they call both, one of the two will give an error message about being used from the wrong state). Add some unit test coverage of this prior to enhancing the API to make a clean shutdown easier for clients. Signed-off-by: Eric Blake --- tests/Makefile.am | 5 ++ tests/shutdown-opt-mode.c | 124 ++ .gitignore| 1 + 3 files changed, 130 insertions(+) create mode 100644 tests/shutdown-opt-mode.c diff --git a/tests/Makefile.am b/tests/Makefile.am index 52fadd9c..d2e9baee 100644 --- a/tests/Makefile.am +++ b/tests/Makefile.am @@ -189,6 +189,7 @@ check_PROGRAMS += \ errors-server-zerosize \ server-death \ shutdown-flags \ + shutdown-opt-mode \ get-size \ read-only-flag \ read-write-flag \ @@ -263,6 +264,7 @@ TESTS += \ errors-server-zerosize \ server-death \ shutdown-flags \ + shutdown-opt-mode \ get-size \ read-only-flag \ read-write-flag \ @@ -401,6 +403,9 @@ server_death_LDADD = $(top_builddir)/lib/libnbd.la shutdown_flags_SOURCES = shutdown-flags.c shutdown_flags_LDADD = $(top_builddir)/lib/libnbd.la +shutdown_opt_mode_SOURCES = shutdown-opt-mode.c +shutdown_opt_mode_LDADD = $(top_builddir)/lib/libnbd.la + get_size_SOURCES = get-size.c get_size_LDADD = $(top_builddir)/lib/libnbd.la diff --git a/tests/shutdown-opt-mode.c b/tests/shutdown-opt-mode.c new file mode 100644 index ..34386220 --- /dev/null +++ b/tests/shutdown-opt-mode.c @@ -0,0 +1,124 @@ +/* NBD client library in userspace + * Copyright Red Hat + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +/* Test shutdown in relation to opt mode. + */ + +#include + +#include +#include +#include +#include +#include + +#include + +static const char *progname; + +int +main (int argc, char *argv[]) +{ + struct nbd_handle *nbd; + const char *cmd_old[] = { "nbdkit", "--oldstyle", "-s", "--exit-with-parent", +"memory", "size=2m", NULL }; + const char *cmd_new[] = { "nbdkit", "-s", "--exit-with-parent", +"memory", "size=2m", NULL }; + + progname = argv[0]; + + /* Part 1: Request opt mode. With oldstyle, it is not possible. */ + nbd = nbd_create (); + if (nbd == NULL) { +fprintf (stderr, "%s: %s\n", progname, nbd_get_error ()); +exit (EXIT_FAILURE); + } + if (nbd_set_opt_mode (nbd, true) == -1) { +fprintf (stderr, "%s: %s\n", progname, nbd_get_error ()); +exit (EXIT_FAILURE); + } + if (nbd_connect_command (nbd, (char **)cmd_old) == -1) { +fprintf (stderr, "%s: %s\n", progname, nbd_get_error ()); +exit (EXIT_FAILURE); + } + if (nbd_aio_is_ready (nbd) != 1) { +fprintf (stderr, "%s: unexpected state\n", progname); +exit (EXIT_FAILURE); + } + + /* opt_abort fails, because we aren't in option negotiation. */ + if (nbd_opt_abort (nbd) != -1) { +fprintf (stderr, "%s: unexpected success of nbd_opt_abort\n", progname); +exit (EXIT_FAILURE); + } + if (nbd_get_errno () != EINVAL) { +fprintf (stderr, "%s: test failed: unexpected errno: %s\n", + progname, strerror (nbd_get_errno ())); +exit (EXIT_FAILURE); + } + /* Shutdown will succeed, since opt mode was not possible. */ + if (nbd_shutdown (nbd, 0) == -1) { +fprintf (stderr, "%s: %s\n", progname, nbd_get_error ()); +exit (EXIT_FAILURE); + } + nbd_close (nbd); + + /* Part 2: Request opt mode. With newstyle, it succeeds. */ + nbd = nbd_create (); + if (nbd == NULL) { +fprintf (stderr, "%s: %s\n", progname, nbd_get_error ()); +exit (EXIT_FAILURE); + } + if (nbd_set_opt_mode (nbd, true) == -1) { +fprintf (stderr, "%s: %s\n", progna
[Libguestfs] [libnbd PATCH 0/3] Simplify nbd_shutdown vs. opt mode
While working on a larger set of patches to make nbdinfo favor NBD_OPT_INFO over NBD_OPT_GO where possible (which requires use of nbd_set_opt_mode(,true) in more cases), I noticed that it got unwieldy to have to pick the correct shutdown function in all code paths. So I propose making the API smarter, by adding an opt-in flag that does the right thing on my behalf. If you have an idea for a better name for the flag, or think this functionality should be enabled by default, let me know. Part of the reason for choosing a new flag is that it becomes a compile-time witness of whether nbd_shutdown has the desired capability (if we allow it to auto-opt_abort without a flag, it's harder to tell whether we are running against an older libnbd where it errors out instead). Eric Blake (3): tests: Test behavior of nbd_shutdown during opt mode api: Add new COVER_OPT_MODE flag to nbd_shutdown info: Simplify shutdown calls generator/API.ml | 21 -- lib/disconnect.c | 15 tests/Makefile.am | 5 ++ tests/shutdown-opt-mode.c | 149 ++ .gitignore| 1 + info/list.c | 8 +- info/main.c | 4 +- 7 files changed, 188 insertions(+), 15 deletions(-) create mode 100644 tests/shutdown-opt-mode.c -- 2.41.0 ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [libnbd PATCH v9 5/7] rust: async: Add a couple of integration tests
On Sat, Aug 26, 2023 at 11:29:58AM +, Tage Johansson wrote: > Add a couple of integration tests as rust/tests/test_async_*.rs. They > are very similar to the tests for the synchronous API. > --- > rust/tests/test_async_460_block_status.rs | 98 > rust/tests/test_async_620_stats.rs| 69 Our series crossed paths. My commit 5aec7d3b add test_465_block_status_64.rs, but this commit of yours (at 0ed92592) did not add a counterpart async test, because you started your series before I added the 64-bit block status API. Since I heavily copy/pasted the synchronous 460 into 465, I'll probably end up doing the same for the async version, unless you beat me to it. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [libnbd PATCH 07/10] rust: Add a couple of integration tests
On Wed, Jul 19, 2023 at 09:09:51AM +, Tage Johansson wrote: > A couple of integration tests are added in rust/tests. They are mostly > ported from the OCaml tests. > --- > rust/tests/test_210_opt_abort.rs | 39 + > rust/tests/test_220_opt_list.rs | 85 +++ I noticed something today while working on issues with nbd_opt_abort vs. nbd_shutdown: > diff --git a/rust/tests/test_220_opt_list.rs b/rust/tests/test_220_opt_list.rs > new file mode 100644 > index 000..5abec5f > --- /dev/null > +++ b/rust/tests/test_220_opt_list.rs > @@ -0,0 +1,85 @@ > + > +impl ConnTester { > +fn new() -> Self { > +let srcdir = env::var("srcdir").unwrap(); > +let srcdir = Path::new(); > +let script_path = srcdir.join("../tests/opt-list.sh"); > +let script_path = > +CString::new(script_path.into_os_string().into_vec()).unwrap(); > +Self { script_path } > +} > + > +fn connect( This function is modeled after the 'let conn' function in test_220_opt_list.ml; however,... > +, > +mode: u8, > +expected_exports: &[], > +) -> libnbd::Result<()> { > +let nbd = libnbd::Handle::new().unwrap(); > +nbd.set_opt_mode(true).unwrap(); > +nbd.connect_command(&[ > +c_str!("nbdkit"), > +c_str!("-s"), > +c_str!("--exit-with-parent"), > +c_str!("-v"), > +c_str!("sh"), > +_path, > +::new(format!("mode={mode}")).unwrap(), > +]) > +.unwrap(); > + > +// Collect all exports in this list. > +let exports = Arc::new(Mutex::new(Vec::new())); > +let exports_clone = exports.clone(); > +let count = nbd.opt_list(move |name, _| { > +exports_clone.lock().unwrap().push(name.to_owned()); > +0 > +})?; > +let exports = > Arc::into_inner(exports).unwrap().into_inner().unwrap(); > +assert_eq!(exports.len(), count as usize); > +assert_eq!(exports.len(), expected_exports.len()); > +for (export, ) in exports.iter().zip(expected_exports) { > +assert_eq!(export.as_c_str(), expected); > +} > +Ok(()) ...the OCaml version calls 'NBD.opt_abort nbd' after verifying that exports match the expected list, while the Rust code does not. You can see the same thing in the Python tests (calling opt_abort before closing the handle). This means that for the Rust tests, the server is more likely to issue an error message about the client abruptly disconnecting, which doesn't really affect the test passing or failing, but adds spurious noise to the log files if we are ever trying to decipher whether two language bindings are doing the same thing. Also affected: 240, 245, and their async counterpart tests. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [libnbd PATCH v4 20/25] generator: Actually request extended headers
On Wed, Aug 02, 2023 at 08:50:40PM -0500, Eric Blake wrote: > This is the culmination of the previous patches' preparation work for > using extended headers when possible. The new states in the state > machine are copied extensively from our handling of > OPT_STRUCTURED_REPLY. The next patch will then expose a new API > nbd_opt_extended_headers() for manual control. > > > Signed-off-by: Eric Blake > Reviewed-by: Richard W.M. Jones > --- > > v4: expand commit message to document other alternatives I did not > code up I have now pushed up through this point into the tree as commits d668062c..099aa257; I'm still polishing the remaining few patches to work cleanly with qemu patches at any point along either patch series (nbdinfo first needs to be taught to favor NBD_OPT_INFO mode where possible; as qemu reports support for the optional extension of block status filtering during INFO but not during GO when requesting only one meta-context). -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [libnbd PATCH v8 06/10] rust: async: Create an async friendly handle type
s. > +let mut revent = epoll::Event { data: 0, events: 0 }; > +// Setting timeout to 0 means that it will return immediately. > +epoll::wait(epfd, 0, std::slice::from_mut( revent))?; > +let revents = Events::from_bits(revent.events).unwrap(); > +if !revents.contains(Events::EPOLLIN) { > +ready_guard.clear_ready_matching(IoReady::READABLE); > +} > +if !revents.contains(Events::EPOLLOUT) { > + ready_guard.clear_ready_matching(IoReady::WRITABLE); > +} > +ready_guard.retain_ready(); > +} > +} > diff --git a/rust/src/lib.rs b/rust/src/lib.rs > index a6f3131..56316b4 100644 > --- a/rust/src/lib.rs > +++ b/rust/src/lib.rs > @@ -17,11 +17,19 @@ > > #![deny(warnings)] > > +#[cfg(feature = "tokio")] > +mod async_bindings; > +#[cfg(feature = "tokio")] > +mod async_handle; > mod bindings; > mod error; > mod handle; > pub mod types; > mod utils; > +#[cfg(feature = "tokio")] > +pub use async_bindings::*; > +#[cfg(feature = "tokio")] > +pub use async_handle::{AsyncHandle, SharedResult}; > pub use bindings::*; > pub use error::{Error, ErrorKind, FatalErrorKind, Result}; > pub use handle::Handle; > diff --git a/rust/src/utils.rs b/rust/src/utils.rs > index b8200c1..8984ebb 100644 > --- a/rust/src/utils.rs > +++ b/rust/src/utils.rs > @@ -21,3 +21,12 @@ use std::ffi::c_void; > pub unsafe extern "C" fn drop_data(data: *mut c_void) { > drop(Box::from_raw(data as *mut T)) > } > + > +/// Turn a [FnOnce] (with a single `` argument) to a [FnMut] > +/// which panics on the second invocation. > +pub fn fn_once_to_fn_mut( > +f: impl FnOnce( T) -> U, > +) -> impl FnMut( T) -> U { > +let mut f = Some(f); > +move |x| (f.take().unwrap())(x) > +} > diff --git a/scripts/git.orderfile b/scripts/git.orderfile > index b988d87..60ec56d 100644 > --- a/scripts/git.orderfile > +++ b/scripts/git.orderfile > @@ -69,6 +69,7 @@ rust/src/types.rs > rust/src/utils.rs > rust/src/lib.rs > rust/src/handle.rs > +rust/src/async_handle.rs > rust/libnbd-sys/* > rust/examples/* > rust/tests/* > -- > 2.41.0 > > ___ > Libguestfs mailing list > Libguestfs@redhat.com > https://listman.redhat.com/mailman/listinfo/libguestfs > I'm still learning Rust, so a lot of this I just have to trust, but overall the patch seems like a good framework. While I definitely found some typos to fix, I'm less certain on whethere there are any major implementation flaws. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [libnbd PATCH v8 05/10] generator: Add information about asynchronous handle calls
On Sun, Aug 20, 2023 at 02:16:24PM +, Tage Johansson wrote: > A new field (async_kind) is added to the call data type in > generator/API.ml*. The purpose is to tell if a certain handle call is > an asynchronous command and if so how one can know when it is > completed. The motivation for this is that all asynchronous commands > on the AsyncHandle in the Rust bindings makes use of Rust's s/makes/make/ > [`async fn`s](https://doc.rust-lang.org/std/keyword.async.html). > But to implement such an async fn, the API needs to know when the > command completed, either by a completion callback or via a change > of state. > --- > generator/API.ml | 32 > generator/API.mli | 11 +++ > 2 files changed, 43 insertions(+) > > +++ b/generator/API.mli > @@ -36,6 +36,11 @@ type call = { >{b guaranteed} never to do that we can save a bit of time by >setting this to false. *) >may_set_error : bool; > + (** There are two types of asynchronous functions, those with a completion > + callback and those which changes state when completed. This field tells s/changes/change/ > + if the function is asynchronous and in that case how one can check if > + it has completed. *) > + async_kind : async_kind option; >(** The first stable version that the symbol appeared in, for >example (1, 2) if the symbol was added in development cycle >1.1.x and thus the first stable version was 1.2. This is > @@ -117,6 +122,12 @@ and permitted_state = > not including CLOSED or DEAD *) > | Closed | Dead(** can be called when the handle is > CLOSED or DEAD *) > +and async_kind = > +(** The asynchronous call has a completion callback. *) > +| WithCompletionCallback > +(** The asynchronous call is completed when the given handle call returns the > +given boolean value. Might for instance be ("aio_is_connected", false). > *) > +| ChangesState of string * bool > and link = > | Link of string (** link to L *) > | SectionLink of string(** link to L *) > -- > 2.41.0 Grammar changes can be made while merging; no need to respin unless later patches require more substantive updates. Reviewed-by: Eric Blake -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [libnbd PATCH v8 04/10] rust: Make it possible to run examples with a URI
On Tue, Aug 22, 2023 at 04:45:40PM -0500, Eric Blake wrote: > For 1-4, > Reviewed-by: Eric Blake > > and I've gone ahead and pushed them (55fec56c..e5563c0d), in order to > see the effect on CI. Sorry, wrong commit ids (those were on a local tree before rebasing); actual commits upstream are 49f98edd..49a97e3a -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [libnbd PATCH v4 01/25] block_status: Add some sanity checking of server lengths
On Fri, Aug 04, 2023 at 11:49:18AM +0100, Richard W.M. Jones wrote: > On Wed, Aug 02, 2023 at 08:50:21PM -0500, Eric Blake wrote: > > Previously, we had not been doing any validation of server extent > > responses, which means a client query at an offset near the end of the > > export can result in a buggy server sending a response longer than the > > export length and potentially confusing the client. The NBD spec also > > says that an extent length should be non-zero so that a successful > > block status call makes progress. It is easy enough to track that the > > server has not overflowed the export size, and that we ensure an error > > on no progress even when the buggy server claims success. Since the > > spec says a client should be prepared for a block status result to be > > truncated, the client should not care whether the truncation happened > > at the server or at libnbd after validating the server's response. > > > > In the process, this patch reorganizes some of the code so that early > > exits are obvious, leading for less indentation in the success path. > > > > Adding this sanity checking now makes it easier for future patches to > > do orthogonal support for a server's 32- or 64-bit reply, vs. a > > client's 32- or 64-bit API call. Once 64-bit replies are in play, we > > will additionally have to worry about a 64-bit reply that overflows a > > 32-bit API callback without exceeding the exportsize. Similarly, > > since nbd_get_size() already caps export size at 63 bits (based on > > off_t limitations), we have guaranteed that a 64-bit API callback will > > never see an extent length that could appear negative in a 64-bit > > signed type (at least OCaml benefits from that guarantee, since its > > only native 64-bit integer type is signed). > > > > Signed-off-by: Eric Blake > > --- > > Acked-by: Richard W.M. Jones This one is now in (e8d837d3), then I'm trying to get CI to a good shape on Rust before proceeding with the rest of my patches (so I can feel more confident I'm not causing Rust regressions). -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [libnbd PATCH v8 04/10] rust: Make it possible to run examples with a URI
On Sun, Aug 20, 2023 at 02:16:23PM +, Tage Johansson wrote: > Previously, the examples fetch-first-sector and get-size in > rust/examples only accepted a unix socket as argument. This commit makes > it possible to provide a URI as well. > --- > rust/examples/fetch-first-sector.rs | 15 +++ > rust/examples/get-size.rs | 20 +++- > 2 files changed, 26 insertions(+), 9 deletions(-) > +++ b/rust/examples/get-size.rs > @@ -5,6 +5,12 @@ > //! > //! nbdkit -U - memory 1M \ > //! --run 'cargo run --example get-size -- $unixsocket' > +//! Or with a URI: > +//! nbdkit -U - memory 1M \ > +//! --run 'cargo run --example get-size -- $uri' As $uri is likely to contain the shell glob character '?', we are best writing this as 'cargo ... -- "$uri"' to set a good example about how to avoid rare file name expansion interference. I made that change in place, as well as tweaking ocaml/examples/get_size.ml where you copied from. For 1-4, Reviewed-by: Eric Blake and I've gone ahead and pushed them (55fec56c..e5563c0d), in order to see the effect on CI. For the rest of the series, I would like to spend a bit more time reviewing. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [nbdkit PATCH] cc: Allow configuration without absolute paths
On Fri, Aug 18, 2023 at 08:27:45AM -0500, Eric Blake wrote: > In https://gitlab.com/nbdkit/nbdkit/-/merge_requests/30, Khem reports > that in a cross-compilation environment, nbdkit embeds the absolute > name of the cross-compiler into the resulting cc plugin, even though > running the plugin should be favoring the bare name 'cc'. This in > turn leads to non-reproducible builds. As the goal of cross-compiling > nbdkit is to produce a binary that behaves identically regardless of > the build environment used, this means we need to give the user > control over the defaults for CC and CFLAGS embedded into the cc > plugin. > > However, instead of trying to munge the build environment variable as > suggested in that merge request, I found it cleaner to just add > additional precious variables to be set at configure time, as in: > > ./configure CC=/path/to/cross-compiler CC_PLUGIN_CC='ccache gcc' ... > > Reported-by: Khem Raj > Signed-off-by: Eric Blake > --- > > gitlab doesn't let me see the right email address to cc; if I can > figure that out, I'll tweak the Reported-by line as appropriate before > committing... > --- > plugins/cc/nbdkit-cc-plugin.pod | 9 ++--- > configure.ac| 11 +++ > plugins/cc/Makefile.am | 4 ++-- > 3 files changed, 19 insertions(+), 5 deletions(-) > > diff --git a/plugins/cc/nbdkit-cc-plugin.pod b/plugins/cc/nbdkit-cc-plugin.pod > index 2974890c..2bc3cfb8 100644 > --- a/plugins/cc/nbdkit-cc-plugin.pod > +++ b/plugins/cc/nbdkit-cc-plugin.pod > @@ -45,9 +45,12 @@ To replace the compiler flags: > > The plugin parameters C, C and C (written in > uppercase) can be used to control which C compiler and C compiler > -flags are used. If not set, the default compiler and flags from when > -nbdkit was itself compiled from source are used. To see what those > -were you can do: > +flags are used. If not set, you can hardcode the defaults for C > +and C at the time nbdkit is compiled from source by > +configuring with C and C, > +otherwise, the configuration for compiling nbdkit itself is used > +(C can only be set from the command line when starting > +the cc plugin). To see what those were you can do: > > $ nbdkit cc --dump-plugin > ... Widening the context, ... CC=gcc CFLAGS=-g -O2 -fPIC -shared The C parameter overrides the built-in flags completely. The C parameter adds extra flags to the built-in flags. Since we already mention EXTRA_CFLAGS below the example, I'm not sure if my addition of a parenthetical about EXTRA_CFLAGS above is worthwhile, or just adds noise. > diff --git a/configure.ac b/configure.ac > index afc5ddab..e5e261c8 100644 > --- a/configure.ac > +++ b/configure.ac > @@ -820,6 +820,15 @@ AC_ARG_ENABLE([plugins], > [disable all bundled plugins and filters])]) > AM_CONDITIONAL([HAVE_PLUGINS], [test "x$enable_plugins" != "xno"]) > > +dnl For the cc plugin, let the user hard-code their preferred compiler setup > +dnl Default to the settings used for nbdkit itself > +AC_ARG_VAR([CC_PLUGIN_CC], > + [Value to use for CC when building the cc plugin, default $CC]) I'm wondering if there is a better way to word this (it shows up in './configure --help' output). Maybe: [Value to hard-code into the cc plugin's default for CC, instead of $CC] > +: "${CC_PLUGIN_CC:=$CC}" > +AC_ARG_VAR([CC_PLUGIN_CFLAGS], > + [Value to use for CFLAGS when building the cc plugin, default $CFLAGS]) > +: "${CC_PLUGIN_CFLAGS:=$CFLAGS}" and similar -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
[Libguestfs] [nbdkit PATCH] cc: Allow configuration without absolute paths
In https://gitlab.com/nbdkit/nbdkit/-/merge_requests/30, Khem reports that in a cross-compilation environment, nbdkit embeds the absolute name of the cross-compiler into the resulting cc plugin, even though running the plugin should be favoring the bare name 'cc'. This in turn leads to non-reproducible builds. As the goal of cross-compiling nbdkit is to produce a binary that behaves identically regardless of the build environment used, this means we need to give the user control over the defaults for CC and CFLAGS embedded into the cc plugin. However, instead of trying to munge the build environment variable as suggested in that merge request, I found it cleaner to just add additional precious variables to be set at configure time, as in: ./configure CC=/path/to/cross-compiler CC_PLUGIN_CC='ccache gcc' ... Reported-by: Khem Raj Signed-off-by: Eric Blake --- gitlab doesn't let me see the right email address to cc; if I can figure that out, I'll tweak the Reported-by line as appropriate before committing... --- plugins/cc/nbdkit-cc-plugin.pod | 9 ++--- configure.ac| 11 +++ plugins/cc/Makefile.am | 4 ++-- 3 files changed, 19 insertions(+), 5 deletions(-) diff --git a/plugins/cc/nbdkit-cc-plugin.pod b/plugins/cc/nbdkit-cc-plugin.pod index 2974890c..2bc3cfb8 100644 --- a/plugins/cc/nbdkit-cc-plugin.pod +++ b/plugins/cc/nbdkit-cc-plugin.pod @@ -45,9 +45,12 @@ To replace the compiler flags: The plugin parameters C, C and C (written in uppercase) can be used to control which C compiler and C compiler -flags are used. If not set, the default compiler and flags from when -nbdkit was itself compiled from source are used. To see what those -were you can do: +flags are used. If not set, you can hardcode the defaults for C +and C at the time nbdkit is compiled from source by +configuring with C and C, +otherwise, the configuration for compiling nbdkit itself is used +(C can only be set from the command line when starting +the cc plugin). To see what those were you can do: $ nbdkit cc --dump-plugin ... diff --git a/configure.ac b/configure.ac index afc5ddab..e5e261c8 100644 --- a/configure.ac +++ b/configure.ac @@ -820,6 +820,15 @@ AC_ARG_ENABLE([plugins], [disable all bundled plugins and filters])]) AM_CONDITIONAL([HAVE_PLUGINS], [test "x$enable_plugins" != "xno"]) +dnl For the cc plugin, let the user hard-code their preferred compiler setup +dnl Default to the settings used for nbdkit itself +AC_ARG_VAR([CC_PLUGIN_CC], + [Value to use for CC when building the cc plugin, default $CC]) +: "${CC_PLUGIN_CC:=$CC}" +AC_ARG_VAR([CC_PLUGIN_CFLAGS], + [Value to use for CFLAGS when building the cc plugin, default $CFLAGS]) +: "${CC_PLUGIN_CFLAGS:=$CFLAGS}" + dnl Check for Perl, for embedding in the perl plugin. dnl Note that the perl binary is checked above. AC_ARG_ENABLE([perl], @@ -1716,6 +1725,8 @@ feature "tests using libguestfs" \ test "x$HAVE_LIBGUESTFS_TRUE" = "x" && \ test "x$USE_LIBGUESTFS_FOR_TESTS_TRUE" = "x" feature "zlib-ng" test "x$ZLIB_NG_LIBS" != "x" +print cc-plugin-CC"$CC_PLUGIN_CC" +print cc-plugin-CFLAGS"$CC_PLUGIN_CFLAGS" echo echo "If any optional component is configured ânoâ when you expected âyesâ" diff --git a/plugins/cc/Makefile.am b/plugins/cc/Makefile.am index 935d125f..088b5ff3 100644 --- a/plugins/cc/Makefile.am +++ b/plugins/cc/Makefile.am @@ -45,8 +45,8 @@ nbdkit_cc_plugin_la_SOURCES = \ $(NULL) nbdkit_cc_plugin_la_CPPFLAGS = \ - -DCC="\"$(CC)\"" \ - -DCFLAGS="\"$(CFLAGS)\"" \ + -DCC="\"$(CC_PLUGIN_CC)\"" \ + -DCFLAGS="\"$(CC_PLUGIN_CFLAGS)\"" \ -I$(top_srcdir)/include \ -I$(top_builddir)/include \ -I$(top_srcdir)/common/include \ -- 2.41.0 ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [libnbd PATCH] golang: Bump minimum Go version to 1.17
On Thu, Aug 17, 2023 at 08:38:48PM +0300, Nir Soffer wrote: > > > > I'm not sure what is the purpose of this test - requiring the Go > > version is > > > > good > > > > enough since the code will not compile with an older version. EVen if > > it > > > > would, > > > > it will not compile without unsafe.Slice so no special check is needed. > > > > Turns out it does matter. On our CI system, Ubuntu 20.04 has Go > > 1.13.8 installed, and without this feature test, it compiled just fine > > (it wasn't until later versions of Go that go.mod's version request > > causes a compile failure if not satisfied). > > > > How does it compile when unsafe.Slice is undefined? > > Quick test with unrelated test app: > > $ go build; echo $? > # cobra-test > ./main.go:10:6: undefined: cmd.NoSuchMethod > 1 > > Or you mean the compile test for configure works and we want to make > the configure test fail to compile? It turns out the real problem was a missing && in the configure script (see commit b089d3f7). It doesn't matter if 'go run .' fails if 'go mod tidy' is still allowed to succeed right after. With that fixed, I got a few more green lines in the CI (before turning back to a bunch of red now that I added Rust into the CI...) -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [libnbd PATCH] ci: Update lcitool to request cargo during CI
On Tue, Aug 15, 2023 at 04:33:38PM -0500, Eric Blake wrote: > Regenerate the CI build scripts with: > > ../libvirt-ci/bin/lcitool manifest ci/manifest.yml > > using libvirt-ci bumped with a pending patch: > https://gitlab.com/libvirt/libvirt-ci/-/merge_requests/424 > > Signed-off-by: Eric Blake > --- > > CI is still not green, but I've been working on it today. I'm > currently waiting for a merge request to lcitool to land, if that > happens quickly, I can rework this commit message. I'm also playing > with pushing this patch to a forked repo, to see how it fares, before > sending it to the main repo. I'm not sure why Alpine builds are now hanging during interop/ (happened in the previous commit, not new to this one). There's another thread discussing how to fix the failures now showing up from too-old Rust. But libvirt-ci accepted my merge request, so I tweaked this, and it is now in as 49f98edd; and I will be doing a similar update to nbdkit shortly. Regarding awk failures seen in the previous commit for opensuse-tumbleweed, the OpenSUSE folks answered at https://bugzilla.suse.com/show_bug.cgi?id=1214365 that their change to drop gawk from their bare-bones repository for minimal CI images was intentional; but they didn't realize how many configure scripts depend on at least some form of awk, so they may still be fixing that in some other manner. In the meantime, they were in agreement with my resolution of explicitly requesting awk in any container with a configure script where it is not already pulled in from something else. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [libnbd PATCH v7 0/9] Rust Bindings for Libnbd
On Thu, Aug 17, 2023 at 04:13:36AM +, Tage Johansson wrote: > > On 8/16/2023 9:11 PM, Eric Blake wrote: > > On Thu, Aug 10, 2023 at 11:24:27AM +, Tage Johansson wrote: > > > This is the 7th version of the Rust bindings for Libnbd. It is more or > > > less identical to the 6th version without the already merged patches. > > > > > > Best regards, > > > Tage > > > > > I still hope to do more review of both merged patches and this one (in > > part, to learn more about Rust myself). But my first observation is > > that you currently have several build failures on different platforms. > > Here's some CI links summarizing the types of failures I'm seeing, > > when I try to turn on --enable-rust as part of our CI coverage: > > > All of these errors are due to the Rustc version is too old. At least rustc > 1.70.0 is required. Is it possible to update rust in the CI machines, > perhaps with rustup? There are several options. A good thing to remember is that both qemu and libvirt have a written policy on supported development platforms. For example: https://www.qemu.org/docs/master/about/build-platforms.html Since libnbd aims to be interoperable with qemu, it is generally advisable that we should strive to compile as much as possible on the same set of platforms as qemu has chosen to be viable for development. The libvirt-ci project provides 'lcitool' which makes it easy to create CI runners for all of these supported systems: https://gitlab.com/libvirt/libvirt-ci which is how I found the problems in our CI - I'm trying to push my patch to bump our CI engine to use the latest definitions, including adding cargo as a build dependency. I'm fine with saying 'to build this aspect, you need to install extra software from your distro', but I'm reluctant to say 'to build this aspect, you need to pull in extra software from a third-party source not already shipped by your distro'. So, as I see it, that leaves us with these options (or some hybrid combination of them): 1. Declare that we really do require 1.70.0 as our minimum version. CI should be tweaked to not even attempt Rust bindings on platforms that don't supply that. This has some suboptions: 1a. tweak ci/manifest.yml to not pull in cargo on affected systems (nbdkit already has examples of avoiding Rust bindings on known-problematic platforms; it would be easy enough to copy that logic over to libnbd) 1b. tweak configure.ac to probe that rustc is new enough to provide all features we depend on. If the Rust environment is too old, refuse to build Rust bindings on that platform, even if cargo is available as a program to call. Note that 1b is better than 1a: both can get the CI green, but only the latter plays nicely to all other developers and not just the CI system. 2. Decide that it is worth trying harder to support Rust bindings back to an older release, based on whatever the CI systems already have. I don't know if Debian 12's 0.66.0 is reasonable, but CentOS 8's 1.69.0 seems fairly close to 1.70.0. But I don't know how easy or hard it would be to get the same functionality using only the older features. Note that the CI system also tries on Debian 11, with an even older cargo 0.47.0, but it passed because configure noted: checking for cargo... cargo checking for rustfmt... no checking if cargo is usable... configure: WARNING: Rust (cargo) is installed but not usable no That's more of a 1b approach, but obviously not quite strong enough to filter out the other old versions on Debian 12. > > It is possible to add a minimum version requirement in Cargo.toml, I guess I > should do that to make the errors a bit more clear. It's okay if older platforms can't build Rust bindings, but anything we can do to make it so that configure detects that gracefully up front (and the rest of the build still succeeds), rather than falling apart later during make when the missing feature turns into a compiler error, is a good thing. And whatever we do, it doesn't hurt if we amend README to call out the minimum Rust version we are willing to support. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [libnbd PATCH v7 3/9] generator: Add information about the lifetime of closures
On Thu, Aug 10, 2023 at 11:24:30AM +, Tage Johansson wrote: > Add two new fields, cblifetime and cbcount, to the `closure` type > in generator/API.ml*. cblifetime tells if the closure may only be used > for as long as the command is in flight or if the closure may be used > until the handle is destructed. cbcount tells whether the closure may > be called many times or just once. > > This information is needed in the Rust bindings for: > a) Knowing if the closure trait should be FnMut or FnOnce >(see <https://doc.rust-lang.org/std/ops/trait.FnOnce.html>). > b) Knowing for what lifetime the closure should be valid. A closure that >may be called after the function invokation has returned must live invocation >for the `'static` lietime. But static closures are inconveniant for lifetime >the user since they can't effectively borrow any local data. So it is >good if this restriction is relaxed when it is not needed. > --- > generator/API.ml | 20 > generator/API.mli | 17 + > 2 files changed, 37 insertions(+) > > +++ b/generator/API.mli > @@ -94,6 +94,12 @@ and ret = > and closure = { >cbname : string; (** name of callback function *) >cbargs : cbarg list; (** all closures return int for now *) > + (** An upper bound of the lifetime of the closure. Either it will be used > for > + as long as the command is in flight or it may be used until the handle > + is destructed. *) > + cblifetime : cblifetime; > + (** Whether the callback may only be called once or many times. *) > + cbcount : cbcount; > } > and cbarg = > | CBArrayAndLen of arg * string (** array + number of entries *) > @@ -104,6 +110,17 @@ and cbarg = > | CBString of string (** like String *) > | CBUInt of string (** like UInt *) > | CBUInt64 of string (** like UInt64 *) > +and cblifetime = > +| CBCommand (** The closure may only be used until the command is retired. > +(E.G., completion callback or list callback.) *) > +| CBHandle (** The closure might be used until the handle is descructed. > +(E.G., debug callback.) *) > +and cbcount = > +| CBOnce (** The closure will be used 0 or 1 time if the aio_* call returned > an > + error and exactly once if the call succeeded. > + (E.g., completion callback.) *) Rather, the closure will not be used if the aio_* returned error, and used exactly once if the aio_* call succeeded. > +| CBMany (** The closure may be used any number of times. > + (E.g., list callback.) *) > and enum = { >enum_prefix : string;(** prefix of each enum variant *) >enums : (string * int) list (** enum names and their values in C *) > -- > 2.41.0 > > ___ > Libguestfs mailing list > Libguestfs@redhat.com > https://listman.redhat.com/mailman/listinfo/libguestfs > -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [libnbd PATCH v5 02/12] rust: Add a couple of integration tests
On Thu, Aug 03, 2023 at 03:36:06PM +, Tage Johansson wrote: > A couple of integration tests are added in rust/tests. They are mostly > ported from the OCaml tests. > --- Overall, this looked like a nice counterpart to the OCaml unit tests, and I was able to easily figure out how to amend my own tests in for the unit tests I added in my 64-bit extension work. Keeping similar unit test numbers across language bindings has been a big boon :-) Style question: > +++ b/rust/tests/test_250_opt_set_meta.rs > + > +/// A struct with information about set meta contexts. > +#[derive(Debug, Clone, PartialEq, Eq)] > +struct CtxInfo { > +/// Whether the meta context "base:allocation" is set. > +has_alloc: bool, > +/// The number of set meta contexts. > +count: u32, > +} Here, you use a trailing comma, > + > +fn set_meta_ctxs(nbd: ::Handle) -> libnbd::Result { > +let info = Arc::new(Mutex::new(CtxInfo { > +has_alloc: false, > +count: 0, > +})); > +let info_clone = info.clone(); > +let replies = nbd.opt_set_meta_context(move |ctx| { > +let mut info = info_clone.lock().unwrap(); > +info.count += 1; > +if ctx == CONTEXT_BASE_ALLOCATION { > +info.has_alloc = true; > +} > +0 > +})?; > +let info = Arc::into_inner(info).unwrap().into_inner().unwrap(); > +assert_eq!(info.count, replies); > +Ok(info) > +} > + ... > + > +// nbdkit does not match wildcard for SET, even though it does for LIST > +nbd.clear_meta_contexts().unwrap(); > +nbd.add_meta_context("base:").unwrap(); > +assert_eq!( > +set_meta_ctxs().unwrap(), > +CtxInfo { > +count: 0, > +has_alloc: false > +} whereas here, you did not. Does it make a difference? 'make check' still passes, and rustfmt doesn't complain, if I temporarily add a trailing comma here. I also note that 'rustfmt --check rust/tests/*.rs' flags a few style suggestions. I had started work on automating gofmt for all checked-in *.go files; maybe I should revive that patch and extend it to also automate rustfmt on all checked-in *.rs files. Here's what rustfmt suggested to me (rustfmt 1.5.2-stable ( ) on Fedora 38): Diff in /home/eblake/libnbd/rust/tests/test_200_connect_command.rs at line 17: #![deny(warnings)] - #[test] fn test_connect_command() { let nbd = libnbd::Handle::new().unwrap(); Diff in /home/eblake/libnbd/rust/tests/test_200_connect_command.rs at line 24: -nbd.connect_command(&[ -"nbdkit", -"-s", -"--exit-with-parent", -"-v", -"null", -]) -.unwrap(); +nbd.connect_command(&["nbdkit", "-s", "--exit-with-parent", "-v", "null"]) +.unwrap(); } Diff in /home/eblake/libnbd/rust/tests/test_300_get_size.rs at line 17: #![deny(warnings)] - #[test] fn test_get_size() { let nbd = libnbd::Handle::new().unwrap(); Diff in /home/eblake/libnbd/rust/tests/test_620_stats.rs at line 17: #![deny(warnings)] - #[test] fn test_stats() { let nbd = libnbd::Handle::new().unwrap(); Diff in /home/eblake/libnbd/rust/tests/test_620_stats.rs at line 31: // Connection performs handshaking, which increments stats. // The number of bytes/chunks here may grow over time as more features get // automatically negotiated, so merely check that they are non-zero. -nbd.connect_command(&[ -"nbdkit", -"-s", -"--exit-with-parent", -"-v", -"null", -]) -.unwrap(); +nbd.connect_command(&["nbdkit", "-s", "--exit-with-parent", "-v", "null"]) +.unwrap(); let bs1 = nbd.stats_bytes_sent(); let cs1 = nbd.stats_chunks_sent(); -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs