Re: [Libguestfs] [PATCH v4 15/24] nbd/server: Prepare to send extended header replies
On 08.06.23 16:56, Eric Blake wrote: Although extended mode is not yet enabled, once we do turn it on, we need to reply with extended headers to all messages. Update the low level entry points necessary so that all other callers automatically get the right header based on the current mode. Signed-off-by: Eric Blake --- v4: new patch, split out from v3 9/14 --- nbd/server.c | 30 ++ 1 file changed, 22 insertions(+), 8 deletions(-) diff --git a/nbd/server.c b/nbd/server.c index 119ac765f09..84c848a31d3 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -1947,8 +1947,6 @@ static inline void set_be_chunk(NBDClient *client, struct iovec *iov, size_t niov, uint16_t flags, uint16_t type, NBDRequest *request) { -/* TODO - handle structured vs. extended replies */ -NBDStructuredReplyChunk *chunk = iov->iov_base; size_t i, length = 0; for (i = 1; i < niov; i++) { @@ -1956,12 +1954,26 @@ static inline void set_be_chunk(NBDClient *client, struct iovec *iov, } assert(length <= NBD_MAX_BUFFER_SIZE + sizeof(NBDStructuredReadData)); -iov[0].iov_len = sizeof(*chunk); -stl_be_p(&chunk->magic, NBD_STRUCTURED_REPLY_MAGIC); -stw_be_p(&chunk->flags, flags); -stw_be_p(&chunk->type, type); -stq_be_p(&chunk->cookie, request->cookie); -stl_be_p(&chunk->length, length); +if (client->mode >= NBD_MODE_EXTENDED) { +NBDExtendedReplyChunk *chunk = iov->iov_base; + +iov->iov_len = sizeof(*chunk); I'd prefer to keep iov[0].iov_len notation, to stress that iov is an array anyway: Reviewed-by: Vladimir Sementsov-Ogievskiy +stl_be_p(&chunk->magic, NBD_EXTENDED_REPLY_MAGIC); +stw_be_p(&chunk->flags, flags); +stw_be_p(&chunk->type, type); +stq_be_p(&chunk->cookie, request->cookie); Hm. Not about this patch: we now moved to simple cookies. And it seems that actually, 64bit is too much for number of request. +stq_be_p(&chunk->offset, request->from); +stq_be_p(&chunk->length, length); +} else { +NBDStructuredReplyChunk *chunk = iov->iov_base; + +iov->iov_len = sizeof(*chunk); +stl_be_p(&chunk->magic, NBD_STRUCTURED_REPLY_MAGIC); +stw_be_p(&chunk->flags, flags); +stw_be_p(&chunk->type, type); +stq_be_p(&chunk->cookie, request->cookie); +stl_be_p(&chunk->length, length); +} } static int coroutine_fn nbd_co_send_chunk_done(NBDClient *client, @@ -2478,6 +2490,8 @@ static coroutine_fn int nbd_send_generic_reply(NBDClient *client, { if (client->mode >= NBD_MODE_STRUCTURED && ret < 0) { return nbd_co_send_chunk_error(client, request, -ret, error_msg, errp); +} else if (client->mode >= NBD_MODE_EXTENDED) { +return nbd_co_send_chunk_done(client, request, errp); } else { return nbd_co_send_simple_reply(client, request, ret < 0 ? -ret : 0, NULL, 0, errp); -- Best regards, Vladimir ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [PATCH v4 14/24] nbd/server: Prepare to receive extended header requests
On 08.06.23 16:56, Eric Blake wrote: Although extended mode is not yet enabled, once we do turn it on, we need to accept extended requests for all messages. Previous patches have already taken care of supporting 64-bit lengths, now we just need to read it off the wire. Note that this implementation will block indefinitely on a buggy client that sends a non-extended payload (that is, we try to read a full packet before we ever check the magic number, but a client that mistakenly sends a simple request after negotiating extended headers doesn't send us enough bytes), but it's no different from any other client that stops talking to us partway through a packet and thus not worth coding around. Signed-off-by: Eric Blake Reviewed-by: Vladimir Sementsov-Ogievskiy -- Best regards, Vladimir ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [PATCH v4 13/24] nbd/server: Refactor handling of request payload
On 08.06.23 21:29, Eric Blake wrote: On Thu, Jun 08, 2023 at 08:56:42AM -0500, Eric Blake wrote: Upcoming additions to support NBD 64-bit effect lengths allow for the possibility to distinguish between payload length (capped at 32M) and effect length (up to 63 bits). Without that extension, only the Technically, the NBD spec does not (yet) have a documented cap of a 63-bit size limit; although I should probably propose a patch upstream that does that (I had one in my draft work, but it hasn't yet made it upstream) NBD_CMD_WRITE request has a payload; but with the extension, it makes sense to allow at least NBD_CMD_BLOCK_STATUS to have both a payload and effect length (where the payload is a limited-size struct that in turns gives the real effect length as well as a subset of known ids for which status is requested). Other future NBD commands may also have a request payload, so the 64-bit extension introduces a new NBD_CMD_FLAG_PAYLOAD_LEN that distinguishes between whether the header length is a payload length or an effect length, rather than hard-coding the decision based on the command. Note that we do not support the payload version of BLOCK_STATUS yet. For this patch, no semantic change is intended for a compliant client. For a non-compliant client, it is possible that the error behavior changes (a different message, a change on whether the connection is killed or remains alive for the next command, or so forth), in part because req->complete is set later on some paths, but all errors should still be handled gracefully. Signed-off-by: Eric Blake --- v4: less indentation on several 'if's [Vladimir] --- nbd/server.c | 76 ++-- nbd/trace-events | 1 + 2 files changed, 49 insertions(+), 28 deletions(-) diff --git a/nbd/server.c b/nbd/server.c index 4ac05d0cd7b..d7dc29f0445 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -2329,6 +2329,8 @@ static int coroutine_fn nbd_co_receive_request(NBDRequestData *req, NBDRequest * Error **errp) { NBDClient *client = req->client; +bool extended_with_payload; +unsigned payload_len = 0; int valid_flags; int ret; @@ -2342,48 +2344,63 @@ static int coroutine_fn nbd_co_receive_request(NBDRequestData *req, NBDRequest * trace_nbd_co_receive_request_decode_type(request->cookie, request->type, nbd_cmd_lookup(request->type)); -if (request->type != NBD_CMD_WRITE) { -/* No payload, we are ready to read the next request. */ -req->complete = true; -} - if (request->type == NBD_CMD_DISC) { /* Special case: we're going to disconnect without a reply, * whether or not flags, from, or len are bogus */ +req->complete = true; return -EIO; } -if (request->type == NBD_CMD_READ || request->type == NBD_CMD_WRITE || -request->type == NBD_CMD_CACHE) -{ -if (request->len > NBD_MAX_BUFFER_SIZE) { -error_setg(errp, "len (%" PRIu64" ) is larger than max len (%u)", - request->len, NBD_MAX_BUFFER_SIZE); -return -EINVAL; +/* Payload and buffer handling. */ +extended_with_payload = client->mode >= NBD_MODE_EXTENDED && +(request->flags & NBD_CMD_FLAG_PAYLOAD_LEN); +if ((request->type == NBD_CMD_READ || request->type == NBD_CMD_WRITE || + request->type == NBD_CMD_CACHE || extended_with_payload) && +request->len > NBD_MAX_BUFFER_SIZE) { I'm still debating about rewriting this series of slightly-different 'if' into a single switch (request->type) block; the benefit is that I vote for switch!) Really, seems it would be a lot simpler to read. all actions for one command will be localized instead of split over multiple lines of if, the drawback is that some code will now have to be duplicated across commands (such as the buffer allocation code for CMD_READ and CMD_WRITE). -- Best regards, Vladimir ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] [PATCH v4 12/24] nbd: Prepare for 64-bit request effect lengths
On 08.06.23 16:56, Eric Blake wrote: Widen the length field of NBDRequest to 64-bits, although we can assert that all current uses are still under 32 bits, because nothing ever puts us into NBD_MODE_EXTENDED yet. Thus no semantic change. No semantic change yet. Signed-off-by: Eric Blake assertions about NBD_MAX_BUFFER_SIZE worth a note in commit message? trace eventsjk nbd_co_request_fail, nbd_co_request_fail, nbd_co_request_fail missed an update to 64bit Also, some functions use size_t. Should we move them to uint64_t for consistancy? - nbd_co_send_sparse_read - nbd_co_send_chunk_read - nbd_co_send_simple_reply but that may be another story -- Best regards, Vladimir ___ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs
Re: [Libguestfs] libldm crashes in a linux-sandbox context
On Fri, Jun 16, 2023 at 11:17:21AM +0900, Vincent MAILHOL wrote: > Hi Richard, > > On Fri. 16 Jun. 2023 à 03:08, Richard W.M. Jones wrote: > > On Thu, Jun 15, 2023 at 09:18:38PM +0900, Vincent Mailhol wrote: > > > Hello, > > > > > > I am using libguestfs in a Bazel's linux-sandbox environment[1]. > > > > > > When executing in that sandbox environment, I got frequent crashes. > > > > > > Please find attached below the results of libguestfs-test-tool when > > > run into that linux-sandbox environment. The most relevant part seems > > > to be: > > > > > > [0.797233] ldmtool[164]: segfault at 0 ip 564a892506a5 sp > > > 7fff8ee5b900 error 4 in ldmtool[564a8924e000+3000] > > > [0.798117] Code: 18 64 48 33 1c 25 28 00 00 00 75 5e 48 83 c4 28 5b > > > 5d 41 5c 41 5d 41 5e 41 5f c3 66 2e 0f 1f 84 00 00 00 00 00 e8 db fd ff > > > ff <4c> 8b 20 48 89 44 24 08 4c 89 e7 e8 0b e1 ff ff 45 31 c0 4c 89 e1 > > > /init: line 154: 164 Segmentation fault ldmtool create all > > > > > > So the root cause seems to be around libldm. This mailing list seems > > > to cover both libguestfs and libldm, so hopefully, I am at the right > > > place to ask :) > > > > > > Needless to say, when run outside of the sandbox environment, no crash > > > were observed. > > > > > > [1] linux-sandbox.cc > > > Link: > > > https://github.com/bazelbuild/bazel/blob/master/src/main/tools/linux-sandbox.cc > > > > > > --- > > ... > > > supermin: picked /sys/block/sdb/dev (8:16) as root device > > > supermin: creating /dev/root as block special 8:16 > > > supermin: mounting new root on /root > > > [0.678248] EXT4-fs (sdb): mounting ext2 file system using the ext4 > > > subsystem > > > [0.679832] EXT4-fs (sdb): mounted filesystem without journal. Opts: . > > > Quota mode: none. > > > supermin: deleting initramfs files > > > supermin: chroot > > > Starting /init script ... > > > mount: only root can use "--types" option (effective UID is 65534) > > > /init: line 38: /proc/cmdline: No such file or directory > > > mount: only root can use "--types" option (effective UID is 65534) > > > mount: only root can use "--options" option (effective UID is 65534) > > > mount: only root can use "--types" option (effective UID is 65534) > > > mount: only root can use "--types" option (effective UID is 65534) > > > mount: only root can use "--options" option (effective UID is 65534) > > > > It really goes wrong from here, where apparently it's not running as > > root (instead UID 65534), even though we're supposed to be running > > inside a Linux appliance virtual machine. > > > > Any idea why that would be? > > > > I looked at the sandbox and that would run the qemu process as UID > > "nobody" (which might be 65534). However I don't understand why that > > would affect anything running on the new kernel inside the appliance. > > And you were right. It was a fact that I got a crash in the sandbox > but did not outside of it and I jumped to the conclusion that the root > cause was linked to the sandbox. > > I continued the analysis and looked at all the differences between a > successful libguestfs-test-tool log and the failed one. It turned out > that the sandbox was not the cause. The culprit turns out to be the > first line of the log: TMPDIR=/tmp. > > If I force TMPDIR=/var/tmp, the problem disappears !! > > This gave me a minimal reproducer: > > TMPDIR=/tmp/ libguestfs-test-tool > > That one crashed outside the sandbox. Next, my attention went to this line: > > libguestfs: checking for previously cached test results of > /usr/bin/qemu-system-x86_64, in /tmp/.guestfs-1001 > > I did a: > > rm -rf /tmp/.guestfs-1001 > > and that solved my issue \o/ > > I still do not understand how I could get the issue of running of UID > 65534 instead of root in the first place. I did other qemu > experimentation, so not sure how, but I somehow got a corrupted > environment under /tmp/.guestfs-1001. We will cache the appliance under $TMPDIR/.guestfs-$UID/ (eg have a look at appliance/root in that directory). We rebuild it if the distro changes, so most of the time we don't have to rebuild it when launching libguestfs (although there was a long-standing bug which I fixed recently: https://github.com/libguestfs/supermin/commit/8c38641042e274a713a18daf7fc85584ca0fc9bb). > Last thing, the segfault on ldmtool [1] still seems a valid issue. > Even if I now do have a workaround for my problem, that segfault might > be worth a bit more investigation. Yes that does look like a real problem. Does it crash if you just run ldmtool as a normal command, nothing to do with libguestfs? Might be a good idea to try to get a stack trace of the crash. Rich. > Regardless, thanks a lot for your quick answer, that helped me to > continue the troubleshooting. > > [1] ldmtool line 164 > Link: https://github.com/mdbooth/libldm/blob/master/src/ldmtool.c#L164 -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones R