Re: [Libguestfs] [PATCH v4 15/24] nbd/server: Prepare to send extended header replies

2023-06-16 Thread Vladimir Sementsov-Ogievskiy

On 08.06.23 16:56, Eric Blake wrote:

Although extended mode is not yet enabled, once we do turn it on, we
need to reply with extended headers to all messages.  Update the low
level entry points necessary so that all other callers automatically
get the right header based on the current mode.

Signed-off-by: Eric Blake 
---

v4: new patch, split out from v3 9/14
---
  nbd/server.c | 30 ++
  1 file changed, 22 insertions(+), 8 deletions(-)

diff --git a/nbd/server.c b/nbd/server.c
index 119ac765f09..84c848a31d3 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -1947,8 +1947,6 @@ static inline void set_be_chunk(NBDClient *client, struct 
iovec *iov,
  size_t niov, uint16_t flags, uint16_t type,
  NBDRequest *request)
  {
-/* TODO - handle structured vs. extended replies */
-NBDStructuredReplyChunk *chunk = iov->iov_base;
  size_t i, length = 0;

  for (i = 1; i < niov; i++) {
@@ -1956,12 +1954,26 @@ static inline void set_be_chunk(NBDClient *client, 
struct iovec *iov,
  }
  assert(length <= NBD_MAX_BUFFER_SIZE + sizeof(NBDStructuredReadData));

-iov[0].iov_len = sizeof(*chunk);
-stl_be_p(&chunk->magic, NBD_STRUCTURED_REPLY_MAGIC);
-stw_be_p(&chunk->flags, flags);
-stw_be_p(&chunk->type, type);
-stq_be_p(&chunk->cookie, request->cookie);
-stl_be_p(&chunk->length, length);
+if (client->mode >= NBD_MODE_EXTENDED) {
+NBDExtendedReplyChunk *chunk = iov->iov_base;
+
+iov->iov_len = sizeof(*chunk);


I'd prefer to keep iov[0].iov_len notation, to stress that iov is an array

anyway:
Reviewed-by: Vladimir Sementsov-Ogievskiy 


+stl_be_p(&chunk->magic, NBD_EXTENDED_REPLY_MAGIC);
+stw_be_p(&chunk->flags, flags);
+stw_be_p(&chunk->type, type);
+stq_be_p(&chunk->cookie, request->cookie);


Hm. Not about this patch:

we now moved to simple cookies. And it seems that actually, 64bit is too much 
for number of request.


+stq_be_p(&chunk->offset, request->from);
+stq_be_p(&chunk->length, length);
+} else {
+NBDStructuredReplyChunk *chunk = iov->iov_base;
+
+iov->iov_len = sizeof(*chunk);
+stl_be_p(&chunk->magic, NBD_STRUCTURED_REPLY_MAGIC);
+stw_be_p(&chunk->flags, flags);
+stw_be_p(&chunk->type, type);
+stq_be_p(&chunk->cookie, request->cookie);
+stl_be_p(&chunk->length, length);
+}
  }

  static int coroutine_fn nbd_co_send_chunk_done(NBDClient *client,
@@ -2478,6 +2490,8 @@ static coroutine_fn int nbd_send_generic_reply(NBDClient 
*client,
  {
  if (client->mode >= NBD_MODE_STRUCTURED && ret < 0) {
  return nbd_co_send_chunk_error(client, request, -ret, error_msg, 
errp);
+} else if (client->mode >= NBD_MODE_EXTENDED) {
+return nbd_co_send_chunk_done(client, request, errp);
  } else {
  return nbd_co_send_simple_reply(client, request, ret < 0 ? -ret : 0,
  NULL, 0, errp);


--
Best regards,
Vladimir

___
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs



Re: [Libguestfs] [PATCH v4 14/24] nbd/server: Prepare to receive extended header requests

2023-06-16 Thread Vladimir Sementsov-Ogievskiy

On 08.06.23 16:56, Eric Blake wrote:

Although extended mode is not yet enabled, once we do turn it on, we
need to accept extended requests for all messages.  Previous patches
have already taken care of supporting 64-bit lengths, now we just need
to read it off the wire.

Note that this implementation will block indefinitely on a buggy
client that sends a non-extended payload (that is, we try to read a
full packet before we ever check the magic number, but a client that
mistakenly sends a simple request after negotiating extended headers
doesn't send us enough bytes), but it's no different from any other
client that stops talking to us partway through a packet and thus not
worth coding around.

Signed-off-by: Eric Blake



Reviewed-by: Vladimir Sementsov-Ogievskiy 

--
Best regards,
Vladimir

___
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs



Re: [Libguestfs] [PATCH v4 13/24] nbd/server: Refactor handling of request payload

2023-06-16 Thread Vladimir Sementsov-Ogievskiy

On 08.06.23 21:29, Eric Blake wrote:

On Thu, Jun 08, 2023 at 08:56:42AM -0500, Eric Blake wrote:

Upcoming additions to support NBD 64-bit effect lengths allow for the
possibility to distinguish between payload length (capped at 32M) and
effect length (up to 63 bits).  Without that extension, only the


Technically, the NBD spec does not (yet) have a documented cap of a
63-bit size limit; although I should probably propose a patch upstream
that does that (I had one in my draft work, but it hasn't yet made it
upstream)


NBD_CMD_WRITE request has a payload; but with the extension, it makes
sense to allow at least NBD_CMD_BLOCK_STATUS to have both a payload
and effect length (where the payload is a limited-size struct that in
turns gives the real effect length as well as a subset of known ids
for which status is requested).  Other future NBD commands may also
have a request payload, so the 64-bit extension introduces a new
NBD_CMD_FLAG_PAYLOAD_LEN that distinguishes between whether the header
length is a payload length or an effect length, rather than
hard-coding the decision based on the command.  Note that we do not
support the payload version of BLOCK_STATUS yet.

For this patch, no semantic change is intended for a compliant client.
For a non-compliant client, it is possible that the error behavior
changes (a different message, a change on whether the connection is
killed or remains alive for the next command, or so forth), in part
because req->complete is set later on some paths, but all errors
should still be handled gracefully.

Signed-off-by: Eric Blake 
---

v4: less indentation on several 'if's [Vladimir]
---
  nbd/server.c | 76 ++--
  nbd/trace-events |  1 +
  2 files changed, 49 insertions(+), 28 deletions(-)

diff --git a/nbd/server.c b/nbd/server.c
index 4ac05d0cd7b..d7dc29f0445 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -2329,6 +2329,8 @@ static int coroutine_fn 
nbd_co_receive_request(NBDRequestData *req, NBDRequest *
 Error **errp)
  {
  NBDClient *client = req->client;
+bool extended_with_payload;
+unsigned payload_len = 0;
  int valid_flags;
  int ret;

@@ -2342,48 +2344,63 @@ static int coroutine_fn 
nbd_co_receive_request(NBDRequestData *req, NBDRequest *
  trace_nbd_co_receive_request_decode_type(request->cookie, request->type,
   nbd_cmd_lookup(request->type));

-if (request->type != NBD_CMD_WRITE) {
-/* No payload, we are ready to read the next request.  */
-req->complete = true;
-}
-
  if (request->type == NBD_CMD_DISC) {
  /* Special case: we're going to disconnect without a reply,
   * whether or not flags, from, or len are bogus */
+req->complete = true;
  return -EIO;
  }

-if (request->type == NBD_CMD_READ || request->type == NBD_CMD_WRITE ||
-request->type == NBD_CMD_CACHE)
-{
-if (request->len > NBD_MAX_BUFFER_SIZE) {
-error_setg(errp, "len (%" PRIu64" ) is larger than max len (%u)",
-   request->len, NBD_MAX_BUFFER_SIZE);
-return -EINVAL;
+/* Payload and buffer handling. */
+extended_with_payload = client->mode >= NBD_MODE_EXTENDED &&
+(request->flags & NBD_CMD_FLAG_PAYLOAD_LEN);
+if ((request->type == NBD_CMD_READ || request->type == NBD_CMD_WRITE ||
+ request->type == NBD_CMD_CACHE || extended_with_payload) &&
+request->len > NBD_MAX_BUFFER_SIZE) {


I'm still debating about rewriting this series of slightly-different
'if' into a single switch (request->type) block; the benefit is that


I vote for switch!) Really, seems it would be a lot simpler to read.


all actions for one command will be localized instead of split over
multiple lines of if, the drawback is that some code will now have to
be duplicated across commands (such as the buffer allocation code for
CMD_READ and CMD_WRITE).



--
Best regards,
Vladimir

___
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs



Re: [Libguestfs] [PATCH v4 12/24] nbd: Prepare for 64-bit request effect lengths

2023-06-16 Thread Vladimir Sementsov-Ogievskiy

On 08.06.23 16:56, Eric Blake wrote:

Widen the length field of NBDRequest to 64-bits, although we can
assert that all current uses are still under 32 bits, because nothing
ever puts us into NBD_MODE_EXTENDED yet.  Thus no semantic change.  No
semantic change yet.

Signed-off-by: Eric Blake 


assertions about NBD_MAX_BUFFER_SIZE worth a note in commit message?

trace eventsjk nbd_co_request_fail, nbd_co_request_fail, nbd_co_request_fail 
missed an update to 64bit

Also, some functions use size_t. Should we move them to uint64_t for 
consistancy?

 - nbd_co_send_sparse_read
 - nbd_co_send_chunk_read
 - nbd_co_send_simple_reply

but that may be another story



--
Best regards,
Vladimir

___
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs



Re: [Libguestfs] libldm crashes in a linux-sandbox context

2023-06-16 Thread Richard W.M. Jones
On Fri, Jun 16, 2023 at 11:17:21AM +0900, Vincent MAILHOL wrote:
> Hi Richard,
> 
> On Fri. 16 Jun. 2023 à 03:08, Richard W.M. Jones  wrote:
> > On Thu, Jun 15, 2023 at 09:18:38PM +0900, Vincent Mailhol wrote:
> > > Hello,
> > >
> > > I am using libguestfs in a Bazel's linux-sandbox environment[1].
> > >
> > > When executing in that sandbox environment, I got frequent crashes.
> > >
> > > Please find attached below the results of libguestfs-test-tool when
> > > run into that linux-sandbox environment. The most relevant part seems
> > > to be:
> > >
> > >   [0.797233] ldmtool[164]: segfault at 0 ip 564a892506a5 sp 
> > > 7fff8ee5b900 error 4 in ldmtool[564a8924e000+3000]
> > >   [0.798117] Code: 18 64 48 33 1c 25 28 00 00 00 75 5e 48 83 c4 28 5b 
> > > 5d 41 5c 41 5d 41 5e 41 5f c3 66 2e 0f 1f 84 00 00 00 00 00 e8 db fd ff 
> > > ff <4c> 8b 20 48 89 44 24 08 4c 89 e7 e8 0b e1 ff ff 45 31 c0 4c 89 e1
> > >   /init: line 154:   164 Segmentation fault  ldmtool create all
> > >
> > > So the root cause seems to be around libldm. This mailing list seems
> > > to cover both libguestfs and libldm, so hopefully, I am at the right
> > > place to ask :)
> > >
> > > Needless to say, when run outside of the sandbox environment, no crash
> > > were observed.
> > >
> > > [1] linux-sandbox.cc
> > > Link: 
> > > https://github.com/bazelbuild/bazel/blob/master/src/main/tools/linux-sandbox.cc
> > >
> > > ---
> > ...
> > > supermin: picked /sys/block/sdb/dev (8:16) as root device
> > > supermin: creating /dev/root as block special 8:16
> > > supermin: mounting new root on /root
> > > [0.678248] EXT4-fs (sdb): mounting ext2 file system using the ext4 
> > > subsystem
> > > [0.679832] EXT4-fs (sdb): mounted filesystem without journal. Opts: . 
> > > Quota mode: none.
> > > supermin: deleting initramfs files
> > > supermin: chroot
> > > Starting /init script ...
> > > mount: only root can use "--types" option (effective UID is 65534)
> > > /init: line 38: /proc/cmdline: No such file or directory
> > > mount: only root can use "--types" option (effective UID is 65534)
> > > mount: only root can use "--options" option (effective UID is 65534)
> > > mount: only root can use "--types" option (effective UID is 65534)
> > > mount: only root can use "--types" option (effective UID is 65534)
> > > mount: only root can use "--options" option (effective UID is 65534)
> >
> > It really goes wrong from here, where apparently it's not running as
> > root (instead UID 65534), even though we're supposed to be running
> > inside a Linux appliance virtual machine.
> >
> > Any idea why that would be?
> >
> > I looked at the sandbox and that would run the qemu process as UID
> > "nobody" (which might be 65534).  However I don't understand why that
> > would affect anything running on the new kernel inside the appliance.
> 
> And you were right. It was a fact that I got a crash in the sandbox
> but did not outside of it and I jumped to the conclusion that the root
> cause was linked to the sandbox.
> 
> I continued the analysis and looked at all the differences between a
> successful libguestfs-test-tool log and the failed one. It turned out
> that the sandbox was not the cause. The culprit turns out to be the
> first line of the log: TMPDIR=/tmp.
> 
> If I force TMPDIR=/var/tmp, the problem disappears !!
> 
> This gave me a minimal reproducer:
> 
>   TMPDIR=/tmp/ libguestfs-test-tool
> 
> That one crashed outside the sandbox. Next, my attention went to this line:
> 
>   libguestfs: checking for previously cached test results of
> /usr/bin/qemu-system-x86_64, in /tmp/.guestfs-1001
> 
> I did a:
> 
>   rm -rf /tmp/.guestfs-1001
> 
> and that solved my issue \o/
> 
> I still do not understand how I could get the issue of running of UID
> 65534 instead of root in the first place. I did other qemu
> experimentation, so not sure how, but I somehow got a corrupted
> environment under /tmp/.guestfs-1001.

We will cache the appliance under $TMPDIR/.guestfs-$UID/ (eg have a
look at appliance/root in that directory).

We rebuild it if the distro changes, so most of the time we don't have
to rebuild it when launching libguestfs (although there was a
long-standing bug which I fixed recently:
https://github.com/libguestfs/supermin/commit/8c38641042e274a713a18daf7fc85584ca0fc9bb).

> Last thing, the segfault on ldmtool [1] still seems a valid issue.
> Even if I now do have a workaround for my problem, that segfault might
> be worth a bit more investigation.

Yes that does look like a real problem.  Does it crash if you just run
ldmtool as a normal command, nothing to do with libguestfs?  Might be
a good idea to try to get a stack trace of the crash.

Rich.

> Regardless, thanks a lot for your quick answer, that helped me to
> continue the troubleshooting.
> 
> [1] ldmtool line 164
> Link: https://github.com/mdbooth/libldm/blob/master/src/ldmtool.c#L164

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
R