Re: [Libguestfs] [libnbd PATCH] info: Try harder for graceful disconnect from server

2023-09-28 Thread Richard W.M. Jones
On Wed, Sep 27, 2023 at 04:22:48PM -0500, Eric Blake wrote:
> On Fri, Sep 22, 2023 at 04:03:19PM -0500, Eric Blake wrote:
> > On Fri, Sep 22, 2023 at 09:47:55PM +0100, Richard W.M. Jones wrote:
> > > On Fri, Sep 22, 2023 at 12:33:36PM -0500, Eric Blake wrote:
> > > > There are a number of ways in which gathering information can fail.
> > > > But when possible, it makes sense to let the server know that we are
> > > > done talking, as it minimizes the likelihood that nbdinfo's error
> > > > message will be obscured by an accompanying error message by the
> > > > server complaining about an unclean disconnect.
> > > > 
> > > > For example, with a one-off qemu temporarily patched as mentioned in
> > > > commit 0f8ee8c6 to advertise sizes larger than 2^63, kicking off
> > > > 'qemu-nbd -f raw -r file &' produces:
> > > > 
> > > > pre-patch:
> > > > 
> > > > $ ./run nbdinfo --size nbd://localhost
> > > > /home/eblake/libnbd/info/.libs/nbdinfo: nbd_get_size: server claims 
> > > > size 9223372036854781440 which does not fit in signed result: Value too 
> > > > large for defined data type
> > > > qemu-nbd: option negotiation failed: Failed to read opts magic: 
> > > > Unexpected end-of-file before all data were read
> > > 
> > > This doesn't necessarily seem like a bug?
> > 
> > It's a quality of service thing - qemu is just being noisy that the
> > client closed the socket abruptly without giving any reason why.  It
> > doesn't change libnbd's behavior (nbdinfo has already reported its
> > error message), but may confuse people reading qemu-nbd logs.
> > 
> > It may also be a case where we could patch qemu-nbd to not output a
> > complaint if the client exited at a clean point in the back-and-forth
> > transactions.  We still want to be noisy if the socket closes after
> > the first byte has been read, but if there are zero bytes available,
> > announcing that the client no longer cares does not add much value.
> > 
> > > 
> > > This feels like quite a significant change in behaviour to me.  In
> > > particular I'm worried if it interacts in some subtle way with the
> > > forking done by the "[ ... ]" syntax for running servers on the
> > > command line (or any of the other ways that libnbd might fork/exec).
> > 
> > Interesting observation.  atexit() handlers are not preserved across
> > exec, and I think all our fork() paths end in either exec or _exit
> > (also where atexit handlers are ignored), so I don't think we are
> > risking calling the handler twice.

So there's a couple of other things I don't like about this patch
after further reflection:

(1) It causes nbdinfo to do extra I/O on the error path.  While I
think we're only writing (NBD_CMD_DISC?) and not expecting a reply,
there might be some situation where a server (especially a local one)
is able to block progress by refusing to read, causing nbdinfo to
hang on exit.  I'm not sure how realistic that is.

(2) It hints that the best way to use the API is to always issue a
shutdown before close, but in reality that's not required.

I think it's also trying to solve something which isn't a problem.

Rich.

> > > 
> > > Can we hold this patch until after 1.18 is released and then put it
> > > in?  Should only be a week or two.
> > 
> > Sure, being conservative on this one is fine by me.  I'll delay it
> > until after 1.18.
> > 
> > > 
> > > Provisionally ACKed for post-1.18
> 
> Now in as commit fd4f3fea, so we can start getting some soak time under CI.
> 
> -- 
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.
> Virtualization:  qemu.org | libguestfs.org

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
nbdkit - Flexible, fast NBD server with plugins
https://gitlab.com/nbdkit/nbdkit
___
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs



Re: [Libguestfs] [libnbd PATCH] info: Try harder for graceful disconnect from server

2023-09-27 Thread Eric Blake
On Fri, Sep 22, 2023 at 04:03:19PM -0500, Eric Blake wrote:
> On Fri, Sep 22, 2023 at 09:47:55PM +0100, Richard W.M. Jones wrote:
> > On Fri, Sep 22, 2023 at 12:33:36PM -0500, Eric Blake wrote:
> > > There are a number of ways in which gathering information can fail.
> > > But when possible, it makes sense to let the server know that we are
> > > done talking, as it minimizes the likelihood that nbdinfo's error
> > > message will be obscured by an accompanying error message by the
> > > server complaining about an unclean disconnect.
> > > 
> > > For example, with a one-off qemu temporarily patched as mentioned in
> > > commit 0f8ee8c6 to advertise sizes larger than 2^63, kicking off
> > > 'qemu-nbd -f raw -r file &' produces:
> > > 
> > > pre-patch:
> > > 
> > > $ ./run nbdinfo --size nbd://localhost
> > > /home/eblake/libnbd/info/.libs/nbdinfo: nbd_get_size: server claims size 
> > > 9223372036854781440 which does not fit in signed result: Value too large 
> > > for defined data type
> > > qemu-nbd: option negotiation failed: Failed to read opts magic: 
> > > Unexpected end-of-file before all data were read
> > 
> > This doesn't necessarily seem like a bug?
> 
> It's a quality of service thing - qemu is just being noisy that the
> client closed the socket abruptly without giving any reason why.  It
> doesn't change libnbd's behavior (nbdinfo has already reported its
> error message), but may confuse people reading qemu-nbd logs.
> 
> It may also be a case where we could patch qemu-nbd to not output a
> complaint if the client exited at a clean point in the back-and-forth
> transactions.  We still want to be noisy if the socket closes after
> the first byte has been read, but if there are zero bytes available,
> announcing that the client no longer cares does not add much value.
> 
> > 
> > This feels like quite a significant change in behaviour to me.  In
> > particular I'm worried if it interacts in some subtle way with the
> > forking done by the "[ ... ]" syntax for running servers on the
> > command line (or any of the other ways that libnbd might fork/exec).
> 
> Interesting observation.  atexit() handlers are not preserved across
> exec, and I think all our fork() paths end in either exec or _exit
> (also where atexit handlers are ignored), so I don't think we are
> risking calling the handler twice.
> 
> > 
> > Can we hold this patch until after 1.18 is released and then put it
> > in?  Should only be a week or two.
> 
> Sure, being conservative on this one is fine by me.  I'll delay it
> until after 1.18.
> 
> > 
> > Provisionally ACKed for post-1.18

Now in as commit fd4f3fea, so we can start getting some soak time under CI.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org
___
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs



Re: [Libguestfs] [libnbd PATCH] info: Try harder for graceful disconnect from server

2023-09-22 Thread Eric Blake
On Fri, Sep 22, 2023 at 09:47:55PM +0100, Richard W.M. Jones wrote:
> On Fri, Sep 22, 2023 at 12:33:36PM -0500, Eric Blake wrote:
> > There are a number of ways in which gathering information can fail.
> > But when possible, it makes sense to let the server know that we are
> > done talking, as it minimizes the likelihood that nbdinfo's error
> > message will be obscured by an accompanying error message by the
> > server complaining about an unclean disconnect.
> > 
> > For example, with a one-off qemu temporarily patched as mentioned in
> > commit 0f8ee8c6 to advertise sizes larger than 2^63, kicking off
> > 'qemu-nbd -f raw -r file &' produces:
> > 
> > pre-patch:
> > 
> > $ ./run nbdinfo --size nbd://localhost
> > /home/eblake/libnbd/info/.libs/nbdinfo: nbd_get_size: server claims size 
> > 9223372036854781440 which does not fit in signed result: Value too large 
> > for defined data type
> > qemu-nbd: option negotiation failed: Failed to read opts magic: Unexpected 
> > end-of-file before all data were read
> 
> This doesn't necessarily seem like a bug?

It's a quality of service thing - qemu is just being noisy that the
client closed the socket abruptly without giving any reason why.  It
doesn't change libnbd's behavior (nbdinfo has already reported its
error message), but may confuse people reading qemu-nbd logs.

It may also be a case where we could patch qemu-nbd to not output a
complaint if the client exited at a clean point in the back-and-forth
transactions.  We still want to be noisy if the socket closes after
the first byte has been read, but if there are zero bytes available,
announcing that the client no longer cares does not add much value.

> 
> This feels like quite a significant change in behaviour to me.  In
> particular I'm worried if it interacts in some subtle way with the
> forking done by the "[ ... ]" syntax for running servers on the
> command line (or any of the other ways that libnbd might fork/exec).

Interesting observation.  atexit() handlers are not preserved across
exec, and I think all our fork() paths end in either exec or _exit
(also where atexit handlers are ignored), so I don't think we are
risking calling the handler twice.

> 
> Can we hold this patch until after 1.18 is released and then put it
> in?  Should only be a week or two.

Sure, being conservative on this one is fine by me.  I'll delay it
until after 1.18.

> 
> Provisionally ACKed for post-1.18
> 
> Rich.
> 
> > post-patch:
> > $ ./run nbdinfo --size nbd://localhost
> > /home/eblake/libnbd/info/.libs/nbdinfo: nbd_get_size: server claims size 
> > 9223372036854781440 which does not fit in signed result: Value too large 
> > for defined data type
> > 
> > if nbdinfo is run in the same terminal as qemu-nbd.
> > 
> > Signed-off-by: Eric Blake 
> > ---
> >  info/main.c | 13 +
> >  1 file changed, 13 insertions(+)
> > 
> > diff --git a/info/main.c b/info/main.c
> > index c8248c61..05f19f43 100644
> > --- a/info/main.c
> > +++ b/info/main.c
> > @@ -88,6 +88,17 @@ usage (FILE *fp, int exitcode)
> >exit (exitcode);
> >  }
> > 
> > +void
> > +clean_shutdown (void)
> > +{
> > +  /* If we are connected but detect an error, try to give the server
> > +   * notice that we are done talking.  Ignore failures, as this is
> > +   * only a courtesy measure.
> > +   */
> > +  if (nbd)
> > +nbd_shutdown (nbd, 0);
> > +}
> > +
> >  int
> >  main (int argc, char *argv[])
> >  {
> > @@ -277,6 +288,7 @@ main (int argc, char *argv[])
> >  fprintf (stderr, "%s: %s\n", progname, nbd_get_error ());
> >  exit (EXIT_FAILURE);
> >}
> > +  atexit (clean_shutdown);
> >nbd_set_uri_allow_local_file (nbd, true); /* Allow ?tls-psk-file. */
> > 
> >/* Set optional modes in the handle. */
> > @@ -353,6 +365,7 @@ main (int argc, char *argv[])
> >free_exports ();
> >nbd_shutdown (nbd, 0);
> >nbd_close (nbd);
> > +  nbd = NULL;
> > 
> >/* Close the output stream and copy it to the real stdout. */
> >if (fclose (fp) == EOF) {
> > -- 
> > 2.41.0
> > 
> 
> 
> 
> -- 
> Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
> Read my programming and virtualization blog: http://rwmj.wordpress.com
> virt-top is 'top' for virtual machines.  Tiny program with many
> powerful monitoring features, net stats, disk stats, logging, etc.
> http://people.redhat.com/~rjones/virt-top
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org
___
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs



Re: [Libguestfs] [libnbd PATCH] info: Try harder for graceful disconnect from server

2023-09-22 Thread Richard W.M. Jones
On Fri, Sep 22, 2023 at 12:33:36PM -0500, Eric Blake wrote:
> There are a number of ways in which gathering information can fail.
> But when possible, it makes sense to let the server know that we are
> done talking, as it minimizes the likelihood that nbdinfo's error
> message will be obscured by an accompanying error message by the
> server complaining about an unclean disconnect.
> 
> For example, with a one-off qemu temporarily patched as mentioned in
> commit 0f8ee8c6 to advertise sizes larger than 2^63, kicking off
> 'qemu-nbd -f raw -r file &' produces:
> 
> pre-patch:
> 
> $ ./run nbdinfo --size nbd://localhost
> /home/eblake/libnbd/info/.libs/nbdinfo: nbd_get_size: server claims size 
> 9223372036854781440 which does not fit in signed result: Value too large for 
> defined data type
> qemu-nbd: option negotiation failed: Failed to read opts magic: Unexpected 
> end-of-file before all data were read

This doesn't necessarily seem like a bug?

This feels like quite a significant change in behaviour to me.  In
particular I'm worried if it interacts in some subtle way with the
forking done by the "[ ... ]" syntax for running servers on the
command line (or any of the other ways that libnbd might fork/exec).

Can we hold this patch until after 1.18 is released and then put it
in?  Should only be a week or two.

Provisionally ACKed for post-1.18

Rich.

> post-patch:
> $ ./run nbdinfo --size nbd://localhost
> /home/eblake/libnbd/info/.libs/nbdinfo: nbd_get_size: server claims size 
> 9223372036854781440 which does not fit in signed result: Value too large for 
> defined data type
> 
> if nbdinfo is run in the same terminal as qemu-nbd.
> 
> Signed-off-by: Eric Blake 
> ---
>  info/main.c | 13 +
>  1 file changed, 13 insertions(+)
> 
> diff --git a/info/main.c b/info/main.c
> index c8248c61..05f19f43 100644
> --- a/info/main.c
> +++ b/info/main.c
> @@ -88,6 +88,17 @@ usage (FILE *fp, int exitcode)
>exit (exitcode);
>  }
> 
> +void
> +clean_shutdown (void)
> +{
> +  /* If we are connected but detect an error, try to give the server
> +   * notice that we are done talking.  Ignore failures, as this is
> +   * only a courtesy measure.
> +   */
> +  if (nbd)
> +nbd_shutdown (nbd, 0);
> +}
> +
>  int
>  main (int argc, char *argv[])
>  {
> @@ -277,6 +288,7 @@ main (int argc, char *argv[])
>  fprintf (stderr, "%s: %s\n", progname, nbd_get_error ());
>  exit (EXIT_FAILURE);
>}
> +  atexit (clean_shutdown);
>nbd_set_uri_allow_local_file (nbd, true); /* Allow ?tls-psk-file. */
> 
>/* Set optional modes in the handle. */
> @@ -353,6 +365,7 @@ main (int argc, char *argv[])
>free_exports ();
>nbd_shutdown (nbd, 0);
>nbd_close (nbd);
> +  nbd = NULL;
> 
>/* Close the output stream and copy it to the real stdout. */
>if (fclose (fp) == EOF) {
> -- 
> 2.41.0
> 



-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top
___
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs



[Libguestfs] [libnbd PATCH] info: Try harder for graceful disconnect from server

2023-09-22 Thread Eric Blake
There are a number of ways in which gathering information can fail.
But when possible, it makes sense to let the server know that we are
done talking, as it minimizes the likelihood that nbdinfo's error
message will be obscured by an accompanying error message by the
server complaining about an unclean disconnect.

For example, with a one-off qemu temporarily patched as mentioned in
commit 0f8ee8c6 to advertise sizes larger than 2^63, kicking off
'qemu-nbd -f raw -r file &' produces:

pre-patch:

$ ./run nbdinfo --size nbd://localhost
/home/eblake/libnbd/info/.libs/nbdinfo: nbd_get_size: server claims size 
9223372036854781440 which does not fit in signed result: Value too large for 
defined data type
qemu-nbd: option negotiation failed: Failed to read opts magic: Unexpected 
end-of-file before all data were read

post-patch:
$ ./run nbdinfo --size nbd://localhost
/home/eblake/libnbd/info/.libs/nbdinfo: nbd_get_size: server claims size 
9223372036854781440 which does not fit in signed result: Value too large for 
defined data type

if nbdinfo is run in the same terminal as qemu-nbd.

Signed-off-by: Eric Blake 
---
 info/main.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/info/main.c b/info/main.c
index c8248c61..05f19f43 100644
--- a/info/main.c
+++ b/info/main.c
@@ -88,6 +88,17 @@ usage (FILE *fp, int exitcode)
   exit (exitcode);
 }

+void
+clean_shutdown (void)
+{
+  /* If we are connected but detect an error, try to give the server
+   * notice that we are done talking.  Ignore failures, as this is
+   * only a courtesy measure.
+   */
+  if (nbd)
+nbd_shutdown (nbd, 0);
+}
+
 int
 main (int argc, char *argv[])
 {
@@ -277,6 +288,7 @@ main (int argc, char *argv[])
 fprintf (stderr, "%s: %s\n", progname, nbd_get_error ());
 exit (EXIT_FAILURE);
   }
+  atexit (clean_shutdown);
   nbd_set_uri_allow_local_file (nbd, true); /* Allow ?tls-psk-file. */

   /* Set optional modes in the handle. */
@@ -353,6 +365,7 @@ main (int argc, char *argv[])
   free_exports ();
   nbd_shutdown (nbd, 0);
   nbd_close (nbd);
+  nbd = NULL;

   /* Close the output stream and copy it to the real stdout. */
   if (fclose (fp) == EOF) {
-- 
2.41.0

___
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs