Re: [Libguestfs] [nbdkit PATCH] sh: Allow pwrite to not consume all data

Richard W.M. Jones Thu, 31 Aug 2023 02:55:36 -0700

On Thu, Aug 31, 2023 at 11:18:27AM +0200, Laszlo Ersek wrote:
> On 8/31/23 10:55, Richard W.M. Jones wrote:
> > On Thu, Aug 31, 2023 at 10:40:53AM +0200, Laszlo Ersek wrote:
> >> On 8/31/23 00:21, Eric Blake wrote:
> >>> I hit another transient failure in libnbd CI when a poorly-written
> >>> eval script did not consume all of stdin during .pwrite.  As behaving
> >>> as a data sink can be a somewhat reasonable feature of a
> >>> quickly-written sh or eval plugin, we should not be so insistent as
> >>> treating an EPIPE failure as an immediate return of EIO to the client.
> >>>
> >>> Signed-off-by: Eric Blake <ebl...@redhat.com>
> >>> ---
> >>>
> >>> I probably need to add unit test coverage of this before committing
> >>> (although proving that I win the data race on a client process exiting
> >>> faster than the parent can write enough data to hit EPIPE is hard).
> >>>
> >>>  plugins/sh/nbdkit-sh-plugin.pod |  8 +++++++
> >>>  plugins/sh/call.c               | 38 ++++++++++++++++++++++-----------
> >>>  2 files changed, 34 insertions(+), 12 deletions(-)
> >>>
> >>> diff --git a/plugins/sh/nbdkit-sh-plugin.pod 
> >>> b/plugins/sh/nbdkit-sh-plugin.pod
> >>> index b2c946a0..8b83a5b3 100644
> >>> --- a/plugins/sh/nbdkit-sh-plugin.pod
> >>> +++ b/plugins/sh/nbdkit-sh-plugin.pod
> >>> @@ -505,6 +505,14 @@ Unlike in other languages, if you provide a 
> >>> C<pwrite> method you
> >>>  B<must> also provide a C<can_write> method which exits with code C<0>
> >>>  (true).
> >>>
> >>> +With nbdkit E<ge> 1.36, this method may return C<0> without consuming
> >>> +any data from stdin, and without producing any output, in order to
> >>> +behave as an intentional data sink.  But in older versions, nbdkit
> >>> +would treat any C<EPIPE> failure in writing to your script as an error
> >>> +condition even if your script returns success; to avoid unintended
> >>> +failures, you may want to include C<"cat >/dev/null"> in a script
> >>> +intending to ignore the client's write requests.
> >>> +
> >>>  =item C<flush>
> >>>
> >>>   /path/to/script flush <handle>
> >>> diff --git a/plugins/sh/call.c b/plugins/sh/call.c
> >>> index 888c6459..79c67a04 100644
> >>> --- a/plugins/sh/call.c
> >>> +++ b/plugins/sh/call.c
> >>> @@ -34,6 +34,7 @@
> >>>
> >>>  #include <assert.h>
> >>>  #include <fcntl.h>
> >>> +#include <stdbool.h>
> >>>  #include <stdio.h>
> >>>  #include <stdlib.h>
> >>>  #include <inttypes.h>
> >>> @@ -130,6 +131,7 @@ debug_call (const char **argv)
> >>>   */
> >>>  static int
> >>>  call3 (const char *wbuf, size_t wbuflen, /* sent to stdin (can be NULL) 
> >>> */
> >>> +       bool *pipe_full,                  /* set if wbuf not fully 
> >>> written */
> >>>         string *rbuf,                     /* read from stdout */
> >>>         string *ebuf,                     /* read from stderr */
> >>>         const char **argv)                /* script + parameters */
> >>> @@ -275,15 +277,8 @@ call3 (const char *wbuf, size_t wbuflen, /* sent to 
> >>> stdin (can be NULL) */
> >>>        r = write (pfds[0].fd, wbuf, wbuflen);
> >>>        if (r == -1) {
> >>>          if (errno == EPIPE) {
> >>> -          /* We tried to write to the script but it didn't consume
> >>> -           * the data.  Probably the script exited without reading
> >>> -           * from stdin.  This is an error in the script.
> >>> -           */
> >>> -          nbdkit_error ("%s: write to script failed because of a broken 
> >>> pipe: "
> >>> -                        "this can happen if the script exits without "
> >>> -                        "consuming stdin, which usually indicates a bug "
> >>> -                        "in the script",
> >>> -                        argv0);
> >>> +          *pipe_full = true;
> >>> +          r = wbuflen;
> >>>          }
> >>>          else
> >>>            nbdkit_error ("%s: write: %m", argv0);
> >>> @@ -555,7 +550,7 @@ call (const char **argv)
> >>>    CLEANUP_FREE_STRING string rbuf = empty_vector;
> >>>    CLEANUP_FREE_STRING string ebuf = empty_vector;
> >>>
> >>> -  r = call3 (NULL, 0, &rbuf, &ebuf, argv);
> >>> +  r = call3 (NULL, 0, NULL, &rbuf, &ebuf, argv);
> >>>    return handle_script_error (argv[0], &ebuf, r);
> >>>  }
> >>>
> >>> @@ -568,7 +563,7 @@ call_read (string *rbuf, const char **argv)
> >>>    int r;
> >>>    CLEANUP_FREE_STRING string ebuf = empty_vector;
> >>>
> >>> -  r = call3 (NULL, 0, rbuf, &ebuf, argv);
> >>> +  r = call3 (NULL, 0, NULL, rbuf, &ebuf, argv);
> >>>    r = handle_script_error (argv[0], &ebuf, r);
> >>>    if (r == ERROR)
> >>>      string_reset (rbuf);
> >>> @@ -584,7 +579,26 @@ call_write (const char *wbuf, size_t wbuflen, const 
> >>> char **argv)
> >>>    int r;
> >>>    CLEANUP_FREE_STRING string rbuf = empty_vector;
> >>>    CLEANUP_FREE_STRING string ebuf = empty_vector;
> >>> +  bool pipe_full = false;
> >>>
> >>> -  r = call3 (wbuf, wbuflen, &rbuf, &ebuf, argv);
> >>> +  r = call3 (wbuf, wbuflen, &pipe_full, &rbuf, &ebuf, argv);
> >>> +  if (pipe_full && r == OK) {
> >>> +    /* We allow scripts to intentionally ignore data, but they must
> >>> +     * have no output when doing so.
> >>> +     */
> >>> +    if (rbuf.len > 0 || ebuf.len > 0) {
> >>> +      nbdkit_error ("%s: write to script failed because of a broken 
> >>> pipe: "
> >>> +                    "this can happen if the script exits without "
> >>> +                    "consuming stdin, which usually indicates a bug "
> >>> +                    "in the script",
> >>> +                    argv[0]);
> >>> +      r = ERROR;
> >>> +    }
> >>> +    else
> >>> +      nbdkit_debug ("%s: write to script failed because of a broken 
> >>> pipe; "
> >>> +                    "assuming this was an intentional data sink, 
> >>> although it "
> >>> +                    "may indicate a bug in the script",
> >>> +                    argv[0]);
> >>> +  }
> >>>    return handle_script_error (argv[0], &ebuf, r);
> >>>  }
> >>
> >> The patch appears to do what it says on the tin, but I'm unconvinced we
> >> should relax the requirement. The "may indicate a bug" debug message is
> >> easy to miss. A hard failure upon EPIPE is noticeable, and as you write
> >> the script can just drain unwanted input with "cat >/dev/null".
> >>
> >> I think I may not be appreciating this patch due to not understanding
> >> how difficult it could be to update such a script. Can you elaborate on
> >> that? Can you provide an example where adding "cat >/dev/null" to the
> >> pwrite script is a disproportionate chore?
> > 
> > This is related to the following commit which Eric pushed yesterday:
> > 
> > https://gitlab.com/nbdkit/libnbd/-/commit/4df70870420d1be9348ac45f4aa467501eca5089
> > 
> > The commit is correct.  But it's an ongoing problem that we forget to
> > discard all the input in pwrite.  Note you have to do this on every
> > path out of the pwrite method, including errors or (like this case)
> > where the test just isn't interested in the data.  The error is also
> > intermittent because it is a race between two possible paths through
> > the poll(2) loop in plugins/sh/call.c:call3() Also this case only
> > applies to the sh plugin.  Other plugins (eg written in C) ignore or
> > only partially consume the buffer in pwrite all the time, and that's
> > never a problem.
> 
> I don't understand.
> 
> https://libguestfs.org/nbdkit-plugin.3.html#pwrite
> 
> """
> The callback must write the whole count bytes if it can. The NBD
> protocol doesn't allow partial writes (instead, these would be errors).
> If the whole count bytes was written successfully, the callback should
> return 0 to indicate there was no error.
> 
> If there is an error (including a short write which couldn't be
> recovered from), .pwrite should call nbdkit_error with an error message,
> and nbdkit_set_error to record an appropriate error (unless errno is
> sufficient), then return -1.
> """
> 
> How is it safe for a plugin (written in C) to *silently* throw away part
> of the data that the client wants written?


Plugins can do what they want and we have plugins which do discard
data, silently or otherwise:

https://gitlab.com/nbdkit/nbdkit/-/blob/c662932f2bc71b5879b6eb83c93478191e6efef2/plugins/full/full.c#L128
https://gitlab.com/nbdkit/nbdkit/-/blob/c662932f2bc71b5879b6eb83c93478191e6efef2/plugins/null/null.c#L125
https://gitlab.com/nbdkit/nbdkit/-/blob/c662932f2bc71b5879b6eb83c93478191e6efef2/plugins/ones/ones.c#L133

> I agree consistency between sh and other plugins is good, but the
> "baseline" seems unfathomable to me. I probably don't understand it.

sh plugin is often used for testing where the data written doesn't
matter but we're interested in other metadata like what write calls
were issued, eg:

https://gitlab.com/nbdkit/libnbd/-/blob/c713529e9fd0641b2d73f764517b5f9c21a767fd/copy/copy-sparse-no-extents.sh#L49

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW
_______________________________________________
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs

Re: [Libguestfs] [nbdkit PATCH] sh: Allow pwrite to not consume all data

Reply via email to