On Thu, Aug 31, 2023 at 09:50:22AM +0100, Richard W.M. Jones wrote:
> See comment for explanation and
> https://listman.redhat.com/archives/libguestfs/2023-August/032468.html
> ---
>  tests/Makefile.am                    |  2 +
>  plugins/sh/call.c                    | 33 +++++++-----
>  tests/test-sh-pwrite-ignore-stdin.sh | 77 ++++++++++++++++++++++++++++
>  3 files changed, 100 insertions(+), 12 deletions(-)

Having read the rest of the conversation, I now agree with you: no
matter when EPIPE happens, it is more important that we honor the exit
status rather than blindly turning the child process's refusal to
empty the pipe as a fatal error.

> +++ b/plugins/sh/call.c
> @@ -275,22 +275,31 @@ call3 (const char *wbuf, size_t wbuflen, /* sent to 
> stdin (can be NULL) */
>        r = write (pfds[0].fd, wbuf, wbuflen);
>        if (r == -1) {
>          if (errno == EPIPE) {
> -          /* We tried to write to the script but it didn't consume
> -           * the data.  Probably the script exited without reading
> -           * from stdin.  This is an error in the script.
> +          /* In nbdkit <= 1.35.11 we gave an error here, arguing that
> +           * scripts must always consume or discard their full input
> +           * when 'pwrite' is called.  Previously we had many cases
> +           * where scripts forgot to discard the data on a path out of
> +           * pwrite (such as an error or where the script is not
> +           * interested in the data being written), resulting in
> +           * intermittent test failures.
> +           *
> +           * It is valid for a script to ignore the written data
> +           * (plenty of non-sh plugins do this), or for a script to be
> +           * gradually processing the data, encounter an error and
> +           * wish to exit immediately.  Therefore ignore this error.

I agree with Laszlo's observation that we may want to enhance this
comment to mention that we still obey the exit status (a plugin that
quit reading stdin because it was killed by a signal will show up as
an error and not a successful .pwrite).

>             */
> -          nbdkit_error ("%s: write to script failed because of a broken 
> pipe: "
> -                        "this can happen if the script exits without "
> -                        "consuming stdin, which usually indicates a bug "
> -                        "in the script",
> -                        argv0);
> +          nbdkit_debug ("%s: write: %m (ignored)", argv0);
> +          wbuflen = 0;          /* discard the rest */
>          }
> -        else
> +        else {
>            nbdkit_error ("%s: write: %m", argv0);
> -        goto error;
> +          goto error;
> +        }

Yes, moving the 'goto error' into the else branch is something I ended
up realizing I missed in my version of the patch, when I tried writing
a unit test.

> +      }
> +      else {
> +        wbuf += r;
> +        wbuflen -= r;
>        }
> -      wbuf += r;
> -      wbuflen -= r;
>        /* After writing all the data we close the pipe so that
>         * the reader on the other end doesn't wait for more.
>         */

This code change does what we need, and is less complex than my attempt.

> diff --git a/tests/test-sh-pwrite-ignore-stdin.sh 
> b/tests/test-sh-pwrite-ignore-stdin.sh
> new file mode 100755
> index 000000000..3448eca17
> --- /dev/null
> +++ b/tests/test-sh-pwrite-ignore-stdin.sh

> +
> +start_nbdkit -P $pid -U $sock sh - <<'EOF'
> +case "$1" in
> +    can_write) echo 0 ;;
> +    pwrite)
> +        # Always ignore the input.  If the offset >= 32M return an error.
> +        if [ $4 -ge 33554432 ]; then
> +            echo 'ENOSPC Out of space' >&2
> +            exit 1
> +        fi
> +        ;;

In my testing, I could not reliably reproduce an EPIPE error, either
with your patch or mine, and whether or not I used your unit test or
my (unpublished) attempt at one (I had tried to 'exec 0</dev/null' to
forcefully close the pipe before the child process exits, but even
that still doesn't necessarily result in the parent process seeing
EPIPE - it really boils down to kernel timing of the window in the
parent between poll() saying readable vs. write() seeing a closed
pipe(); and the child does not have easy control over widening that
window).  With just the unit test but not the code change, I tried:

$ for i in `seq 100`; do make -C tests check 
TESTS=test-sh-pwrite-ignore-stdin.sh || break; done

while running a heavy compile load on another project in parallel, and
the loop ended at i == 100 (meaning I did not hit EPIPE at all during
my 100 tests).  I do think that on a system under heavy enough load
(our CI system tends to be such a setup), you will eventually see the
EPIPE failures with enough iterations.  So the unit test is
worthwhile, even if it is not deterministic at demonstrating the
problem being solved.

Meanwhile, I like that your unit test covers the fact that we DO want
to allow the child process to conditionally return errors while not
fully reading stdin, which is somthing my atempt at a patch did not
do.

Overnight, I also thought about experimenting whether adding
O_NONBLOCK to the nbdkit's side of the pipe to the plugin would make
the EPIPE error more likely.  But as sh is not intended as performance
critical, and we aren't using epoll() with its edge-triggered
notifications, that was getting too invasive.

[https://eklitzke.org/blocking-io-nonblocking-io-and-epoll was a nice
read]

> +    get_size)
> +        echo 64M
> +        ;;
> +    pread)
> +        ;;
> +    *) exit 2 ;;
> +esac
> +EOF
> +
> +nbdsh -u "nbd+unix://?socket=$sock" -c '
> +buf = bytearray(16*1024*1024)
> +h.pwrite(buf, 0)
> +
> +try:
> +    h.pwrite(buf, 32*1024*1024)
> +    assert False
> +except nbd.Error:
> +    # Expect an error here.
> +    pass

I would prefer if we go one step further, and

import errno
...
except nbd.Error as ex:
    assert ex.errnum == errno.ENOSPC

to prove that we got the client's intended ENOSPC, rather than an
accidental nbdkit treating EPIPE writing to the plugin as an instant
EIO to the plugin (the pre-patch behavior).

With that change,

Reviewed-by: Eric Blake <ebl...@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org
_______________________________________________
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs

Reply via email to