Hi Denys,

On Thu, 24 Aug 2017, Denys Vlasenko wrote:

> function                                             old     new   delta
> xargs_main                                           787    1174    +387
> packed_usage                                       31757   31769     +12
> ------------------------------------------------------------------------------
> (add/remove: 0/0 grow/shrink: 2/0 up/down: 399/0)             Total: 399 bytes
> 
> 
> How about this version:
> 
> function                                             old     new   delta
> xargs_exec                                             -     294    +294
> packed_usage                                       31757   31772     +15
> xargs_main                                           787     719     -68
> ------------------------------------------------------------------------------
> (add/remove: 1/0 grow/shrink: 1/1 up/down: 309/-68)           Total: 241 bytes
> 

I see, you did away with the `procs` array. I am a little worried that the
*wrong* children will be mistaken for the xargs-children, and that as a
consequence too many children will be running at the same time.

Also, I implemented it using the array because I need this feature to work
on Windows, after all, where we do not have an equivalent for the
waitpid(-1, ...) call to wait for any child process to exit.

A couple of comments on your patch:

> diff --git a/findutils/xargs.c b/findutils/xargs.c
> index b4e3db0..77e01ef 100644
> --- a/findutils/xargs.c
> +++ b/findutils/xargs.c
> [...]
> +/*
> + * Returns 0 if xargs should continue (but may set G.xargs_exitcode to 123).
> + * Else sets G.xargs_exitcode to error code and returns nonzero.
> + *
> + * If G.max_procs == 0, performs final waitpid() loop for all children.
> + */
>  static int xargs_exec(void)

Nice, this makes it a lot less magic.

>  {
>      int status;
> 
> +#if !ENABLE_FEATURE_XARGS_SUPPORT_PARALLEL
>      status = spawn_and_wait(G.args);
> +#else
> +    if (G.max_procs == 1) {
> +        status = spawn_and_wait(G.args);
> +    } else {
> +        pid_t pid;
> +        int wstat;
> + again:
> +        if (G.running_procs >= G.max_procs)
> +            pid = safe_waitpid(-1, &wstat, 0);
> +        else
> +            pid = wait_any_nohang(&wstat);

What if running_procs == max_procs == 0? In that case, we should avoid
calling either safe_waitpid() or wait_any_nohang(). Maybe instead
something like

+        if (G.running_procs < G.max_procs)
+            pid = wait_any_nohang(&wstat);
+        else if (G.running_procs > 0)
+            pid = safe_waitpid(-1, &wstat, 0);

But then, the wait_any_nohang() call is still unnecessary if running_procs
== 0. But you want to save space, not necessarily make the thing more
performant, so...

> +        if (pid > 0) {
> +            /* We may have children we don't know about:
> +             * sh -c 'sleep 1 & exec xargs ...'
> +             * Do not make G.running_procs go negative.
> +             */

This hints very eloquently at the problem I am concerned about. But I
think it is worse than just running_procs going negative. You could easily
mistake child processes for having exited when they have not, including
possibly failing children (and hence the exit status can be all wrong).

> +            if (G.running_procs != 0)
> +                G.running_procs--;
> +            status = WIFSIGNALED(wstat)
> +                ? 0x180 + WTERMSIG(wstat)
> +                : WEXITSTATUS(wstat);
> +            if (status > 0 && status < 255) {
> +                /* See below why 123 does not abort */
> +                G.xargs_exitcode = 123;
> +                status = 0;
> +            }
> +            if (status == 0)
> +                goto again; /* maybe we have more children? */

I think this logic still needs some love:

- if running_procs is 0, there is no need to go at it again

- if running_procs is greater than 0, we still have to wait for the
  remaining children to exit, even if one child exited due to a signal.

> +            /* else: "bad" status, will bail out */
> +        } else if (G.max_procs != 0) {
> +            /* Not in final waitpid() loop,
> +             * and G.running_procs < G.max_procs: start more procs
> +             */
> +            status = spawn(G.args);
> +            /* here "status" actually holds pid, or -1 */
> +            if (status > 0) {
> +                G.running_procs++;
> +                status = 0;
> +            }
> +            /* else: status == -1 (failed to fork or exec) */
> +        } else {
> +            /* final waitpid() loop: must be ECHILD "no more children" */
> +            status = 0;
> +        }
> +    }
> +#endif
> +    /* Manpage:
> +     * """xargs exits with the following status:
> +     * 0 if it succeeds
> +     * 123 if any invocation of the command exited with status 1-125
> +     * 124 if the command exited with status 255
> +     *     ("""If any invocation of the command exits with a status of 255,
> +     *     xargs will stop immediately without reading any further input.
> +     *     An error message is issued on stderr when this happens.""")
> +     * 125 if the command is killed by a signal
> +     * 126 if the command cannot be run
> +     * 127 if the command is not found
> +     * 1 if some other error occurred."""
> +     */
>      if (status < 0) {
>          bb_simple_perror_msg(G.args[0]);
> -        return errno == ENOENT ? 127 : 126;
> -    }
> -    if (status == 255) {
> -        bb_error_msg("%s: exited with status 255; aborting", G.args[0]);
> -        return 124;
> +        status = (errno == ENOENT) ? 127 : 126;
>      }
> -    if (status >= 0x180) {
> +    else if (status >= 0x180) {
>          bb_error_msg("'%s' terminated by signal %d",
>              G.args[0], status - 0x180);
> -        return 125;
> +        status = 125;
>      }
> -    if (status)
> -        return 123;
> -    return 0;
> +    else if (status != 0) {
> +        if (status == 255) {
> +            bb_error_msg("%s: exited with status 255; aborting", G.args[0]);
> +            return 124;
> +        }
> +        /* "123 if any invocation of the command exited with status 1-125"
> +         * This implies that nonzero exit code is remembered,
> +         * but does not cause xargs to stop: we return 0.
> +         */
> +        G.xargs_exitcode = 123;
> +        status = 0;
> +    }
> +
> +    if (status != 0)
> +        G.xargs_exitcode = status;

Is it always safe to override the exitcode, even if it has already been
set to something non-zero?

My patch override it only if the new status was larger than the previous
one, but now that I think about it, I would actually keep the first
non-zero exitcode, always. I.e.:

+    if (status != 0 && !G.xargs_exitcode)
+        G.xargs_exitcode = status;

> +    return status;
>  }
> 
>  /* In POSIX/C locale isspace is only these chars: "\t\n\v\f\r" and space.
> @@ -500,8 +591,14 @@ int xargs_main(int argc UNUSED_PARAM, char **argv)
>          "no-run-if-empty\0" No_argument "r",
>          &max_args, &max_chars, &G.eof_str, &G.eof_str
>          IF_FEATURE_XARGS_SUPPORT_REPL_STR(, &G.repl_str, &G.repl_str)
> +        IF_FEATURE_XARGS_SUPPORT_PARALLEL(, &G.max_procs)
>      );
> 
> +#if ENABLE_FEATURE_XARGS_SUPPORT_PARALLEL
> +    if (G.max_procs <= 0) /* -P0 means "run lots of them" */
> +        G.max_procs = 100; /* let's not go crazy high */

That is quite a bit arbitrary. I understand that you want to abuse
max_procs == 0 to imply the final wrap-up, but 100 is such a non-digital
number. If you must impose an artificial limit, why not 64, or 256, or
1024?

FWIW in my Windows-specific add-on patch (intended for the BusyBox-w32
fork) sets this to 64, for technical reasons.

Other than that, I think your patch is a fine improvement over my initial
stab at it.

Thanks,
Johannes
_______________________________________________
busybox mailing list
[email protected]
http://lists.busybox.net/mailman/listinfo/busybox

Reply via email to