Hi Denys,
On Thu, 24 Aug 2017, Denys Vlasenko wrote:
> function old new delta
> xargs_main 787 1174 +387
> packed_usage 31757 31769 +12
> ------------------------------------------------------------------------------
> (add/remove: 0/0 grow/shrink: 2/0 up/down: 399/0) Total: 399 bytes
>
>
> How about this version:
>
> function old new delta
> xargs_exec - 294 +294
> packed_usage 31757 31772 +15
> xargs_main 787 719 -68
> ------------------------------------------------------------------------------
> (add/remove: 1/0 grow/shrink: 1/1 up/down: 309/-68) Total: 241 bytes
>
I see, you did away with the `procs` array. I am a little worried that the
*wrong* children will be mistaken for the xargs-children, and that as a
consequence too many children will be running at the same time.
Also, I implemented it using the array because I need this feature to work
on Windows, after all, where we do not have an equivalent for the
waitpid(-1, ...) call to wait for any child process to exit.
A couple of comments on your patch:
> diff --git a/findutils/xargs.c b/findutils/xargs.c
> index b4e3db0..77e01ef 100644
> --- a/findutils/xargs.c
> +++ b/findutils/xargs.c
> [...]
> +/*
> + * Returns 0 if xargs should continue (but may set G.xargs_exitcode to 123).
> + * Else sets G.xargs_exitcode to error code and returns nonzero.
> + *
> + * If G.max_procs == 0, performs final waitpid() loop for all children.
> + */
> static int xargs_exec(void)
Nice, this makes it a lot less magic.
> {
> int status;
>
> +#if !ENABLE_FEATURE_XARGS_SUPPORT_PARALLEL
> status = spawn_and_wait(G.args);
> +#else
> + if (G.max_procs == 1) {
> + status = spawn_and_wait(G.args);
> + } else {
> + pid_t pid;
> + int wstat;
> + again:
> + if (G.running_procs >= G.max_procs)
> + pid = safe_waitpid(-1, &wstat, 0);
> + else
> + pid = wait_any_nohang(&wstat);
What if running_procs == max_procs == 0? In that case, we should avoid
calling either safe_waitpid() or wait_any_nohang(). Maybe instead
something like
+ if (G.running_procs < G.max_procs)
+ pid = wait_any_nohang(&wstat);
+ else if (G.running_procs > 0)
+ pid = safe_waitpid(-1, &wstat, 0);
But then, the wait_any_nohang() call is still unnecessary if running_procs
== 0. But you want to save space, not necessarily make the thing more
performant, so...
> + if (pid > 0) {
> + /* We may have children we don't know about:
> + * sh -c 'sleep 1 & exec xargs ...'
> + * Do not make G.running_procs go negative.
> + */
This hints very eloquently at the problem I am concerned about. But I
think it is worse than just running_procs going negative. You could easily
mistake child processes for having exited when they have not, including
possibly failing children (and hence the exit status can be all wrong).
> + if (G.running_procs != 0)
> + G.running_procs--;
> + status = WIFSIGNALED(wstat)
> + ? 0x180 + WTERMSIG(wstat)
> + : WEXITSTATUS(wstat);
> + if (status > 0 && status < 255) {
> + /* See below why 123 does not abort */
> + G.xargs_exitcode = 123;
> + status = 0;
> + }
> + if (status == 0)
> + goto again; /* maybe we have more children? */
I think this logic still needs some love:
- if running_procs is 0, there is no need to go at it again
- if running_procs is greater than 0, we still have to wait for the
remaining children to exit, even if one child exited due to a signal.
> + /* else: "bad" status, will bail out */
> + } else if (G.max_procs != 0) {
> + /* Not in final waitpid() loop,
> + * and G.running_procs < G.max_procs: start more procs
> + */
> + status = spawn(G.args);
> + /* here "status" actually holds pid, or -1 */
> + if (status > 0) {
> + G.running_procs++;
> + status = 0;
> + }
> + /* else: status == -1 (failed to fork or exec) */
> + } else {
> + /* final waitpid() loop: must be ECHILD "no more children" */
> + status = 0;
> + }
> + }
> +#endif
> + /* Manpage:
> + * """xargs exits with the following status:
> + * 0 if it succeeds
> + * 123 if any invocation of the command exited with status 1-125
> + * 124 if the command exited with status 255
> + * ("""If any invocation of the command exits with a status of 255,
> + * xargs will stop immediately without reading any further input.
> + * An error message is issued on stderr when this happens.""")
> + * 125 if the command is killed by a signal
> + * 126 if the command cannot be run
> + * 127 if the command is not found
> + * 1 if some other error occurred."""
> + */
> if (status < 0) {
> bb_simple_perror_msg(G.args[0]);
> - return errno == ENOENT ? 127 : 126;
> - }
> - if (status == 255) {
> - bb_error_msg("%s: exited with status 255; aborting", G.args[0]);
> - return 124;
> + status = (errno == ENOENT) ? 127 : 126;
> }
> - if (status >= 0x180) {
> + else if (status >= 0x180) {
> bb_error_msg("'%s' terminated by signal %d",
> G.args[0], status - 0x180);
> - return 125;
> + status = 125;
> }
> - if (status)
> - return 123;
> - return 0;
> + else if (status != 0) {
> + if (status == 255) {
> + bb_error_msg("%s: exited with status 255; aborting", G.args[0]);
> + return 124;
> + }
> + /* "123 if any invocation of the command exited with status 1-125"
> + * This implies that nonzero exit code is remembered,
> + * but does not cause xargs to stop: we return 0.
> + */
> + G.xargs_exitcode = 123;
> + status = 0;
> + }
> +
> + if (status != 0)
> + G.xargs_exitcode = status;
Is it always safe to override the exitcode, even if it has already been
set to something non-zero?
My patch override it only if the new status was larger than the previous
one, but now that I think about it, I would actually keep the first
non-zero exitcode, always. I.e.:
+ if (status != 0 && !G.xargs_exitcode)
+ G.xargs_exitcode = status;
> + return status;
> }
>
> /* In POSIX/C locale isspace is only these chars: "\t\n\v\f\r" and space.
> @@ -500,8 +591,14 @@ int xargs_main(int argc UNUSED_PARAM, char **argv)
> "no-run-if-empty\0" No_argument "r",
> &max_args, &max_chars, &G.eof_str, &G.eof_str
> IF_FEATURE_XARGS_SUPPORT_REPL_STR(, &G.repl_str, &G.repl_str)
> + IF_FEATURE_XARGS_SUPPORT_PARALLEL(, &G.max_procs)
> );
>
> +#if ENABLE_FEATURE_XARGS_SUPPORT_PARALLEL
> + if (G.max_procs <= 0) /* -P0 means "run lots of them" */
> + G.max_procs = 100; /* let's not go crazy high */
That is quite a bit arbitrary. I understand that you want to abuse
max_procs == 0 to imply the final wrap-up, but 100 is such a non-digital
number. If you must impose an artificial limit, why not 64, or 256, or
1024?
FWIW in my Windows-specific add-on patch (intended for the BusyBox-w32
fork) sets this to 64, for technical reasons.
Other than that, I think your patch is a fine improvement over my initial
stab at it.
Thanks,
Johannes
_______________________________________________
busybox mailing list
[email protected]
http://lists.busybox.net/mailman/listinfo/busybox