URL: <https://savannah.gnu.org/bugs/?64451>
Summary: Unexpected behaviour of xargs when multiple children exit with 255 in parallel Group: findutils Submitter: None Submitted: Thu 20 Jul 2023 07:46:06 AM UTC Category: xargs Severity: 3 - Normal Item Group: Wrong result Status: None Privacy: Public Assigned to: None Originator Name: Originator Email: Open/Closed: Open Release: 4.9.0 Discussion Lock: Any Fixed Release: None _______________________________________________________ Follow-up Comments: ------------------------------------------------------- Date: Thu 20 Jul 2023 07:46:06 AM UTC By: Anonymous = Synopsis = When xargs is invoked with *-P* option set to a value other than 1 or 2 and more that one child exits with 255, xargs fails to wait on all children before exiting. *-P0* breaks in the same manner on just one exit 255. The case for *-P2* is a degenerate case where the second child exiting would cause this bug, but as it is the last running child, exiting with 255 does not cause any misbehaviour. = Examples = # One exit 255 seq 10 | xargs -n1 -P4 sh -c 'for i do echo start $i; sleep $i; if [ $i -eq 3 ]; then echo exit $i; exit 255; fi; echo stop $i; done' _; echo xargs exited here with $? start 1 start 2 start 3 start 4 stop 1 start 5 stop 2 start 6 exit 3 xargs: sh: exited with status 255; aborting stop 4 stop 5 stop 6 xargs exited here with 124 After job 3 exited with 255, xargs correctly waits for already started jobs to finish. # Two exit 255s seq 10 | xargs -n1 -P4 sh -c 'for i do echo start $i; sleep $i; if [ $i -ge 3 ] && [ $i -le 4 ]; then echo exit $i; exit 255; fi; echo stop $i; done' _; echo xargs exited here with $? start 1 start 2 start 3 start 4 stop 1 start 5 stop 2 start 6 exit 3 xargs: sh: exited with status 255; aborting exit 4 xargs: sh: exited with status 255; aborting xargs exited here with 124 stop 5 stop 6 xargs exits after job 4. Jobs 5 and 6 continue to run in background. It can no longer be guaranteed that jobs 5 and 6 will complete before further processing in some script. = POSIX compliance = This breaks a POSIX requirement. >From xargs(1p): Implementations wanting to provide parallel operation of the invoked utilities are encouraged to add an option enabling parallel invocation, but should still wait for termination of all of the children before _xargs_ terminates normally. = Quick debugging = After adding unconditional error messages to the _wait_for_proc_all_() function diff --git a/xargs/xargs.c b/xargs/xargs.c index fdede10..9bc7aa7 100644 --- a/xargs/xargs.c +++ b/xargs/xargs.c @@ -1597,6 +1597,7 @@ wait_for_proc (bool all, unsigned int minreap) static void wait_for_proc_all (void) { + error (0, 0, _("wait_for_proc_all enter")); static bool waiting = false; /* This function was registered by the original, parent, process. @@ -1614,6 +1615,7 @@ wait_for_proc_all (void) wait_for_proc (true, 0u); waiting = false; + error (0, 0, _("wait_for_proc_all before child_error test")); if (original_exit_value != child_error) { /* wait_for_proc () changed the value of child_error (). This and testing xargs again, the following is observed: # One exit 255 seq 10 | ./xargs -n1 -P4 sh -c 'for i do echo start $i; sleep $i; if [ $i -ge 3 ] && [ $i -le 3 ]; then echo exit $i; exit 255; fi; echo stop $i; done' _; echo xargs exited here with $? start 1 start 2 start 3 start 4 stop 1 start 5 stop 2 start 6 exit 3 ./xargs: sh: exited with status 255; aborting ./xargs: wait_for_proc_all enter stop 4 stop 5 stop 6 ./xargs: wait_for_proc_all before child_error test xargs exited here with 124 # Two exit 255s seq 10 | ./xargs -n1 -P4 sh -c 'for i do echo start $i; sleep $i; if [ $i -ge 3 ] && [ $i -le 4 ]; then echo exit $i; exit 255; fi; echo stop $i; done' _; echo xargs exited here with $? start 1 start 2 start 3 start 4 stop 1 start 5 stop 2 start 6 exit 3 ./xargs: sh: exited with status 255; aborting ./xargs: wait_for_proc_all enter exit 4 ./xargs: sh: exited with status 255; aborting xargs exited here with 124 stop 5 stop 6 It can be seen that in the second case _wait_for_proc_all_() did not reach _before child_error test_ line. = My hypothesis = As _wait_for_proc_all_() is registered via _atexit_() it is called when the first exit 255 is handled with _error_() (which itself calls _exit_()), then _wait_for_proc_all_() calls _wait_for_proc_() inside of which another _exit_() is called. According to atexit(3): POSIX.1 says that the result of calling *exit*(3) more than once (i.e., calling *exit*(3) within a function registered using *atexit*()) is undefined. On some systems (but not Linux), this can result in an infinite recursion; portable programs should not invoke *exit*(3) inside a function registered using *atexit*(). = In the case of -P0 = If the same command line is run with *-P0* something else happens: # One exit 255 is now enough to exit xargs seq 10 | ./xargs -n1 -P0 sh -c 'for i do echo start $i; sleep $i; if [ $i -ge 3 ] && [ $i -le 4 ]; then echo exit $i; exit 255; fi; echo stop $i; done' _; echo xargs exited here with $? start 1 start 2 start 3 start 4 start 5 start 7 start 6 start 8 start 9 ./xargs: wait_for_proc_all enter start 10 stop 1 stop 2 exit 3 ./xargs: sh: exited with status 255; aborting xargs exited here with 124 exit 4 stop 5 stop 6 stop 7 stop 8 stop 9 stop 10 As xargs has read in all the input it enters _wait_for_proc_all_(), now a single _exit 255_ is enough to exit xargs. _______________________________________________________ Reply to this item at: <https://savannah.gnu.org/bugs/?64451> _______________________________________________ Message sent via Savannah https://savannah.gnu.org/