On 02/26/2016 12:31 AM, Nicolas Boichat wrote: > On Fri, Feb 26, 2016 at 1:53 PM, Rob Landley <[email protected] > On 02/25/2016 01:31 AM, drinkcat wrote: > > We use toybox-0.7.0 as part of the Chromium OS project,
P.S. Yay! > > and sometimes > > hit an issue when building it on our automated builders (see this > issue > > <https://bugs.chromium.org/p/chromium/issues/detail?id=584542>): > > > > |toybox-0.7.0: armv7a-cros-linux-gnueabi-gcc -O2 -O2 -pipe > -march=armv7-a > > -mtune=cortex-a15 -mfpu=neon -mfloat-abi=hard -g -fno-exceptions > > -fno-unwind-tables -fno-asynchronous-unwind-tables -clang-syntax > > -funsigned-char -Wno-string-plus-int -I . -Os -ffunction-sections > > -fdata-sections -fno-asynchronous-unwind-tables > -fno-strict-aliasing -c > > toys/posix/tail.c -o generated/obj/tail.o toybox-0.7.0: > scripts/make.sh: > > line 270: wait: pid 8477 is not a child of this shell toybox-0.7.0: > > Hmmm... PID wrap, maybe? > > That's what we were wondering about... The builder is building a lot of > other packages at the same time, including Chromium, so it's not > unlikely that the PID space is saturated... Also, the builder retries > after the first failure, and the second try always works (probably when > the builder is less busy...) Possibly the OS is killing zombies if it wants to reuse that PID before the zombie is reaped? (Which would be a horrible heuristic because process exit could happen after a long runtime but right before a new fork.) Or maybe it's doing so if it there _are_ no more free PIDs, instead of fork failing? In either case, moving to $! wouldn't fix it. But that also wouldn't explain why only bash was seeing the problem... It's an interesting bug and I'd be interested in tracking it down if I was willing to get sucked into debugging GPLv3 bash. (GPLv2 bash I spent days tracking down weirdness, ala: The initial problem: http://landley.net/notes-2011.html#24-08-2011 Mentioned in passing: http://landley.net/notes-2011.html#26-08-2011 http://landley.net/notes-2011.html#28-08-2011 Deep dig: http://landley.net/notes-2011.html#02-09-2011 http://landley.net/notes-2011.html#03-09-2011 http://landley.net/notes-2011.html#04-09-2011 And finally finding it: http://landley.net/notes-2011.html#05-09-2011 Yes, that's me happily digging through libc, kernel, and back into a userspace program to find a problem. But if a GPLv3 program is involved, "it's broken, let's replace it". > > Looking at the code (|script/make.sh|), we are wondering about > your use > > of |$(jobs -rp)|. Wouldn't it be more correct to add jobs to PENDING > > using |$!| right after you launch the job (|do_loudly|)? > > If you think that'll help, I'm happy to give it a try, sure. > > > I have a commit ready here, that appears to fix the problem: > https://github.com/drinkcat/toybox/commit/4c705620d73e3e9c12a3be54dc5d2efda939241a I pushed a change last night based on your $! suggestion, did that fix it? (Your patch is using ${%%} to filter, which is interesting. I couldn't make ${//} work right but maybe that could replace my sed invocation? Trying to get the number of execs in the dispatch/monitoring cycle down as small as possible. Then again once it can build under a toybox shell then it's just a fork() and not an exec, which is cheaper. Eh, worry about it later...) > It's a little less aggressive at parallelizing, as it always waits for > the first PID if PENDING is full (instead of refreshing the PENDING list > every time)... So's the one I did last night. I should poke around on my 8-way machine and see how it's doing keeping the cpus busy... > I guess that you prefer I send the patch to the list? Or is a github PR > fine too? What would be _really_ nice is if github gave me a button to get the "git format-patch" version of the patch at the above URL. But of course they don't do that, why would they do that? When github emails me a pull requests I can wget and "git am" from there, so it's usable. (It's then up to the submitter to _close_ said request, but having a list of old irrelevant pull requests I've already dealt with one way or another is github's problem, as far as I'm concerned.) Posting them to the list gives other people the chance to chime in, but I think we covered that here. :) Thanks, Rob _______________________________________________ Toybox mailing list [email protected] http://lists.landley.net/listinfo.cgi/toybox-landley.net
