Re: GNU Make on Linux Feeding All Commands Through ksh

Steve Waltner Thu, 04 Dec 2008 10:09:12 -0800

On Dec 4, 2008, at 11:28 AM, David Boyce wrote:

On Thu, Dec 4, 2008 at 11:05 AM, Steve Waltner<[EMAIL PROTECTED]> wrote:
After two months, I'm finally looking into this issue again. Gottaget it
working by the end of the year since migrating builds to Linux (more
specifically the faster x86 hardware) is one of my businessobjectives
Somewhat off topic: Solaris is now FOSS and runs on the same X86
hardware as Linux. Thus there may be good reasons to convert to Linux
but access to faster X86 hardware is not a sufficient one.

I presume you know this and have additional reasons for the switch but
wanted to point it out for the record/archives.

You are correct that going to Solaris x86 would be the better solutionto get the performance gains of the x86 hardware and not deal with thecompatibility issues between Linux and Solaris that I'm seeing.Unfortunately the toolset that we are using to build (VxWorks fromWindRiver) is only available on Solaris SPARC, Linux x86, and Windows.Obviously, going to Windows would be a monumental undertaking with allthe unix based scripts that are used during the build, so that wasn'tconsidered. Going to Linux seemed like the easiest way to get thespeed boost, but is proving a little bit of a problem. I had askedWindRiver about a Solaris x86 release of their software in the past.Maybe it's time to ping them again about this. It would have beenbetter to ping them 6 weeks ago before we sent them a PO for licensesfor the next four years though. "Port your software, and get thecash..." :-)

I do remember
the developer that did most of the work on the makefiles making thecomment
about /bin/sh on Solaris being junk and switching to /bin/ksh.


That reasoning made sense on Solaris but may have a problem now, given
that you're moving to Linux, because /bin/ksh on Linux is *also* junk.
[snip] Fortunately Solaris has been bundling
bash for quite a long time, so perhaps the most robust and portable
arrangement for you would be to settle on SHELL=/bin/bash.

I'll investigate using bash (as well as CentOS and Ubuntu as mentionedby Galen) to see if it behaves any differently.

The main question that remains would be: Is there a way to debugand followthe token check-in/check-out process that is used internally in GNUmake totry and see what's going on here? I can work on trying to trackdown what'sgoing wrong, but without a way to get visibility into the process,I'd justbe making random changes to the makefiles, which isn't going to bevery
productive.


Sorry, can't help directly with your main problem since I haven't
worked much with make -j. Since you're building your own make anyway
it shouldn't be too hard to insert some debugging printfs. Or if you
want to be really aggressive you could build a Solaris 10 machine and
install Linux in a "zone" (semi-virtualization concept), then use
dtrace to track what's happening with the job server. Possibly even
strace would help on native Linux.

I don't remember if this was mentioned upthread but presumably you've
read http://make.paulandlesley.org/jobserver.html for background? If
not, probably a good idea.

Hmm... as I think about it, the whole jobserver technique depends on
downstream processes to leave those file descriptors open. If anybody
messes with the FD_CLOEXEC flag or closes them explicitly, you might
see the behavior described. I've seen programs that do something like

 for (i = 3; i < maxfds; i++) close(i);

before an exec, just for the heck of it. I've already mentioned that
pdksh is crap; I wonder if it's doing something like that? Wait, no,
you said you took /bin/ksh out and it still broke ... anyway, I'd try
strace or similar to see if the jobserver pipe's file descriptors are
being closed. Note that this is all based on a memory of the jobserver
document; I have not read it closely, lately.

I had read through the jobserver web page two years ago when weswitched our builds from using "-j --max-load=4" to "-j 4" at the sametime we moved the builds from running on the servers that everyoneuses for their interactive jobs to a cluster of dedicated buildservers. We did have several issues in the makefiles originally thatneeded to be fixed in regards to how make called itself recursively torun the build.

I'll do some testing with strace and possibly re-compiling GNU makewith some printfs in there to see if that provides any insight. Yourcomment about something (possibly ksh) closing file handles may beexactly what's going on here. I let a "gmake -j 100" job run tocompletion on the Linux server. It too eventually degraded to a single-threaded build, but it took a lot longer than the "-j 32" builds Iwould normally run. This build exited with the following warning:

gmake: INTERNAL: Exiting with 1 jobserver tokens available; should be100!

So, something is definitely interfering with the jobserver when thebuild is run on Linux and consuming tokens that should only be used byGNU make.

Thank you everyone for the detailed responses. I will some digging andlet you know what I find out.


Steve


_______________________________________________
Help-make mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/help-make

Re: GNU Make on Linux Feeding All Commands Through ksh

Reply via email to