Thoughts on limiting the parallel build load, suggestion for a new -j auto option

2012-02-22 Thread R. Diez
Hi all:

I recently came across a build test script that launched make -j with no 
limits, which consumed all available RAM and ground my Linux system to a halt. 
I had to power the computer off, there was nothing else I could do.

After this experience, I believe that the default limitless behaviour of -j is 
too dangerous. Even if your PC does not end up dying, GNU Make may 
inadvertently consume too many resources. An explicit -j infinite would be a 
better option, if anybody really needs something like that.

I recently came across option -l , which limits the amount of parallel tasks 
based on the system's average load. However, this flag does not seem safe 
enough, as, according to Wikipedia, not all Unix systems include in the average 
load those processes currently waiting for disk I/O. Besides, the maximum 
average load one would almost certainly want to use depends on the number of 
CPUs, so the calling script (or user) has to find out how many CPUs there are. 
How you find out may also depend on the operating system underneath, so 
everybody gets a little extra work every time.

I am writing a build test framework myself, and I have been trying to 
coordinate all sub-makes from a main makefile. The top-level script decides how 
many parallel processes are allowed for the entire build and relies on 
MAKEFLAGS in order to let all sub-makes talk to each other so as to limit the 
overall load. Because different makefiles may require different GNU Make 
options, I am filtering out all others and leaving just the parallel build 
flags in place, like this:

  export MAKEFLAGS=$(filter --jobserver-fds=%,$(MAKEFLAGS)) $(filter 
-j,$(MAKEFLAGS))  $(MAKE) ...etc...

By the way, option --jobserver-fds is not documented, but I think it should be. 
As you can see, the user may need to filter it out manually after all.

The trouble with this MAKEFLAGS technique is that I often come across some 
third-party script which insists on calculating and setting its own -j value, 
rendering my coordination efforts useless. When this happens, I get warnings 
like this:

  warning: -jN forced in submake: disabling jobserver mode

Needless to say, most heuristics to calculate the -j value are as lame as mine 
(see below). When writing build scripts, nobody seems to have much time left 
for finding out how to retrieve the relevant system information
in bash/perl/whatever in a portable way and then calculate a good -j value out 
of it.

I have been thinking about the best way to overcome such parallel woes, and I 
wanted to share this thought with you all. How about adding to GNU Make a new 
-j parameter like this:

  make -j auto

The behaviour of -j auto could be as follows:

1) If MAKEFLAGS specifies -j and --jobserver-fds , then use those settings (no 
warning required).

2) Otherwise, calculate the maximum number of parallel tasks with some trivial 
heuristic based on the number of CPUs and/or the system load. I'm using CPU 
count + 1 at the moment, but I'm sure there are better guesses.

I could think of several alternative heuristics:

  make -j auto-system-load  # Use -l CPU count + 0.5
  make -j auto-processor-count  # Use -j CPU count + 1

I guess most people would then end up using some -j auto variant, in order to 
avoid overloading or underloading the system without having to implement their 
own heuristics. That way, a top-level script will be much more likely to 
succeed at setting a global limit when launching third-party sub-makes in 
parallel.

Please copy me on the answers, as I'm not on this list.

Thanks,
  R. Diez

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make


Re: Thoughts on limiting the parallel build load, suggestion for a new -j auto option

2012-02-22 Thread Howard Chu

R. Diez wrote:

Hi all:




I recently came across a build test script that launched make -j with no

limits, which consumed all available RAM and ground my Linux system to a halt.
I had to power the computer off, there was nothing else I could do.

Go and shoot the moron who wrote that script.


After this experience, I believe that the default limitless behaviour of
-j

is too dangerous. Even if your PC does not end up dying, GNU Make may
inadvertently consume too many resources. An explicit -j infinite would be a
better option, if anybody really needs something like that.


I recently came across option -l , which limits the amount of parallel
tasks

based on the system's average load. However, this flag does not seem safe
enough, as, according to Wikipedia, not all Unix systems include in the
average load those processes currently waiting for disk I/O. Besides, the
maximum average load one would almost certainly want to use depends on the
number of CPUs, so the calling script (or user) has to find out how many CPUs
there are. How you find out may also depend on the operating system
underneath, so everybody gets a little extra work every time.

-l is utterly useless. Load average is computed too slowly; by the time it 
passes any useful threshold the actual make load will have spiralled out of 
control.



I am writing a build test framework myself, and I have been trying to

coordinate all sub-makes from a main makefile. The top-level script decides
how many parallel processes are allowed for the entire build and relies on
MAKEFLAGS in order to let all sub-makes talk to each other so as to limit the
overall load. Because different makefiles may require different GNU Make
options, I am filtering out all others and leaving just the parallel build
flags in place, like this:


   export MAKEFLAGS=$(filter --jobserver-fds=%,$(MAKEFLAGS)) $(filter 
-j,$(MAKEFLAGS))  $(MAKE) ...etc...

By the way, option --jobserver-fds is not documented, but I think it
should

be. As you can see, the user may need to filter it out manually after all.


The trouble with this MAKEFLAGS technique is that I often come across some

third-party script which insists on calculating and setting its own -j value,
rendering my coordination efforts useless. When this happens, I get warnings
like this:

Go and shoot the morons who wrote those scripts. The make -j value should 
never be encoded in any file; it should only ever be set by the user invoking 
the actual top level make command.



   warning: -jN forced in submake: disabling jobserver mode

Needless to say, most heuristics to calculate the -j value are as lame as

mine (see below). When writing build scripts, nobody seems to have much time
left for finding out how to retrieve the relevant system information

in bash/perl/whatever in a portable way and then calculate a good -j value

out of it.

Nobody writing scripts should ever bother with such a thing. It's for the 
end-user to control; any other effort will be wrong 99.9% of the time.



I have been thinking about the best way to overcome such parallel woes, and I 
wanted to share this thought with you all. How about adding to GNU Make a new 
-j parameter like this:

   make -j auto

The behaviour of -j auto could be as follows:

1) If MAKEFLAGS specifies -j and --jobserver-fds , then use those settings (no 
warning required).

2) Otherwise, calculate the maximum number of parallel tasks with some trivial 
heuristic based on the number of CPUs and/or the system load. I'm usingCPU 
count  + 1 at the moment, but I'm sure there are better guesses.

I could think of several alternative heuristics:

   make -j auto-system-load  # Use -lCPU count + 0.5
   make -j auto-processor-count  # Use -jCPU count + 1

I guess most people would then end up using some -j auto variant, in
order

to avoid overloading or underloading the system without having to implement
their own heuristics. That way, a top-level script will be much more likely to
succeed at setting a global limit when launching third-party sub-makes in
parallel.

A good value depends entirely on the actual workload. For most compiles I find 
that CPU count * 1.5 works well. Some people use make for things other than 
invoking compilers though, and some compilers have different resource 
requirements than a typical gcc run, so even that isn't a widely applicable 
estimate. Only the end user can determine the best value.



Please copy me on the answers, as I'm not on this list.


--
  -- Howard Chu
  CTO, Symas Corp.   http://www.symas.com
  Director, Highland Sun http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make


Re: Thoughts on limiting the parallel build load, suggestion for a new -j auto option

2012-02-22 Thread R. Diez



-l is utterly useless. Load average is computed too slowly; by the time
it passes any useful threshold the actual make load will have spiralled
out of control.


If that's the case, exactly that reasoning should be written in the 
documentation.




Go and shoot the morons who wrote those scripts. The make -j value
should never be encoded in any file; it should only ever be set by the
user invoking the actual top level make command.


Again, such a recommendation should be in the manual.

However, for practical reasons, I still think GNU Make should offer some 
simple heuristic for people without the necessary time or inclination to 
calculate their own optimum value (that would be most humans).


That simple heuristic need not be totally fixed. The user could provide 
their own factor for a simple CPU * factor formula, something like 
-jcpufactor 1.5 could do the trick. I just want to spare the user the 
effort to retrieve the CPU count, which is always a small, 
system-dependant pain.


Please copy me on the answers, as I'm not on this list.

Regards,
  R. Diez

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make