Re: Thoughts on limiting the parallel build load, suggestion for a new "-j auto" option
Hi, I would suggest to mention the current behavior in the manual, instead of making $(value ...) to emit a warning. The main motivation is backward compatibility. In our project we widely use $(value N) to get arguments of 'call' that are assumed to be optional without raising a warning (I've just found about 100 lines where we use it to suppress the warning). And we also always use make with '--warn-undefined-variables' flag too. For example, # Args: # 1. A list to convert into a PATH variable. # 2. (Optional) path separator to use. When empty, defaults to a colon. PATHIFY = \ $(subst $(space),$(or $(value 2),$(colon)),$(strip $1)) This allows one to use $(call PATHIFY,foo bar) as well as $(call PATHIFY,bar baz,;). Without using the 'value' function I would have to write something like: $(subst $(space),$(if $(and $(filter-out undefined,$(flavor 2)),$2),$2,$(colon)),$(strip $1)) The latter looks ugly and it is also much slower if the function is assumed to be called often. And all these drawbacks would be needed only to suppress a warning. Well, in other word, I think about 'value' function as a kind of an introspection facility. It is much like as 'flavor' or 'origin' functions, which do not generate a warning for undefined variables. So I wish to leave the current behavior as is and to describe it in the manual. 2012/3/14 R. Diez > Hi all: > > Writing makefiles is hard enough, so I always enable option > --warn-undefined-variables whenever I can. > > I have recently realised that in GNU Make 3.81 the $(value ...) function > does not warn as expected. That is, if I run this example makefile: > > > VAR_NAME := UNDEFINED_VARIABLE > > all: > echo "$(UNDEFINED_VARIABLE)" > echo "$($(VAR_NAME))" > echo "$(value $(VAR_NAME))" > > > I get the following output: > > Makefile:24: Warnung: undefinierte Variable »UNDEFINED_VARIABLE« > Makefile:24: Warnung: undefinierte Variable »UNDEFINED_VARIABLE« > echo "" > echo "" > echo "" > > > Is this an oversight? The manual does not mention any exceptions for > --warn-undefined-variables, and I wish that $(value ...) warned too. > > > Please copy me on the answers, as I'm not subscribed to this list. > > Thanks, > R. Diez > > ___ > Bug-make mailing list > Bug-make@gnu.org > https://lists.gnu.org/mailman/listinfo/bug-make > -- Best regards, Eldar Sh. Abusalimov ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
WG: Thoughts on limiting the parallel build load, suggestion for a new "-j auto" option
Hi all: Writing makefiles is hard enough, so I always enable option --warn-undefined-variables whenever I can. I have recently realised that in GNU Make 3.81 the $(value ...) function does not warn as expected. That is, if I run this example makefile: VAR_NAME := UNDEFINED_VARIABLE all: echo "$(UNDEFINED_VARIABLE)" echo "$($(VAR_NAME))" echo "$(value $(VAR_NAME))" I get the following output: Makefile:24: Warnung: undefinierte Variable »UNDEFINED_VARIABLE« Makefile:24: Warnung: undefinierte Variable »UNDEFINED_VARIABLE« echo "" echo "" echo "" Is this an oversight? The manual does not mention any exceptions for --warn-undefined-variables, and I wish that $(value ...) warned too. Please copy me on the answers, as I'm not subscribed to this list. Thanks, R. Diez ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: Thoughts on limiting the parallel build load, suggestion for a new "-j auto" option
Edward Welbourne wrote: Go and shoot the moron who wrote that script. Easy to say, but no use to someone who needs to deploy a package whose build infrastructure includes a badly-written script. If the author of that script has been shot, there's probably no-one alive who understands how to build the package. Quite possibly, the author only knocked the script together for personal use, but made it available to the project as a start-point for others to set up their own builds, only to see it chucked into the distribution because it was more use than nothing. They could have committed the script with either totaly benign settings, or obviously broken settings that force the user to substitute in valid ones. (Since otherwise most users will ignore any documentation.) E.g., changed to "make -j PICK_A_NUMBER" would have been better. The make -j value should never be encoded in any file; it should only ever be set by the user invoking the actual top level make command. and therein lies the problem with -j and -l; they require the user to have a sophisticated understanding of the system parameters and build load profile in order to guess a half-way decent value for them. Such options are much less useful than ones that have sensible defaults. Make *does* have sensible defaults. The default is to run serially. Sometimes computing is hard. Users *do* need a sophisticated understanding of their systems. That's life. For developers of the package being built, it's possible to learn - by trial and error - what settings tend to work out reasonably sensibly on specific machines. I have heard various ad hoc rules of thumb in use, typically number of CPUs plus either a fixed offset or some fraction of the total number of CPUs, as either -j or -l value. In the end, I know that my old machine grinds to a halt if I let make take load> about 4 and my shiny new machine accepts -j 8 -l 12 without breaking a sweat - and gets the build done in a few minutes, instead of half an hour. Yes. If you know how to improve on trial and error, go ahead and post your code. For anyone who has to build a package that isn't the primary business of their work - for example, a distribution maintainer, for whom each package is just one in a horde of many - configuring the right -j and -l flags for each run of make is not practical. It would make sense to provide them with a sensible way to tell make to make good use it can of the CPUs available, without bringing the system to its knees. It might make sense, but it's not the developers' responsibility to know what is sensible for every machine configuration out there. It is the end user's responsibility to know something about the system they're using. You can't expect developers to even have exposure to all the possible parallel configurations on which someone will attempt to build their software. A distro maintainer probably has a farm of machines to build on. That set of machines is already a known quantity to them, because presumably they've already done a lot of builds on those machines. They're the ones with the most knowledge of how the machines behave, not any particular developer. You can't just encode "# CPUs x " into the Make source. The CPUs may be fake to begin with (e.g., SMT with Intel Hyperthreading or Sun Niagara). Putting a database of CPU types/knowledge into Make and maintaining it would be ludicrous. End users should know their own machines. The present unconstrained -j behaviour is, in any case, self-defeating. It starts so many processes that the time the machine spends swapping them all out and back in again swamps the time they spend actually doing anu useful work. The build would complete sooner if fewer processes were started. Pretty sure the documentation already says so too. Anyone using just "make -j" is a moron. I think R. Diez's suggestions are a constructive step towards designing some half-way sensible heuristics - those with better understanding of what make's internal decision-process looks like can doubtless improve on them, which I'm sure is R. Diez's hope and intent in starting this discussion. We don't need -j auto to achieve perfectly optimal tuning on every machine; it'll be good enough if it can do builds faster than make's default implicit -j 1, without (too often) loading the machine so heavily as to interfere with browsing, reading mail, playing nethack and other things developers like to still be able to do while waiting for a build. Eddy. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/ ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: Thoughts on limiting the parallel build load, suggestion for a new "-j auto" option
> Go and shoot the moron who wrote that script. Easy to say, but no use to someone who needs to deploy a package whose build infrastructure includes a badly-written script. If the author of that script has been shot, there's probably no-one alive who understands how to build the package. Quite possibly, the author only knocked the script together for personal use, but made it available to the project as a start-point for others to set up their own builds, only to see it chucked into the distribution because it was more use than nothing. > -l is utterly useless. Hyperbole. > Load average is computed too slowly; by the time it passes any useful > threshold the actual make load will have spiralled out of control. The -l value is not as useful as we might like, for this and other reasons, but - in practice - in conjunction with -j, it makes it possible to tune make's behaviour so that I can make good use of my CPU without rendering the machine unusable while waiting for a build. > The make -j value should never be encoded in any file; it should only > ever be set by the user invoking the actual top level make command. and therein lies the problem with -j and -l; they require the user to have a sophisticated understanding of the system parameters and build load profile in order to guess a half-way decent value for them. Such options are much less useful than ones that have sensible defaults. For developers of the package being built, it's possible to learn - by trial and error - what settings tend to work out reasonably sensibly on specific machines. I have heard various ad hoc rules of thumb in use, typically number of CPUs plus either a fixed offset or some fraction of the total number of CPUs, as either -j or -l value. In the end, I know that my old machine grinds to a halt if I let make take load > about 4 and my shiny new machine accepts -j 8 -l 12 without breaking a sweat - and gets the build done in a few minutes, instead of half an hour. For anyone who has to build a package that isn't the primary business of their work - for example, a distribution maintainer, for whom each package is just one in a horde of many - configuring the right -j and -l flags for each run of make is not practical. It would make sense to provide them with a sensible way to tell make to make good use it can of the CPUs available, without bringing the system to its knees. The present unconstrained -j behaviour is, in any case, self-defeating. It starts so many processes that the time the machine spends swapping them all out and back in again swamps the time they spend actually doing anu useful work. The build would complete sooner if fewer processes were started. I think R. Diez's suggestions are a constructive step towards designing some half-way sensible heuristics - those with better understanding of what make's internal decision-process looks like can doubtless improve on them, which I'm sure is R. Diez's hope and intent in starting this discussion. We don't need -j auto to achieve perfectly optimal tuning on every machine; it'll be good enough if it can do builds faster than make's default implicit -j 1, without (too often) loading the machine so heavily as to interfere with browsing, reading mail, playing nethack and other things developers like to still be able to do while waiting for a build. Eddy. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: Thoughts on limiting the parallel build load, suggestion for a new "-j auto" option
-l is utterly useless. Load average is computed too slowly; by the time it passes any useful threshold the actual make load will have spiralled out of control. If that's the case, exactly that reasoning should be written in the documentation. Go and shoot the morons who wrote those scripts. The make -j value should never be encoded in any file; it should only ever be set by the user invoking the actual top level make command. Again, such a recommendation should be in the manual. However, for practical reasons, I still think GNU Make should offer some simple heuristic for people without the necessary time or inclination to calculate their own optimum value (that would be most humans). That simple heuristic need not be totally fixed. The user could provide their own factor for a simple "CPU * " formula, something like "-jcpufactor 1.5" could do the trick. I just want to spare the user the effort to retrieve the CPU count, which is always a small, system-dependant pain. Please copy me on the answers, as I'm not on this list. Regards, R. Diez ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: Thoughts on limiting the parallel build load, suggestion for a new "-j auto" option
R. Diez wrote: Hi all: I recently came across a build test script that launched "make -j" with no limits, which consumed all available RAM and ground my Linux system to a halt. I had to power the computer off, there was nothing else I could do. Go and shoot the moron who wrote that script. After this experience, I believe that the default limitless behaviour of -j is too dangerous. Even if your PC does not end up dying, GNU Make may inadvertently consume too many resources. An explicit "-j infinite" would be a better option, if anybody really needs something like that. I recently came across option -l , which limits the amount of parallel tasks based on the system's average load. However, this flag does not seem safe enough, as, according to Wikipedia, not all Unix systems include in the average load those processes currently waiting for disk I/O. Besides, the maximum average load one would almost certainly want to use depends on the number of CPUs, so the calling script (or user) has to find out how many CPUs there are. How you find out may also depend on the operating system underneath, so everybody gets a little extra work every time. -l is utterly useless. Load average is computed too slowly; by the time it passes any useful threshold the actual make load will have spiralled out of control. I am writing a build test framework myself, and I have been trying to coordinate all sub-makes from a main makefile. The top-level script decides how many parallel processes are allowed for the entire build and relies on MAKEFLAGS in order to let all sub-makes talk to each other so as to limit the overall load. Because different makefiles may require different GNU Make options, I am filtering out all others and leaving just the parallel build flags in place, like this: export MAKEFLAGS="$(filter --jobserver-fds=%,$(MAKEFLAGS)) $(filter -j,$(MAKEFLAGS))"&& $(MAKE) ...etc... By the way, option --jobserver-fds is not documented, but I think it should be. As you can see, the user may need to filter it out manually after all. The trouble with this MAKEFLAGS technique is that I often come across some third-party script which insists on calculating and setting its own -j value, rendering my coordination efforts useless. When this happens, I get warnings like this: Go and shoot the morons who wrote those scripts. The make -j value should never be encoded in any file; it should only ever be set by the user invoking the actual top level make command. warning: -jN forced in submake: disabling jobserver mode Needless to say, most heuristics to calculate the -j value are as lame as mine (see below). When writing build scripts, nobody seems to have much time left for finding out how to retrieve the relevant system information in bash/perl/whatever in a portable way and then calculate a good -j value out of it. Nobody writing scripts should ever bother with such a thing. It's for the end-user to control; any other effort will be wrong 99.9% of the time. I have been thinking about the best way to overcome such parallel woes, and I wanted to share this thought with you all. How about adding to GNU Make a new -j parameter like this: make -j auto The behaviour of "-j auto" could be as follows: 1) If MAKEFLAGS specifies -j and --jobserver-fds , then use those settings (no warning required). 2) Otherwise, calculate the maximum number of parallel tasks with some trivial heuristic based on the number of CPUs and/or the system load. I'm using + 1 at the moment, but I'm sure there are better guesses. I could think of several alternative heuristics: make -j auto-system-load # Use -l make -j auto-processor-count # Use -j I guess most people would then end up using some "-j auto" variant, in order to avoid overloading or underloading the system without having to implement their own heuristics. That way, a top-level script will be much more likely to succeed at setting a global limit when launching third-party sub-makes in parallel. A good value depends entirely on the actual workload. For most compiles I find that CPU count * 1.5 works well. Some people use make for things other than invoking compilers though, and some compilers have different resource requirements than a typical gcc run, so even that isn't a widely applicable estimate. Only the end user can determine the best value. Please copy me on the answers, as I'm not on this list. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/ ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Thoughts on limiting the parallel build load, suggestion for a new "-j auto" option
Hi all: I recently came across a build test script that launched "make -j" with no limits, which consumed all available RAM and ground my Linux system to a halt. I had to power the computer off, there was nothing else I could do. After this experience, I believe that the default limitless behaviour of -j is too dangerous. Even if your PC does not end up dying, GNU Make may inadvertently consume too many resources. An explicit "-j infinite" would be a better option, if anybody really needs something like that. I recently came across option -l , which limits the amount of parallel tasks based on the system's average load. However, this flag does not seem safe enough, as, according to Wikipedia, not all Unix systems include in the average load those processes currently waiting for disk I/O. Besides, the maximum average load one would almost certainly want to use depends on the number of CPUs, so the calling script (or user) has to find out how many CPUs there are. How you find out may also depend on the operating system underneath, so everybody gets a little extra work every time. I am writing a build test framework myself, and I have been trying to coordinate all sub-makes from a main makefile. The top-level script decides how many parallel processes are allowed for the entire build and relies on MAKEFLAGS in order to let all sub-makes talk to each other so as to limit the overall load. Because different makefiles may require different GNU Make options, I am filtering out all others and leaving just the parallel build flags in place, like this: export MAKEFLAGS="$(filter --jobserver-fds=%,$(MAKEFLAGS)) $(filter -j,$(MAKEFLAGS))" && $(MAKE) ...etc... By the way, option --jobserver-fds is not documented, but I think it should be. As you can see, the user may need to filter it out manually after all. The trouble with this MAKEFLAGS technique is that I often come across some third-party script which insists on calculating and setting its own -j value, rendering my coordination efforts useless. When this happens, I get warnings like this: warning: -jN forced in submake: disabling jobserver mode Needless to say, most heuristics to calculate the -j value are as lame as mine (see below). When writing build scripts, nobody seems to have much time left for finding out how to retrieve the relevant system information in bash/perl/whatever in a portable way and then calculate a good -j value out of it. I have been thinking about the best way to overcome such parallel woes, and I wanted to share this thought with you all. How about adding to GNU Make a new -j parameter like this: make -j auto The behaviour of "-j auto" could be as follows: 1) If MAKEFLAGS specifies -j and --jobserver-fds , then use those settings (no warning required). 2) Otherwise, calculate the maximum number of parallel tasks with some trivial heuristic based on the number of CPUs and/or the system load. I'm using + 1 at the moment, but I'm sure there are better guesses. I could think of several alternative heuristics: make -j auto-system-load # Use -l make -j auto-processor-count # Use -j I guess most people would then end up using some "-j auto" variant, in order to avoid overloading or underloading the system without having to implement their own heuristics. That way, a top-level script will be much more likely to succeed at setting a global limit when launching third-party sub-makes in parallel. Please copy me on the answers, as I'm not on this list. Thanks, R. Diez ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make