Re: [RFC]serialize the output of parallel make?
Hambridge, Philip J (ODP) wrote: I've not been following this thread too closely, but this may be relevant: http://www.cmcrossroads.com/ask-mr-make/12909-descrambling-parallel-build-logs I wrote the linked article; for those not interested in reading it, the strategy described there is the same as that used by the talon shell wrapper described elsewhere in this thread. Best regards, Eric Melski Architect Electric Cloud, Inc. http://blog.electric-cloud.com/ ___ Bug-make mailing list Bug-make@gnu.org http://lists.gnu.org/mailman/listinfo/bug-make
Re: Fwd: [RFC]serialize the output of parallel make?
Hi, Since some things happen at the same time there is no single serial order. The semaphore mechanism, forces one of the possible orders. I forgot to say that for recipes with multiple commands you need to either use the new .ONESHELL target or do this kind of thing: mytarget:mytarget: -command1 \ -command2 \ -command3 This causes them to be executed in a single shell invocation for which the output can be gathered together (I am using - to indicate TAB) With .ONESHELL, as I understand it, you would not need the '\' characters to escape the end-of-line: mytarget: -command1 -command2 -command3 Note that I'm using bash syntax here. On windows if you want to use cmd.exe then good luck - I don't think it's really fit for purpose. Regards, Tim On 3 August 2010 02:11, Chiheng Xu chiheng...@gmail.com wrote: On Mon, Aug 2, 2010 at 4:22 PM, Edward Welbourne e...@opera.com wrote: If my guess is not wrong, the semaphore safeguard the consistency of output of one command, not the order of commands. well, with -j, commands are being run concurrently, so there *isn't* a strict ordering of commands to safeguard, although output shall be delivered in roughly the order of completion of commands, with only minor disturbances. Still, if target A is a prerequisite of B, the recipe to make A is run, and must complete, before the recipe to make B will be initiated; since the recipe for A ends with whatever is ensuring its output comes out as an atom, A's output is produced before B's recipe is initiated, so you can be sure they appear in the right order. So the only ordering property among commands that actually matters *is* preserved. This is not my ideal solution. My idea is to preserve the order of output of parallel make as if it is a serial make. Modern CPU can issue multiple instructions simultaneously, but preserve the order of commit to program order. So the instruction level parallelism of CPU is transparent to programmer. What I want is transparent parallel make. Make can issue multiple shells simultaneously, but print their outputs in the same order as in a serial make. -- Chiheng Xu Wuhan,China ___ Bug-make mailing list Bug-make@gnu.org http://lists.gnu.org/mailman/listinfo/bug-make -- You could help some brave and decent people to have access to uncensored news by making a donation at: http://www.thezimbabwean.co.uk/ ___ Bug-make mailing list Bug-make@gnu.org http://lists.gnu.org/mailman/listinfo/bug-make
Re: Fwd: [RFC]serialize the output of parallel make?
On Tue, Aug 3, 2010 at 2:51 PM, Tim Murphy tnmur...@gmail.com wrote: Since some things happen at the same time there is no single serial order. The semaphore mechanism, forces one of the possible orders. I'm not familiar with source code of make, but I believe the serial order of shells is determined by the dependence DAG, it may be unique for a given dependence DAG. Shells can be issued and completed at random order(only need satisfying the dependence relation). But make can print their outputs strictly in their serial order. -- Chiheng Xu Wuhan,China ___ Bug-make mailing list Bug-make@gnu.org http://lists.gnu.org/mailman/listinfo/bug-make
Re: Fwd: [RFC]serialize the output of parallel make?
Chiheng Xu wrote: On Tue, Aug 3, 2010 at 2:51 PM, Tim Murphytnmur...@gmail.com wrote: Since some things happen at the same time there is no single serial order. The semaphore mechanism, forces one of the possible orders. I'm not familiar with source code of make, but I believe the serial order of shells is determined by the dependence DAG, it may be unique for a given dependence DAG. Shells can be issued and completed at random order(only need satisfying the dependence relation). But make can print their outputs strictly in their serial order. I'm trying very hard to only provide constructive comments in response to this thread, but frankly this is, in a word, stupid. If you want make's output to be in serial order, then don't use parallel make at all. The point to parallel make is that it allows jobs which have no ordering dependency to run in parallel. If you want their output to be fully serialized, then you will force make to wait for them to complete serially. Which automatically also means that make will have to maintain an arbitrarily large internal queue for all of the output, because given the unpredictable completion times of multiple jobs running concurrently, no output can be emitted until the slowest parallel job completes. In particular, if you have recursive makefiles, no parent make process can output anything at all until all of its submakes have completed, because no individual make process has enough knowledge about what the actual serial order is. Given that this discussion seems to have arisen due to the braindead stdio handling in Cygwin, it seems like any de-mangling of parallel make's output should be directed to the Cygwin libraries. In my experience Cygwin is too slow an environment to be useful anyway, which is why I use MSYS for Windows builds. But I have to admit, I only use it inside a single-core VirtualBox these days so I haven't looked at how parallel make behaves there. But the fact is all I/O in Cygwin is funneled through the Cygwin DLL, so there's no reason that it can't be fixed to not mingle/mangle lines from different processes together. But again, that's not gnu-make's problem, that's a Cygwin issue. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/ ___ Bug-make mailing list Bug-make@gnu.org http://lists.gnu.org/mailman/listinfo/bug-make
Re: Fwd: [RFC]serialize the output of parallel make?
Date: Tue, 3 Aug 2010 07:51:22 +0100 From: Tim Murphy tnmur...@gmail.com Cc: e...@opera.com, bug-make@gnu.org mytarget: -command1 -command2 -command3 Note that I'm using bash syntax here. On windows if you want to use cmd.exe then good luck - I don't think it's really fit for purpose. cmd.exe supports the same `command1 command2' semantics as does Bash, so there's no problem here and no need for any ``luck''. ___ Bug-make mailing list Bug-make@gnu.org http://lists.gnu.org/mailman/listinfo/bug-make
Re: Fwd: [RFC]serialize the output of parallel make?
Chiheng Xu wrote: What I want is transparent parallel make. Make can issue multiple shells simultaneously, but print their outputs in the same order as in a serial make. ElectricAccelerator is a gmake replacement that does exactly this. I wrote about this feature a while back: http://blog.electric-cloud.com/2008/12/01/untangling-parallel-build-logs/ You can read more about Accelerator on the blog, or here: http://www.electric-cloud.com/products/electricaccelerator.php Eric Melski Architect Electric Cloud, Inc. http://blog.electric-cloud.com/ ___ Bug-make mailing list Bug-make@gnu.org http://lists.gnu.org/mailman/listinfo/bug-make
Re: Fwd: [RFC]serialize the output of parallel make?
On Wed, Aug 4, 2010 at 10:41 AM, Eric Melski s...@melski.net wrote: ElectricAccelerator is a gmake replacement that does exactly this. I wrote about this feature a while back: http://blog.electric-cloud.com/2008/12/01/untangling-parallel-build-logs/ You can read more about Accelerator on the blog, or here: http://www.electric-cloud.com/products/electricaccelerator.php Eric Melski Architect Electric Cloud, Inc. http://blog.electric-cloud.com/ Excellent ! -- Chiheng Xu Wuhan,China ___ Bug-make mailing list Bug-make@gnu.org http://lists.gnu.org/mailman/listinfo/bug-make
Re: Fwd: [RFC]serialize the output of parallel make?
Edward Welbourne wrote: 2x is too much. 1.5x has been the best in my experience, any more than that and you're losing too much CPU to scheduling overhead instead of real work. Any less and you're giving up too much in idle or I/O time. This depends a bit on whether you're using icecc or some similar distributed compilation system. I believe a better approach is to set a generous -j, such as twice the count of CPUs, but impose a load limit using -l, tuned rather more carefully. Scheduling overhead contributes to load, so is taken into account this way. Perhaps in a perfect world -l would be useful. In fact, since load averages are calculated so slowly, by the time your -l limit is reached the actual CPU load will have blown past it and your machine will be thrashing. That's the entire reason I came up with the -j implementation in the first place. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/ ___ Bug-make mailing list Bug-make@gnu.org http://lists.gnu.org/mailman/listinfo/bug-make
Re: Fwd: [RFC]serialize the output of parallel make?
If my guess is not wrong, the semaphore safeguard the consistency of output of one command, not the order of commands. well, with -j, commands are being run concurrently, so there *isn't* a strict ordering of commands to safeguard, although output shall be delivered in roughly the order of completion of commands, with only minor disturbances. Still, if target A is a prerequisite of B, the recipe to make A is run, and must complete, before the recipe to make B will be initiated; since the recipe for A ends with whatever is ensuring its output comes out as an atom, A's output is produced before B's recipe is initiated, so you can be sure they appear in the right order. So the only ordering property among commands that actually matters *is* preserved. Eddy. ___ Bug-make mailing list Bug-make@gnu.org http://lists.gnu.org/mailman/listinfo/bug-make
Re: Fwd: [RFC]serialize the output of parallel make?
2x is too much. 1.5x has been the best in my experience, any more than that and you're losing too much CPU to scheduling overhead instead of real work. Any less and you're giving up too much in idle or I/O time. This depends a bit on whether you're using icecc or some similar distributed compilation system. I believe a better approach is to set a generous -j, such as twice the count of CPUs, but impose a load limit using -l, tuned rather more carefully. Scheduling overhead contributes to load, so is taken into account this way. Eddy. ___ Bug-make mailing list Bug-make@gnu.org http://lists.gnu.org/mailman/listinfo/bug-make
Re: Fwd: [RFC]serialize the output of parallel make?
On Mon, Aug 2, 2010 at 4:22 PM, Edward Welbourne e...@opera.com wrote: If my guess is not wrong, the semaphore safeguard the consistency of output of one command, not the order of commands. well, with -j, commands are being run concurrently, so there *isn't* a strict ordering of commands to safeguard, although output shall be delivered in roughly the order of completion of commands, with only minor disturbances. Still, if target A is a prerequisite of B, the recipe to make A is run, and must complete, before the recipe to make B will be initiated; since the recipe for A ends with whatever is ensuring its output comes out as an atom, A's output is produced before B's recipe is initiated, so you can be sure they appear in the right order. So the only ordering property among commands that actually matters *is* preserved. This is not my ideal solution. My idea is to preserve the order of output of parallel make as if it is a serial make. Modern CPU can issue multiple instructions simultaneously, but preserve the order of commit to program order. So the instruction level parallelism of CPU is transparent to programmer. What I want is transparent parallel make. Make can issue multiple shells simultaneously, but print their outputs in the same order as in a serial make. -- Chiheng Xu Wuhan,China ___ Bug-make mailing list Bug-make@gnu.org http://lists.gnu.org/mailman/listinfo/bug-make
Re: [RFC]serialize the output of parallel make?
On Fri, Jul 30, 2010 at 1:26 PM, Paul Smith psm...@gnu.org wrote: On Fri, 2010-07-30 at 09:59 +0800, Chiheng Xu wrote: As parallel make are becoming more and more popular, can make serialize the output of parallel make? Make can redirect every parallelly issued shell's output to an temporary file, and output the stored output serially, as if in a serial make. This would be a good thing, but as always the details are not quite so trivial. We have to ensure that these temporary files are cleaned up properly, even in the face of users ^C'ing their make invocations. We also need to verify that whatever methods we use will work properly on Windows and VMS and other operating systems make supports (where are their /tmp equivalents?) My suggestion is that you can implement it as an optional command line option(like -j), and on one or two primary platforms(Linux/Windows), instead of on all platforms. And, what about stdout vs. stderr? Should we write both to the same file? Then we lose the ability to do things like make -j4 2/dev/null since all output will be written to stdout (presumably). Or should we keep two files per command, one for stdout and one for stderr? But that's even worse since then when we printed it we'd have to print all the stdout first then all the stderr, which could lose important context. The scenario like make -j4 2/dev/null may be very rare, but scenario like make -j4 21 | tee output.txt may be common. I think using make to parallelly build or test large software on multicore system is by far the most normal situation. User of make most need is : 1. utilize the computing power of multicore system using parallel make(-j) ; 2. easily analyze the output of the build or test, to precisely find the problem. When user of make provide the optional output-serializing command line option, he known what he need, he is responsible for the mess situation. Then there's the possibility that some commands will behave differently if they are writing to a TTY, then they will if they're writing to a file. Do we not care about that, or do we try to do something crazy with PTYs or similar (ugh!) And of course we have to have a guaranteed unique naming strategy in case multiple instances of make are running on the same system at the same time, maybe running the same makefiles and even building the same targets. On POSIX systems we can use tmpfile() or mkstemp() or something but other systems might need other measures. These are just some things I thought of off the top of my head. It certainly does not mean that it would not be a good thing to have this ability though. -- --- Paul D. Smith psm...@gnu.org Find some GNU make tips at: http://www.gnu.org http://make.mad-scientist.net Please remain calm...I may be mad, but I am a professional. --Mad Scientist -- Chiheng Xu Wuhan,China ___ Bug-make mailing list Bug-make@gnu.org http://lists.gnu.org/mailman/listinfo/bug-make
Re: [RFC]serialize the output of parallel make?
On Thu, Jul 29, 2010 at 11:29 PM, Chiheng Xu chiheng...@gmail.com wrote: ... My suggestion is that you can implement it as an optional command line option(like -j), and on one or two primary platforms(Linux/Windows), instead of on all platforms. So, the complexity of both possibilities. You're writing, debugging, and contributing this code yourself? (I would hope that this wouldn't require any Linux-specific code; perhaps you meant POSIX Windows?) ... The scenario like make -j4 2/dev/null may be very rare, but scenario like make -j4 21 | tee output.txt may be common. And what, exactly, are you suggesting that make do to reflect that guess about usage patterns? Philip Guenther ___ Bug-make mailing list Bug-make@gnu.org http://lists.gnu.org/mailman/listinfo/bug-make
Re: [RFC]serialize the output of parallel make?
On Fri, Jul 30, 2010 at 3:45 PM, Philip Guenther guent...@gmail.com wrote: On Thu, Jul 29, 2010 at 11:29 PM, Chiheng Xu chiheng...@gmail.com wrote: ... My suggestion is that you can implement it as an optional command line option(like -j), and on one or two primary platforms(Linux/Windows), instead of on all platforms. So, the complexity of both possibilities. You're writing, debugging, and contributing this code yourself? Nop, I'm not maintainer of make, just a user :) . (I would hope that this wouldn't require any Linux-specific code; perhaps you meant POSIX Windows?) Yes. ... The scenario like make -j4 2/dev/null may be very rare, but scenario like make -j4 21 | tee output.txt may be common. And what, exactly, are you suggesting that make do to reflect that guess about usage patterns? Philip Guenther I mean, normal user of make does not differentiate stdout or stderr very seriously, they see them both as output. They want serialized output, whether or not it's stdout or stderr. -- Chiheng Xu Wuhan,China ___ Bug-make mailing list Bug-make@gnu.org http://lists.gnu.org/mailman/listinfo/bug-make
Re: [RFC]serialize the output of parallel make?
Hi, Serialisation can be done by external programs - no real need for make to do it. My project does it already with the talon shell wrapper that we use in our build system here: http://developer.symbian.org/oss/MCL/sftools/dev/build/file/96fee2635b19/sbsv2/raptor/util/talon You set talon as the shell for make and talon in turn runs whatever the actual shell is but adds the serialisation. there are a lot of other nice things you can do with this - e.g. measuring the execution time of every build step so that you can see what affects performance. Regards, Tim On 30 July 2010 09:28, Chiheng Xu chiheng...@gmail.com wrote: On Fri, Jul 30, 2010 at 3:45 PM, Philip Guenther guent...@gmail.com wrote: On Thu, Jul 29, 2010 at 11:29 PM, Chiheng Xu chiheng...@gmail.com wrote: ... My suggestion is that you can implement it as an optional command line option(like -j), and on one or two primary platforms(Linux/Windows), instead of on all platforms. So, the complexity of both possibilities. You're writing, debugging, and contributing this code yourself? Nop, I'm not maintainer of make, just a user :) . (I would hope that this wouldn't require any Linux-specific code; perhaps you meant POSIX Windows?) Yes. ... The scenario like make -j4 2/dev/null may be very rare, but scenario like make -j4 21 | tee output.txt may be common. And what, exactly, are you suggesting that make do to reflect that guess about usage patterns? Philip Guenther I mean, normal user of make does not differentiate stdout or stderr very seriously, they see them both as output. They want serialized output, whether or not it's stdout or stderr. -- Chiheng Xu Wuhan,China ___ Bug-make mailing list Bug-make@gnu.org http://lists.gnu.org/mailman/listinfo/bug-make -- You could help some brave and decent people to have access to uncensored news by making a donation at: http://www.thezimbabwean.co.uk/ ___ Bug-make mailing list Bug-make@gnu.org http://lists.gnu.org/mailman/listinfo/bug-make
Re: [RFC]serialize the output of parallel make?
From: Paul Smith psm...@gnu.org Date: Fri, 30 Jul 2010 01:26:46 -0400 Cc: bug-make@gnu.org We have to ensure that these temporary files are cleaned up properly, even in the face of users ^C'ing their make invocations. We also need to verify that whatever methods we use will work properly on Windows and VMS and other operating systems make supports (where are their /tmp equivalents?) For the latter, we could use P_tmpdir, which is quite portable. This is not to say that I think the original idea is easy to implement. Actually, it is not entirely clear to me what is meant by serialization in this context. Commands are invoked by Make in parallel, and the output is produced in the order of their execution (modulo buffering issues). What would serialization look like in this context? Can someone show a simple example? ___ Bug-make mailing list Bug-make@gnu.org http://lists.gnu.org/mailman/listinfo/bug-make
Re: [RFC]serialize the output of parallel make?
On Fri, Jul 30, 2010 at 4:40 PM, Tim Murphy tnmur...@gmail.com wrote: Hi, You set talon as the shell for make and talon in turn runs whatever the actual shell is but adds the serialisation. there are a lot of other nice things you can do with this - e.g. measuring the execution time of every build step so that you can see what affects performance. Sorry, I don't understand this. You probably misunderstand the meaning of serialization of output and parallel make. -- Chiheng Xu Wuhan,China ___ Bug-make mailing list Bug-make@gnu.org http://lists.gnu.org/mailman/listinfo/bug-make
Re: [RFC]serialize the output of parallel make?
On Fri, Jul 30, 2010 at 4:53 PM, Eli Zaretskii e...@gnu.org wrote: This is not to say that I think the original idea is easy to implement. Actually, it is not entirely clear to me what is meant by serialization in this context. Commands are invoked by Make in parallel, and the output is produced in the order of their execution (modulo buffering issues). What would serialization look like in this context? Can someone show a simple example? Parallelly invoked shells(and commands the shells invoke) may print to output randomly, render the output messy. -- Chiheng Xu Wuhan,China ___ Bug-make mailing list Bug-make@gnu.org http://lists.gnu.org/mailman/listinfo/bug-make
Re: [RFC]serialize the output of parallel make?
On 30 July 2010 09:55, Chiheng Xu chiheng...@gmail.com wrote: On Fri, Jul 30, 2010 at 4:40 PM, Tim Murphy tnmur...@gmail.com wrote: Hi, You set talon as the shell for make and talon in turn runs whatever the actual shell is but adds the serialisation. there are a lot of other nice things you can do with this - e.g. measuring the execution time of every build step so that you can see what affects performance. Sorry, I don't understand this. You probably misunderstand the meaning of serialization of output and parallel make. No, I understand better than you think as I do huge parallel builds every day. The shell wrapper buffers the recipe output and then grabs a semaphore before writing the output to it's stdout.. if another recipe has completed and is in the process of outputting to the stdout then it has to wait a few microseconds. This system works reliably and has given us de-scrambled output for the last 2 years. Regards, Tim You could help some brave and decent people to have access to uncensored news by making a donation at: http://www.thezimbabwean.co.uk/ ___ Bug-make mailing list Bug-make@gnu.org http://lists.gnu.org/mailman/listinfo/bug-make
Re: [RFC]serialize the output of parallel make?
On Fri, Jul 30, 2010 at 5:27 PM, Tim Murphy tnmur...@gmail.com wrote: No, I understand better than you think as I do huge parallel builds every day. The shell wrapper buffers the recipe output and then grabs a semaphore before writing the output to it's stdout.. if another recipe has completed and is in the process of outputting to the stdout then it has to wait a few microseconds. The use of semaphore may impair performance. -- Chiheng Xu Wuhan,China ___ Bug-make mailing list Bug-make@gnu.org http://lists.gnu.org/mailman/listinfo/bug-make
Re: [RFC]serialize the output of parallel make?
Make can redirect every parallelly issued shell's output to an temporary file, and output the stored output serially, as if in a serial make. +1 for wanting this as a make feature. We a hack in some of our makefiles to implement essentially exactly the above. For reference, here's the essence of a kludge known to work: EACHOUT = $...@.out ALLOUT = $(GENROOT)/build_log.out LOGEACH = ; flock $(ALLOUT) \ sh -c 'echo $1 $) | cat - $(EACHOUT) $(ALLOUT)'; \ $(RM) $(EACHOUT) Get every command whose output you care about to redirect to $(EACHOUT); at the end of each recipe that's doing any of that (in many cases this'll be right after the redirect, but some command-lines may be fiddlier) append $(LOGEACH). That's assuming you combine stdout and stderr; if not, you need to do similar for EACHERR and ALLERR. For bonus points, add an || failed=yes to each command in each affected recipe and a ; [ -z $$failed ] to the end of LOGEACH, so that make knows which recipes fail. And, what about stdout vs. stderr? Yup, there are endless problems with this. On the other hand, there are situations that really do beg for it. In particular, automated build systems are an important part of many development methodologies; obviously, using -j in them is desirable; and getting coherent logs out of them, in which each compilation unit's errors and warnings appear as a single block, is crucial. These typically don't even bother with tee, they just output.txt 21 directly. The results need to be tidy, because they are typically parsed by something that converts them to HTML, needs to make the errors and warnings stand out and may also wants to compare warnings against those from earlier builds in order to flag them as regressions. Such parsing gets severely messed up when output from several commands gets interleaved. The scenario like make -j4 2/dev/null may be very rare, but scenario like make -j4 21 | tee output.txt may be common. I concur. Those discarding make's output probably discard all of it, not selectively stderr. When user of make provide the optional output-serializing command line option, he known what he need, he is responsible for the mess situation. Making this an optional feature makes sense, particularly given the myriad complications Paul points out. I suggest options along the following lines: sketch --demux-stdout[=filename] --demux-stderr[=filename] --demux-outerr[=filename] When using -j, capture each recipe's standard output, standard error or both (interleaved as by 21) and emit it atomically (i.e. not interleaved with the corresponding output of any other recipe), optionally to filename. (Even when -j is not used, --demux-stdout and --demux-stderr can be used to separate the standard output and error streams of commands.) Note that some commands may behave differently if output is not to a terminal (as will be the case if these flags are used). Some commands may emit output that only makes sense when stdout and stderr are interleaved in the order in which they were output. Use at your own peril. /sketch Quite how implementable that is, I leave Paul to decree, Eddy. ___ Bug-make mailing list Bug-make@gnu.org http://lists.gnu.org/mailman/listinfo/bug-make
Re: [RFC]serialize the output of parallel make?
Date: Fri, 30 Jul 2010 17:08:55 +0800 From: Chiheng Xu chiheng...@gmail.com Cc: psm...@gnu.org, bug-make@gnu.org On Fri, Jul 30, 2010 at 4:53 PM, Eli Zaretskii e...@gnu.org wrote: This is not to say that I think the original idea is easy to implement. Actually, it is not entirely clear to me what is meant by serialization in this context. Commands are invoked by Make in parallel, and the output is produced in the order of their execution (modulo buffering issues). What would serialization look like in this context? Can someone show a simple example? Parallelly invoked shells(and commands the shells invoke) may print to output randomly, render the output messy. I asked for an example. Could you please show a messy output and the output you'd like to have after serialization? TIA ___ Bug-make mailing list Bug-make@gnu.org http://lists.gnu.org/mailman/listinfo/bug-make
Re: [RFC]serialize the output of parallel make?
On Fri, Jul 30, 2010 at 5:35 PM, Eli Zaretskii e...@gnu.org wrote: I asked for an example. Could you please show a messy output and the output you'd like to have after serialization? TIA serially make : execute A, B, C programs, they print: A: Hello, I'm A, I am from Earth. B: The moon is my home. C: Welcome to Mars, It's an amazing planet. parallely make : the output of A, B, C programs interleave : C: Welcome to B: The moon is my A: Hello, I'm A, I am from Earth.home.Mars, It's an amazing planet. -- Chiheng Xu Wuhan,China ___ Bug-make mailing list Bug-make@gnu.org http://lists.gnu.org/mailman/listinfo/bug-make
Re: [RFC]serialize the output of parallel make?
On Fri, Jul 30, 2010 at 6:01 PM, Howard Chu h...@highlandsun.com wrote: Chiheng Xu wrote: On Fri, Jul 30, 2010 at 5:35 PM, Eli Zaretskiie...@gnu.org wrote: I asked for an example. Could you please show a messy output and the output you'd like to have after serialization? TIA serially make : execute A, B, C programs, they print: A: Hello, I'm A, I am from Earth. B: The moon is my home. C: Welcome to Mars, It's an amazing planet. parallely make : the output of A, B, C programs interleave : C: Welcome to B: The moon is my A: Hello, I'm A, I am from Earth.home.Mars, It's an amazing planet. This seems like quite an extreme example. stdout is line buffered by default, so individual lines would get written atomically unless the programs you're running are doing weird things with their output. In the common case interleaving like this doesn't happen within lines, it only happens between lines of multi-line output. stderr may skew things since it's usually nonbuffered, but again, that's not the common case. I use make -j 4 to build and test gcc, the situation above is very common. -- Chiheng Xu Wuhan,China ___ Bug-make mailing list Bug-make@gnu.org http://lists.gnu.org/mailman/listinfo/bug-make
Re: [RFC]serialize the output of parallel make?
Chiheng Xu wrote: On Fri, Jul 30, 2010 at 5:35 PM, Eli Zaretskiie...@gnu.org wrote: I asked for an example. Could you please show a messy output and the output you'd like to have after serialization? TIA serially make : execute A, B, C programs, they print: A: Hello, I'm A, I am from Earth. B: The moon is my home. C: Welcome to Mars, It's an amazing planet. parallely make : the output of A, B, C programs interleave : C: Welcome to B: The moon is my A: Hello, I'm A, I am from Earth.home.Mars, It's an amazing planet. This seems like quite an extreme example. stdout is line buffered by default, so individual lines would get written atomically unless the programs you're running are doing weird things with their output. In the common case interleaving like this doesn't happen within lines, it only happens between lines of multi-line output. stderr may skew things since it's usually nonbuffered, but again, that's not the common case. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/ ___ Bug-make mailing list Bug-make@gnu.org http://lists.gnu.org/mailman/listinfo/bug-make
Re: [RFC]serialize the output of parallel make?
Date: Fri, 30 Jul 2010 12:10:36 +0100 From: Tim Murphy tnmur...@gmail.com Cc: bug-make@gnu.org gcc -o fred fred.cpp perl makedef.pl -i something.def perl prepdef.pl -i otherthing.def error: fred.cpp: syntax error on line 345 ERROR: File not Found Which file was missing? If you can't change the tool to print every detail then what do you do? a) run make --debug=j and see which command failed. b) run without -j and see which command failed. ___ Bug-make mailing list Bug-make@gnu.org http://lists.gnu.org/mailman/listinfo/bug-make
Re: [RFC]serialize the output of parallel make?
The shell wrapper buffers the recipe output and then grabs a semaphore before writing the output to it's stdout.. if another recipe has completed and is in the process of outputting to the stdout then it has to wait a few microseconds. The use of semaphore may impair performance. And serialzation you mean is not the same as I mean. I believe Paul and Edward fully understand what I mean. The approach described above is - as far as I can tell - exactly what my kludge does, except done cleanly. In particular, my kludge has the same semaphore problem (it uses flock) and I'm quite sure this doesn't impair performance - because each recipe does the thing that actually matters (compiles the code) before getting stuck waiting for an opportunity to deliver its output (synchronously) to the common pool. So the only cost is that the process that ran the recipe is sitting idle for a while after it finishes its work, before it reports its results. This might lose you some performance if your -j count is too low, but I always use a generous -j limited by a suitable -l in any case (i.e. I let make run as many processes as it finds useful, so long as it doesn't take the load above a limit I set). So I think Tim Murphy understands perfectly well what we mean by serialization - ensuring the output from each command comes out cleanly, separate from the output of any other command - and I'm suddenly interested in finding out more about talon ;-) Eddy. ___ Bug-make mailing list Bug-make@gnu.org http://lists.gnu.org/mailman/listinfo/bug-make
Re: [RFC]serialize the output of parallel make?
On Thu, 2010-07-29 at 22:44 -0700, Howard Chu wrote: The scheme that I grew up with on Alliant Concentrix was just to prefix each output line with its job number |xx|blah blah blah. It obviously requires a pipe for each child process' output, so that lines can be read by the parent make and then the prefix attached. The resource issue is one thing for sure, but even more than that I'm not sure that would work with make's current, single-threaded design. Make doesn't really have any central loop where we could add a select() or whatever to check which children had output ready to be processed, so _where_ to add this is a big issue... if we don't read the pipe fast enough then jobs will slow down as they hang on the write(). I think asking make to do this work will simply cause your builds to slow down a lot, unless we introduced threads and had a separate thread doing that work. Or, we could implement the other idea you had for more reliable jobservers (avoiding the RESTART issue), which had make fork a process and then had that process fork the job: in that environment there's an extra process that can be used to manage each child's output. Of course this has its own drawbacks on systems with very high process creation overhead, like Windows. And serialzation you mean is not the same as I mean. I believe Paul and Edward fully understand what I mean. I think Tim is saying the same thing: his solution will definitely work, at least as well as having make do it. If make did the work then it would invoke the command with stdout/stderr redirected to a temporary file, then when the job was complete make would read and print those files to stdout. In Tim's solution, the command that make invokes (really, the shell make invokes to run the command) saves its OWN output to a temporary file, then when the command is done it gets a semaphore (to ensure serialization) and dumps all that output. Actually I suspect that Tim's solution would be MORE efficient, because if make is reading large output files and streaming them to stdout, that's time it DOESN'T spend doing other, make-like things. If you have the command itself doing it then you get the advantage of multi-processing involved. I certainly don't see how it could be SLOWER; if you want to enforce serialization then at some point, someone is going to have to wait--that's more or less the definition of serialization. I don't see how the command waiting is any less efficient than make doing basically the same thing. This is all assuming that by serialization you mean ONLY that the output from each command will be grouped together, without interspersing any other command's output. If you mean something more, such as that the output of the commands appears in some deterministic fashion (for example, given the rule a: b c d that the output of the command to build b would always come before c and that would always come before d) then that's much more difficult, and not what I was suggesting. -- --- Paul D. Smith psm...@gnu.org Find some GNU make tips at: http://www.gnu.org http://make.mad-scientist.net Please remain calm...I may be mad, but I am a professional. --Mad Scientist ___ Bug-make mailing list Bug-make@gnu.org http://lists.gnu.org/mailman/listinfo/bug-make
Re: [RFC]serialize the output of parallel make?
This seems like quite an extreme example. stdout is line buffered by default, on half-way decent systems - and even then, I'm not sure, it might be limited to when writing to a TTY. I use make -j 4 to build and test gcc, the situation above is very common. Then it means you're getting a lot of diagnostics written to stderr, and you should probably look into why you're getting so many. While I can agree as concerns output while building gcc (at least after the first pass, so you're using a half-way decent compiler), I know we've had this problem with a certain compiler a certain consortium insists on using (so we have to use to get our product on their platform). That compiler produces megabytes of warnings per build, mostly spurious (including - joy - some warnings against code that isn't actually present in the source; it's actually warning against the code *it generated* to implement bit-field initializers in C++ constructors slap forehead=own /) or, at the very least, uninteresting - but enough of them need to be checked up (and some of them are Important, albeit only on that platform, due to deficiencies of its tool-chain) that we need to have a parser (because we sure aren't going to read those megabytes ourselves) check the output and tell us the interesting bits. We're running GNU make under cygwin to drive this monstrosity (of course, it's a windows-only development environment - the sort of consortium that's going to impose such brain-dead-ness on ISVs isn't likely to believe in other platforms): and when we turned on -j2 (because we'd finally moved the painfully slow auto-builder to a machine with enough cores to make that useful; we can now use -j4 but our first test-builds were cautious) our parser promptly broke because we got interleaved output. My previously-posted hack was the work-around for that. While large amounts of diagonostics from a sensible compiler like gcc are indeed a good reason to *fix your damn code* so that interleaving of diagnostics isn't an issue (and this is exactly what we've done on platforms where we use gcc), there are (lamentably) times when one is obliged to use less sensible software to earn an honest crust. If that spams you with diagnostics, you need to be able to at least ensure they reach you in a clean enough form to support automatic sorting of the wheat from the chaff. Of course, that doesn't mean that the serialization *has* to be done by make - I *can* hack round the problem and I'm pleased to hear there are other tools to help - but it remains this problem is likely to be prevalent among users of make -j, which *is* a case for make to provide build-in support for it, if it's practical and someone is willing to do the work to implement it, Eddy. ___ Bug-make mailing list Bug-make@gnu.org http://lists.gnu.org/mailman/listinfo/bug-make
Fwd: [RFC]serialize the output of parallel make?
-- Forwarded message -- From: Chiheng Xu chiheng...@gmail.com Date: Fri, Jul 30, 2010 at 6:02 PM Subject: Re: [RFC]serialize the output of parallel make? To: Tim Murphy tnmur...@gmail.com On Fri, Jul 30, 2010 at 5:54 PM, Tim Murphy tnmur...@gmail.com wrote: Hi, The cost to the system of even starting a shell to invoke a recipe is so huge compared to the time needed to reserve a semaphore that it is insignificant in comparison. The amount of contention is limited by -j i.e. by how many processes there are ( 2 * CPUs is usually considered reasonable) and by how long the lock is held for which is basically about how long GNU make takes to read the output from the process that currently has the lock. Since modern computers have 1000s of CPUs the degree of contention is not high and most of the cost of contention is something you pay for no matter what method you use to descramble stuff. Our experience indicates that it performs very well. If my guess is not wrong, the semaphore safeguard the consistency of output of one command, not the order of commands. -- Chiheng Xu Wuhan,China -- Chiheng Xu Wuhan,China ___ Bug-make mailing list Bug-make@gnu.org http://lists.gnu.org/mailman/listinfo/bug-make
Re: Fwd: [RFC]serialize the output of parallel make?
Chiheng Xu wrote: -- Forwarded message -- From: Chiheng Xuchiheng...@gmail.com Date: Fri, Jul 30, 2010 at 6:02 PM Subject: Re: [RFC]serialize the output of parallel make? To: Tim Murphytnmur...@gmail.com On Fri, Jul 30, 2010 at 5:54 PM, Tim Murphytnmur...@gmail.com wrote: Hi, The cost to the system of even starting a shell to invoke a recipe is so huge compared to the time needed to reserve a semaphore that it is insignificant in comparison. The amount of contention is limited by -j i.e. by how many processes there are ( 2 * CPUs is usually considered reasonable) and by how long 2x is too much. 1.5x has been the best in my experience, any more than that and you're losing too much CPU to scheduling overhead instead of real work. Any less and you're giving up too much in idle or I/O time. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/ ___ Bug-make mailing list Bug-make@gnu.org http://lists.gnu.org/mailman/listinfo/bug-make
[RFC]serialize the output of parallel make?
As parallel make are becoming more and more popular, can make serialize the output of parallel make? Make can redirect every parallelly issued shell's output to an temporary file, and output the stored output serially, as if in a serial make. -- Chiheng Xu Wuhan,China ___ Bug-make mailing list Bug-make@gnu.org http://lists.gnu.org/mailman/listinfo/bug-make
Re: [RFC]serialize the output of parallel make?
On Fri, 2010-07-30 at 09:59 +0800, Chiheng Xu wrote: As parallel make are becoming more and more popular, can make serialize the output of parallel make? Make can redirect every parallelly issued shell's output to an temporary file, and output the stored output serially, as if in a serial make. This would be a good thing, but as always the details are not quite so trivial. We have to ensure that these temporary files are cleaned up properly, even in the face of users ^C'ing their make invocations. We also need to verify that whatever methods we use will work properly on Windows and VMS and other operating systems make supports (where are their /tmp equivalents?) And, what about stdout vs. stderr? Should we write both to the same file? Then we lose the ability to do things like make -j4 2/dev/null since all output will be written to stdout (presumably). Or should we keep two files per command, one for stdout and one for stderr? But that's even worse since then when we printed it we'd have to print all the stdout first then all the stderr, which could lose important context. Then there's the possibility that some commands will behave differently if they are writing to a TTY, then they will if they're writing to a file. Do we not care about that, or do we try to do something crazy with PTYs or similar (ugh!) And of course we have to have a guaranteed unique naming strategy in case multiple instances of make are running on the same system at the same time, maybe running the same makefiles and even building the same targets. On POSIX systems we can use tmpfile() or mkstemp() or something but other systems might need other measures. These are just some things I thought of off the top of my head. It certainly does not mean that it would not be a good thing to have this ability though. -- --- Paul D. Smith psm...@gnu.org Find some GNU make tips at: http://www.gnu.org http://make.mad-scientist.net Please remain calm...I may be mad, but I am a professional. --Mad Scientist ___ Bug-make mailing list Bug-make@gnu.org http://lists.gnu.org/mailman/listinfo/bug-make
Re: [RFC]serialize the output of parallel make?
Paul Smith wrote: On Fri, 2010-07-30 at 09:59 +0800, Chiheng Xu wrote: As parallel make are becoming more and more popular, can make serialize the output of parallel make? Make can redirect every parallelly issued shell's output to an temporary file, and output the stored output serially, as if in a serial make. This would be a good thing, but as always the details are not quite so trivial. Aside from the difficulties outlined below, I just am not fond of having output batched up instead of appearing in realtime. It tends to complicate the polling logic too (though I guess in this case, you just have to cat the appropriate file[s] whenever a child process ends.) The scheme that I grew up with on Alliant Concentrix was just to prefix each output line with its job number |xx|blah blah blah. It obviously requires a pipe for each child process' output, so that lines can be read by the parent make and then the prefix attached. In the original jobserver prototype I used unique bytes for each job token so that the token == the job ID, with an eye toward adding this support later. But that obviously limited it to supporting only 256 concurrent jobs, and these days we already get complaints that it's limited to only 4096. Using a pipe per job would likewise cut make's maximum job count in half (or worse, if using a separate stderr pipe). I still favor this latter approach because it keeps the output flowing in realtime, and its easy enough to use grep if you want to zero in on a single output stream. But the cost in resources will add up... We have to ensure that these temporary files are cleaned up properly, even in the face of users ^C'ing their make invocations. We also need to verify that whatever methods we use will work properly on Windows and VMS and other operating systems make supports (where are their /tmp equivalents?) And, what about stdout vs. stderr? Should we write both to the same file? Then we lose the ability to do things like make -j4 2/dev/null since all output will be written to stdout (presumably). Or should we keep two files per command, one for stdout and one for stderr? But that's even worse since then when we printed it we'd have to print all the stdout first then all the stderr, which could lose important context. Then there's the possibility that some commands will behave differently if they are writing to a TTY, then they will if they're writing to a file. Do we not care about that, or do we try to do something crazy with PTYs or similar (ugh!) And of course we have to have a guaranteed unique naming strategy in case multiple instances of make are running on the same system at the same time, maybe running the same makefiles and even building the same targets. On POSIX systems we can use tmpfile() or mkstemp() or something but other systems might need other measures. These are just some things I thought of off the top of my head. It certainly does not mean that it would not be a good thing to have this ability though. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/ ___ Bug-make mailing list Bug-make@gnu.org http://lists.gnu.org/mailman/listinfo/bug-make