subject:"\[RFC\]serialize the output of parallel make\?"

Re: [RFC]serialize the output of parallel make?

2010-08-04 Thread Eric Melski


Hambridge, Philip J (ODP) wrote:

I've not been following this thread too closely, but this may be
relevant:
http://www.cmcrossroads.com/ask-mr-make/12909-descrambling-parallel-build-logs



I wrote the linked article; for those not interested in reading it, the 
strategy described there is the same as that used by the talon shell 
wrapper described elsewhere in this thread.


Best regards,

Eric Melski
Architect
Electric Cloud, Inc.
http://blog.electric-cloud.com/




___
Bug-make mailing list
Bug-make@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-make

Re: Fwd: [RFC]serialize the output of parallel make?

2010-08-03 Thread Tim Murphy

Hi,

Since some things happen at the same time there is no single serial
order.  The semaphore mechanism, forces one of the possible orders.

I forgot to say that for recipes with multiple commands you need to
either use the new .ONESHELL target or do this kind of thing:

mytarget:mytarget:
-command1  \
-command2  \
-command3

This causes them to be executed in a single shell invocation for which
the output can be gathered together

(I am using - to indicate TAB)

With .ONESHELL, as I understand it, you would not need the '\'
characters to escape the end-of-line:

mytarget:
-command1 
-command2 
-command3


Note that I'm using bash syntax here.  On windows if you want to use
cmd.exe then good luck - I don't think it's really fit for purpose.

Regards,

Tim

On 3 August 2010 02:11, Chiheng Xu chiheng...@gmail.com wrote:
 On Mon, Aug 2, 2010 at 4:22 PM, Edward Welbourne e...@opera.com wrote:
 If my guess is not wrong, the semaphore safeguard the consistency of
 output of one command, not the order of commands.

 well, with -j, commands are being run concurrently, so there *isn't* a
 strict ordering of commands to safeguard, although output shall be
 delivered in roughly the order of completion of commands, with only
 minor disturbances.

 Still, if target A is a prerequisite of B, the recipe to make A is
 run, and must complete, before the recipe to make B will be initiated;
 since the recipe for A ends with whatever is ensuring its output comes
 out as an atom, A's output is produced before B's recipe is initiated,
 so you can be sure they appear in the right order.  So the only
 ordering property among commands that actually matters *is* preserved.


 This is not my ideal solution.

 My idea is to preserve the order of output of parallel make as if it
 is a serial make.

 Modern CPU can issue multiple instructions simultaneously, but
 preserve the order of commit to program order. So the instruction
 level parallelism of CPU is transparent to programmer.

 What I want is transparent parallel make.   Make can issue multiple
 shells simultaneously, but print their outputs in the same order as in
 a serial make.




 --
 Chiheng Xu
 Wuhan,China

 ___
 Bug-make mailing list
 Bug-make@gnu.org
 http://lists.gnu.org/mailman/listinfo/bug-make




-- 
You could help some brave and decent people to have access to
uncensored news by making a donation at:

http://www.thezimbabwean.co.uk/

___
Bug-make mailing list
Bug-make@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-make

Re: Fwd: [RFC]serialize the output of parallel make?

2010-08-03 Thread Chiheng Xu

On Tue, Aug 3, 2010 at 2:51 PM, Tim Murphy tnmur...@gmail.com wrote:

 Since some things happen at the same time there is no single serial
 order.  The semaphore mechanism, forces one of the possible orders.


I'm not familiar with source code of make, but I believe the serial
order of shells is determined by the dependence DAG,  it may be
unique for a given dependence DAG.

Shells can be issued and completed at random order(only need
satisfying the dependence relation). But make can print their outputs
strictly in their serial order.


-- 
Chiheng Xu
Wuhan,China

___
Bug-make mailing list
Bug-make@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-make

Re: Fwd: [RFC]serialize the output of parallel make?

2010-08-03 Thread Howard Chu


Chiheng Xu wrote:

On Tue, Aug 3, 2010 at 2:51 PM, Tim Murphytnmur...@gmail.com  wrote:


Since some things happen at the same time there is no single serial
order.  The semaphore mechanism, forces one of the possible orders.



I'm not familiar with source code of make, but I believe the serial
order of shells is determined by the dependence DAG,  it may be
unique for a given dependence DAG.

Shells can be issued and completed at random order(only need
satisfying the dependence relation). But make can print their outputs
strictly in their serial order.


I'm trying very hard to only provide constructive comments in response to this 
thread, but frankly this is, in a word, stupid.


If you want make's output to be in serial order, then don't use parallel make 
at all. The point to parallel make is that it allows jobs which have no 
ordering dependency to run in parallel. If you want their output to be fully 
serialized, then you will force make to wait for them to complete serially. 
Which automatically also means that make will have to maintain an arbitrarily 
large internal queue for all of the output, because given the unpredictable 
completion times of multiple jobs running concurrently, no output can be 
emitted until the slowest parallel job completes. In particular, if you have 
recursive makefiles, no parent make process can output anything at all until 
all of its submakes have completed, because no individual make process has 
enough knowledge about what the actual serial order is.


Given that this discussion seems to have arisen due to the braindead stdio 
handling in Cygwin, it seems like any de-mangling of parallel make's output 
should be directed to the Cygwin libraries. In my experience Cygwin is too 
slow an environment to be useful anyway, which is why I use MSYS for Windows 
builds. But I have to admit, I only use it inside a single-core VirtualBox 
these days so I haven't looked at how parallel make behaves there. But the 
fact is all I/O in Cygwin is funneled through the Cygwin DLL, so there's no 
reason that it can't be fixed to not mingle/mangle lines from different 
processes together. But again, that's not gnu-make's problem, that's a Cygwin 
issue.


--
  -- Howard Chu
  CTO, Symas Corp.   http://www.symas.com
  Director, Highland Sun http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/

___
Bug-make mailing list
Bug-make@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-make

Re: Fwd: [RFC]serialize the output of parallel make?

2010-08-03 Thread Eli Zaretskii

 Date: Tue, 3 Aug 2010 07:51:22 +0100
 From: Tim Murphy tnmur...@gmail.com
 Cc: e...@opera.com, bug-make@gnu.org

 mytarget:
 -command1 
 -command2 
 -command3

 Note that I'm using bash syntax here.  On windows if you want to use
 cmd.exe then good luck - I don't think it's really fit for purpose.

cmd.exe supports the same `command1  command2' semantics as does
Bash, so there's no problem here and no need for any ``luck''.

___
Bug-make mailing list
Bug-make@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-make

Re: Fwd: [RFC]serialize the output of parallel make?

2010-08-03 Thread Eric Melski


Chiheng Xu wrote:

What I want is transparent parallel make.   Make can issue multiple
shells simultaneously, but print their outputs in the same order as in
a serial make.


ElectricAccelerator is a gmake replacement that does exactly this.  I 
wrote about this feature a while back:


http://blog.electric-cloud.com/2008/12/01/untangling-parallel-build-logs/

You can read more about Accelerator on the blog, or here:

http://www.electric-cloud.com/products/electricaccelerator.php

Eric Melski
Architect
Electric Cloud, Inc.
http://blog.electric-cloud.com/



___
Bug-make mailing list
Bug-make@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-make

Re: Fwd: [RFC]serialize the output of parallel make?

2010-08-03 Thread Chiheng Xu

On Wed, Aug 4, 2010 at 10:41 AM, Eric Melski s...@melski.net wrote:
 ElectricAccelerator is a gmake replacement that does exactly this.  I wrote
 about this feature a while back:

 http://blog.electric-cloud.com/2008/12/01/untangling-parallel-build-logs/

 You can read more about Accelerator on the blog, or here:

 http://www.electric-cloud.com/products/electricaccelerator.php

 Eric Melski
 Architect
 Electric Cloud, Inc.
 http://blog.electric-cloud.com/


Excellent !


-- 
Chiheng Xu
Wuhan,China

___
Bug-make mailing list
Bug-make@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-make

Re: Fwd: [RFC]serialize the output of parallel make?

2010-08-02 Thread Howard Chu


Edward Welbourne wrote:

2x is too much. 1.5x has been the best in my experience, any more than that
and you're losing too much CPU to scheduling overhead instead of real work.
Any less and you're giving up too much in idle or I/O time.


This depends a bit on whether you're using icecc or some similar
distributed compilation system.  I believe a better approach is to set
a generous -j, such as twice the count of CPUs, but impose a load
limit using -l, tuned rather more carefully.  Scheduling overhead
contributes to load, so is taken into account this way.


Perhaps in a perfect world -l would be useful. In fact, since load averages 
are calculated so slowly, by the time your -l limit is reached the actual CPU 
load will have blown past it and your machine will be thrashing. That's the 
entire reason I came up with the -j implementation in the first place.


--
  -- Howard Chu
  CTO, Symas Corp.   http://www.symas.com
  Director, Highland Sun http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/

___
Bug-make mailing list
Bug-make@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-make

Re: Fwd: [RFC]serialize the output of parallel make?

2010-08-02 Thread Edward Welbourne

 If my guess is not wrong, the semaphore safeguard the consistency of
 output of one command, not the order of commands.

well, with -j, commands are being run concurrently, so there *isn't* a
strict ordering of commands to safeguard, although output shall be
delivered in roughly the order of completion of commands, with only
minor disturbances.

Still, if target A is a prerequisite of B, the recipe to make A is
run, and must complete, before the recipe to make B will be initiated;
since the recipe for A ends with whatever is ensuring its output comes
out as an atom, A's output is produced before B's recipe is initiated,
so you can be sure they appear in the right order.  So the only
ordering property among commands that actually matters *is* preserved.

Eddy.

___
Bug-make mailing list
Bug-make@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-make

Re: Fwd: [RFC]serialize the output of parallel make?

2010-08-02 Thread Edward Welbourne

 2x is too much. 1.5x has been the best in my experience, any more than that 
 and you're losing too much CPU to scheduling overhead instead of real work. 
 Any less and you're giving up too much in idle or I/O time.

This depends a bit on whether you're using icecc or some similar
distributed compilation system.  I believe a better approach is to set
a generous -j, such as twice the count of CPUs, but impose a load
limit using -l, tuned rather more carefully.  Scheduling overhead
contributes to load, so is taken into account this way.

Eddy.

___
Bug-make mailing list
Bug-make@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-make

Re: Fwd: [RFC]serialize the output of parallel make?

2010-08-02 Thread Chiheng Xu

On Mon, Aug 2, 2010 at 4:22 PM, Edward Welbourne e...@opera.com wrote:
 If my guess is not wrong, the semaphore safeguard the consistency of
 output of one command, not the order of commands.

 well, with -j, commands are being run concurrently, so there *isn't* a
 strict ordering of commands to safeguard, although output shall be
 delivered in roughly the order of completion of commands, with only
 minor disturbances.

 Still, if target A is a prerequisite of B, the recipe to make A is
 run, and must complete, before the recipe to make B will be initiated;
 since the recipe for A ends with whatever is ensuring its output comes
 out as an atom, A's output is produced before B's recipe is initiated,
 so you can be sure they appear in the right order.  So the only
 ordering property among commands that actually matters *is* preserved.


This is not my ideal solution.

My idea is to preserve the order of output of parallel make as if it
is a serial make.

Modern CPU can issue multiple instructions simultaneously, but
preserve the order of commit to program order. So the instruction
level parallelism of CPU is transparent to programmer.

What I want is transparent parallel make.   Make can issue multiple
shells simultaneously, but print their outputs in the same order as in
a serial make.




-- 
Chiheng Xu
Wuhan,China

___
Bug-make mailing list
Bug-make@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-make

Re: [RFC]serialize the output of parallel make?

2010-07-30 Thread Chiheng Xu

On Fri, Jul 30, 2010 at 1:26 PM, Paul Smith psm...@gnu.org wrote:
 On Fri, 2010-07-30 at 09:59 +0800, Chiheng Xu wrote:
 As parallel make are becoming more and more popular,  can make
 serialize the output of parallel make?

 Make can redirect every parallelly issued shell's output to an
 temporary file,  and output the stored output serially, as if in a
 serial make.

 This would be a good thing, but as always the details are not quite so
 trivial.

 We have to ensure that these temporary files are cleaned up properly,
 even in the face of users ^C'ing their make invocations.  We also need
 to verify that whatever methods we use will work properly on Windows and
 VMS and other operating systems make supports (where are their /tmp
 equivalents?)


My suggestion is that you can implement it as an optional command line
option(like -j), and on one or two primary platforms(Linux/Windows),
instead of on all platforms.


 And, what about stdout vs. stderr?  Should we write both to the same
 file?  Then we lose the ability to do things like make -j4 2/dev/null
 since all output will be written to stdout (presumably).  Or should we
 keep two files per command, one for stdout and one for stderr?  But
 that's even worse since then when we printed it we'd have to print all
 the stdout first then all the stderr, which could lose important
 context.



The scenario like make -j4 2/dev/null may be very rare, but
scenario like make -j4  21 |  tee output.txt may be common.


I think using make to parallelly build or test large software on
multicore system is by far the most normal situation. User of make
most need is : 1. utilize the computing power of multicore system
using parallel make(-j) ; 2. easily analyze the output of the build or
test, to precisely find the problem.

When user of make provide the optional output-serializing command line
option, he known what he need, he is responsible for the mess
situation.




 Then there's the possibility that some commands will behave differently
 if they are writing to a TTY, then they will if they're writing to a
 file.  Do we not care about that, or do we try to do something crazy
 with PTYs or similar (ugh!)

 And of course we have to have a guaranteed unique naming strategy in
 case multiple instances of make are running on the same system at the
 same time, maybe running the same makefiles and even building the same
 targets.  On POSIX systems we can use tmpfile() or mkstemp() or
 something but other systems might need other measures.

 These are just some things I thought of off the top of my head.

 It certainly does not mean that it would not be a good thing to have
 this ability though.

 --
 ---
  Paul D. Smith psm...@gnu.org          Find some GNU make tips at:
  http://www.gnu.org                      http://make.mad-scientist.net
  Please remain calm...I may be mad, but I am a professional. --Mad Scientist





-- 
Chiheng Xu
Wuhan,China

___
Bug-make mailing list
Bug-make@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-make

Re: [RFC]serialize the output of parallel make?

2010-07-30 Thread Philip Guenther

On Thu, Jul 29, 2010 at 11:29 PM, Chiheng Xu chiheng...@gmail.com wrote:
...
 My suggestion is that you can implement it as an optional command line
 option(like -j), and on one or two primary platforms(Linux/Windows),
 instead of on all platforms.

So, the complexity of both possibilities.  You're writing, debugging,
and contributing this code yourself?

(I would hope that this wouldn't require any Linux-specific code;
perhaps you meant POSIX  Windows?)


...
 The scenario like make -j4 2/dev/null may be very rare, but
 scenario like make -j4  21 |  tee output.txt may be common.

And what, exactly, are you suggesting that make do to reflect that
guess about usage patterns?


Philip Guenther

___
Bug-make mailing list
Bug-make@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-make

Re: [RFC]serialize the output of parallel make?

2010-07-30 Thread Chiheng Xu

On Fri, Jul 30, 2010 at 3:45 PM, Philip Guenther guent...@gmail.com wrote:
 On Thu, Jul 29, 2010 at 11:29 PM, Chiheng Xu chiheng...@gmail.com wrote:
 ...
 My suggestion is that you can implement it as an optional command line
 option(like -j), and on one or two primary platforms(Linux/Windows),
 instead of on all platforms.

 So, the complexity of both possibilities.  You're writing, debugging,
 and contributing this code yourself?


Nop, I'm not maintainer of make,  just a user :)  .


 (I would hope that this wouldn't require any Linux-specific code;
 perhaps you meant POSIX  Windows?)


Yes.


 ...
 The scenario like make -j4 2/dev/null may be very rare, but
 scenario like make -j4  21 |  tee output.txt may be common.

 And what, exactly, are you suggesting that make do to reflect that
 guess about usage patterns?


 Philip Guenther


I mean, normal user of make does not differentiate stdout or stderr
very seriously, they see them both as output. They want serialized
output,  whether or not it's stdout or stderr.



-- 
Chiheng Xu
Wuhan,China

___
Bug-make mailing list
Bug-make@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-make

Re: [RFC]serialize the output of parallel make?

2010-07-30 Thread Tim Murphy

Hi,

Serialisation can be done by external programs - no real need for make to do it.

My project does it already with the talon shell wrapper that we use in
our build system here:

http://developer.symbian.org/oss/MCL/sftools/dev/build/file/96fee2635b19/sbsv2/raptor/util/talon

You set talon as the shell for make and talon in turn runs whatever
the actual shell is but adds the serialisation. there are a lot of
other nice things you can do with this - e.g. measuring the execution
time of every build step so that you can see what affects performance.

Regards,

Tim

On 30 July 2010 09:28, Chiheng Xu chiheng...@gmail.com wrote:
On Fri, Jul 30, 2010 at 3:45 PM, Philip Guenther guent...@gmail.com wrote:
On Thu, Jul 29, 2010 at 11:29 PM, Chiheng Xu chiheng...@gmail.com wrote:
...
My suggestion is that you can implement it as an optional command line
option(like -j), and on one or two primary platforms(Linux/Windows),
instead of on all platforms.

So, the complexity of both possibilities. You're writing, debugging,
and contributing this code yourself?

Nop, I'm not maintainer of make, just a user :) .

(I would hope that this wouldn't require any Linux-specific code;
perhaps you meant POSIX Windows?)

Yes.

...
The scenario like make -j4 2/dev/null may be very rare, but
scenario like make -j4 21 | tee output.txt may be common.

And what, exactly, are you suggesting that make do to reflect that
guess about usage patterns?

Philip Guenther

I mean, normal user of make does not differentiate stdout or stderr
very seriously, they see them both as output. They want serialized
output, whether or not it's stdout or stderr.

--
Chiheng Xu
Wuhan,China

___
Bug-make mailing list
Bug-make@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-make

--
You could help some brave and decent people to have access to
uncensored news by making a donation at:

http://www.thezimbabwean.co.uk/

___
Bug-make mailing list
Bug-make@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-make

Re: [RFC]serialize the output of parallel make?

2010-07-30 Thread Eli Zaretskii

 From: Paul Smith psm...@gnu.org
 Date: Fri, 30 Jul 2010 01:26:46 -0400
 Cc: bug-make@gnu.org

 We have to ensure that these temporary files are cleaned up properly,
 even in the face of users ^C'ing their make invocations.  We also need
 to verify that whatever methods we use will work properly on Windows and
 VMS and other operating systems make supports (where are their /tmp
 equivalents?)

For the latter, we could use P_tmpdir, which is quite portable.

This is not to say that I think the original idea is easy to
implement.

Actually, it is not entirely clear to me what is meant by
serialization in this context.  Commands are invoked by Make in
parallel, and the output is produced in the order of their execution
(modulo buffering issues).  What would serialization look like in this
context?  Can someone show a simple example?

___
Bug-make mailing list
Bug-make@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-make

Re: [RFC]serialize the output of parallel make?

2010-07-30 Thread Chiheng Xu

On Fri, Jul 30, 2010 at 4:40 PM, Tim Murphy tnmur...@gmail.com wrote:
 Hi,



 You set talon as the shell for make and talon in turn runs whatever
 the actual shell is but adds the serialisation.  there are a lot of
 other nice things you can do with this - e.g. measuring the execution
 time of every build step so that you can see what affects performance.

Sorry, I don't understand this.

You probably misunderstand the meaning of serialization of output
and parallel make.




-- 
Chiheng Xu
Wuhan,China

___
Bug-make mailing list
Bug-make@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-make

Re: [RFC]serialize the output of parallel make?

2010-07-30 Thread Chiheng Xu

On Fri, Jul 30, 2010 at 4:53 PM, Eli Zaretskii e...@gnu.org wrote:
 This is not to say that I think the original idea is easy to
 implement.

 Actually, it is not entirely clear to me what is meant by
 serialization in this context.  Commands are invoked by Make in
 parallel, and the output is produced in the order of their execution
 (modulo buffering issues).  What would serialization look like in this
 context?  Can someone show a simple example?


Parallelly invoked shells(and commands the shells invoke) may print to
output randomly, render the output messy.



-- 
Chiheng Xu
Wuhan,China

___
Bug-make mailing list
Bug-make@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-make

Re: [RFC]serialize the output of parallel make?

2010-07-30 Thread Tim Murphy

On 30 July 2010 09:55, Chiheng Xu chiheng...@gmail.com wrote:
 On Fri, Jul 30, 2010 at 4:40 PM, Tim Murphy tnmur...@gmail.com wrote:
 Hi,



 You set talon as the shell for make and talon in turn runs whatever
 the actual shell is but adds the serialisation.  there are a lot of
 other nice things you can do with this - e.g. measuring the execution
 time of every build step so that you can see what affects performance.

 Sorry, I don't understand this.

 You probably misunderstand the meaning of serialization of output
 and parallel make.

No, I understand better than you think as I do huge parallel builds every day.

The shell wrapper buffers the recipe output and then grabs a semaphore
before writing the output to it's stdout..  if another recipe has
completed and is in the process of outputting to the stdout then it
has to wait a few microseconds.

This system works reliably and has given us de-scrambled output for
the last 2 years.

Regards,

Tim



You could help some brave and decent people to have access to
uncensored news by making a donation at:

http://www.thezimbabwean.co.uk/

___
Bug-make mailing list
Bug-make@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-make

Re: [RFC]serialize the output of parallel make?

2010-07-30 Thread Chiheng Xu

On Fri, Jul 30, 2010 at 5:27 PM, Tim Murphy tnmur...@gmail.com wrote:

 No, I understand better than you think as I do huge parallel builds every day.

 The shell wrapper buffers the recipe output and then grabs a semaphore
 before writing the output to it's stdout..  if another recipe has
 completed and is in the process of outputting to the stdout then it
 has to wait a few microseconds.

The use of semaphore may impair performance.



-- 
Chiheng Xu
Wuhan,China

___
Bug-make mailing list
Bug-make@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-make

Re: [RFC]serialize the output of parallel make?

2010-07-30 Thread Edward Welbourne

 Make can redirect every parallelly issued shell's output to an
 temporary file,  and output the stored output serially, as if in a
 serial make.

+1 for wanting this as a make feature.
We a hack in some of our makefiles to implement essentially exactly
the above.  For reference, here's the essence of a kludge known to
work:

EACHOUT = $...@.out
ALLOUT  = $(GENROOT)/build_log.out
LOGEACH = ; flock $(ALLOUT) \
sh -c 'echo $1 $) | cat - $(EACHOUT)  $(ALLOUT)'; \
$(RM) $(EACHOUT)

Get every command whose output you care about to redirect to
$(EACHOUT); at the end of each recipe that's doing any of that (in
many cases this'll be right after the redirect, but some command-lines
may be fiddlier) append $(LOGEACH).  That's assuming you combine
stdout and stderr; if not, you need to do similar for EACHERR and
ALLERR.  For bonus points, add an || failed=yes to each command in
each affected recipe and a ; [ -z $$failed ] to the end of LOGEACH,
so that make knows which recipes fail.

 And, what about stdout vs. stderr?
Yup, there are endless problems with this.
On the other hand, there are situations that really do beg for it.

In particular, automated build systems are an important part of many
development methodologies; obviously, using -j in them is desirable;
and getting coherent logs out of them, in which each compilation
unit's errors and warnings appear as a single block, is crucial.
These typically don't even bother with tee, they just output.txt 21
directly.

The results need to be tidy, because they are typically parsed by
something that converts them to HTML, needs to make the errors and
warnings stand out and may also wants to compare warnings against
those from earlier builds in order to flag them as regressions.  Such
parsing gets severely messed up when output from several commands gets
interleaved.

 The scenario like make -j4 2/dev/null may be very rare, but
 scenario like make -j4  21 |  tee output.txt may be common.
I concur.
Those discarding make's output probably discard all of it, not
selectively stderr.

 When user of make provide the optional output-serializing command line
 option, he known what he need, he is responsible for the mess
 situation.

Making this an optional feature makes sense, particularly given the
myriad complications Paul points out.  I suggest options along the
following lines: sketch

 --demux-stdout[=filename]
 --demux-stderr[=filename]
 --demux-outerr[=filename]

When using -j, capture each recipe's standard output, standard
error or both (interleaved as by 21) and emit it atomically
(i.e. not interleaved with the corresponding output of any
other recipe), optionally to filename.

(Even when -j is not used, --demux-stdout and --demux-stderr
can be used to separate the standard output and error streams
of commands.)

Note that some commands may behave differently if output is
not to a terminal (as will be the case if these flags are
used).  Some commands may emit output that only makes sense
when stdout and stderr are interleaved in the order in which
they were output.  Use at your own peril.

/sketch Quite how implementable that is, I leave Paul to decree,

Eddy.

___
Bug-make mailing list
Bug-make@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-make

Re: [RFC]serialize the output of parallel make?

2010-07-30 Thread Eli Zaretskii

 Date: Fri, 30 Jul 2010 17:08:55 +0800
 From: Chiheng Xu chiheng...@gmail.com
 Cc: psm...@gnu.org, bug-make@gnu.org

 On Fri, Jul 30, 2010 at 4:53 PM, Eli Zaretskii e...@gnu.org wrote:
  This is not to say that I think the original idea is easy to
  implement.

  Actually, it is not entirely clear to me what is meant by
  serialization in this context.  Commands are invoked by Make in
  parallel, and the output is produced in the order of their execution
  (modulo buffering issues).  What would serialization look like in this
  context?  Can someone show a simple example?

 Parallelly invoked shells(and commands the shells invoke) may print to
 output randomly, render the output messy.

I asked for an example.  Could you please show a messy output and
the output you'd like to have after serialization?

TIA

___
Bug-make mailing list
Bug-make@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-make

Re: [RFC]serialize the output of parallel make?

2010-07-30 Thread Chiheng Xu

On Fri, Jul 30, 2010 at 5:35 PM, Eli Zaretskii e...@gnu.org wrote:

 I asked for an example.  Could you please show a messy output and
 the output you'd like to have after serialization?

 TIA


serially make : execute  A, B, C programs, they print:

A:  Hello, I'm A, I am from Earth.
B:  The moon is my home.
C:  Welcome to Mars, It's an amazing planet.

parallely make : the output of A, B, C programs interleave :

C:  Welcome to
B:  The moon is my
A:  Hello, I'm A, I am from Earth.home.Mars, It's an amazing planet.




-- 
Chiheng Xu
Wuhan,China

___
Bug-make mailing list
Bug-make@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-make

Re: [RFC]serialize the output of parallel make?

2010-07-30 Thread Chiheng Xu

On Fri, Jul 30, 2010 at 6:01 PM, Howard Chu h...@highlandsun.com wrote:
 Chiheng Xu wrote:

 On Fri, Jul 30, 2010 at 5:35 PM, Eli Zaretskiie...@gnu.org  wrote:

 I asked for an example.  Could you please show a messy output and
 the output you'd like to have after serialization?

 TIA


 serially make : execute  A, B, C programs, they print:

 A:  Hello, I'm A, I am from Earth.
 B:  The moon is my home.
 C:  Welcome to Mars, It's an amazing planet.

 parallely make : the output of A, B, C programs interleave :

 C:  Welcome to
 B:  The moon is my
 A:  Hello, I'm A, I am from Earth.home.Mars, It's an amazing planet.

 This seems like quite an extreme example. stdout is line buffered by
 default, so individual lines would get written atomically unless the
 programs you're running are doing weird things with their output. In the
 common case interleaving like this doesn't happen within lines, it only
 happens between lines of multi-line output. stderr may skew things since
 it's usually nonbuffered, but again, that's not the common case.


I use make -j 4 to build and test gcc, the situation above is very common.




-- 
Chiheng Xu
Wuhan,China

___
Bug-make mailing list
Bug-make@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-make

Re: [RFC]serialize the output of parallel make?

2010-07-30 Thread Howard Chu


Chiheng Xu wrote:

On Fri, Jul 30, 2010 at 5:35 PM, Eli Zaretskiie...@gnu.org  wrote:


I asked for an example.  Could you please show a messy output and
the output you'd like to have after serialization?

TIA



serially make : execute  A, B, C programs, they print:

A:  Hello, I'm A, I am from Earth.
B:  The moon is my home.
C:  Welcome to Mars, It's an amazing planet.

parallely make : the output of A, B, C programs interleave :

C:  Welcome to
B:  The moon is my
A:  Hello, I'm A, I am from Earth.home.Mars, It's an amazing planet.


This seems like quite an extreme example. stdout is line buffered by default, 
so individual lines would get written atomically unless the programs you're 
running are doing weird things with their output. In the common case 
interleaving like this doesn't happen within lines, it only happens between 
lines of multi-line output. stderr may skew things since it's usually 
nonbuffered, but again, that's not the common case.


--
  -- Howard Chu
  CTO, Symas Corp.   http://www.symas.com
  Director, Highland Sun http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/

___
Bug-make mailing list
Bug-make@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-make

Re: [RFC]serialize the output of parallel make?

2010-07-30 Thread Eli Zaretskii

 Date: Fri, 30 Jul 2010 12:10:36 +0100
 From: Tim Murphy tnmur...@gmail.com
 Cc: bug-make@gnu.org

 gcc -o fred fred.cpp
 perl makedef.pl -i something.def
 perl prepdef.pl  -i otherthing.def
 error: fred.cpp: syntax error on line 345
 ERROR: File not Found

 Which file was missing?  If you can't change the tool to print every
 detail then what do you do?

 a) run make --debug=j and see which command failed.

 b) run without -j and see which command failed.

___
Bug-make mailing list
Bug-make@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-make

Re: [RFC]serialize the output of parallel make?

2010-07-30 Thread Edward Welbourne

 The shell wrapper buffers the recipe output and then grabs a semaphore
 before writing the output to it's stdout..  if another recipe has
 completed and is in the process of outputting to the stdout then it
 has to wait a few microseconds.
 The use of semaphore may impair performance.
 And serialzation you mean is not the same as I mean.

 I believe Paul and Edward fully understand what I mean.

The approach described above is - as far as I can tell - exactly what
my kludge does, except done cleanly.  In particular, my kludge has the
same semaphore problem (it uses flock) and I'm quite sure this doesn't
impair performance - because each recipe does the thing that actually
matters (compiles the code) before getting stuck waiting for an
opportunity to deliver its output (synchronously) to the common pool.
So the only cost is that the process that ran the recipe is sitting
idle for a while after it finishes its work, before it reports its
results.  This might lose you some performance if your -j count is too
low, but I always use a generous -j limited by a suitable -l in any
case (i.e. I let make run as many processes as it finds useful, so
long as it doesn't take the load above a limit I set).

So I think Tim Murphy understands perfectly well what we mean by
serialization - ensuring the output from each command comes out
cleanly, separate from the output of any other command - and I'm
suddenly interested in finding out more about talon ;-)

Eddy.

___
Bug-make mailing list
Bug-make@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-make

Re: [RFC]serialize the output of parallel make?

2010-07-30 Thread Paul Smith

On Thu, 2010-07-29 at 22:44 -0700, Howard Chu wrote:
 The scheme that I grew up with on Alliant Concentrix was just to prefix each 
 output line with its job number |xx|blah blah blah. It obviously requires a 
 pipe for each child process' output, so that lines can be read by the parent 
 make and then the prefix attached.

The resource issue is one thing for sure, but even more than that I'm
not sure that would work with make's current, single-threaded design.
Make doesn't really have any central loop where we could add a
select() or whatever to check which children had output ready to be
processed, so _where_ to add this is a big issue... if we don't read the
pipe fast enough then jobs will slow down as they hang on the write().

I think asking make to do this work will simply cause your builds to
slow down a lot, unless we introduced threads and had a separate thread
doing that work.

Or, we could implement the other idea you had for more reliable
jobservers (avoiding the RESTART issue), which had make fork a process
and then had that process fork the job: in that environment there's an
extra process that can be used to manage each child's output.  Of course
this has its own drawbacks on systems with very high process creation
overhead, like Windows.

 And serialzation you mean is not the same as I mean.
 
 I believe Paul and Edward fully understand what I mean.

I think Tim is saying the same thing: his solution will definitely work,
at least as well as having make do it.  If make did the work then it
would invoke the command with stdout/stderr redirected to a temporary
file, then when the job was complete make would read and print those
files to stdout.

In Tim's solution, the command that make invokes (really, the shell make
invokes to run the command) saves its OWN output to a temporary file,
then when the command is done it gets a semaphore (to ensure
serialization) and dumps all that output.

Actually I suspect that Tim's solution would be MORE efficient, because
if make is reading large output files and streaming them to stdout,
that's time it DOESN'T spend doing other, make-like things.  If you have
the command itself doing it then you get the advantage of
multi-processing involved.

I certainly don't see how it could be SLOWER; if you want to enforce
serialization then at some point, someone is going to have to
wait--that's more or less the definition of serialization.  I don't see
how the command waiting is any less efficient than make doing basically
the same thing.


This is all assuming that by serialization you mean ONLY that the output
from each command will be grouped together, without interspersing any
other command's output.  If you mean something more, such as that the
output of the commands appears in some deterministic fashion (for
example, given the rule a: b c d that the output of the command to
build b would always come before c and that would always come before
d) then that's much more difficult, and not what I was suggesting.

-- 
---
 Paul D. Smith psm...@gnu.org  Find some GNU make tips at:
 http://www.gnu.org  http://make.mad-scientist.net
 Please remain calm...I may be mad, but I am a professional. --Mad Scientist


___
Bug-make mailing list
Bug-make@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-make

Re: [RFC]serialize the output of parallel make?

2010-07-30 Thread Edward Welbourne

 This seems like quite an extreme example. stdout is line buffered by
 default,

on half-way decent systems - and even then, I'm not sure, it might be
limited to when writing to a TTY.

 I use make -j 4 to build and test gcc, the situation above is very common.

 Then it means you're getting a lot of diagnostics written to stderr, and you 
 should probably look into why you're getting so many.

While I can agree as concerns output while building gcc (at least
after the first pass, so you're using a half-way decent compiler), I
know we've had this problem with a certain compiler a certain
consortium insists on using (so we have to use to get our product on
their platform).  That compiler produces megabytes of warnings per
build, mostly spurious (including - joy - some warnings against code
that isn't actually present in the source; it's actually warning
against the code *it generated* to implement bit-field initializers in
C++ constructors slap forehead=own /) or, at the very least,
uninteresting - but enough of them need to be checked up (and some of
them are Important, albeit only on that platform, due to deficiencies
of its tool-chain) that we need to have a parser (because we sure
aren't going to read those megabytes ourselves) check the output and
tell us the interesting bits.

We're running GNU make under cygwin to drive this monstrosity (of
course, it's a windows-only development environment - the sort of
consortium that's going to impose such brain-dead-ness on ISVs isn't
likely to believe in other platforms): and when we turned on -j2
(because we'd finally moved the painfully slow auto-builder to a
machine with enough cores to make that useful; we can now use -j4 but
our first test-builds were cautious) our parser promptly broke because
we got interleaved output.  My previously-posted hack was the
work-around for that.

While large amounts of diagonostics from a sensible compiler like gcc
are indeed a good reason to *fix your damn code* so that interleaving
of diagnostics isn't an issue (and this is exactly what we've done on
platforms where we use gcc), there are (lamentably) times when one is
obliged to use less sensible software to earn an honest crust.  If
that spams you with diagnostics, you need to be able to at least
ensure they reach you in a clean enough form to support automatic
sorting of the wheat from the chaff.

Of course, that doesn't mean that the serialization *has* to be done
by make - I *can* hack round the problem and I'm pleased to hear there
are other tools to help - but it remains this problem is likely to be
prevalent among users of make -j, which *is* a case for make to
provide build-in support for it, if it's practical and someone is
willing to do the work to implement it,

Eddy.

___
Bug-make mailing list
Bug-make@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-make

Fwd: [RFC]serialize the output of parallel make?

2010-07-30 Thread Chiheng Xu

-- Forwarded message --
From: Chiheng Xu chiheng...@gmail.com
Date: Fri, Jul 30, 2010 at 6:02 PM
Subject: Re: [RFC]serialize the output of parallel make?
To: Tim Murphy tnmur...@gmail.com

On Fri, Jul 30, 2010 at 5:54 PM, Tim Murphy tnmur...@gmail.com wrote:
 Hi,

 The cost to the system of even starting a shell to invoke a recipe is
 so huge compared to the time needed to reserve a semaphore that it is
 insignificant in comparison.

 The amount of contention is limited by -j i.e. by how many processes
 there are ( 2 * CPUs is usually considered reasonable) and by how long
 the lock is held for which is basically about how long GNU make takes
 to read the output from the process that currently has the lock.
 Since modern computers have 1000s of CPUs the degree of contention is
 not high and most of the cost of contention is something you pay for
 no matter what method you use to descramble stuff.

 Our experience indicates that it performs very well.

If my guess is not wrong, the semaphore safeguard the consistency of
output of one command, not the order of commands.

--
Chiheng Xu
Wuhan,China

-- 
Chiheng Xu
Wuhan,China

___
Bug-make mailing list
Bug-make@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-make

Re: Fwd: [RFC]serialize the output of parallel make?

2010-07-30 Thread Howard Chu

Chiheng Xu wrote:

-- Forwarded message --
From: Chiheng Xuchiheng...@gmail.com
Date: Fri, Jul 30, 2010 at 6:02 PM
Subject: Re: [RFC]serialize the output of parallel make?
To: Tim Murphytnmur...@gmail.com

On Fri, Jul 30, 2010 at 5:54 PM, Tim Murphytnmur...@gmail.com  wrote:

Hi,

The cost to the system of even starting a shell to invoke a recipe is
so huge compared to the time needed to reserve a semaphore that it is
insignificant in comparison.

The amount of contention is limited by -j i.e. by how many processes
there are ( 2 * CPUs is usually considered reasonable) and by how long

2x is too much. 1.5x has been the best in my experience, any more than that 
and you're losing too much CPU to scheduling overhead instead of real work. 
Any less and you're giving up too much in idle or I/O time.

--
  -- Howard Chu
  CTO, Symas Corp.   http://www.symas.com
  Director, Highland Sun http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/

___
Bug-make mailing list
Bug-make@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-make

[RFC]serialize the output of parallel make?

2010-07-29 Thread Chiheng Xu

As parallel make are becoming more and more popular,  can make
serialize the output of parallel make?

Make can redirect every parallelly issued shell's output to an
temporary file,  and output the stored output serially, as if in a
serial make.

-- 
Chiheng Xu
Wuhan,China

___
Bug-make mailing list
Bug-make@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-make

Re: [RFC]serialize the output of parallel make?

2010-07-29 Thread Paul Smith

On Fri, 2010-07-30 at 09:59 +0800, Chiheng Xu wrote:
 As parallel make are becoming more and more popular,  can make
 serialize the output of parallel make?
 
 Make can redirect every parallelly issued shell's output to an
 temporary file,  and output the stored output serially, as if in a
 serial make.

This would be a good thing, but as always the details are not quite so
trivial.

We have to ensure that these temporary files are cleaned up properly,
even in the face of users ^C'ing their make invocations.  We also need
to verify that whatever methods we use will work properly on Windows and
VMS and other operating systems make supports (where are their /tmp
equivalents?)

And, what about stdout vs. stderr?  Should we write both to the same
file?  Then we lose the ability to do things like make -j4 2/dev/null
since all output will be written to stdout (presumably).  Or should we
keep two files per command, one for stdout and one for stderr?  But
that's even worse since then when we printed it we'd have to print all
the stdout first then all the stderr, which could lose important
context.

Then there's the possibility that some commands will behave differently
if they are writing to a TTY, then they will if they're writing to a
file.  Do we not care about that, or do we try to do something crazy
with PTYs or similar (ugh!)

And of course we have to have a guaranteed unique naming strategy in
case multiple instances of make are running on the same system at the
same time, maybe running the same makefiles and even building the same
targets.  On POSIX systems we can use tmpfile() or mkstemp() or
something but other systems might need other measures.

These are just some things I thought of off the top of my head.

It certainly does not mean that it would not be a good thing to have
this ability though.

-- 
---
 Paul D. Smith psm...@gnu.org  Find some GNU make tips at:
 http://www.gnu.org  http://make.mad-scientist.net
 Please remain calm...I may be mad, but I am a professional. --Mad Scientist


___
Bug-make mailing list
Bug-make@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-make

Re: [RFC]serialize the output of parallel make?

2010-07-29 Thread Howard Chu


Paul Smith wrote:

On Fri, 2010-07-30 at 09:59 +0800, Chiheng Xu wrote:

As parallel make are becoming more and more popular,  can make
serialize the output of parallel make?

Make can redirect every parallelly issued shell's output to an
temporary file,  and output the stored output serially, as if in a
serial make.


This would be a good thing, but as always the details are not quite so
trivial.


Aside from the difficulties outlined below, I just am not fond of having 
output batched up instead of appearing in realtime. It tends to complicate the 
polling logic too (though I guess in this case, you just have to cat the 
appropriate file[s] whenever a child process ends.)


The scheme that I grew up with on Alliant Concentrix was just to prefix each 
output line with its job number |xx|blah blah blah. It obviously requires a 
pipe for each child process' output, so that lines can be read by the parent 
make and then the prefix attached. In the original jobserver prototype I used 
unique bytes for each job token so that the token == the job ID, with an eye 
toward adding this support later. But that obviously limited it to supporting 
only 256 concurrent jobs, and these days we already get complaints that it's 
limited to only 4096. Using a pipe per job would likewise cut make's maximum 
job count in half (or worse, if using a separate stderr pipe).


I still favor this latter approach because it keeps the output flowing in 
realtime, and its easy enough to use grep if you want to zero in on a single 
output stream. But the cost in resources will add up...


We have to ensure that these temporary files are cleaned up properly,
even in the face of users ^C'ing their make invocations.  We also need
to verify that whatever methods we use will work properly on Windows and
VMS and other operating systems make supports (where are their /tmp
equivalents?)

And, what about stdout vs. stderr?  Should we write both to the same
file?  Then we lose the ability to do things like make -j4 2/dev/null
since all output will be written to stdout (presumably).  Or should we
keep two files per command, one for stdout and one for stderr?  But
that's even worse since then when we printed it we'd have to print all
the stdout first then all the stderr, which could lose important
context.

Then there's the possibility that some commands will behave differently
if they are writing to a TTY, then they will if they're writing to a
file.  Do we not care about that, or do we try to do something crazy
with PTYs or similar (ugh!)

And of course we have to have a guaranteed unique naming strategy in
case multiple instances of make are running on the same system at the
same time, maybe running the same makefiles and even building the same
targets.  On POSIX systems we can use tmpfile() or mkstemp() or
something but other systems might need other measures.

These are just some things I thought of off the top of my head.

It certainly does not mean that it would not be a good thing to have
this ability though.




--
  -- Howard Chu
  CTO, Symas Corp.   http://www.symas.com
  Director, Highland Sun http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/

___
Bug-make mailing list
Bug-make@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-make

Re: [RFC]serialize the output of parallel make?

Re: Fwd: [RFC]serialize the output of parallel make?

Re: Fwd: [RFC]serialize the output of parallel make?

Re: Fwd: [RFC]serialize the output of parallel make?

Re: Fwd: [RFC]serialize the output of parallel make?

Re: Fwd: [RFC]serialize the output of parallel make?

Re: Fwd: [RFC]serialize the output of parallel make?

Re: Fwd: [RFC]serialize the output of parallel make?

Re: Fwd: [RFC]serialize the output of parallel make?

Re: Fwd: [RFC]serialize the output of parallel make?

Re: Fwd: [RFC]serialize the output of parallel make?

Re: [RFC]serialize the output of parallel make?

Re: [RFC]serialize the output of parallel make?

Re: [RFC]serialize the output of parallel make?

Re: [RFC]serialize the output of parallel make?

Re: [RFC]serialize the output of parallel make?

Re: [RFC]serialize the output of parallel make?

Re: [RFC]serialize the output of parallel make?

Re: [RFC]serialize the output of parallel make?

Re: [RFC]serialize the output of parallel make?

Re: [RFC]serialize the output of parallel make?

Re: [RFC]serialize the output of parallel make?

Re: [RFC]serialize the output of parallel make?

Re: [RFC]serialize the output of parallel make?

Re: [RFC]serialize the output of parallel make?

Re: [RFC]serialize the output of parallel make?

Re: [RFC]serialize the output of parallel make?

Re: [RFC]serialize the output of parallel make?

Re: [RFC]serialize the output of parallel make?

Fwd: [RFC]serialize the output of parallel make?

Re: Fwd: [RFC]serialize the output of parallel make?

[RFC]serialize the output of parallel make?

Re: [RFC]serialize the output of parallel make?

Re: [RFC]serialize the output of parallel make?

34 matches

Site Navigation

Mail list logo

Footer information