subject:"\[bug #33138\] .PARLLELSYNC enhancement with patch"

Let me add my voice as a user.  If you are one of the lucky people whose
builds consist mostly of 1 line of output per rule then you will rarely
have any trouble in a good build but try interpreting error messages from
compiler/tool X when they're 10 lines from the file that they refer to and
don't include the filename in the error message.  Try working it out when
it's not your code that's being built.

This suggests an optimisation where you buffer 5 lines in memory and open a
file if there's more but I wouldn't be surprised if the gain is quite small
as I suspect that the average build step does so much IO as to make a
buffered temp file for the output seem a relatively tiny overhead.

I have certainly spent a lot of time wondering why xyz.cpp doesn't have a
line 370.  I faced this  5 years ago and we had to come up with our own
solution rather than wait to convince GNU mission control that we had a
problem - which was the only choice given the great difficulty at that job
of trying to get permission to make open source contributions.  It has
already been a very long time coming.

You've got it now, thank goodness, in whatever shape or form. Please turn
it on. As for reasons:
a) I would be surprised if there weren't other changes in the next release
that give some people pause for thought before upgrading - why worry about
1 more? e.g. globbing differences between various recent releases caused
more pain to me than this is likely to.

b) Demand is usually only going to come from the few anyhow.  I think this
is one of those features that few people will know to turn on but many will
benefit anyhow.  The reason is that you will start to be able to write
generic make output analysis tools. I don't know if you guys have seen
Electric Insight, for example.  I don't intend this as a plug but I can't
help it - it's just the only non-inhouse example that I can think of but I
have seen a number of others at various companies. This is a trivial
example where you can see that parsing is taking most of the time in this
parallel build:
https://docs.google.com/file/d/0BwJNUxZZ7qItQlRrUjBmM0tTTDg/edit?usp=sharing

It lets you see the state of your parallel build - gives you an intuitive
idea of where the time is going and what the errors are and so on. You
can't implement tools like this across a general a lot of builds unless
they have contiguous recipe output and the surest way to get that is for it
to be on by default. You also need recipe separators and timing data but I
think you'll realise that eventually now that this step is taken.

c) Lets benchmark it on something like an android build and you can make a
decision after that? Heck, lets try it with various things and be happy it
works.

Regards,

Tim


On 29 April 2013 03:53, Eli Zaretskii e...@gnu.org wrote:

  From: Paul Smith psm...@gnu.org
  Cc: bug-make@gnu.org
  Date: Sun, 28 Apr 2013 22:03:39 -0400
 
  Now that we seem to have a workable solution for output synchronization
  for both POSIX and Windows systems, I wonder if we shouldn't consider
  enabling it as the default mode when parallel builds are running.

 I don't think we should do this unless users demand it, which means we
 should wait for the next release and some time beyond before making
 the decision.

 FWIW, I'm running 8-way parallel Make jobs for many months now, and
 never had any trouble interpreting their output, neither when they
 succeed, nor when they don't.

 The synchronized output operation annoys me slightly in that there are
 distinct time intervals when Make says nothing at all, and then the
 output appears in large chunks that get dumped en-masse to the
 screen.  IOW, the user experience is different and takes time to get
 used to.

 ___
 Bug-make mailing list
 Bug-make@gnu.org
 https://lists.gnu.org/mailman/listinfo/bug-make




-- 
You could help some brave and decent people to have access to uncensored
news by making a donation at:

http://www.thezimbabwean.co.uk/friends/
___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: Default output-sync setting (was: Re: [bug #33138] .PARLLELSYNC enhancement with patch)

 Date: Mon, 29 Apr 2013 09:58:50 +0100
 From: Tim Murphy tnmur...@gmail.com
 Cc: bug-make@gnu.org bug-make@gnu.org

 try interpreting error messages from compiler/tool X when they're 10
 lines from the file that they refer to and don't include the
 filename in the error message.

That's unrelated: interpreting such output will be hard no matter what
order it comes in.

 b) Demand is usually only going to come from the few anyhow.  I think this
 is one of those features that few people will know to turn on but many will
 benefit anyhow.  The reason is that you will start to be able to write
 generic make output analysis tools.

People who write programs that parse Make output will find and use
this switch anyway; we could announce that prominently in NEWS if you
think the feature could go unnoticed.

But that is irrelevant to interactive human users and their use cases.

 c) Lets benchmark it on something like an android build and you can make a
 decision after that? Heck, lets try it with various things and be happy it
 works.

I already tried it, and I'm not happy, FWIW.

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: Default output-sync setting (was: Re: [bug #33138] .PARLLELSYNC enhancement with patch)

On 29 April 2013 16:19, Eli Zaretskii e...@gnu.org wrote:

  Date: Mon, 29 Apr 2013 09:58:50 +0100
  From: Tim Murphy tnmur...@gmail.com
  Cc: bug-make@gnu.org bug-make@gnu.org

  try interpreting error messages from compiler/tool X when they're 10
  lines from the file that they refer to and don't include the
  filename in the error message.

 That's unrelated: interpreting such output will be hard no matter what
 order it comes in.

cc fred.c -c -o fred.o
cc bob.c -c -o bob.o
error on line 20 -X

Which one?

It gets worse when you scale up.

Regards,

Tim

http://www.thezimbabwean.co.uk/friends/
___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: Default output-sync setting (was: Re: [bug #33138] .PARLLELSYNC enhancement with patch)

 Date: Mon, 29 Apr 2013 16:40:03 +0100
 From: Tim Murphy tnmur...@gmail.com
 Cc: bug-make@gnu.org bug-make@gnu.org
 
 cc fred.c -c -o fred.o
 cc bob.c -c -o bob.o
 error on line 20 -X
 
 Which one?

Make will actually tell you which one, something like:

  makefile:342: recipe for target 'oo/i386/acl-errno-valid.o' failed
  make: *** [oo/i386/acl-errno-valid.o] Error 1

(Copied from a failed make -j8 run I just did for Emacs.)

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: Default output-sync setting (was: Re: [bug #33138] .PARLLELSYNC enhancement with patch)

2013-04-29 Thread Philip Guenther

On Mon, Apr 29, 2013 at 10:30 AM, Tim Murphy tnmur...@gmail.com wrote:
 cc fred.c -c -o fred.o
 cc bob.c -c -o bob.o
 error on line 20 -X
 error on line 30 -
 error on line 330 -
 makefile:342: recipe for target 'fred.o' failed
 makefile:350: recipe for target 'bob.o' failed

 ?

Doctor, my hammer has a head so large that I always hit my thumb
Throw out that hammer and get a non-broken one

The GNU tool working around the brokenness of some non-free software?
Some would call that collaboration in retaining your chains.

:-/



I haven't had a chance to play with the new functionality in the
repository, so I'll merely caution that having major new functionality
that will affect the user experience be on by default runs the risk of
there being a problem and having the new version be
tarred-and-feathered, and the projects that would most benefit from
the new functionality instead turning it off.  If there are other
fixes and/or features in this release that people are waiting for,
burning them on this would be unfriendly.


Philip Guenther

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: Default output-sync setting (was: Re: [bug #33138] .PARLLELSYNC enhancement with patch)

Come now - the broken excuse is an excuse. There's plenty of crap free
software out there and some poor bastard trying to build it who can't
change the source because the people who own it think it should be make's
problem.

:-)


Cheers,

Tim


On 29 April 2013 19:00, Philip Guenther guent...@gmail.com wrote:

 On Mon, Apr 29, 2013 at 10:30 AM, Tim Murphy tnmur...@gmail.com wrote:
  cc fred.c -c -o fred.o
  cc bob.c -c -o bob.o
  error on line 20 -X
  error on line 30 -
  error on line 330 -
  makefile:342: recipe for target 'fred.o' failed
  makefile:350: recipe for target 'bob.o' failed
 
  ?

 Doctor, my hammer has a head so large that I always hit my thumb
 Throw out that hammer and get a non-broken one

 The GNU tool working around the brokenness of some non-free software?
 Some would call that collaboration in retaining your chains.

 :-/

 

 I haven't had a chance to play with the new functionality in the
 repository, so I'll merely caution that having major new functionality
 that will affect the user experience be on by default runs the risk of
 there being a problem and having the new version be
 tarred-and-feathered, and the projects that would most benefit from
 the new functionality instead turning it off.  If there are other
 fixes and/or features in this release that people are waiting for,
 burning them on this would be unfriendly.


 Philip Guenther




-- 
You could help some brave and decent people to have access to uncensored
news by making a donation at:

http://www.thezimbabwean.co.uk/friends/
___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: Default output-sync setting (was: Re: [bug #33138] .PARLLELSYNC enhancement with patch)

 Date: Mon, 29 Apr 2013 18:30:37 +0100
 From: Tim Murphy tnmur...@gmail.com
 Cc: bug-make@gnu.org bug-make@gnu.org
 
 cc fred.c -c -o fred.o
 cc bob.c -c -o bob.o
 error on line 20 -X
 error on line 30 -
 error on line 330 -
 makefile:342: recipe for target 'fred.o' failed
 makefile:350: recipe for target 'bob.o' failed
 
 ?

You need to look in both anyway.

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: Default output-sync setting (was: Re: [bug #33138] .PARLLELSYNC enhancement with patch)

 Date: Mon, 29 Apr 2013 19:33:10 +0100
 From: Tim Murphy tnmur...@gmail.com
 Cc: Eli Zaretskii e...@gnu.org, bug-make@gnu.org bug-make@gnu.org

 Come now - the broken excuse is an excuse. There's plenty of crap free
 software out there and some poor bastard trying to build it who can't
 change the source because the people who own it think it should be make's
 problem.

Look, no one is claiming that this feature is useless.  I just spent
two days researching and implementing it on Windows, something I
wouldn't do if I thought it wasn't important to have.

What we are talking here is whether to have this turned on by default.
I hope you are not going to claim that adding -O to a Make command
line to get the new behavior is too hard.  By contrast, having to type
-Onone to have the _old_ behavior back might very well annoy a few.

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: Default output-sync setting (was: Re: [bug #33138] .PARLLELSYNC enhancement with patch)

One doesn't have to suffer the problems and learn the option exists
afterwards.

In the end I can understand why a new feature might not be default to start
with - until a lot of people have used it and are sure that it works
everywhere.

Cheers,

Tim


On 29 April 2013 20:21, Eli Zaretskii e...@gnu.org wrote:

  Date: Mon, 29 Apr 2013 19:33:10 +0100
  From: Tim Murphy tnmur...@gmail.com
  Cc: Eli Zaretskii e...@gnu.org, bug-make@gnu.org bug-make@gnu.org
 
  Come now - the broken excuse is an excuse. There's plenty of crap free
  software out there and some poor bastard trying to build it who can't
  change the source because the people who own it think it should be make's
  problem.

 Look, no one is claiming that this feature is useless.  I just spent
 two days researching and implementing it on Windows, something I
 wouldn't do if I thought it wasn't important to have.

 What we are talking here is whether to have this turned on by default.
 I hope you are not going to claim that adding -O to a Make command
 line to get the new behavior is too hard.  By contrast, having to type
 -Onone to have the _old_ behavior back might very well annoy a few.




-- 
You could help some brave and decent people to have access to uncensored
news by making a donation at:

http://www.thezimbabwean.co.uk/friends/
___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: Default output-sync setting (was: Re: [bug #33138] .PARLLELSYNC enhancement with patch)

2013-04-29 Thread Edward Welbourne

Eli:
 cc fred.c -c -o fred.o
 cc bob.c -c -o bob.o
 error on line 20 -X
 error on line 30 -
 error on line 330 -
 makefile:342: recipe for target 'fred.o' failed
 makefile:350: recipe for target 'bob.o' failed

 You need to look in both anyway.

That is true of the very specific example Tim had given, in response to
a particular objection which his example did genuinely address.  Think
for a moment longer and you shall see that there are other cases where
your answer won't apply.  The interleaved output from make -j n is often
much harder to read.  (When the output isn't even line-interleaved, but
has some command's output starting part way through a line of the output
of another command, life gets even harder.)

Even in the specific example: on which file should I be looking at line
20, at line 30, or line 330 ?  The answer look in every failing file
for every failing line number scales quadratically, which is not good.
Even when I know which line a message relates to, it can be hard to work
out how that message relates to that line (especially if there's a
preprocessor involved and the undefined symbol xxyyz relates to a line
in which xxyyz never appears, but only arises after expansion of a mess
of macros).  It gets nasty fast to have to look in several files, at
line 30 of each, and work out which of them is the one - relates to;
and it gets easy to end up mistaking which one it relates to and
fixing the wrong one.

In a make with -j 12ish compiling many files, the starts and ends of
compiles interleave with one another and with the output of running
commands.  Some of that output is errors, which (if the commands being
run are well-behaved - not always true) shall mean the affected files
are named by make in the output when they finish.  Some output is
warnings, that may actually be important, but the command succeeds:
there's no marker indicating the end of a file that succeeds, so there
are many files that warning might have come from; most likely one of
those started recently, but not always easy to tell which.  When
hundreds of files are compiling - why else use -j 12ish ? - it can get
hard to keep track of which are currently active at any given line of
output, even when every file has both start and end markers for its
output, e.g. if some files take much longer than others.

The output from make -j several is hard to make sense of.  Your shiny
new feature really does sound (I haven't had occasion to try it) like a
vast improvement.  That *is* a strong case for turning it on by default.
Naturally, before doing so, it's important to do ample testing: and one
way to do that is to initially release make with it off by default but
available, so that lots of folk can try it and give you feed-back.  All
the same, once it's been well tested, if no serious problems arise (and
aren't fixed by your collective ingenuity) this is a good thing to turn
on by default.

Philip
 Doctor, my hammer has a head so large that I always hit my thumb
 Throw out that hammer and get a non-broken one

 The GNU tool working around the brokenness of some non-free software?
 Some would call that collaboration in retaining your chains.

This problem doesn't arise exclusively with non-free software, so this
is an emotive argument, an appeal to bad example.  Programmers,
generally, are slap-dash about the things they throw together to build
the software they're so proud of having (in their own opinions) written
so nicely and well.  They throw together shell scripts to wrap the
compiler with some other commands that they also want to run on the same
files, or that they need run before handing off their output to the
compiler (e.g. they invented their own way of doing localisation,
because it's so much more fun to reinvent the wheel than learn to use
gettext): and they're profoundly negligent of error-reporting because
it's always worked for me so anyone else inheriting their code is stuck
with the results.  They write contorted shell-scripts inline in the
make-file as the command for a rule, using the [ condition ]  action
short-hand for if [ condition ]; then action; fi and tack on || true
to avoid having make report the command as failed when [ condition ] was
false; so the rule never gets reported as failing, even when it does,
but that's OK, because the error messaeges tell you what failed.

The hideous monstrosities of make files that get generated by helper
scripts that work round the deficiencies of make (almost invariably,
in fact, they work round the author's poor knowledge of how to use make
well - but those who inherit the code are stuck with the results) end up
doing perverse things that most on this list would never dream of.  I'd
love to answer that with so replace the bad scripts and hideously
wrapped (and warped) build system with something well written, but the
poor bug-fixer who's inherited the code from the oh-so-clever author is
stuck with the need to get things fixed today, which doesn't

Re: [bug #33138] .PARLLELSYNC enhancement with patch

2013-04-28 Thread David Boyce

On Sun, Apr 28, 2013 at 1:34 AM, Paul Smith psm...@gnu.org wrote:
 I'm not excited about that term (job); it's kind of accurate, but in
 the documentation for example we're really mushy about exactly what a
 job is, vs. a recipe or a command line etc.  I'd like to pick some
 terms for this, define them in a solid way, then clean up the
 references.  It would be best to do this before the release to avoid
 changing things later.

 For example, we currently use target as the name; maybe recipe is
 better?

 If anyone has opinions I'm listening.

This seems quite wrong to me. I think it's inherent from the existence
of the --jobs option that job must be synonymous with recipe, even
if that's not what was intended at the time. Therefore my HO is that
not only does this one need a new name but -O target would be better
renamed -O job.

What's wrong with line for the line-by-line case? It's exactly the
way make is documented, that each line of the recipe is fed to the
shell in turn, and the documentation you just wrote for the new
argument is all about lines; I see the word line 4 times in 2
sentences. The whole thing seems clear and well written to me, but
would be even more clear if the new option was called -O line.

So I'd argue for:

-O line (new)
-O job (current -O target)
-O make

And document the notion that job and recipe are equivalent terms
(more precisely, one is the instantiation of the other: job is to
recipe what process is to program).

I also think it wouldn't hurt for the documentation of the default to
contain a forward reference. Pending potential rename:

--- a/doc/make.texi
+++ b/doc/make.texi
@@ -4136,7 +4136,7 @@ specified by giving an argument to the option
(e.g.,  @samp{-Ojob} or
 @table @code
 @item none
 The is the default: all output is sent directly as it is generated and
-no synchronization is performed.
+no synchronization is performed. It is identical to @samp{--output-sync=target}

 @item job
 Output from each individual line of the recipe is grouped and printed

David

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

2013-04-28 Thread Eli Zaretskii

 From: Paul Smith psm...@gnu.org
 Cc: e...@gnu.org, david.s.bo...@gmail.com, bug-make@gnu.org
 Date: Sun, 28 Apr 2013 01:34:44 -0400

 On Thu, 2013-04-18 at 22:36 +0200, Frank Heckenbach wrote:
This is useful (to me) because at any time, I know what's running.
([Start] messages minus [End] messages.)

   Thanks, this is the reason I was looking for; that use-case wasn't clear
   to me based on the previous email.

  OK, so what are we going to do about it? Leave, revert, new option?

 I've pushed a change to add a new argument to the -O/--output-sync
 option, job, to write output after each line of the recipe.

What is its purpose?  To avoid mixing in the same screen line
characters from several parallel sub-makes?  (That does happen, albeit
rarely.)  Or is it something else?

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

2013-04-28 Thread Eli Zaretskii

 Date: Sun, 28 Apr 2013 10:22:40 -0400
 From: David Boyce david.s.bo...@gmail.com
 Cc: Frank Heckenbach f.heckenb...@fh-soft.de, Eli Zaretskii e...@gnu.org, 
 bug-make bug-make@gnu.org

 So I'd argue for:

 -O line (new)
 -O job (current -O target)
 -O make

Agree about line (assuming I understood what it means), but disagree
about renaming target: it is quite clear what it means.  By
contrast, make is not so clear; perhaps we should rename it
makefile or something.

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

2013-04-28 Thread Paul Smith

On Sun, 2013-04-28 at 20:00 +0300, Eli Zaretskii wrote:
  I've pushed a change to add a new argument to the -O/--output-sync
  option, job, to write output after each line of the recipe.
 
 What is its purpose?  To avoid mixing in the same screen line
 characters from several parallel sub-makes?  (That does happen, albeit
 rarely.)  Or is it something else?

I'm not sure exactly what you mean by your second sentence.  However I
asked the same question you did about this feature.

Frank had a use-case: he was tracking which jobs were active/still
running by making all his recipes look like this:

  target:
  @echo start: $@
  ... recipe ...
  @echo end: $@

This allows a higher-level, dynamic interface to track which jobs are
running, when they started, etc. and track the build.

Although I implemented this because it was simple, I'm not so sure this
is a real use-case.  Or to be more accurate, I agree that it's a real
use-case but I don't think this is a good solution to the problem.

I suspect that a better solution might be to create a machine
interface mode for make, as some other GNU CLI tools like GDB, etc.
have.  This interface would be well-defined and unchanging and easily
machine-parseable, and allow people to write front-ends to more
accurately examine make's output.

However, for now this new output-sync mode doesn't seem to be harmful.


___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

2013-04-28 Thread Paul Smith

On Thu, 2013-04-18 at 22:36 +0200, Frank Heckenbach wrote:
 % make -Omake # same with -Otarget
 m:2: recipe for target 'foo' failed
 make: *** [foo] Error 1
 foo:error
 
 This seems at least strange to me: The conclusion recipe failed is
 printed before the reason (the messages from the job).
 
 Reverting this part (i.e., moving the sync_output() call to where it
 was before) corrects this:
 
 % make -Otarget # same with -Otarget
 foo:error
 m:2: recipe for target 'foo' failed
 make: *** [foo] Error 1
 
 To fix it without reverting the behaviour seems to require calling
 sync_output() before child_error() (2 places).

Unfortunately that isn't sufficient either.  It's possible for
child_error() to be invoked, but then we don't stop but continue on with
the recipe (if the error is ignored).  If we ran child_error() here we
would show half the recipe output, then the error message, then the rest
later, in a separate grouping.

I went a different way: I modified the child_error() function so that if
there was an output-sync file, the error message would be written to the
end of that file instead of printed directly.  This way when the output
is shown to the user she sees the entire thing, including error messages
from make, in order.

This also allowed me to drop the layer of enter/leave notices around
error messages.

Please check/verify this change in your own environments.


PS. Some may be tsking to themselves noting this change would have been
a lot simpler if we'd kept a FILE* handle instead of a file descriptor.
Unfortunately it's not clear that would work.  I created a test program
to verify that this method of having the parent append messages to the
file after the child exits, and with a FILE* handle I couldn't make it
work right.  My fseek() after the child exited didn't skip past the
child's output and my message printed by the parent ended up overwriting
that output.  I'm sure it would have worked if I'd closed the file in
the parent and re-opened it, but since I'm using tmpfile() that is not
possible.  I might have done something wrong, but anyway it wasn't a
slam-dunk.


___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Default output-sync setting (was: Re: [bug #33138] .PARLLELSYNC enhancement with patch)

2013-04-28 Thread Paul Smith

Now that we seem to have a workable solution for output synchronization
for both POSIX and Windows systems, I wonder if we shouldn't consider
enabling it as the default mode when parallel builds are running.

I understand that this will be a change that could be visible (beyond
the collection of output) due to using a temp file instead of a
terminal.  Of course people can still use -Onone if they want old
behavior.

However assuming the new mode works and is reliable, and is not a
performance bottleneck, I'm hard-pressed to see why a well-ordered
output would not be preferable to just about everyone, and hence
shouldn't be the default.

Of course it's possible that writing to a file, rather than spewing to
stdout, WOULD be noticeably performance impacting at least in some
situations/systems.


Comments?



___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: Default output-sync setting (was: Re: [bug #33138] .PARLLELSYNC enhancement with patch)

2013-04-28 Thread Eli Zaretskii

 From: Paul Smith psm...@gnu.org
 Cc: bug-make@gnu.org
 Date: Sun, 28 Apr 2013 22:03:39 -0400

 Now that we seem to have a workable solution for output synchronization
 for both POSIX and Windows systems, I wonder if we shouldn't consider
 enabling it as the default mode when parallel builds are running.

I don't think we should do this unless users demand it, which means we
should wait for the next release and some time beyond before making
the decision.

FWIW, I'm running 8-way parallel Make jobs for many months now, and
never had any trouble interpreting their output, neither when they
succeed, nor when they don't.

The synchronized output operation annoys me slightly in that there are
distinct time intervals when Make says nothing at all, and then the
output appears in large chunks that get dumped en-masse to the
screen.  IOW, the user experience is different and takes time to get
used to.

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

2013-04-27 Thread Eli Zaretskii

 From: Paul Smith psm...@gnu.org
 Cc: Frank Heckenbach f.heckenb...@fh-soft.de, bug-make@gnu.org
 Date: Sun, 21 Apr 2013 19:30:05 -0400
 
 On Fri, 2013-04-19 at 14:09 +0300, Eli Zaretskii wrote:
   Date: Fri, 19 Apr 2013 11:54:05 +0200
   Cc: bo...@kolpackov.net, bug-make@gnu.org
   From: Frank Heckenbach f.heckenb...@fh-soft.de
   
Is there a simple enough Makefile somewhere that could be used to test
this feature, once implemented?
   
   We have a test in the test suite (output-sync). Can you use that?
  
  I hoped for something simpler and not involving Perl or Unixy shell
  features, because I'd like to use this in the native Windows
  environment, where the Windows port of Make runs.  However, if that's
  the best possibility, I guess I'd craft something based on that test.
 
 The basic feature can be tested trivially like this:
 
   all: one two
 
   one two:
   @echo start $@
   @sleep 1
   @echo stop $@
 
 Now if you run this using make -j you'll get:
 
   start one
   start two
   stop one
   stop two
 
 If you run this using make -j -O you should get:
 
   start one
   stop one
   start two
   stop two
 
 There's more to test than that: before it's done we need to test
 recursive make invocations for example.  But the above is simple.

Thanks, I tested this, and also the recursive behavior.  But I'm not
sure whether the results in some less-than-trivial cases are as
intended (although I see exactly the same behavior on GNU/Linux, so
the questions below are not Windows-specific).

E.g., with this Makefile, called mkfsync, (indented 2 spaces for
clarity):

  all: simple recursive

  simple: one two

  one two:
  @echo start $@
  @sleep 1
  @echo stop $@
  @-false

  recursive: rec1 rec2

  rec1 rec2:
  @echo start $@
  $(MAKE) -f mkfsync simple
  @echo stop $@

and invoking Make as gnumake -j -O -f mkfsync, I get this output:

  gnumake -f mkfsync simple
  gnumake -f mkfsync simple
  mkfsync:6: recipe for target 'one' failed
  gnumake: [one] Error 1 (ignored)
  start one
  stop one
  mkfsync:6: recipe for target 'two' failed
  gnumake: [two] Error 1 (ignored)
  start two
  stop two
  start rec1
  gnumake[1]: Entering directory 'D:/gnu/make-3.82.90_GIT_2013-04-20'
  gnumake[1]: Entering directory 'D:/gnu/make-3.82.90_GIT_2013-04-20'
  mkfsync:6: recipe for target 'one' failed
  gnumake[1]: Leaving directory 'D:/gnu/make-3.82.90_GIT_2013-04-20'
  gnumake[1]: Entering directory 'D:/gnu/make-3.82.90_GIT_2013-04-20'
  gnumake[1]: [one] Error 1 (ignored)
  gnumake[1]: Leaving directory 'D:/gnu/make-3.82.90_GIT_2013-04-20'
  gnumake[1]: Entering directory 'D:/gnu/make-3.82.90_GIT_2013-04-20'
  start one
  stop one
  gnumake[1]: Leaving directory 'D:/gnu/make-3.82.90_GIT_2013-04-20'
  gnumake[1]: Entering directory 'D:/gnu/make-3.82.90_GIT_2013-04-20'
  mkfsync:6: recipe for target 'two' failed
  gnumake[1]: Leaving directory 'D:/gnu/make-3.82.90_GIT_2013-04-20'
  gnumake[1]: Entering directory 'D:/gnu/make-3.82.90_GIT_2013-04-20'
  gnumake[1]: [two] Error 1 (ignored)
  gnumake[1]: Leaving directory 'D:/gnu/make-3.82.90_GIT_2013-04-20'
  gnumake[1]: Entering directory 'D:/gnu/make-3.82.90_GIT_2013-04-20'
  start two
  stop two
  gnumake[1]: Leaving directory 'D:/gnu/make-3.82.90_GIT_2013-04-20'
  gnumake[1]: Leaving directory 'D:/gnu/make-3.82.90_GIT_2013-04-20'
  stop rec1
  start rec2
  gnumake[1]: Entering directory 'D:/gnu/make-3.82.90_GIT_2013-04-20'
  gnumake[1]: Entering directory 'D:/gnu/make-3.82.90_GIT_2013-04-20'
  mkfsync:6: recipe for target 'one' failed
  gnumake[1]: Leaving directory 'D:/gnu/make-3.82.90_GIT_2013-04-20'
  gnumake[1]: Entering directory 'D:/gnu/make-3.82.90_GIT_2013-04-20'
  gnumake[1]: [one] Error 1 (ignored)
  gnumake[1]: Leaving directory 'D:/gnu/make-3.82.90_GIT_2013-04-20'
  gnumake[1]: Entering directory 'D:/gnu/make-3.82.90_GIT_2013-04-20'
  start one
  stop one
  gnumake[1]: Leaving directory 'D:/gnu/make-3.82.90_GIT_2013-04-20'
  gnumake[1]: Entering directory 'D:/gnu/make-3.82.90_GIT_2013-04-20'
  mkfsync:6: recipe for target 'two' failed
  gnumake[1]: Leaving directory 'D:/gnu/make-3.82.90_GIT_2013-04-20'
  gnumake[1]: Entering directory 'D:/gnu/make-3.82.90_GIT_2013-04-20'
  gnumake[1]: [two] Error 1 (ignored)
  gnumake[1]: Leaving directory 'D:/gnu/make-3.82.90_GIT_2013-04-20'
  gnumake[1]: Entering directory 'D:/gnu/make-3.82.90_GIT_2013-04-20'
  start two
  stop two
  gnumake[1]: Leaving directory 'D:/gnu/make-3.82.90_GIT_2013-04-20'
  gnumake[1]: Leaving directory 'D:/gnu/make-3.82.90_GIT_2013-04-20'
  stop rec2

Is this intended behavior that these two messages:

  mkfsync:6: recipe for target 'two' failed
  gnumake[1]: [two] Error 1 (ignored)

are separated and wrapped by separate Entering...Leaving blocks,
instead of being produced together?  They are both produced by a call
to 'message', which outputs the message to stdout, so it's not like
they went to two different

Re: [bug #33138] .PARLLELSYNC enhancement with patch

2013-04-27 Thread Eli Zaretskii

 Date: Sat, 27 Apr 2013 13:09:02 +0300
 From: Eli Zaretskii e...@gnu.org
 Cc: f.heckenb...@fh-soft.de, bug-make@gnu.org

  The basic feature can be tested trivially like this:

all: one two

one two:
@echo start $@
@sleep 1
@echo stop $@

  Now if you run this using make -j you'll get:

start one
start two
stop one
stop two

  If you run this using make -j -O you should get:

start one
stop one
start two
stop two

  There's more to test than that: before it's done we need to test
  recursive make invocations for example.  But the above is simple.

 Thanks, I tested this, and also the recursive behavior.

The changes needed to make -O work on MS-Windows are now committed to
the master repository, see commits da7df54 and 049f8e8.  Please review
and comment.

I also took liberty in committing a small change posted by Frank in
this thread, whereby, if taking the semaphore fails, output is dumped
unsynchronized, instead of being silently ignored.  (Btw, I don't see
Frank's assignment on file for Make, so I marked this change tiny,
per the FSF procedures.)

Thanks to all who contributed to this (lengthy) discussion and allowed
me to understand how to implement this feature on Windows.

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

2013-04-27 Thread Paul Smith

On Sat, 2013-04-27 at 13:09 +0300, Eli Zaretskii wrote:
 Is this intended behavior that these two messages:
 
   mkfsync:6: recipe for target 'two' failed
   gnumake[1]: [two] Error 1 (ignored)
 
 are separated and wrapped by separate Entering...Leaving blocks,
 instead of being produced together?  They are both produced by a call
 to 'message', which outputs the message to stdout, so it's not like
 they went to two different streams.  Am I missing something?

Frank mentioned this as well and it's a bug that needs to be fixed.
I'll look into it this weekend.  I need to check the algorithm; one
simple fix, if it's too complex to do it another way, would be to have
make write (append) the error message to the temp file rather than
printing it directly.  That way we know it will come out in the right
order when the temp file is dumped to stdout.

However that may not be necessary.

 If this is intended behavior, can we somehow cut down on these
 Entering...Leaving pairs?  They add a lot of clutter and make it
 hard to read the output of a Make run.

Yes, this was also discussed: I've been thinking about a separate way to
allow the user to choose.  We also need to ensure, as best as we can,
that extra unneeded lines are not being generated.


___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

2013-04-27 Thread Paul Smith

On Sat, 2013-04-27 at 14:39 +0300, Eli Zaretskii wrote:
 The changes needed to make -O work on MS-Windows are now committed to
 the master repository, see commits da7df54 and 049f8e8.  Please review
 and comment.

Thanks Eli!!

I'll take a look over the next few days.


___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

2013-04-27 Thread Paul Smith

On Thu, 2013-04-18 at 22:36 +0200, Frank Heckenbach wrote:
   This is useful (to me) because at any time, I know what's running.
   ([Start] messages minus [End] messages.)
  
  Thanks, this is the reason I was looking for; that use-case wasn't clear
  to me based on the previous email.
 
 OK, so what are we going to do about it? Leave, revert, new option?

I've pushed a change to add a new argument to the -O/--output-sync
option, job, to write output after each line of the recipe.  Please
give it a try and make sure it works for your situation.  It worked OK
in my more limited testing.


I'm not excited about that term (job); it's kind of accurate, but in
the documentation for example we're really mushy about exactly what a
job is, vs. a recipe or a command line etc.  I'd like to pick some
terms for this, define them in a solid way, then clean up the
references.  It would be best to do this before the release to avoid
changing things later.

For example, we currently use target as the name; maybe recipe is
better?

If anyone has opinions I'm listening.


___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

2013-04-26 Thread Eli Zaretskii

 Date: Wed, 24 Apr 2013 21:19:45 +0300
 From: Eli Zaretskii e...@gnu.org
 Cc: bug-make@gnu.org

  Date: Wed, 24 Apr 2013 19:14:01 +0200
  Cc: bug-make@gnu.org
  From: Frank Heckenbach f.heckenb...@fh-soft.de

   Testing clearly shows it doesn't: GetFileInformationByHandle simply
   fails for handles open on console devices and the null device.  And we
   will be comparing handles for console devices in the majority of use
   cases here.

  That's right. Or perhaps pipes (I post-process make's output, to
  handle the directories in messages etc. and I suppose the mentioned
  Emacs mode does something similar).

 GetFileInformationByHandle does work on pipes, at least on shell
 pipes (as in foo | cat).

   I researched this a bit, and eventually succeeded in finding a way
   that is good enough for both disk files, pipes, and console devices.

  Good. (And the null device, I suppose. Though it doesn't really
  matter if we write to the null device combined or separate. ;-)

 No, I didn't find a satisfactory solution for the null device.  But as
 you point out, we don't care.

One more minor issue with this: do we want the comparison of stdin and
stdout for the purposes of combined_output to return zero, although
they refer to the same device on Unix?  I know it's a bit theoretical
for the case in point, but I'd like the code I write to be useful in
other contexts as well, and maybe even Make will need that in the
future.

On Windows, console input and output use different devices, and I can
distinguish between them.  Should I?

Thanks.

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: Fwd: [bug #33138] .PARLLELSYNC enhancement with patch

2013-04-26 Thread Tim Murphy

On 25 April 2013 20:06, Eli Zaretskii e...@gnu.org wrote:

  Date: Thu, 25 Apr 2013 19:36:28 +0100
  From: Tim Murphy tnmur...@gmail.com
  Cc: bug-make@gnu.org bug-make@gnu.org

  1)  sem_timedwait() in posix lets you timeout so in a big build when
  something crashes or just sits around, there is at least the option of
  printing an error message or giving up on locking from that point on
 which
  is a bit of a godsend if it happens overnight and there's no-one to
 restart
  the build.

 How much would you use for the timeout, though?  A sub-Make could
 legitimately run for a very long time, depending on what's in the
 Makefile.

 FWIW, I currently let Make wait forever for the mutex.

It's a horrible question and no answer makes people happy.   It's a kind of
exceedingly blunt tool that you only use because there is no other.  I have
seen systems where the child process watches stdio/stderr activity and
times out if there isn't any and that just gets into trouble with processes
that run long and silent.

On the other hand, having your 24-hour build hang after 2 hours when you
have a lot of people waiting on the result is a sort of costly thing.  Even
if you accept that a lot of the build will be broken, there is still a lot
of feedback to get from the parts which should be able to complete - errors
and warnings, unit tests and so on - that people can take action on while
the build is being fixed.

So timeouts have to be configurable by users who can set whatever they
think is the least bad compromise in their particular case.

Regards,

Tim

-- 
You could help some brave and decent people to have access to uncensored
news by making a donation at:

http://www.thezimbabwean.co.uk/friends/
___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

2013-04-26 Thread Eli Zaretskii

 Date: Thu, 25 Apr 2013 02:16:33 +0200
 Cc: e...@gnu.org, bug-make@gnu.org
 From: Frank Heckenbach f.heckenb...@fh-soft.de

   On Windows, you said fstat was very expensive, didn't you? Or is
   lseek even worse?

  I think anything that potentially moves the file pointer can be
  sometimes expensive and is best avoided.  (On Windows, I'd use
  GetFileInformationByHandle.)

 OK, if that's so, do that. But I don't think that's true on POSIX.

I don't think it's worth doing on Windows as well, see below.

   Nothing is actually read by lseek (and even if it were, it would
   only need to look at the first and last part of the file, not read
   all the content, if that was the worry).

  Are you sure?  How can lseek jump to the last byte of the file, if
  the file is not contiguous on disk, except by reading some of it?

 lseek doesn't need to read any data. It just sets the current offset
 of the FD to the given position, so the next read (which in this
 case never happens before seeking to the beginning) knows where to
 read. Even in the case of SEEK_END, all it has to do is add the
 given offset (here: 0) to the current file size.

What I meant is that lseek doesn't just return the byte position, it
also makes sure the next read or write happens at that position.  So
at some point, some piece of software needs to tell the disk to move
its reading head to the right point.  Whether this happens as part of
lseek or the subsequent read/write, and whether this requires reading
some of the data on the disk, is a matter of how this is implemented
and what data structures does the filesystem maintain in memory at all
times.

 Instead of testing, I just looked at the implementation (Linux
 3.2.2). The following is really the whole relevant code. As you see,
 nothing's read from the disk, it only handles in-memory data. (Also
 the file size is in memory for open files; even it were not, it
 would be a constant-time access to the inode and wouldn't need to
 touch any data blocks.)

I timed lseek on Windows on very large files (hundreds of MBs), and
found that a single lseek takes less than 1 usec, at least with NTFS
volumes and in my core i7 box.  So, while more efficient ways of
revealing whether the file is empty are possible, I don't think such a
small penalty justifies yet another set of ifdef's.

The only situation where lseek could be really expensive is if the
volume is compressed, because lseek on Windows returns the
uncompressed offsets in that case.  I don't have access to any machine
which has such volumes, so I cannot test that.  (Does Unix support
such filesystems?  If so, what does lseek do there?)

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: Fwd: [bug #33138] .PARLLELSYNC enhancement with patch

2013-04-26 Thread Eli Zaretskii

 Date: Fri, 26 Apr 2013 09:05:18 +0100
 From: Tim Murphy tnmur...@gmail.com
 Cc: Paul D. Smith psm...@gnu.org, bug-make@gnu.org bug-make@gnu.org

  How much would you use for the timeout, though?  A sub-Make could
  legitimately run for a very long time, depending on what's in the
  Makefile.

  FWIW, I currently let Make wait forever for the mutex.

 It's a horrible question and no answer makes people happy.

Yes, I know.  Which is why I think we would be better off not asking
it in the first place.

 So timeouts have to be configurable by users who can set whatever they
 think is the least bad compromise in their particular case.

If humans need to be in the loop, they can impose timeouts even if
Make does not offer any such customizations.  E.g., run the build
under the 'timeout' command (from Coreutils), or just plain Ctrl-C the
instance of Make that hogs the mutex.

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

2013-04-26 Thread Frank Heckenbach

Eli Zaretskii wrote:

 You underestimate me ;-)
 
 What I have is actually this:
 [...]

 and I wrote 'fcntl' emulation for Windows that uses a mutex.

Indeed, I had not expected this. :-)

 That said, I'm not wedded to the above approach, and if people like a
 completely disjoint code for Windows, that would be a trivial change.
 I just wanted to comply with what Paul said, viz.:
 
   Also, where is the best place to put the emulated Posix functions?
   Some new file in w32/compat/? 
  
  I'd like to see it there.  I'm thinking I want to move the new stuff out
  of job.c even for POSIX systems.  The ifdefs are really getting to me.
 
 So I now have w32/compat/posixfcn.c with the emulation of fcntl (also
 used for CLOSE_ON_EXEC), and a few support functions it needs.

I see. If Paul agrees, I'm fine with it.

Nothing is actually read by lseek (and even if it were, it would
only need to look at the first and last part of the file, not read
all the content, if that was the worry).
   
   Are you sure?  How can lseek jump to the last byte of the file, if
   the file is not contiguous on disk, except by reading some of it?
  
  lseek doesn't need to read any data. It just sets the current offset
  of the FD to the given position, so the next read (which in this
  case never happens before seeking to the beginning) knows where to
  read. Even in the case of SEEK_END, all it has to do is add the
  given offset (here: 0) to the current file size.
 
 What I meant is that lseek doesn't just return the byte position, it
 also makes sure the next read or write happens at that position.  So
 at some point, some piece of software needs to tell the disk to move
 its reading head to the right point.  Whether this happens as part of
 lseek or the subsequent read/write, and whether this requires reading
 some of the data on the disk, is a matter of how this is implemented
 and what data structures does the filesystem maintain in memory at all
 times.

Sure, in theory it's up to the filesystem, but I'd be surprised if
any fs actually worked this way. There are several layers between an
lseek and the actual moving of disk heads (virtual fs, concrete fs,
block layer, caching, drive hardware etc.). In particular, if you
consider that other processes can read different files
simultaneously, it would be premature to position disk heads after
an lseek already. (If this is even possible; normally the OS sends a
request to read a block to the disk, and the disk controller does
the rest.) And as I said, lseek (SEEK_END, 0) is a rather
well-known idiom, so I don't think any fs implementor would consider
it an optimization to do anything like this. In short, all that's
needed for lseek is the current offset and the file size. Both
values really should be in memory for any open file.

  Instead of testing, I just looked at the implementation (Linux
  3.2.2). The following is really the whole relevant code. As you see,
  nothing's read from the disk, it only handles in-memory data. (Also
  the file size is in memory for open files; even it were not, it
  would be a constant-time access to the inode and wouldn't need to
  touch any data blocks.)
 
 I timed lseek on Windows on very large files (hundreds of MBs), and
 found that a single lseek takes less than 1 usec, at least with NTFS
 volumes and in my core i7 box.  So, while more efficient ways of
 revealing whether the file is empty are possible, I don't think such a
 small penalty justifies yet another set of ifdef's.

As I expected.

 The only situation where lseek could be really expensive is if the
 volume is compressed, because lseek on Windows returns the
 uncompressed offsets in that case.  I don't have access to any machine
 which has such volumes, so I cannot test that.  (Does Unix support
 such filesystems?  If so, what does lseek do there?)

At least I don't have access to one right now. But I still wouldn't
expect it to take long. They surely store the uncompressed size
somewhere, and as I said that's all that's needed. When actually
reading from the new position is the right time to compute the
compressed position. Furthermore, we do this check only on our temp
files. Though the user could put them (i.e., the directory where
tmpfile() will create them, usually /tmp on Unix) on a compressed
fs, that's not exactly the wisest thing to do, and we might not
really have to optimize for this case. :-)

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Fwd: [bug #33138] .PARLLELSYNC enhancement with patch

2013-04-25 Thread Tim Murphy

-- Forwarded message --
From: Tim Murphy tnmur...@gmail.com
Date: 25 April 2013 07:13
Subject: Re: [bug #33138] .PARLLELSYNC enhancement with patch
To: Paul D. Smith psm...@gnu.org

To be honest, I have done all this before with named semaphores including
the file that gets left over problem and it's all solvable quite nicely.
You pass the build id in the environment which is, after all, what it's for.

It has been done on huger builds than most people using make ever imagine
and contention doesn't appear to be a problem as the semaphore is held
momentarily while writing output.

I'm not saying that this clever stuff is not a good idea but the reasons
for doing it haven't proved to be valid in my experience.

On 24 April 2013 23:47, Paul Smith psm...@gnu.org wrote:

 On Wed, 2013-04-24 at 22:55 +0100, Tim Murphy wrote:
  why not use a named semaphore wherever possible (windows and linux)
  and lock a file where not instead of trying to pass kernel object
  handles around (seems a bit nasty to me)?

 Hi Tim; I think you're late to the party :-).  Let me summarize a lot of
 discussion then we can use this as a reference for other questions.

 Named semaphores on POSIX are kind of sucky.  They don't get
 automatically cleaned up when the process exits, for one thing.  And my
 suspicion is that they're not very portable across different variations
 of UNIX especially older ones.

 File locking, on the other hand, is very portable, very simple, very
 fast, and any held locks are automatically released when the descriptor
 is closed.  No muss, no fuss.

 If you agree file locking is the right way to go on a POSIX system, then
 we just have to decide what to lock.  It can be anything as long as all
 children of a given top-level make can find it.  We could use a
 temporary file, sure.  If we did that we'd have to pass around either
 the name of the file or, more likely, use tmpfile() and send along the
 file descriptor, handle close-on-exec, etc. like we do with the
 jobserver pipe.  Which we could certainly do: we do it today with
 jobserver.

 However we already have a descriptor available to us: stdout.  If we use
 that we don't have to pass around anything because our children are
 already inheriting it and it's on a well-known FD.  We don't have to
 worry about using + before sub-make rules like we do with jobserver.

 There's one other advantage of using stdout: if a sub-make redirects its
 output to a separate file, that magically starts a new lock domain and
 that sub-make and its children don't contend with the other sub-makes
 and their children.  This is not a big deal and if we didn't get it for
 free, we'd never go to the effort of implementing it.  But it's nice.

 Eli is working on the Windows port; I have no idea how he's decided to
 implement this there yet.

-- 
You could help some brave and decent people to have access to uncensored
news by making a donation at:

http://www.thezimbabwean.co.uk/friends/

-- 
You could help some brave and decent people to have access to uncensored
news by making a donation at:

http://www.thezimbabwean.co.uk/friends/
___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

2013-04-25 Thread Eli Zaretskii

 Date: Thu, 25 Apr 2013 05:14:41 +0200
 Cc: psm...@gnu.org, bug-make@gnu.org
 From: Frank Heckenbach f.heckenb...@fh-soft.de

 My suggestion was under the assumption that we use different
 declarations because of different types anyway -- which I still
 recommend, though Paul might have to decide this one.

Yes, eventually it's Paul's call.

 If I understand it correctly, roughly we'd have now:

 intptr_t sync_handle;

 [...]
 #ifdef POSIX
 sync_handle = (intptr_t) fileno (stdout);
 #else
 sync_handle = (intptr_t) get_mutex_handle (...);
 #endif
 [...]

 #ifdef POSIX
 fcntl ((int) sync_handle, ...);
 #else
 lock_mutex ((mutex *) sync_handle);
 #endif

 I.e., everything about it is separate, apart from the declaration,
 and the latter uses a type that isn't quite right for either one.

You underestimate me ;-)

What I have is actually this:

sync_handle_t sync_handle = -1;
[...]
#if POSIX
  sync_handle = fileno (stdout);
#elif WINDOWS
  sync_handle = create_mutex ();
#endif
[...]
  if (fcntl (sync_handle, F_SETLKW, fl) != -1)
return fl;

and I wrote 'fcntl' emulation for Windows that uses a mutex.

IOW, I didn't add even a single ifdef (the one you see above was
already there, and it can be removed if we try hard enough).

That said, I'm not wedded to the above approach, and if people like a
completely disjoint code for Windows, that would be a trivial change.
I just wanted to comply with what Paul said, viz.:

  Also, where is the best place to put the emulated Posix functions?
  Some new file in w32/compat/? 

 I'd like to see it there.  I'm thinking I want to move the new stuff out
 of job.c even for POSIX systems.  The ifdefs are really getting to me.

So I now have w32/compat/posixfcn.c with the emulation of fcntl (also
used for CLOSE_ON_EXEC), and a few support functions it needs.

 Though I generally like reducing the number of ifdefs, in this case
 it seems more consistent to split the declaration, so it's clear
 that these are really two different things. As a side benefit, if in
 the future someone tried to use it in common code, they would get
 compile-time errors on the other platform, rather than silently
 producing wrong code (like applying some fd operation to a mutex
 handle).

If we really want to separate Posix code from non-Posix one, I would
suggest to have identically named functions, such as
start_job_command, construct_command_argv, etc., in 2 separate files.
These are very large and complex functions, and it is very hard to
follow their logic when large portions of them are more-or-less copied
with non-trivial changes, and you have something like

#ifdef MSDOS
 /* lots of MSDOS stuff */
#else
#ifdef WINDOWS
 /* lots of similar-but-different Windows  stuff */
#else
#ifdef AMIGA
 /* Amiga stuff */
#else
#ifdef _EMX_
 /* EMX stuff */
#else
 /* Posix stuff */

Just paging through the irrelevant portions without losing track of
the logic is non-trivial.  You need a very good editor with folding
capabilities to not become distracted.

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: Fwd: [bug #33138] .PARLLELSYNC enhancement with patch

2013-04-25 Thread Paul Smith

On Thu, 2013-04-25 at 07:14 +0100, Tim Murphy wrote:
 To be honest, I have done all this before with named semaphores
 including the file that gets left over problem and it's all solvable
 quite nicely.  You pass the build id in the environment which is,
 after all, what it's for.

Sure.  Given enough effort all programming problems are solvable.  The
question is what is the effort, and what is the benefit?

At the moment to me it looks like the effort is not insignificant and I
don't see the benefit.  However, that may well just mean I haven't
thought of something.  Can you give a quick outline of how the cleanup /
error handling might work, and what advantages using named semaphores
has?

Cheers!



___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: Fwd: [bug #33138] .PARLLELSYNC enhancement with patch

2013-04-25 Thread Tim Murphy

Ok, I don't want to be too much of a pain - I can see the way things have
gone and my attitude is oh well but I thought I'd put the case anyhow:

1)  sem_timedwait() in posix lets you timeout so in a big build when
something crashes or just sits around, there is at least the option of
printing an error message or giving up on locking from that point on which
is a bit of a godsend if it happens overnight and there's no-one to restart
the build.   With stdio and file locks I don't know if this is a problem
anyhow for crashes (are locks automatically released?) but the behaviour
between windows and linux might be different.

2) Semaphores have the semantics you want a) the ability to lock b)
system-wide c) named so that processes can access them. So it seems simpler
to me to use that semantic as the model and implement it in whatever is the
most faithful way for each platform.

3) As for POSIX semaphores, I found that one could do a lot by just having
init and delete functions that created and destroyed them at the start and
end of make execution. Only the parent make does this of course. You can
make them live in /tmp which is volatile anyhow and you pre-delete ones
that by incredible luck might exist before make started and have the same
name as the one you want to use.

4) In a way this makes the file-lock and semaphore mechanisms look rather
the same - files in /tmp. This seems easier to understand.

It's no big deal, though.

Regards,

Tim



On 25 April 2013 19:12, Paul Smith psm...@gnu.org wrote:

 On Thu, 2013-04-25 at 07:14 +0100, Tim Murphy wrote:
  To be honest, I have done all this before with named semaphores
  including the file that gets left over problem and it's all solvable
  quite nicely.  You pass the build id in the environment which is,
  after all, what it's for.

 Sure.  Given enough effort all programming problems are solvable.  The
 question is what is the effort, and what is the benefit?

 At the moment to me it looks like the effort is not insignificant and I
 don't see the benefit.  However, that may well just mean I haven't
 thought of something.  Can you give a quick outline of how the cleanup /
 error handling might work, and what advantages using named semaphores
 has?

 Cheers!





-- 
You could help some brave and decent people to have access to uncensored
news by making a donation at:

http://www.thezimbabwean.co.uk/friends/
___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: Fwd: [bug #33138] .PARLLELSYNC enhancement with patch

2013-04-25 Thread Eli Zaretskii

 Date: Thu, 25 Apr 2013 19:36:28 +0100
 From: Tim Murphy tnmur...@gmail.com
 Cc: bug-make@gnu.org bug-make@gnu.org

 1)  sem_timedwait() in posix lets you timeout so in a big build when
 something crashes or just sits around, there is at least the option of
 printing an error message or giving up on locking from that point on which
 is a bit of a godsend if it happens overnight and there's no-one to restart
 the build.

How much would you use for the timeout, though?  A sub-Make could
legitimately run for a very long time, depending on what's in the
Makefile.

FWIW, I currently let Make wait forever for the mutex.

 With stdio and file locks I don't know if this is a problem
 anyhow for crashes (are locks automatically released?)

Yes, locks are automatically released once the file descriptor is
closed (which happens on a crash or exit).

 but the behaviour between windows and linux might be different.

It is similar: when a process holding a mutex exits or crashes on
Windows, the mutex is released.  If another process was waiting for
the mutex, it gets a special indication about such calamities (because
in some cases of synchronization, that might mean the shared resource
is in inconsistent state), but in our case, we don't care, at least
the way I wrote the code in its current shape.

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

Eli Zaretskii wrote:

 That's one way.  Another one is to discuss the design more thoroughly
 before the patches are accepted.

I think it was discussed quite extensively. Also in retrospect, I
don't see how the design could have been much different (see below).

 I tried to turn the discussion that
 way when the issue was first brought up, but my comments were largely
 ignored (if I compare what was suggested then with what was eventually
 committed), and most of the discussions were about the details of the
 Unix implementation anyway.

I just reread the whole discussion, and as far as I can see, most of
your questions were answered. Already in
http://lists.gnu.org/archive/html/bug-make/2011-04/msg00033.html
David wrote: Yes, it's conceptually a semaphore. In fact a Windows
port might prefer to use real semaphores.

You wrote:

: Yes, but a few words about how is this semaphore supposed to get job
: done, and in fact what kind of synchronization will this bring to
: Make, would be appreciated.  I don't think you described the feature
: too much in your original post.

Just like David, I really don't know what more you want described. I
can repeat what I wrote yesterday: The only thing really necessary
is that two makes [with the same stdout/stderr] never run
sync_output() simultaneously. (Note: We're only talking about makes,
i.e. cooperating processes; we never need to synchronize the
processes run from the recipes, apart from recursive makes.)

That's really all there is to it, the rest is implementation details
(and may be totally different on Windows). Perhaps you're thinking
of it as much more complicated than is really is. It's just about
the simplest possible use case of an inter-process mutual exclusion
(all processes related, so they can pass info around; all
cooperating; just one code block to lock).

: Btw, there will be other side effects, at least on non-Posix
: platforms, due to the fact that stuff that was supposed to go to the
: screen is redirected to a file instead.  Some programs sense that and
: behave differently, e.g. with binary non-printable characters or with
: special character sequences.  By redirecting the output to files, then
: displaying it on the screen, the output may look strangely, or have
: some more grave effect, like terminating output prematurely.

I don't see why it would terminate prematurely, but indeed it may
display differently, also on Unix, e.g. ls --color=tty may use
colors whwn writing to a terminal and not to a file. So if you use
output-sync and your recipes contains this command, you lose colors.
I'd say you have several options:

- Live with it.

- Change the command (in the ls example, use --color=always).

- If the effect is more serious than losing colors (you say it may
  look strangely), maybe you shouldn't put such a command in a make
  recipe in the first place (at least not in a Makefile that's going
  to be distributed; note that the same problems will happen if the
  user redirects the whole make output to a file which I often do).

- If the Makefile is only used in-house and you definitely want the
  output-device-dependent behaviour, just don't use output-sync.
  It's just an option.

- Provide your own work-around, perhaps write your output to a pty.

What I'm saying is the circumstances where this is a real problem
seem rather exotic, and noone's forcing you to use the option. If we
were talking about changing make's default behaviour, the concern
may be justified, but we aren't.

: We should consider these potential problems, and in any case the
: simple loop that reads from temporary files and writes to
: stdout/stderr may need to become much more complicated, at least
: on DOS/Windows.  (I believe similar, though less serious, problems
: could happen on Unix as well.)

Apart from setting O_BINARY, I don't see why. Can you explain?

WRT O_BINARY, you wrote:

:  . pump_from_tmp_fd will need to switch to_fd to binary mode, since
:this is how tmpfile opens the temporary file; we need output mode
:match the input mode.

I was about to add it, but now I wonder if that's really needed:
If the temp file is opened in binary mode, and this temp file is
passed as stdout/stderr to the shell commands, those will write
their output in binary mode (i.e., without CR, AIUI). Later
pump_from_tmp_fd() will read it back in binary and dump it to the
original stdout/stderr in whatever mode it was (usually non-binary),
so the CRs get added there. Or is there any special handling of
stdout/stderr WRT binary mode during inheritance or so?

I'm attaching a small patch to:

- Factor out is_same_file() into a function.

- Turn file_ok() and fd_not_empty() into functions instead of
  macros. (No real need for macros, and functions may be easier to
  port.)

- Use SEEK_END rather than SEEK_CUR in fd_not_empty(), so it also
  reports true if the fd is at the beginning of a non-empty file
  (such as if the position is not shared between processes
  inheriting

Re: [bug #33138] .PARLLELSYNC enhancement with patch

 Date: Wed, 24 Apr 2013 05:56:49 +0300
 From: Eli Zaretskii e...@gnu.org
 Cc: f.heckenb...@fh-soft.de, bug-make@gnu.org

 I will see if locking works on console handles; if not, I will have
 to introduce a command-line argument for passing the name or the
 handle of a mutex to a sub-Make.

As I suspected, neither LockFileEx nor LockFile work on console
handles, they all fail with ERROR_INVALID_HANDLE.  The functions do
work on regular files, but since (as it turns out) file locks on
Windows are mandatory, not advisory as on Posix platforms, they are
not really suitable for this job anyway.

So file locking is out, mutex is in.

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

 Date: Thu, 18 Apr 2013 21:57:56 +0300
 From: Eli Zaretskii e...@gnu.org
 Cc: bug-make@gnu.org, bo...@kolpackov.net

  Date: Thu, 18 Apr 2013 19:09:06 +0200
  From: Frank Heckenbach f.heckenb...@fh-soft.de

. calculation of combined_output in start_job_command will need to be
  reimplemented for Windows, since the reliance on st_dev and st_ino
  makes assumptions that are false on Windows.

  What we need is basically is_same_file (fd1, fd2) (which we
  probably could refactor into a separate function). Eli, can you try:

test file1 -ef file2

  where test is either a standalone utility or a built-in of bash or
  any other shell. If any of these works, you can see how they do it.
  Otherwise, a bit of googling turned up this page which has code (in
  Python, but it seems like a thin layer around the system calls, so
  it can probably easily be ported to C):

  http://timgolden.me.uk/python/win32_how_do_i/see_if_two_files_are_the_same_file.html

 I know about this method, but I'm not sure it does what we want here.
 The number used by this method is not guaranteed to be unique on some
 versions of Windows.  More importantly, it only works for disk files,
 I don't know whether it reports a meaningful value for the console
 device or the null device.

Testing clearly shows it doesn't: GetFileInformationByHandle simply
fails for handles open on console devices and the null device.  And we
will be comparing handles for console devices in the majority of use
cases here.

I researched this a bit, and eventually succeeded in finding a way
that is good enough for both disk files, pipes, and console devices.

  . STREAM_OK uses fcntl and F_GETFD which both don't exist on Windows.

This is to test if somehow make's stdout or stderr has been closed (not
just redirected to /dev/null, but closed).

   Right, got that; but Windows needs a different test for that, as
   there's no fcntl.

  Basically we need just any call that succees for a valid file and
  fails otherwise. fstat() might be a good candidate.

 fstat is too heavy on Windows; there are better methods.

  If that isn't available, maybe even lseek() (as used in
  FD_NOT_EMPTY) will do if we keep the || errno != EBADF test.

 lseek fails for the console as well, so it's not a good candidate.
 But this problem is not difficult, I just mentioned that the code as
 committed won't compile.

  Eli, can you do some tests to see if you find something that works
  reliably?

 Don't worry about that.  My worst problem is lack of time, not lack of
 ideas ;-)

Found a reasonable solution for that as well.

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

Eli Zaretskii wrote:

  I know about this method, but I'm not sure it does what we want here.
  The number used by this method is not guaranteed to be unique on some
  versions of Windows.  More importantly, it only works for disk files,
  I don't know whether it reports a meaningful value for the console
  device or the null device.
 
 Testing clearly shows it doesn't: GetFileInformationByHandle simply
 fails for handles open on console devices and the null device.  And we
 will be comparing handles for console devices in the majority of use
 cases here.

That's right. Or perhaps pipes (I post-process make's output, to
handle the directories in messages etc. and I suppose the mentioned
Emacs mode does something similar).

 I researched this a bit, and eventually succeeded in finding a way
 that is good enough for both disk files, pipes, and console devices.

Good. (And the null device, I suppose. Though it doesn't really
matter if we write to the null device combined or separate. ;-)

   . STREAM_OK uses fcntl and F_GETFD which both don't exist on 
  Windows.
  
   Basically we need just any call that succees for a valid file and
   fails otherwise.

 Found a reasonable solution for that as well.

OK, so only one problem left.

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

2013-04-24 Thread Tim Murphy

Some time ago when solving the same problem in a different way we used
semaphores on Windows and Linux. Compatibility might make it less
interesting but I would suggest pretending that one has semaphores first
with some nice little abstraction and then implementing them in the best
way the platform has to offer.

We didn't need to support so many platforms but you can see the idea here:
https://bitbucket.org/tnmurphy/raptor/src/fbb2e624d320e5eabc0689105e2f2b80d131ca03/util/talon/sema.h?at=default

Basically each build needs a unique id to be passed around in an
environment variable. The top-level make executable's PID should be good
enough. This is used to create the named semaphore.

Platforms without semaphores might use a file and locking or if that
doesn't work then they can't implement the feature and you configure it out.

Regards,

Tim

On 24 April 2013 18:14, Frank Heckenbach f.heckenb...@fh-soft.de wrote:

Eli Zaretskii wrote:

I know about this method, but I'm not sure it does what we want here.
The number used by this method is not guaranteed to be unique on some
versions of Windows. More importantly, it only works for disk files,
I don't know whether it reports a meaningful value for the console
device or the null device.

Testing clearly shows it doesn't: GetFileInformationByHandle simply
fails for handles open on console devices and the null device. And we
will be comparing handles for console devices in the majority of use
cases here.

That's right. Or perhaps pipes (I post-process make's output, to
handle the directories in messages etc. and I suppose the mentioned
Emacs mode does something similar).

I researched this a bit, and eventually succeeded in finding a way
that is good enough for both disk files, pipes, and console devices.

Good. (And the null device, I suppose. Though it doesn't really
matter if we write to the null device combined or separate. ;-)

. STREAM_OK uses fcntl and F_GETFD which both don't exist on
Windows.

Basically we need just any call that succees for a valid file and
fails otherwise.

Found a reasonable solution for that as well.

OK, so only one problem left.

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

--
You could help some brave and decent people to have access to uncensored
news by making a donation at:

http://www.thezimbabwean.co.uk/friends/
___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

2013-04-24 Thread David Boyce

On Wed, Apr 24, 2013 at 10:26 AM, Tim Murphy tnmur...@gmail.com wrote:

  I would suggest pretending that one has semaphores first with some nice
 little abstraction and then implementing them in the best way the platform
 has to offer.


How about we introduce functions called acquire_semaphore() and
release_semaphore()? Oh, wait, we did.

Platforms without semaphores might use a file and locking or if that
 doesn't work then they can't implement the feature and you configure it out.


This kind of (re)invention is not known to be needed at this time. I have
the impression Eli has the Windows port in hand, he's just skeptical about
a few details. And what's there now is already the LCD for most other
platforms as far as I know.

David
___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

 Date: Wed, 24 Apr 2013 18:50:15 +0200
 Cc: bug-make@gnu.org
 From: Frank Heckenbach f.heckenb...@fh-soft.de

  That's one way.  Another one is to discuss the design more thoroughly
  before the patches are accepted.

 I think it was discussed quite extensively. Also in retrospect, I
 don't see how the design could have been much different (see below).

Fine, then this discussion should probably be terminated.  The last
thing I want is to annoy people.  If you think that everything is and
was like it should have, then I will have to live with that.  It's not
the first time, either, and Make is not the first project where I saw
this happening.

  I tried to turn the discussion that
  way when the issue was first brought up, but my comments were largely
  ignored (if I compare what was suggested then with what was eventually
  committed), and most of the discussions were about the details of the
  Unix implementation anyway.

 I just reread the whole discussion, and as far as I can see, most of
 your questions were answered.

Answered, yes.  But the real reason behind my questions was not
really to have them answered.  It was to affect the design and the
implementation of the feature.  In that, I failed.  You might argue
that I failed because there was nothing to change anyway, but I beg to
differ.

 : Yes, but a few words about how is this semaphore supposed to get job
 : done, and in fact what kind of synchronization will this bring to
 : Make, would be appreciated.  I don't think you described the feature
 : too much in your original post.

 Just like David, I really don't know what more you want described.

Nothing.  I already understood what's there to understand, thanks.
I'm half way into implementing this on Windows, which wouldn't have
been possible _unless_ I understood this completely.  The problems,
whatever they were, and the reasons for my rant are history now,
i.e. immutable and uncorrectable.  We can only behave differently in
the future.  Or not.

 : Btw, there will be other side effects, at least on non-Posix
 : platforms, due to the fact that stuff that was supposed to go to the
 : screen is redirected to a file instead.  Some programs sense that and
 : behave differently, e.g. with binary non-printable characters or with
 : special character sequences.  By redirecting the output to files, then
 : displaying it on the screen, the output may look strangely, or have
 : some more grave effect, like terminating output prematurely.

 I don't see why it would terminate prematurely

It was long ago, but I presume I thought about the ^Z character that
some programs write or interpret to signal end of file.  Writing that
to the console stops output, AFAIR.

 - Change the command (in the ls example, use --color=always).

This will only work on a Posix console.  The Windows console doesn't
support the SGR sequences, and will display them as ugly binary
garbage.  (When 'ls' or 'grep' run on Windows and their stdout is
connected to a console, they use special console output commands to
produce color, but those commands cannot be recorded in a file and
replayed, as you can with SGR sequences.)

 What I'm saying is the circumstances where this is a real problem
 seem rather exotic, and noone's forcing you to use the option. If we
 were talking about changing make's default behaviour, the concern
 may be justified, but we aren't.

I just mentioned some caveats that people might find surprising,
that's all.  Perhaps those caveats should be mentioned in the
documentation.  No other intentions.

 : We should consider these potential problems, and in any case the
 : simple loop that reads from temporary files and writes to
 : stdout/stderr may need to become much more complicated, at least
 : on DOS/Windows.  (I believe similar, though less serious, problems
 : could happen on Unix as well.)

 Apart from setting O_BINARY, I don't see why. Can you explain?

You just explained it yourself.  Sure, if we don't want to do anything
about those issues, then no complications are needed.

 WRT O_BINARY, you wrote:

 :  . pump_from_tmp_fd will need to switch to_fd to binary mode, since
 :this is how tmpfile opens the temporary file; we need output mode
 :match the input mode.

 I was about to add it

Please don't.  I'm actively working on this, will test it, and hope to
commit the changes in a few days.  Making such changes in parallel
will just make my merging harder.

 but now I wonder if that's really needed: If the temp file is opened
 in binary mode, and this temp file is passed as stdout/stderr to the
 shell commands, those will write their output in binary mode

No, they won't.  The text/binary mode is per process (and per file
descriptor in the process), it is not global and not per resource.
That's because the text mode is implemented in the C library and uses
data private to each process.

Therefore, even if the handle is inherited by child processes, those
processes will not magically start

Re: [bug #33138] .PARLLELSYNC enhancement with patch

 Date: Wed, 24 Apr 2013 19:14:01 +0200
 Cc: bug-make@gnu.org
 From: Frank Heckenbach f.heckenb...@fh-soft.de

  Testing clearly shows it doesn't: GetFileInformationByHandle simply
  fails for handles open on console devices and the null device.  And we
  will be comparing handles for console devices in the majority of use
  cases here.

 That's right. Or perhaps pipes (I post-process make's output, to
 handle the directories in messages etc. and I suppose the mentioned
 Emacs mode does something similar).

GetFileInformationByHandle does work on pipes, at least on shell
pipes (as in foo | cat).

  I researched this a bit, and eventually succeeded in finding a way
  that is good enough for both disk files, pipes, and console devices.

 Good. (And the null device, I suppose. Though it doesn't really
 matter if we write to the null device combined or separate. ;-)

No, I didn't find a satisfactory solution for the null device.  But as
you point out, we don't care.

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

 Date: Wed, 24 Apr 2013 10:34:20 -0700
 From: David Boyce david.s.bo...@gmail.com
 Cc: Frank Heckenbach f.heckenb...@fh-soft.de,
   bug-make@gnu.org bug-make@gnu.org

 On Wed, Apr 24, 2013 at 10:26 AM, Tim Murphy tnmur...@gmail.com wrote:

   I would suggest pretending that one has semaphores first with some nice
  little abstraction and then implementing them in the best way the platform
  has to offer.

 How about we introduce functions called acquire_semaphore() and
 release_semaphore()? Oh, wait, we did.

Sorry, but this is not funny.  There are, indeed, such functions, but
that's about all there is about modularity in the implementation.  The
rest is code scattered all over the place, that _knows_ all too well
that it is dealing with file descriptors, not with some abstract
semaphores or mutexes.  Here's a good example:

  if (output_sync)
{
  static int combined_output;
  /* If output_sync is turned on, find a resource to
  synchronize on. This block is traversed only once. */
  if (sync_handle == -1)
{
  if (STREAM_OK (stdout))
{
  struct stat stbuf_o, stbuf_e;

  sync_handle = fileno (stdout);
  combined_output =
fstat (fileno (stdout), stbuf_o) == 0 
fstat (fileno (stderr), stbuf_e) == 0 
stbuf_o.st_dev == stbuf_e.st_dev 
stbuf_o.st_ino == stbuf_e.st_ino;
}
  else if (STREAM_OK (stderr))
sync_handle = fileno (stderr);
  else
{
  perror_with_name (output-sync suppressed: , stderr);
  output_sync = 0;
}
}

There are 2 separate things intermixed here: determination of
combined_output and determination of the resource to use as a mutex.
They are mixed because they both deal with file descriptors, and it
was just convenient, by sheer luck (or maybe because of something
else) to save a few if's and do them together.

But this is exactly where abstractions leak and break.

It will come as small surprise, I'm sure, that the corresponding code
in the Windows branch looks different, because there the two decisions
are uncoupled.

 I have the impression Eli has the Windows port in hand

Almost, yes.  I needed to invest some time into researching how to do
is_same_file, STREAM_OK, CLOSE_ON_EXEC, etc.  But I'm getting there.

 he's just skeptical about a few details.

Not anymore, thanks.

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

On Wed, 2013-04-24 at 21:17 +0300, Eli Zaretskii wrote:
 There's one issue that perhaps needs discussing.  A mutex is
 identified by a handle, which on Windows is actually a pointer to an
 opaque object (maintained by the kernel).  As such, using just 'int'
 for sync_handle is not wide enough, certainly not in 64-bit builds.
 Is it OK to use intptr_t instead?  Doing this cleanly might require to
 have a macro (I called it FD_FROM_HANDLE) to extract a file descriptor
 that is passed to a Posix fcntl; on Windows that macro will be a
 no-op, and on Posix platforms it's a simple cast.
 
 Is this OK?

I think the lock or lock handle or whatever can be encapsulated into
a typedef, which will be different between the different platforms.
That sounds like a good abstraction to me.

We can even generalize the way in which we communicate the handle to
sub-makes; for example by calling a function with the handle that
returns a char*: if that value is non-NULL it's added to the command
line (maybe as a third element to the current jobs-fd argument).  If
it's null, nothing is added.  Then the submake can parse it out and hand
it back to a function that returns a handle again
(serialize/deserialize).  I'm not sure if this is necessary; it depends
on the details of the Windows model.

  +/* Test whether a file contains any data. */
  +static int
  +fd_not_empty (int fd)
  +{
  +  return fd = 0  lseek (fd, 0, SEEK_END)  0;
  +}
 
 Isn't this unnecessarily expensive (with large output volumes)?  Why
 not use fstat?

This lseek() doesn't actually move the file reference: SEEK_END plus an
offset of 0 is a no-op so it doesn't matter how large the file is.  This
is just seeing if the position has moved since we opened the file (still
at 0 or not); it just returns the current position in the file, which is
known to the system directly without having to go ask anyone (it has to
be so, since each file descriptor has its own position).

I would be greatly surprised if fstat(), which has to go to the
directory (probably) to look up all the information on the file such as
ownership, permissions, etc., is faster.


___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

Eli Zaretskii wrote:

  : Btw, there will be other side effects, at least on non-Posix
  : platforms, due to the fact that stuff that was supposed to go to the
  : screen is redirected to a file instead.  Some programs sense that and
  : behave differently, e.g. with binary non-printable characters or with
  : special character sequences.  By redirecting the output to files, then
  : displaying it on the screen, the output may look strangely, or have
  : some more grave effect, like terminating output prematurely.
  
  I don't see why it would terminate prematurely
 
 It was long ago, but I presume I thought about the ^Z character that
 some programs write or interpret to signal end of file.  Writing that
 to the console stops output, AFAIR.

Oh, that's still done? When I last used Dos in the 1990s it was
already obsolete and few programs did it. OK, so you'll have to
strip them. Fortunately, that's not too hard, I assume you have
already implemented it.

  What I'm saying is the circumstances where this is a real problem
  seem rather exotic, and noone's forcing you to use the option. If we
  were talking about changing make's default behaviour, the concern
  may be justified, but we aren't.
 
 I just mentioned some caveats that people might find surprising,
 that's all.  Perhaps those caveats should be mentioned in the
 documentation.  No other intentions.

OK, we can add a few sentences in the docs. Have you already written
something (just to avoid further dupliation of work)?

  - Factor out is_same_file() into a function.
 
 I already wrote this in my branch, please wait until I commit.
 
  - Turn file_ok() and fd_not_empty() into functions instead of
macros. (No real need for macros, and functions may be easier to
port.)
 
 Fixed that as well here.

OK, so Paul now has two versions to choose from. :-)

Really, I just tried to be helpful. Since you were constantly
complaining about the design, I did the (small) things I saw to
improve about it. Other than that I didn't (and still don't) know
what's so bad about the design.

 There's one issue that perhaps needs discussing.  A mutex is
 identified by a handle, which on Windows is actually a pointer to an
 opaque object (maintained by the kernel).

And the pointer is the same between different processes?

 As such, using just 'int'
 for sync_handle is not wide enough, certainly not in 64-bit builds.
 Is it OK to use intptr_t instead?  Doing this cleanly might require to
 have a macro (I called it FD_FROM_HANDLE) to extract a file descriptor
 that is passed to a Posix fcntl; on Windows that macro will be a
 no-op, and on Posix platforms it's a simple cast.
 
 Is this OK?

I can't follow you here. On POSIX, we don't need to pass a fd
because it's always stdout/stderr. Or do mean something else?

  +/* Test whether a file contains any data. */
  +static int
  +fd_not_empty (int fd)
  +{
  +  return fd = 0  lseek (fd, 0, SEEK_END)  0;
  +}
 
 Isn't this unnecessarily expensive (with large output volumes)?  Why
 not use fstat?

On POSIX both lseek and fstat are not very expensive. On Windows,
you said fstat was very expensive, didn't you? Or is lseek even
worse?

  How about we introduce functions called acquire_semaphore() and
  release_semaphore()? Oh, wait, we did.
 
 Sorry, but this is not funny.  There are, indeed, such functions, but
 that's about all there is about modularity in the implementation.  The
 rest is code scattered all over the place, that _knows_ all too well
 that it is dealing with file descriptors, not with some abstract
 semaphores or mutexes.  Here's a good example:

 [...]

 There are 2 separate things intermixed here: determination of
 combined_output and determination of the resource to use as a mutex.
 They are mixed because they both deal with file descriptors, and it
 was just convenient, by sheer luck (or maybe because of something
 else) to save a few if's and do them together.

I talked about this yesterday or so. I still don't see why you can't
keep this code as it. You can use sync_handle as a flag to make sure
the code is executed only once and ignore it for the locking
purpose.

Of course, you can rewrite the code to separate the two things, but
I still don't see that it's necessary. The checks to be done
(file_ok, is_same_file) are the same anyway, so I don't see it as a
design flaw.

Paul Smith wrote:

   +/* Test whether a file contains any data. */
   +static int
   +fd_not_empty (int fd)
   +{
   +  return fd = 0  lseek (fd, 0, SEEK_END)  0;
   +}
  
  Isn't this unnecessarily expensive (with large output volumes)?  Why
  not use fstat?
 
 This lseek() doesn't actually move the file reference: SEEK_END plus an
 offset of 0 is a no-op so it doesn't matter how large the file is.  This
 is just seeing if the position has moved since we opened the file (still
 at 0 or not); it just returns the current position in the file, which is
 known to the system directly without having to go ask anyone (it has to
 be so,

Re: [bug #33138] .PARLLELSYNC enhancement with patch

I'm fully prepared to accept the blame for not doing the best job
getting buy-in etc. on this effort.  Can we leave the discussion on the
process behind?  I'd prefer that, unless there are real constructive
comments on how to do better next time rather than rehashing what was
done wrong.  I think we have discussed that enough, and I'd prefer to
avoid going further along that road.

Thanks.


On Wed, 2013-04-24 at 20:46 +0200, Frank Heckenbach wrote:
 That's true about SEEK_CUR which was there originally. I actually
 changed it to SEEK_END, which does move the position to the end.

Oh right.  My head cold is keeping me foggy.  What was the reason to
change to SEEK_END, again?

I'm not so sure fstat() is that cheap.  struct stat contains a lot of
information.  Although I guess since we are only ever talking about temp
files, not NFS files or something, it's probably not too bad.


___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

 From: Paul Smith psm...@gnu.org
 Cc: e...@gnu.org, bug-make@gnu.org
 Date: Wed, 24 Apr 2013 15:07:21 -0400

 I'm not so sure fstat() is that cheap.  struct stat contains a lot of
 information.  Although I guess since we are only ever talking about temp
 files, not NFS files or something, it's probably not too bad.

We could time it if we are afraid of the cost, but I'd be surprised if
'fstat' wasn't extremely fast on Posix platforms.  Most of the
information you get in struct stat is already available when the file
is open.  In particular, the OS tracks the file's size as it is being
written.

(On Windows, we will have to use a different method.)

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

 Date: Wed, 24 Apr 2013 20:46:14 +0200
 Cc: p...@mad-scientist.net, bug-make@gnu.org
 From: Frank Heckenbach f.heckenb...@fh-soft.de

   I don't see why it would terminate prematurely

  It was long ago, but I presume I thought about the ^Z character that
  some programs write or interpret to signal end of file.  Writing that
  to the console stops output, AFAIR.

 Oh, that's still done? When I last used Dos in the 1990s it was
 already obsolete and few programs did it. OK, so you'll have to
 strip them. Fortunately, that's not too hard, I assume you have
 already implemented it.

No need: if the file is read and written in binary mode, ^Z is just a
character like any other.

  I just mentioned some caveats that people might find surprising,
  that's all.  Perhaps those caveats should be mentioned in the
  documentation.  No other intentions.

 OK, we can add a few sentences in the docs. Have you already written
 something (just to avoid further dupliation of work)?

No.  Go ahead, if you have time.

 Really, I just tried to be helpful.

I realize, and I am grateful.  The discussion was very helpful.

  There's one issue that perhaps needs discussing.  A mutex is
  identified by a handle, which on Windows is actually a pointer to an
  opaque object (maintained by the kernel).

 And the pointer is the same between different processes?

I arrange for it to be inherited, therefore yes.  This way, I don't
have to invent a unique name and don't risk a slim chance that some
stale process left a mutex by the same name around.  (Still need to
make sure that nameless mutexes will do the job in this case, though.)

  As such, using just 'int'
  for sync_handle is not wide enough, certainly not in 64-bit builds.
  Is it OK to use intptr_t instead?  Doing this cleanly might require to
  have a macro (I called it FD_FROM_HANDLE) to extract a file descriptor
  that is passed to a Posix fcntl; on Windows that macro will be a
  no-op, and on Posix platforms it's a simple cast.

  Is this OK?

 I can't follow you here. On POSIX, we don't need to pass a fd
 because it's always stdout/stderr. Or do mean something else?

I meant the file descriptor passed to fcntl to put a lock on it.  The
value of that descriptor is stored in sync_handle, whose type is
'int'.

   +/* Test whether a file contains any data. */
   +static int
   +fd_not_empty (int fd)
   +{
   +  return fd = 0  lseek (fd, 0, SEEK_END)  0;
   +}

  Isn't this unnecessarily expensive (with large output volumes)?  Why
  not use fstat?

 On POSIX both lseek and fstat are not very expensive.

But fstat should be cheaper, as it already tracks the size of the
file, which is what you want here.

 On Windows, you said fstat was very expensive, didn't you? Or is
 lseek even worse?

I think anything that potentially moves the file pointer can be
sometimes expensive and is best avoided.  (On Windows, I'd use
GetFileInformationByHandle.)

  There are 2 separate things intermixed here: determination of
  combined_output and determination of the resource to use as a mutex.
  They are mixed because they both deal with file descriptors, and it
  was just convenient, by sheer luck (or maybe because of something
  else) to save a few if's and do them together.

 I talked about this yesterday or so. I still don't see why you can't
 keep this code as it.

Because the Windows implementation doesn't care whether stdout or
stderr (or both) are OK.  It doesn't use their file descriptor for
locking, so it doesn't care.  It just needs to know, for the purposes
of combined_output, whether they point to the same file/device.  But
for determining what to use for sync_handle, this is irrelevant.  So
the logic looks different.

 You can use sync_handle as a flag to make sure the code is executed
 only once and ignore it for the locking purpose.

I need to store the handle for the mutex somewhere, and sync_handle
sounds just like the right place.

 Nothing is actually read by lseek (and even if it were, it would
 only need to look at the first and last part of the file, not read
 all the content, if that was the worry).

Are you sure?  How can lseek jump to the last byte of the file, if
the file is not contiguous on disk, except by reading some of it?

Anyway, we should time this instead of arguing.

Or maybe we should abandon this optimization and take the lock
regardless.  How bad can that be?

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

On Wed, 2013-04-24 at 22:25 +0300, Eli Zaretskii wrote:
  From: Paul Smith psm...@gnu.org
  Cc: e...@gnu.org, bug-make@gnu.org
  Date: Wed, 24 Apr 2013 15:07:21 -0400

  I'm not so sure fstat() is that cheap.  struct stat contains a lot of
  information.  Although I guess since we are only ever talking about temp
  files, not NFS files or something, it's probably not too bad.

 We could time it if we are afraid of the cost, but I'd be surprised if
 'fstat' wasn't extremely fast on Posix platforms.  Most of the
 information you get in struct stat is already available when the file
 is open.  In particular, the OS tracks the file's size as it is being
 written.

True.  It's probably all right there and not measurably different if you
have a file descriptor already.

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

On Wed, 2013-04-24 at 22:39 +0300, Eli Zaretskii wrote:
  Nothing is actually read by lseek (and even if it were, it would
  only need to look at the first and last part of the file, not read
  all the content, if that was the worry).
 
 Are you sure?  How can lseek jump to the last byte of the file, if
 the file is not contiguous on disk, except by reading some of it?

If fstat() can get the size from an internal structure then lseek() can
do the same, then just update the file descriptor's position.  I don't
think there's more to it than setting that value, but it could be.
Certainly at the filesystem layer we don't know, and we don't care,
about things like whether the file is kept contiguously at the block
layer.

As you say, we should just measure.

 Or maybe we should abandon this optimization and take the lock
 regardless.  How bad can that be?

Well, we want to know if the file has any content anyway: for example we
don't want to output the enter/leave notifications if there's nothing to
print.  So there's no extra cost to avoiding the lock here.



___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

2013-04-24 Thread Tim Murphy

why not use a named semaphore wherever possible (windows and linux) and
lock a file where not instead of trying to pass kernel object handles
around (seems a bit nasty to me)?


On 24 April 2013 21:19, Paul Smith psm...@gnu.org wrote:

 On Wed, 2013-04-24 at 22:39 +0300, Eli Zaretskii wrote:
   Nothing is actually read by lseek (and even if it were, it would
   only need to look at the first and last part of the file, not read
   all the content, if that was the worry).
 
  Are you sure?  How can lseek jump to the last byte of the file, if
  the file is not contiguous on disk, except by reading some of it?

 If fstat() can get the size from an internal structure then lseek() can
 do the same, then just update the file descriptor's position.  I don't
 think there's more to it than setting that value, but it could be.
 Certainly at the filesystem layer we don't know, and we don't care,
 about things like whether the file is kept contiguously at the block
 layer.

 As you say, we should just measure.

  Or maybe we should abandon this optimization and take the lock
  regardless.  How bad can that be?

 Well, we want to know if the file has any content anyway: for example we
 don't want to output the enter/leave notifications if there's nothing to
 print.  So there's no extra cost to avoiding the lock here.



 ___
 Bug-make mailing list
 Bug-make@gnu.org
 https://lists.gnu.org/mailman/listinfo/bug-make




-- 
You could help some brave and decent people to have access to uncensored
news by making a donation at:

http://www.thezimbabwean.co.uk/friends/
___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

On Wed, 2013-04-24 at 22:55 +0100, Tim Murphy wrote:
 why not use a named semaphore wherever possible (windows and linux)
 and lock a file where not instead of trying to pass kernel object
 handles around (seems a bit nasty to me)?

Hi Tim; I think you're late to the party :-).  Let me summarize a lot of
discussion then we can use this as a reference for other questions.

Named semaphores on POSIX are kind of sucky.  They don't get
automatically cleaned up when the process exits, for one thing.  And my
suspicion is that they're not very portable across different variations
of UNIX especially older ones.

File locking, on the other hand, is very portable, very simple, very
fast, and any held locks are automatically released when the descriptor
is closed.  No muss, no fuss.

If you agree file locking is the right way to go on a POSIX system, then
we just have to decide what to lock.  It can be anything as long as all
children of a given top-level make can find it.  We could use a
temporary file, sure.  If we did that we'd have to pass around either
the name of the file or, more likely, use tmpfile() and send along the
file descriptor, handle close-on-exec, etc. like we do with the
jobserver pipe.  Which we could certainly do: we do it today with
jobserver.

However we already have a descriptor available to us: stdout.  If we use
that we don't have to pass around anything because our children are
already inheriting it and it's on a well-known FD.  We don't have to
worry about using + before sub-make rules like we do with jobserver.

There's one other advantage of using stdout: if a sub-make redirects its
output to a separate file, that magically starts a new lock domain and
that sub-make and its children don't contend with the other sub-makes
and their children.  This is not a big deal and if we didn't get it for
free, we'd never go to the effort of implementing it.  But it's nice.


Eli is working on the Windows port; I have no idea how he's decided to
implement this there yet.


___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

Paul Smith wrote:

 On Wed, 2013-04-24 at 20:46 +0200, Frank Heckenbach wrote:
  That's true about SEEK_CUR which was there originally. I actually
  changed it to SEEK_END, which does move the position to the end.
 
 Oh right.  My head cold is keeping me foggy.  What was the reason to
 change to SEEK_END, again?

lseek (SEEK_END, 0) seeks to the end and then returns the current
position, i.e. the file size.

lseek (SEEK_CUR, 0) stays at the current position and returns it. So
if the fd (in the parent) remains at the start while the child wrote
something, it would return 0, whereas the former wouldn't.

AFAIK, lseek (SEEK_END, 0) is one of two canonical ways (besides
fstat) to get the size of a file.

Eli Zaretskii wrote:

   I just mentioned some caveats that people might find surprising,
   that's all.  Perhaps those caveats should be mentioned in the
   documentation.  No other intentions.
  
  OK, we can add a few sentences in the docs. Have you already written
  something (just to avoid further dupliation of work)?
 
 No.  Go ahead, if you have time.

Attached.

   As such, using just 'int'
   for sync_handle is not wide enough, certainly not in 64-bit builds.
   Is it OK to use intptr_t instead?  Doing this cleanly might require to
   have a macro (I called it FD_FROM_HANDLE) to extract a file descriptor
   that is passed to a Posix fcntl; on Windows that macro will be a
   no-op, and on Posix platforms it's a simple cast.
   
   Is this OK?
  
  I can't follow you here. On POSIX, we don't need to pass a fd
  because it's always stdout/stderr. Or do mean something else?
 
 I meant the file descriptor passed to fcntl to put a lock on it.  The
 value of that descriptor is stored in sync_handle, whose type is
 'int'.

So you want to overload sync_handle to contain an fd on POSIX and a
pointer on Windows? I'm not sure I'd like this. Since it's used in
separate code branches (ifdef'd or perhaps in different source files
in compat) anyway, why not define separate variables of the
appropriate types (preferably with different names, to reduce
confusion)?

  On Windows, you said fstat was very expensive, didn't you? Or is
  lseek even worse?
 
 I think anything that potentially moves the file pointer can be
 sometimes expensive and is best avoided.  (On Windows, I'd use
 GetFileInformationByHandle.)

OK, if that's so, do that. But I don't think that's true on POSIX.

  Nothing is actually read by lseek (and even if it were, it would
  only need to look at the first and last part of the file, not read
  all the content, if that was the worry).
 
 Are you sure?  How can lseek jump to the last byte of the file, if
 the file is not contiguous on disk, except by reading some of it?

lseek doesn't need to read any data. It just sets the current offset
of the FD to the given position, so the next read (which in this
case never happens before seeking to the beginning) knows where to
read. Even in the case of SEEK_END, all it has to do is add the
given offset (here: 0) to the current file size.

Instead of testing, I just looked at the implementation (Linux
3.2.2). The following is really the whole relevant code. As you see,
nothing's read from the disk, it only handles in-memory data. (Also
the file size is in memory for open files; even it were not, it
would be a constant-time access to the inode and wouldn't need to
touch any data blocks.)

loff_t
generic_file_llseek_size(struct file *file, loff_t offset, int origin,
loff_t maxsize)
{
struct inode *inode = file-f_mapping-host;

switch (origin) {
case SEEK_END:
offset += i_size_read(inode);
break;
// skipped other cases
}

return lseek_execute(file, inode, offset, maxsize);
}

static loff_t lseek_execute(struct file *file, struct inode *inode,
loff_t offset, loff_t maxsize)
{
if (offset  0  !unsigned_offsets(file))
return -EINVAL;
if (offset  maxsize)
return -EINVAL;

if (offset != file-f_pos) {
file-f_pos = offset;
file-f_version = 0;
}
return offset;
}

static inline int unsigned_offsets(struct file *file)
{
return file-f_mode  FMODE_UNSIGNED_OFFSET;
}

static inline loff_t i_size_read(const struct inode *inode)
{
// skipped code to deal with preemption and synchronized access in SMP
return inode-i_size;
}
--- doc/make.texi.orig	2013-04-24 16:30:05.0 +0200
+++ doc/make.texi	2013-04-25 02:08:58.0 +0200
@@ -8639,6 +8639,17 @@
 complex parallel build in the background and checking its output
 afterwards.
 
+Note that with this option, each job's output is redirected to a
+temporary file first. Some commands can behave differently depending
+on the type of their standard output or standard error. E.g.,
+@code{ls --color=tty} might display a colored directory listing when
+standard output is connected to a terminal.

Re: [bug #33138] .PARLLELSYNC enhancement with patch

 From: Paul Smith p...@mad-scientist.net
 Cc: Frank Heckenbach f.heckenb...@fh-soft.de, bug-make@gnu.org
 Date: Wed, 24 Apr 2013 16:03:56 -0400

  Or maybe we should abandon this optimization and take the lock
  regardless.  How bad can that be?

 Well, we want to know if the file has any content anyway: for example we
 don't want to output the enter/leave notifications if there's nothing to
 print.

Why not?

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

Eli Zaretskii wrote:

  From: Paul Smith p...@mad-scientist.net
  Cc: Frank Heckenbach f.heckenb...@fh-soft.de, bug-make@gnu.org
  Date: Wed, 24 Apr 2013 16:03:56 -0400

   Or maybe we should abandon this optimization and take the lock
   regardless.  How bad can that be?

  Well, we want to know if the file has any content anyway: for example we
  don't want to output the enter/leave notifications if there's nothing to
  print.

 Why not?

To avoid clutter. There are probably still too many of these
messages anyway, as discussed before.

Note, we still give the regular enter/leave messages as we do
without output-sync, so no information is lost. We just avoid
additional enter/leave messages with nothing in between.

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

 Date: Wed, 24 Apr 2013 22:55:59 +0100
 From: Tim Murphy tnmur...@gmail.com
 Cc: bug-make@gnu.org bug-make@gnu.org

 trying to pass kernel object handles around (seems a bit nasty to
 me)?

Why is that nasty?  The handle is returned by a documented interface,
and communicating it to another process is an officially documented
technique.  How is that different from passing file descriptors, which
are indices into a kernel-maintained array?

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

 Date: Thu, 25 Apr 2013 02:16:33 +0200
 Cc: e...@gnu.org, bug-make@gnu.org
 From: Frank Heckenbach f.heckenb...@fh-soft.de

   I can't follow you here. On POSIX, we don't need to pass a fd
   because it's always stdout/stderr. Or do mean something else?

  I meant the file descriptor passed to fcntl to put a lock on it.  The
  value of that descriptor is stored in sync_handle, whose type is
  'int'.

 So you want to overload sync_handle to contain an fd on POSIX and a
 pointer on Windows?

I don't see it as an overloading.  'sync_handle' is already opaque,
even its name says so.  It just so happens that on Unix it is actually
a file descriptor.  And it just so happens that on Windows it will be
a handle to a mutex.

 I'm not sure I'd like this. Since it's used in separate code
 branches (ifdef'd or perhaps in different source files in compat)
 anyway, why not define separate variables of the appropriate types
 (preferably with different names, to reduce confusion)?

I don't see any confusion (how probable is it that someone who is
interested in Unix will ever look at the details of the Windows
implementation, and vice versa?).  And we never do that in Make, we
generally try to use the same variables.  If nothing else, it keeps
the number of ifdef's down.
 --- doc/make.texi.orig2013-04-24 16:30:05.0 +0200
 +++ doc/make.texi 2013-04-25 02:08:58.0 +0200
 @@ -8639,6 +8639,17 @@
  complex parallel build in the background and checking its output
  afterwards.

 +Note that with this option, each job's output is redirected to a
 +temporary file first. Some commands can behave differently depending
 +on the type of their standard output or standard error. E.g.,
 +@code{ls --color=tty} might display a colored directory listing when
 +standard output is connected to a terminal. When using this option,
 +colors would be disabled because the output of the command goes to a
 +file. Note, however, that it is generally best to avoid such
 +commands in makefiles, independent of this option, since the
 +different behavior would also be triggered when users redirect the
 +whole output of @code{make}, e.g. to a log file.

Thanks.  I think you need two spaces between sentences (that's GNU
Coding Standards' requirement), and command lines people type should
be in @kbd, not @code.  Also, e.g. needs either a comma or a @:
after it, to signal to TeX it's not an end of a sentence.

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

Eli Zaretskii wrote:

  Date: Thu, 25 Apr 2013 02:16:33 +0200
  Cc: e...@gnu.org, bug-make@gnu.org
  From: Frank Heckenbach f.heckenb...@fh-soft.de

I can't follow you here. On POSIX, we don't need to pass a fd
because it's always stdout/stderr. Or do mean something else?

   I meant the file descriptor passed to fcntl to put a lock on it.  The
   value of that descriptor is stored in sync_handle, whose type is
   'int'.

  So you want to overload sync_handle to contain an fd on POSIX and a
  pointer on Windows?

 I don't see it as an overloading.  'sync_handle' is already opaque,
 even its name says so.

Alright, in this case I'd suggest renaming it to sync_fd on the
POSIX side.

 It just so happens that on Unix it is actually
 a file descriptor.  And it just so happens that on Windows it will be
 a handle to a mutex.

  I'm not sure I'd like this. Since it's used in separate code
  branches (ifdef'd or perhaps in different source files in compat)
  anyway, why not define separate variables of the appropriate types
  (preferably with different names, to reduce confusion)?

 I don't see any confusion (how probable is it that someone who is
 interested in Unix will ever look at the details of the Windows
 implementation, and vice versa?).

I often grep through sources in order to quickly find all places a
declaration is used.

 And we never do that in Make, we
 generally try to use the same variables.  If nothing else, it keeps
 the number of ifdef's down.

My suggestion was under the assumption that we use different
declarations because of different types anyway -- which I still
recommend, though Paul might have to decide this one.

If I understand it correctly, roughly we'd have now:

intptr_t sync_handle;

[...]
#ifdef POSIX
sync_handle = (intptr_t) fileno (stdout);
#else
sync_handle = (intptr_t) get_mutex_handle (...);
#endif
[...]

#ifdef POSIX
fcntl ((int) sync_handle, ...);
#else
lock_mutex ((mutex *) sync_handle);
#endif

I.e., everything about it is separate, apart from the declaration,
and the latter uses a type that isn't quite right for either one.
Though I generally like reducing the number of ifdefs, in this case
it seems more consistent to split the declaration, so it's clear
that these are really two different things. As a side benefit, if in
the future someone tried to use it in common code, they would get
compile-time errors on the other platform, rather than silently
producing wrong code (like applying some fd operation to a mutex
handle).

 Thanks.  I think you need two spaces between sentences (that's GNU
 Coding Standards' requirement), and command lines people type should
 be in @kbd, not @code.  Also, e.g. needs either a comma or a @:
 after it, to signal to TeX it's not an end of a sentence.

Changed.
--- doc/make.texi.orig	2013-04-24 16:30:05.0 +0200
+++ doc/make.texi	2013-04-25 02:08:58.0 +0200
@@ -8639,6 +8639,17 @@
 complex parallel build in the background and checking its output
 afterwards.

+Note that with this option, each job's output is redirected to a
+temporary file first.  Some commands can behave differently depending
+on the type of their standard output or standard error.  E.g.,
+@kbd{ls --color=tty} might display a colored directory listing when
+standard output is connected to a terminal.  When using this option,
+colors would be disabled because the output of the command goes to a
+file.  Note, however, that it is generally best to avoid such
+commands in makefiles, independent of this option, since the
+different behavior would also be triggered when users redirect the
+whole output of @code{make}, e.g.@: to a log file.
+
 @item -q
 @cindex @code{-q}
 @itemx --question
___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

 Date: Fri, 19 Apr 2013 11:54:05 +0200
 Cc: bo...@kolpackov.net, bug-make@gnu.org
 From: Frank Heckenbach f.heckenb...@fh-soft.de

 Eli Zaretskii wrote:

  Initial investigation indicates that tmpfile should do the job just
  fine: the file is deleted only when the last descriptor for it is
  closed.  That includes any duplicated descriptors.

 Great.

  As for fcntl, F_SETLKW, and F_GETFD, they will need to be emulated.
  In particular, it looks like LockFileEx with LOCKFILE_EXCLUSIVE_LOCK
  flag set and LOCKFILE_FAIL_IMMEDIATELY flag cleared should do the
  job.  I will need to see how it works in reality, though.

 OK.

Upon a second look, I'm not sure I understand how this feature works,
exactly, and why you-all thought making it work on Windows is a matter
of a few functions.  I sincerely hope I'm missing something, please
bear with me.

First, most of the meat of OUTPUT_SYNC code, which sets up the stage
when running child jobs, is in a branch that isn't compiled on Windows
(#if !defined(__MSDOS__)  !defined(_AMIGA)  !defined(WINDOWS32)
on line 1482 of job.c).  So currently that part is not even run on
Windows.  Please tell me that nothing in this feature relies on
'fork', with its copying of handles and other data structures.
Because if it does, we have no hope of making it work on Windows, at
least not using the same algorithms as on Unix.

More importantly, how exactly locking the (redirected) stdout/stderr
of the child is supposed to cause synchronization, and why do we need
it at all?  Isn't synchronization already achieved by redirecting
child's output to a file, and only dumping it to screen when the child
exits?  What does lock add to this?  Who else will be writing what to
where, that we want to prevent by holding the lock/semaphore?

In an old thread, Paul explained something similar:

 David, can you explain why you needed to lock the files?  Also, what
 region(s) of the file you are locking?  fcntl with F_WRLCK won't work
 on Windows, so the question is how to emulate it.

David wants to interlock between ALL instances of make printing output,
so that even during recursive makes no matter how many you have running
concurrently, only one will print its output at a time.

There is no specific region of the file that's locked: the lockfile is
basically a file-based, system-wide semaphore.  The entire file is
locked; it's empty and has no content.

Assuming this all is still basically true, I guess I still don't
understand what exactly is being locked and why.  E.g., why do we only
want to interlock instances of Make, but not the programs they run?
Also, acquire_semaphore is used only in sync_output, which is called
only when a child exits.  IOW, nothing is locked while the child
runs, only when its output is ready.

In addition, we are locking stdout.  But doesn't each instance of Make
have, or can have, its own stdout?  If so, how will the interlock
work?

What am I missing?  Probably a lot.

TIA

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

2013-04-23 Thread David Boyce

Since you asked basic questions I'm going to start this at a basic level.
Apologies if it covers some stuff you already know or if I misinterpreted
the questions. Note that I haven't actually looked at the patch that went
in so this is generally wrt the original.

The first thing is get the word lock out of your mind because we aren't
really locking anything. Yes, that API is in use but it's only to create a
semaphore or baton. Nobody is ever prevented from doing anything. It just
happens that on Unix the most portable (i.e. oldest) way of implementing a
semaphore is with the advisory locking API. All cooperating processes agree
not to proceed unless and until they are able to acquire the exclusive lock
on a shared file descriptor, but it's not necessary to ever actually write
anything to that descriptor.

Second, the original implementation (not sure if I ever sent that one one
in though) actually created a temp file to use as the semaphore fd. But
then I discovered that stdout can be locked in the same way, which is
simpler. But applying the lock to stdout is just a frill; it could be a
temp file, especially if some platform turned out to need it that way. I
just figured that stdout is always available, or at least if it's closed
you don't have to worry about synchronizing output.

Third, yes, nothing is locked while the child runs. If a shared resource
was locked during child runs it would have the effect of re-serializing the
build as each supposedly parallel child waited on the lock. So what happens
here is really very simple: each child (aka recipe) runs asynchronously,
assuming -j of course, and dumps its output to one or two temp files. Only
when the child has finished and wants to report results does it enter the
queue waiting for the baton. When it gets it, it holds it just long enough
to copy its output from the temp files to stdout/stderr and then lets the
next guy have his turn. Thus, assuming the average job runs for
a significant amount of time (multiples of a write() system call anyway)
there will not be much contention on the semaphore and it won't be a
bottleneck.

You're right that simply writing to temp files and dumping everything at
once when the job finished would be likely to reduce the incidence of
garbling even without the semaphore, but not to zero.

It may be that the locking of stdout is only useful on Unix due to the fact
that it's inherited into child processes. I don't know what Paul or Frank
is thinking, and as mentioned I haven't looked at the current version, but
my thinking originally was that Windows could easily handle this using its
own far richer set of semaphore/locking APIs. I'd actually expect this to
be easier and more natural on Windows than Unix. All that's required is to
choose a semaphore to synchronize on, dump output to temp files, and copy
it to stdout/stderr only after acquiring the semaphore. And remove the temp
files of course.

-David Boyce




On Tue, Apr 23, 2013 at 10:50 AM, Eli Zaretskii e...@gnu.org wrote:

  Date: Fri, 19 Apr 2013 11:54:05 +0200
  Cc: bo...@kolpackov.net, bug-make@gnu.org
  From: Frank Heckenbach f.heckenb...@fh-soft.de
 
  Eli Zaretskii wrote:
 
   Initial investigation indicates that tmpfile should do the job just
   fine: the file is deleted only when the last descriptor for it is
   closed.  That includes any duplicated descriptors.
 
  Great.
 
   As for fcntl, F_SETLKW, and F_GETFD, they will need to be emulated.
   In particular, it looks like LockFileEx with LOCKFILE_EXCLUSIVE_LOCK
   flag set and LOCKFILE_FAIL_IMMEDIATELY flag cleared should do the
   job.  I will need to see how it works in reality, though.
 
  OK.

 Upon a second look, I'm not sure I understand how this feature works,
 exactly, and why you-all thought making it work on Windows is a matter
 of a few functions.  I sincerely hope I'm missing something, please
 bear with me.

 First, most of the meat of OUTPUT_SYNC code, which sets up the stage
 when running child jobs, is in a branch that isn't compiled on Windows
 (#if !defined(__MSDOS__)  !defined(_AMIGA)  !defined(WINDOWS32)
 on line 1482 of job.c).  So currently that part is not even run on
 Windows.  Please tell me that nothing in this feature relies on
 'fork', with its copying of handles and other data structures.
 Because if it does, we have no hope of making it work on Windows, at
 least not using the same algorithms as on Unix.

 More importantly, how exactly locking the (redirected) stdout/stderr
 of the child is supposed to cause synchronization, and why do we need
 it at all?  Isn't synchronization already achieved by redirecting
 child's output to a file, and only dumping it to screen when the child
 exits?  What does lock add to this?  Who else will be writing what to
 where, that we want to prevent by holding the lock/semaphore?

 In an old thread, Paul explained something similar:

  David, can you explain why you needed to lock the files?  Also, what
  region(s) of the file you

Re: [bug #33138] .PARLLELSYNC enhancement with patch

 Date: Tue, 23 Apr 2013 11:29:35 -0700
 From: David Boyce david.s.bo...@gmail.com
 Cc: Frank Heckenbach f.heckenb...@fh-soft.de, bug-make bug-make@gnu.org

 Since you asked basic questions I'm going to start this at a basic level.
 Apologies if it covers some stuff you already know or if I misinterpreted
 the questions. Note that I haven't actually looked at the patch that went
 in so this is generally wrt the original.
 [...]

Thanks, I will dwell on this.

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

2013-04-23 Thread Paul Smith

On Tue, 2013-04-23 at 21:40 +0300, Eli Zaretskii wrote:
  Date: Tue, 23 Apr 2013 11:29:35 -0700
  From: David Boyce david.s.bo...@gmail.com
  Cc: Frank Heckenbach f.heckenb...@fh-soft.de, bug-make bug-make@gnu.org

  Since you asked basic questions I'm going to start this at a basic level.
  Apologies if it covers some stuff you already know or if I misinterpreted
  the questions. Note that I haven't actually looked at the patch that went
  in so this is generally wrt the original.
  [...]

 Thanks, I will dwell on this.

When thinking about this, remember that it's not enough to consider how
a single make invocation will work.  If you run with a single make
instance under -j, then redirecting each job's output to a temp file and
then when make reaps each job, copying the contents of that temp file to
stdout, is a sufficient solution.  You just need to be able to redirect
stdout/stderr of a given job to temporary files.  In UNIX of course this
is done by dup()'ing the file descriptors after the fork and before the
exec.  Presumably on Windows there's some other way to do this.

However, in a situation where we have recursive make instances all
running at the same time under -j then each one of those make instances
is also running one or more jobs in parallel.  In this case it's not
good enough for each make to synchronize its own jobs' output.

So in addition to the temp file change above, you ALSO need a way to
synchronize the use of the single resource (stdout) that is being shared
by all instances of recursive make.  On UNIX we have chosen to use an
advisory lock on the stdout file descriptor: it's handy, and it's the
resource being contended for, so it makes sense.

You asked, what if someone redirected the stdout of a sub-make.  In that
case things still work: that sub-make will not participate in the same
locking as the other sub-makes, it's true, but that's OK because the
output is going to a different location from the other sub-makes so
there's no need to synchronize them.  Meanwhile any sub-sub-makes using
the same output file will still be synchronous with each other.

I'm not sure if the lock locks the FD (so that if you dup'd the FD but
it still pointed to the same output, you could take exclusive locks on
both), or if it locks the underlying resource.  If the former I guess
it's possible to break the synchrony if you worked at it.

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

 From: Paul Smith psm...@gnu.org
 Cc: David Boyce david.s.bo...@gmail.com, f.heckenb...@fh-soft.de, 
   bug-make@gnu.org
 Date: Tue, 23 Apr 2013 15:04:39 -0400

 When thinking about this, remember that it's not enough to consider how
 a single make invocation will work.  If you run with a single make
 instance under -j, then redirecting each job's output to a temp file and
 then when make reaps each job, copying the contents of that temp file to
 stdout, is a sufficient solution.  You just need to be able to redirect
 stdout/stderr of a given job to temporary files.  In UNIX of course this
 is done by dup()'ing the file descriptors after the fork and before the
 exec.  Presumably on Windows there's some other way to do this.

It's the same way, actually (modulo the fork/exec dance).

 So in addition to the temp file change above, you ALSO need a way to
 synchronize the use of the single resource (stdout) that is being shared
 by all instances of recursive make.  On UNIX we have chosen to use an
 advisory lock on the stdout file descriptor: it's handy, and it's the
 resource being contended for, so it makes sense.

I still don't know how does Make achieve on Unix the interlocking with
its sub-Make's.  Is it because the lock is inherited as part of fork?
If so, we will need a special command-line argument on Windows to pass
the name of the mutex, or the value of its handle, down the sub-make
chain, because even if the handle is inherited, its value needs to be
communicated.

(I wish the design and implementation of this feature were less
Posix-centric...)

 I'm not sure if the lock locks the FD (so that if you dup'd the FD but
 it still pointed to the same output, you could take exclusive locks on
 both), or if it locks the underlying resource.  If the former I guess
 it's possible to break the synchrony if you worked at it.

Sorry, you lost me here.

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

2013-04-23 Thread David Boyce

On Tue, Apr 23, 2013 at 12:04 PM, Paul Smith psm...@gnu.org wrote:

 I'm not sure if the lock locks the FD (so that if you dup'd the FD but
 it still pointed to the same output, you could take exclusive locks on
 both), or if it locks the underlying resource.


I'm pretty sure it's the underlying resource that gets locked. The man page
is not quite explicit but implicitly fairly clear. The opening paragraph
says:

Apply, test or remove a POSIX lock on a section of an open file.  The file
is specified by fd, a file descriptor open for writing


David
___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

2013-04-23 Thread Frank Heckenbach

(TL;DR: Probably irrelevant.)

Paul Smith wrote:

 I'm not sure if the lock locks the FD (so that if you dup'd the FD but
 it still pointed to the same output, you could take exclusive locks on
 both), or if it locks the underlying resource.  If the former I guess
 it's possible to break the synchrony if you worked at it.

The Linux manpage says:

: As well as being removed by an explicit F_UNLCK, record locks are auto-
: matically released when the process terminates or if it closes any file
: descriptor  referring  to a file on which locks are held.  This is bad:
: it means that a process can lose the locks on a file  like  /etc/passwd
: or  /etc/mtab  when for some reason a library function decides to open,
: read and close it.
:
: Record locks are not inherited by a child created via fork(2), but  are
: preserved across an execve(2).

A quick test indicates that the lock indeed refers to the resource,
but several locks to the same resource are merged by the system
(rather than deadlocking itself):

#include fcntl.h

int main ()
{
  int a = open (/dev/null, O_RDWR);
  int b = open (/dev/null, O_RDWR);  // or b = dup (a)
  struct flock fl;
  fl.l_type = F_WRLCK;
  fl.l_whence = SEEK_SET;
  fl.l_start = 0;
  fl.l_len = 1;
  if (fcntl (a, F_SETLKW, fl))
perror (a);
  if (fcntl (b, F_SETLKW, fl))
perror (b);
  return 0;
}

Though I don't think any of this really matters, at least with our
current implementation, since we don't dup the FD locked or fork or
exec during this time. And it's possible break the synchrony if you
worked at it since our locking is only advisory, anyway. (I.e.,
within make; not so easy in the jobs run since they don't see the FD
used for locking, i.e. the original stdout.)

PS: Paul, the following comment is obsolete; I forgot to remove it:

  fl.l_start = 0; /* lock just one byte according to pid */

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

2013-04-23 Thread Frank Heckenbach

David Boyce wrote:

 The first thing is get the word lock out of your mind because we aren't
 really locking anything. Yes, that API is in use but it's only to create a
 semaphore or baton. Nobody is ever prevented from doing anything. It just
 happens that on Unix the most portable (i.e. oldest) way of implementing a
 semaphore is with the advisory locking API. All cooperating processes agree
 not to proceed unless and until they are able to acquire the exclusive lock
 on a shared file descriptor, but it's not necessary to ever actually write
 anything to that descriptor.

Just to clarify: It's not necessary to write anything to that
descriptor in order for the locking to work. We do actually write to
the descriptor when we have the lock, but that's just an
implementation detail, i.e. the lock could be something else, as
long as the different make instances agree not to write to the
descriptor without holding the lock.

 You're right that simply writing to temp files and dumping everything at
 once when the job finished would be likely to reduce the incidence of
 garbling even without the semaphore, but not to zero.

 It may be that the locking of stdout is only useful on Unix due to the fact
 that it's inherited into child processes. I don't know what Paul or Frank
 is thinking, and as mentioned I haven't looked at the current version, but
 my thinking originally was that Windows could easily handle this using its
 own far richer set of semaphore/locking APIs. I'd actually expect this to
 be easier and more natural on Windows than Unix. All that's required is to
 choose a semaphore to synchronize on, dump output to temp files, and copy
 it to stdout/stderr only after acquiring the semaphore. And remove the temp
 files of course.

Yes, as I wrote in another mail, even a completely global semaphore
should do. Sure, it's excluding much more than necessary, but since
the critical section is very short, this shouldn't hurt much. (In
other words, if make jobs produce such huge output that copying it
around takes a significant amount of time, nobody's ever going to
read that output anyway, or someone is post-processing it which will
take much longer than copying it, anyway.)

Indeed, as David wrote, under Unix, locking stdout/stderr is most
straightforward because it's available without further setup.
Incidentally, this way of locking is also as fine-grained as
possible (considering recursive makes). However, as I wrote, this
fine-grained locking is not really necessary, so it's not worth
worrying about replicating it on Windows if this causes trouble.

 On Tue, Apr 23, 2013 at 10:50 AM, Eli Zaretskii e...@gnu.org wrote:
 
  Please tell me that nothing in this feature relies on
  'fork', with its copying of handles and other data structures.

All it requires is inheriting the redirected stdout/stderr to child
processes. This was already possible under Dos (with the exception
that since there was no fork, you had to redirect in the parent
process, call the child, then undirect in the parent, IIRC).

It's just like shell redirections, i.e. if you do foo  bar  baz,
the stdout of foo and all processes called by foo is redirected to a
file called bar, while baz and the rest of the shell continue
running with their original stdout. If that's the same under Windows
(at least using bash or a similar shell; no idea what Windows's own
shell does), there should be no problem (or the bash sources should
reveal what needs to be done). Perhaps you know all of this already
and perhaps it's trivial, or perhaps it's impossible ... (I really
don't know how different things are under Windows.)

  In an old thread, Paul explained something similar:
 
   David, can you explain why you needed to lock the files?  Also, what
   region(s) of the file you are locking?  fcntl with F_WRLCK won't work
   on Windows, so the question is how to emulate it.
 
  David wants to interlock between ALL instances of make printing output,
  so that even during recursive makes no matter how many you have running
  concurrently, only one will print its output at a time.
 
  There is no specific region of the file that's locked: the lockfile is
  basically a file-based, system-wide semaphore.  The entire file is
  locked; it's empty and has no content.
 
  Assuming this all is still basically true,

It's almost still true. As written above, we now don't use an extra,
empty, file for locking, but stdout itself. Otherwise it's still so,
we don't lock a particular region of the file, but use it as a
semaphore.

  I guess I still don't
  understand what exactly is being locked and why.  E.g., why do we only
  want to interlock instances of Make, but not the programs they run?
  Also, acquire_semaphore is used only in sync_output, which is called
  only when a child exits.  IOW, nothing is locked while the child
  runs, only when its output is ready.

As David wrote, that's necessary to preserve parallelism.

  In

Re: [bug #33138] .PARLLELSYNC enhancement with patch

 Date: Tue, 23 Apr 2013 21:41:22 +0200
 From: Frank Heckenbach f.heckenb...@fh-soft.de

 Yes, as I wrote in another mail, even a completely global semaphore
 should do.

Not healthy, IMHO.  Some snafu could inadvertently and completely
silently stop an unrelated build somewhere on the same system.

 Sure, it's excluding much more than necessary, but since
 the critical section is very short, this shouldn't hurt much. (In
 other words, if make jobs produce such huge output that copying it
 around takes a significant amount of time, nobody's ever going to
 read that output anyway, or someone is post-processing it which will
 take much longer than copying it, anyway.)

It doesn't matter whether someone reads the output or not, that output
gets written and takes its time.  We sometimes invoke tools with
options we later regret, because we don't anticipate the results.

 Indeed, as David wrote, under Unix, locking stdout/stderr is most
 straightforward because it's available without further setup.
 Incidentally, this way of locking is also as fine-grained as
 possible (considering recursive makes). However, as I wrote, this
 fine-grained locking is not really necessary, so it's not worth
 worrying about replicating it on Windows if this causes trouble.

It is not easy to know what needs to be replicated and what doesn't,
because the fine print of the implementation is not clear, at least
to me, and therefore it could easily happen that non-replication of
some seemingly secondary detail completely breaks the design.  After
all, the basic models of Unix and Windows regarding processes and
signals are very different.

  On Tue, Apr 23, 2013 at 10:50 AM, Eli Zaretskii e...@gnu.org wrote:

   Please tell me that nothing in this feature relies on
   'fork', with its copying of handles and other data structures.

 All it requires is inheriting the redirected stdout/stderr to child
 processes. This was already possible under Dos (with the exception
 that since there was no fork, you had to redirect in the parent
 process, call the child, then undirect in the parent, IIRC).

Inheritance is not the problem; _finding_ the inherited object or its
handle is.  With stdout, its name is global, but not so with a handle
to some mutex I can create or its name (unless we indeed make the name
fixed and thus system-global, which I don't like, if only because it's
not how Make works on Unix).

 It's just like shell redirections, i.e. if you do foo  bar  baz,
 the stdout of foo and all processes called by foo is redirected to a
 file called bar, while baz and the rest of the shell continue
 running with their original stdout. If that's the same under Windows
 (at least using bash or a similar shell; no idea what Windows's own
 shell does), there should be no problem (or the bash sources should
 reveal what needs to be done). Perhaps you know all of this already
 and perhaps it's trivial, or perhaps it's impossible ... (I really
 don't know how different things are under Windows.)

They aren't different, and all this is pretty basic.  It's not what
worries me.

 They can have their own stdout, in particular with the
 --output-sync=make option. But that's actually the harmless case:
 Each sub-make runs with its stdout already redirected to a temp file
 by the main make. In turn, it redirects the stdout of its children
 to separate temp files, and when they are done, collects the data to
 its stdout, i.e. the temp file from the main make. When the sub make
 is finished, the main make collects its output to the original
 stdout. So unless I'm mistaken, no locking is actually required in
 this case.

But we still acquire the semaphore in this case, don't we?  Perhaps we
shouldn't.

  I still don't know how does Make achieve on Unix the interlocking with
  its sub-Make's.  Is it because the lock is inherited as part of fork?

 The fd is inherited as part of fork. Each make that needs to takes a
 lock on the fd.

On the same fd, or, rather, on an fd connected to the same resource.
That's what I thought (it isn't spelled out anywhere in the comments,
and the code that implements that is quite scattered).

I need to check whether LockFileEx works on handles connected to the
console; somehow I'm not sure.  And LockFileEx is not ideal because it
doesn't work on Windows 9X.  So maybe I will have to use a mutex after
all.

  (I wish the design and implementation of this feature were less
  Posix-centric...)

 The implementation my be (though it's only two functions,
 acquire_semaphore(), release_semaphore()) that can be completely
 replaced (note, again, the fact that the stdout or stderr fd also
 serves for locking, is just an implementation detail and not central
 to the design).

I beg to differ.  The implementation relies heavily on the fact that
the semaphore's name or descriptor is globally known to every program
in the chain.  Writing a replacement for F_SETLKW using LockFileEx
took me no more than 5 minutes; it is only when I

Re: [bug #33138] .PARLLELSYNC enhancement with patch

2013-04-23 Thread Paul Smith

On Tue, 2013-04-23 at 23:16 +0300, Eli Zaretskii wrote:
  All it requires is inheriting the redirected stdout/stderr to child
  processes. This was already possible under Dos (with the exception
  that since there was no fork, you had to redirect in the parent
  process, call the child, then undirect in the parent, IIRC).
 
 Inheritance is not the problem; _finding_ the inherited object or its
 handle is.  With stdout, its name is global, but not so with a handle
 to some mutex I can create or its name (unless we indeed make the name
 fixed and thus system-global, which I don't like, if only because it's
 not how Make works on Unix).

You're right: we cannot use a fixed name.  That would allow only one
(top-level) invocation of make per system to use -O.  That's not
acceptable.

  They can have their own stdout, in particular with the
  --output-sync=make option. But that's actually the harmless case:
  Each sub-make runs with its stdout already redirected to a temp file
  by the main make. In turn, it redirects the stdout of its children
  to separate temp files, and when they are done, collects the data to
  its stdout, i.e. the temp file from the main make. When the sub make
  is finished, the main make collects its output to the original
  stdout. So unless I'm mistaken, no locking is actually required in
  this case.
 
 But we still acquire the semaphore in this case, don't we?  Perhaps we
 shouldn't.

We do acquire it but it's on a different resource.  We still do need the
semaphore because we want the output directed to the other file to be
synchronized as well, at least within itself.  If you run make -O and
your output to stdout is synchronized by you run make -O out and the
contents of the out file are not synchronized, that would be bogus.

However, I'm not worried about losing this ability to change the lock
domain for sub-makes based on redirection.  If all the sub-makes were
synchronized together even though some redirected to a different file
that is fine.

First, I sincerely doubt there are many makefiles which run sub-makes
with IO redirected like that.  And second, the slight loss in
parallelization is not worth a huge increase in complexity.  It's a nice
thing that we get it for free with stdout but I wouldn't expend days
trying to implement it if we didn't... in fact without stdout I just
don't see that it's even possible.  I guess if the child could somehow
detect that it was using a different stdout it could be done, but that
would require the parent passing information about stdout to the
child... ugh.  Not worth it.

 IOW, the top-level design is indeed quite general, but the low-level
 algorithmic details are not, and therefore just replacing these
 functions will not necessarily do the job.

Right.  The locking is the part that's not portable, and we need a lock
that can be shared between all the make instances for one top-level
make, but not shared with another top-level make.

Without knowing what kind of resource Windows can take locks on, we
can't really know how to help with that.  There must be precedent for
this kind of shared between a subset of processes lock on Windows.
What handle do we need to know, and how can that handle be communicated?
We can pass a filename through the environment, we can add a
command-line flag to the sub-makes, we can do whatever the equivalent of
inheriting an open file descriptor is on Windows...

 If we really want to make this reasonably portable (and without that,
 I cannot see how Paul's dream about less ifdef's is going to
 materialize), this code needs IMO to be refactored to make the
 algorithm know less about the implementation details.

Personally I've never had any luck trying to create portable code from
scratch, at least not unless I'm very familiar with all the platforms
which is certainly not the case here.

Once we see the second implementation it will be a lot more obvious what
needs to be done to make this more portable.


___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

2013-04-23 Thread Frank Heckenbach

Eli Zaretskii wrote:

  Date: Tue, 23 Apr 2013 21:41:22 +0200
  From: Frank Heckenbach f.heckenb...@fh-soft.de

  Sure, it's excluding much more than necessary, but since
  the critical section is very short, this shouldn't hurt much. (In
  other words, if make jobs produce such huge output that copying it
  around takes a significant amount of time, nobody's ever going to
  read that output anyway, or someone is post-processing it which will
  take much longer than copying it, anyway.)

 It doesn't matter whether someone reads the output or not, that output
 gets written and takes its time.  We sometimes invoke tools with
 options we later regret, because we don't anticipate the results.

I was just saying we don't need to optimize for such a strange case.
I'm not saying we should fail in such a case, just that it may take
a little longer if locking is too coarse. Producing the output by
the jobs takes some time anyway, and if the additional delay caused
by locking is too much, they just shouldn't use this option. It's an
option after all.

  Indeed, as David wrote, under Unix, locking stdout/stderr is most
  straightforward because it's available without further setup.
  Incidentally, this way of locking is also as fine-grained as
  possible (considering recursive makes). However, as I wrote, this
  fine-grained locking is not really necessary, so it's not worth
  worrying about replicating it on Windows if this causes trouble.

 It is not easy to know what needs to be replicated and what doesn't,
 because the fine print of the implementation is not clear, at least
 to me, and therefore it could easily happen that non-replication of
 some seemingly secondary detail completely breaks the design.

I think it was discussed extensively in the original discussion (in
which you also participated) and here. If anything is still unclear,
feel free to ask. But just to repeat: The only thing really
necessary is that two makes with the same stdout/stderr never run
sync_output() simultaneously. If *any* two sub-makes never run
sync_output() simultaneously, that is also fine.

   On Tue, Apr 23, 2013 at 10:50 AM, Eli Zaretskii e...@gnu.org wrote:

Please tell me that nothing in this feature relies on
'fork', with its copying of handles and other data structures.

  All it requires is inheriting the redirected stdout/stderr to child
  processes. This was already possible under Dos (with the exception
  that since there was no fork, you had to redirect in the parent
  process, call the child, then undirect in the parent, IIRC).

 Inheritance is not the problem; _finding_ the inherited object or its
 handle is.  With stdout, its name is global, but not so with a handle
 to some mutex I can create or its name (unless we indeed make the name
 fixed and thus system-global, which I don't like, if only because it's
 not how Make works on Unix).

Now we're talking about different things. I was replying to your
question what it requires about fork etc. (answer: just
stdout/stderr redirection like in the shell). You're worried about
how to pass a reference to a mutex around. Paul just replied to
that.

  They can have their own stdout, in particular with the
  --output-sync=make option. But that's actually the harmless case:
  Each sub-make runs with its stdout already redirected to a temp file
  by the main make. In turn, it redirects the stdout of its children
  to separate temp files, and when they are done, collects the data to
  its stdout, i.e. the temp file from the main make. When the sub make
  is finished, the main make collects its output to the original
  stdout. So unless I'm mistaken, no locking is actually required in
  this case.

 But we still acquire the semaphore in this case, don't we?  Perhaps we
 shouldn't.

It wouldn't really make things easier, just add another conditional
which needs to be tested etc.

   I still don't know how does Make achieve on Unix the interlocking with
   its sub-Make's.  Is it because the lock is inherited as part of fork?

  The fd is inherited as part of fork. Each make that needs to takes a
  lock on the fd.

 On the same fd, or, rather, on an fd connected to the same resource.
 That's what I thought (it isn't spelled out anywhere in the comments,
 and the code that implements that is quite scattered).

It's the same fd (namely, stdout or stderr) via inheritance. Of
course, you can say it's not the same fd, but two different fd's
(i.e., stdout of different processes) referring to the same
resource, but that's splitting hairs (and, again, probably
irrelevant since you're going to implement it differently anyway).

   (I wish the design and implementation of this feature were less
   Posix-centric...)

  The implementation my be (though it's only two functions,
  acquire_semaphore(), release_semaphore()) that can be completely
  replaced (note, again, the fact that the stdout or stderr fd also
  serves for locking, is just an implementation detail and

Re: [bug #33138] .PARLLELSYNC enhancement with patch

 From: Paul Smith p...@mad-scientist.net
 Cc: Frank Heckenbach f.heckenb...@fh-soft.de, bug-make@gnu.org
 Date: Tue, 23 Apr 2013 16:54:33 -0400

 Without knowing what kind of resource Windows can take locks on, we
 can't really know how to help with that.

The only resources that don't need their names passed to sub-Make are
the standard streams.  I will see if locking works on console handles;
if not, I will have to introduce a command-line argument for passing
the name or the handle of a mutex to a sub-Make.

 we can do whatever the equivalent of inheriting an open file
 descriptor is on Windows...

Doesn't help unless it's a file descriptor of one of the 3 standard
streams.  Any other descriptor needs to be passed on the command line,
for the same reasons we have --job-server-fds option.

  If we really want to make this reasonably portable (and without that,
  I cannot see how Paul's dream about less ifdef's is going to
  materialize), this code needs IMO to be refactored to make the
  algorithm know less about the implementation details.

 Personally I've never had any luck trying to create portable code from
 scratch, at least not unless I'm very familiar with all the platforms
 which is certainly not the case here.

 Once we see the second implementation it will be a lot more obvious what
 needs to be done to make this more portable.

That's one way.  Another one is to discuss the design more thoroughly
before the patches are accepted.  I tried to turn the discussion that
way when the issue was first brought up, but my comments were largely
ignored (if I compare what was suggested then with what was eventually
committed), and most of the discussions were about the details of the
Unix implementation anyway.

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

2013-04-21 Thread Paul Smith

On Fri, 2013-04-19 at 14:09 +0300, Eli Zaretskii wrote:
  Date: Fri, 19 Apr 2013 11:54:05 +0200
  Cc: bo...@kolpackov.net, bug-make@gnu.org
  From: Frank Heckenbach f.heckenb...@fh-soft.de

   Is there a simple enough Makefile somewhere that could be used to test
   this feature, once implemented?

  We have a test in the test suite (output-sync). Can you use that?

 I hoped for something simpler and not involving Perl or Unixy shell
 features, because I'd like to use this in the native Windows
 environment, where the Windows port of Make runs.  However, if that's
 the best possibility, I guess I'd craft something based on that test.

The basic feature can be tested trivially like this:

  all: one two

  one two:
  @echo start $@
  @sleep 1
  @echo stop $@

Now if you run this using make -j you'll get:

  start one
  start two
  stop one
  stop two

If you run this using make -j -O you should get:

  start one
  stop one
  start two
  stop two

There's more to test than that: before it's done we need to test
recursive make invocations for example.  But the above is simple.

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

2013-04-21 Thread Paul Smith

On Fri, 2013-04-19 at 12:36 +0300, Eli Zaretskii wrote:
 Also, where is the best place to put the emulated Posix functions?
 Some new file in w32/compat/? 

I'd like to see it there.  I'm thinking I want to move the new stuff out
of job.c even for POSIX systems.  The ifdefs are really getting to me.
I was thinking of creating a posix.c file for example.  The problem is
there's a lot of variation even in POSIX, and there's a lot of POSIX-y
stuff we actually do use in Windows too.  We might need to break it down
further than just posix.c.

For this feature I probably won't have a clear feel for how to split it
until I see what the Windows version looks like.

Anyway I'll probably wait until after the next release to do major
renovations.


___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

2013-04-19 Thread Eli Zaretskii

 Date: Fri, 19 Apr 2013 00:06:52 +0300
 From: Eli Zaretskii e...@gnu.org
 Cc: bo...@kolpackov.net, bug-make@gnu.org

Indeed, as you suggested earlier, it might be useful to use the main
part of open_tmpfile() (i.e. without the fdopen()), though we'd have
to manually remove the file then. On Unix, we could unlink it right
after opening (since we never need the filename again, unlike the
other users of open_tmpfile()). On Windows, though, this might need
to be delayed, AIUI.

   You can do that on Windows as well, just not if you use open/creat to
   create the file.  You need to use the Windows file APIs with options
   that are not exposed to Posix-style functions like open.

  Can you get an fd from a file opened this way?

 Yes.

  If so, it should be alright, it could all be encapsulated in
  open_tmpfd().

 I'm not sure it's worth the hassle.  Using tmpfile is so much easier
 (if it works, which is to be determined).

Initial investigation indicates that tmpfile should do the job just
fine: the file is deleted only when the last descriptor for it is
closed.  That includes any duplicated descriptors.

As for fcntl, F_SETLKW, and F_GETFD, they will need to be emulated.
In particular, it looks like LockFileEx with LOCKFILE_EXCLUSIVE_LOCK
flag set and LOCKFILE_FAIL_IMMEDIATELY flag cleared should do the
job.  I will need to see how it works in reality, though.

Is there a simple enough Makefile somewhere that could be used to test
this feature, once implemented?

Also, where is the best place to put the emulated Posix functions?
Some new file in w32/compat/?

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

2013-04-19 Thread Frank Heckenbach

Eli Zaretskii wrote:

 Initial investigation indicates that tmpfile should do the job just
 fine: the file is deleted only when the last descriptor for it is
 closed.  That includes any duplicated descriptors.

Great.

 As for fcntl, F_SETLKW, and F_GETFD, they will need to be emulated.
 In particular, it looks like LockFileEx with LOCKFILE_EXCLUSIVE_LOCK
 flag set and LOCKFILE_FAIL_IMMEDIATELY flag cleared should do the
 job.  I will need to see how it works in reality, though.

OK.

 Is there a simple enough Makefile somewhere that could be used to test
 this feature, once implemented?

We have a test in the test suite (output-sync). Can you use that?

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

2013-04-19 Thread Eli Zaretskii

 Date: Fri, 19 Apr 2013 11:54:05 +0200
 Cc: bo...@kolpackov.net, bug-make@gnu.org
 From: Frank Heckenbach f.heckenb...@fh-soft.de

  Is there a simple enough Makefile somewhere that could be used to test
  this feature, once implemented?

 We have a test in the test suite (output-sync). Can you use that?

I hoped for something simpler and not involving Perl or Unixy shell
features, because I'd like to use this in the native Windows
environment, where the Windows port of Make runs.  However, if that's
the best possibility, I guess I'd craft something based on that test.

Thanks.

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

Paul Smith wrote:

 I've applied the patch from Frank.

Thanks. I did some tests and so far everything works in my setup.

Since I was away for a day, I couldn't follow the discussion
earlier, so allow me to respond to several mails at once ...

 I changed the behavior so that the entire recipe
 is printed together, rather than each individual job being printed together,
 so that even without .ONESHELL you still have the whole recipe as a single
 output.  I can't justify to myself the need for individual job output but if
 someone really wants it please provide a reason.

I almost do. I use a mechanism to give me a progress indicator while
make is running. This mechanism does not require any special make
features and works outside of make. How it works is roughly, my
rules (at least the longer running ones) look like this (where
[Start] and [End] are in fact some different strings that do not
easily occur in normal messages):

foo.o:
+@echo [Start] Compiling foo.c
@sleep 1 # cc -o foo.o foo.c
+@echo [End] Compiling foo.c

My program that interpretes make's output catches those special
messages and shows me a list of currently running tasks. (In
non-parallel builds, the non-silent make output serves this purpose,
you can just see the last command shown which is the one running
now. In parallel builds, this is not so; a long-running command may
have scrolled far away by subsequent shorter commands run in
parallel, so it's not easy to see what's actually running.
Furthermore, many typical command lines these days (including mine)
have so many compiler options etc. that they're hard to read by
themselves. The above messages make it much more comfortable for me
to see what's going on.)

This mechanism was unaffected by my output-sync patch, and I
expected your change broke it. In fact, it didn't, that's why I said
almost. That's because of the + in the echo commands which
prevents make from considering the above a single recipe. (The
reason I use + is so I can run make -n and count those messages
before the actual run in order to get the total number that will be
produced which allows me to draw a progress bar. Not terribly
important, I admit, but in this case it incidentally saved my ass.)

So at the moment, it works for me, but it indicates there may be
legitimate reasons to want the output grouped per job rather than
per recipe.

So I'd plead to revert this bit (since one can still use .ONESHELL
if wanted). Or we could add another mode like --output-sync=job.
Shouldn't be too hard now (if you like, I can implement it).

Paul Smith wrote:

 On Tue, 2013-04-16 at 11:30 +0300, Eli Zaretskii wrote:

   . why is leaving directory FOO displayed _before_ releasing the
 stderr output? it should be after, I think.
 
 The enter/leave output needs investigation for sure.

The idea was since log_working_directory() writes its messages to
stdout, we need to wrap the stdout data with them. If stderr is the
same as stdout, they are combined, so the first pump_from_tmp_fd
handles all of them and they are properly wrapped. If they are
separate, it doesn't matter when we give the leave message, since
they go to different places anyway. So I think it's not necessary
to move the leave message after the stderr output, though it
wouldn't hurt either (just require another if (outfd_not_empty)).

 I think we're
 doing too much here.  I would have left it alone but I think some tools
 that parse make results expect to see that output to help them figure
 out what's going on.

As I described in https://savannah.gnu.org/bugs/?33138#comment3, the
problem is how to interpret messages that contain file names when
different jobs run in different subdirectories. A typical message
looks like this:

foo.c:42: syntax error blah

In fact, also make itself gives such messages which depend on the
current directory:

Makefile:7: recipe for target 'foo-fail' failed

With no context, there's no straightforward way to find the
offending file. One could search all subdirectories for a file named
foo.c (but there may be several ones in different directories), or
leave it up to the user to find the right file. But that's not
necessary if we have proper context. As I understand, Emacs has a
directory change tracking mode which does just that (I don't use
it myself, but a utility which does the same).

In non-parallel builds, the normal enter/leave messages from make
provide enough context. Not so in parallel builds, especially with
target-level synchronization. Adding more of these messages helps to
establish the context. As I wrote in the abovementioned comment, we
now output rather too many messages, though trimming them better
might be nontrivial (I did the easy part by giving them only if
outfd_not_empty).

   . btw, why an error in acquire_semaphore and release_semaphore causes
 Make to bail out?  isn't it better to fall back on the normal
 operation mode instead?
 
 I'll look at this.  I guess the problem is

Re: [bug #33138] .PARLLELSYNC enhancement with patch

2013-04-18 Thread Paul Smith

On Thu, 2013-04-18 at 19:09 +0200, Frank Heckenbach wrote:
 This mechanism was unaffected by my output-sync patch, and I
 expected your change broke it.

I was reading your email with interest, waiting for the punch-line, but
then after all that description you just said that the change broke it,
with no explanation :-).  My question is, what about the new behavior
does not work for you?

The change I made was to have all the jobs in a target recipe write
(appending of course) to the same temp file, then after the last job in
the recipe is completed the output for the entire recipe is generated.
So for example, if you have:

  foo bar:
  : $@ one
  : $@ two
  : $@ three

and you run make -j -O, the way you had it the output from each job
would be mixed something like this (the actual result order will be
slightly random):

  foo one
  bar one
  foo two
  bar two
  bar three
  foo three

with my change you still get the same results but only after all jobs
are complete, and they will be collected:

  foo one
  foo two
  foo three
  bar one
  bar two
  bar three

 So I'd plead to revert this bit (since one can still use .ONESHELL
 if wanted). Or we could add another mode like --output-sync=job.
 Shouldn't be too hard now (if you like, I can implement it).

I'd prefer to not add another option here unless there's a compelling
case for it, even though you're correct that we have the flexibility
now.

  I think we're
  doing too much here.  I would have left it alone but I think some tools
  that parse make results expect to see that output to help them figure
  out what's going on.
 
 As I described in https://savannah.gnu.org/bugs/?33138#comment3, the
 problem is how to interpret messages that contain file names when
 different jobs run in different subdirectories.

I do get it.  I just think we might need to do more drastic surgery on
this to REALLY get it right.  Probably we'll want to allow the user to
have more control over it, as well.  Maybe a similar flag that lets you
choose whether to trace on a per-target or per-make basis.  And, if you
choose per-target you don't need to ALSO generate output per-make.

We do need to be careful about tracing non-target output, for example
output generated by $(info ...), $(warning ...), and $(error ...).

  I guess this points out a potential issue with the current UNIX
  implementation as well: if you had some utility you were running as part
  of your build that was also trying to lock stdout, then it would
  interfere with make.
 
 I don't think so, since its stdout is now our temp file which it may
 lock as it pleases. (In fact, that's just what recursive makes do
 with --output-sync=make.)

True.  And if they were trying to use stdout like we are, to lock
between different recipes, they just can't use -O at all since by
definition each target will be writing to a different file.

 PS: In assign_child_tempfiles(), the following declarations are no more
 needed and can be removed:
 
   FILE *outstrm = NULL, *errstrm = NULL;
   const char *suppressed = output-sync suppressed: ;
   char *failmode = NULL;

Yeah, I have a commit in my local repo that fixes this.

Thanks!


___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

2013-04-18 Thread David Boyce

On Thu, Apr 18, 2013 at 10:09 AM, Frank Heckenbach
f.heckenb...@fh-soft.dewrote:

 FILE is an opaque structure which should never be allocated by user
 code, so its layout can be implementation specific.


Of course it's an opaque structure. The problem is that the implementation
can't change its size without breaking binary compatibility with every
previously-compiled program that uses stdio. This is a well-known problem,
easily googled for. Here are a couple of references for Solaris:

https://groups.google.com/forum/?fromgroups=#!topic/comp.unix.solaris/yky1fsJ3PMI
http://www.oracle.com/technetwork/server-storage/solaris10/stdio-256-136698.html

 In particular, on Linux the field is an int ...

Linux is a pretty elastic term, covering both 32- and 64-bit variants
plus many kernels and distros. All 64-bit builds of any system will expand
the original unsigned char; the opportunity to break binary compatibility
allows for a lot of fixes. For all I know Linux even offers a clever fix in
32-bit versions, but there remain plenty of other 32-bit platforms with the
problem.

Bottom line, in 32-bit code you have to program for the likelihood that no
more than 256 streams may be open at once.

-David Boyce
___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

Paul Smith wrote:

 On Thu, 2013-04-18 at 19:09 +0200, Frank Heckenbach wrote:
  This mechanism was unaffected by my output-sync patch, and I
  expected your change broke it.
 
 I was reading your email with interest, waiting for the punch-line, but
 then after all that description you just said that the change broke it,
 with no explanation :-).  My question is, what about the new behavior
 does not work for you?

I said I *expected* your change broke it, but it didn't. :-)
Let me explain:

 The change I made was to have all the jobs in a target recipe write
 (appending of course) to the same temp file, then after the last job in
 the recipe is completed the output for the entire recipe is generated.
 So for example, if you have:
 
   foo bar:
   : $@ one
   : $@ two
   : $@ three
 
 and you run make -j -O, the way you had it the output from each job
 would be mixed something like this (the actual result order will be
 slightly random):
 
   foo one
   bar one
   foo two
   bar two
   bar three
   foo three

And with my progress mechanism, that's exactly what I want. In my
case it'd look like this:

[Start] Compiling foo.c
[Start] Compiling bar.c
# time passes
foo.c: some error
# time passes
bar.c: some error
# time passes
[End] Compiling bar.c
# time passes
[End] Compiling foo.c

This is useful (to me) because at any time, I know what's running.
([Start] messages minus [End] messages.)

 with my change you still get the same results but only after all jobs
 are complete, and they will be collected:
 
   foo one
   foo two
   foo three
   bar one
   bar two
   bar three

In my case:

# time passes
[Start] Compiling foo.c
foo.c: some error
[End] Compiling foo.c
# time passes
[Start] Compiling bar.c
bar.c: some error
[End] Compiling bar.c

Here, I don't see what's running, since the start and end messages
always appear together, and I don't see a job started until it's
already finished.

I expected(!) your patch would do this to my output which would
break my mechanism. In fact, it didn't, but only because I have +
before the echo rules (for a different reason) which causes make to
consider them separate recipes (at least for the purpose of this
check). So, it's OK for me at the moment, but just coincidentally.
(I checked, if I remove the +, it does just what I expected and
breaks my mechanism.)

Of course, you could say I'm misusing the same message channel for
different purposes here, but that's just the only channel readily
available without adding some message-passing framework or whatever.
(Sure, there are two channels, stdout and stderr, but separating the
messages between them wouldn't help here.)

   I think we're
   doing too much here.  I would have left it alone but I think some tools
   that parse make results expect to see that output to help them figure
   out what's going on.
  
  As I described in https://savannah.gnu.org/bugs/?33138#comment3, the
  problem is how to interpret messages that contain file names when
  different jobs run in different subdirectories.
 
 I do get it.  I just think we might need to do more drastic surgery on
 this to REALLY get it right.

I'm not saying it's perfect now, but my priorities are first correct
(i.e. sufficient context), then nice (i.e. no redundant messages).

 Probably we'll want to allow the user to
 have more control over it, as well.  Maybe a similar flag that lets you
 choose whether to trace on a per-target or per-make basis.

I think it should in principle be possible without requiring the
user to specify any more options. But it would be some work,
requiring make to keep track of which directory message was output
last, delay the leaving message in case the next one will be
entering the same directory etc., and synchronize this among
recursive makes in the different modes.

 And, if you
 choose per-target you don't need to ALSO generate output per-make.

Probably, though I didn't check it in detail. I just didn't want to
remove any messages that were output before.

 We do need to be careful about tracing non-target output, for example
 output generated by $(info ...), $(warning ...), and $(error ...).

The latter two are covered by the changes in error() and fatal(),
but $(info) was missing. At least in function.c, this seems to be
the only place that was forgotten:

--- function.c.orig 2013-04-18 11:39:15.0 +0200
+++ function.c  2013-04-18 20:08:35.0 +0200
@@ -1106,8 +1106,12 @@
   break;
 
 case 'i':
+  if (output_sync)
+log_working_directory (1, 1);
   printf (%s\n, msg);
   fflush(stdout);
+  if (output_sync)
+log_working_directory (0, 1);
   break;
 
 default:

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

David Boyce wrote:

 On Thu, Apr 18, 2013 at 10:09 AM, Frank Heckenbach
 f.heckenb...@fh-soft.dewrote:
 
  FILE is an opaque structure which should never be allocated by user
  code, so its layout can be implementation specific.

 Of course it's an opaque structure. The problem is that the implementation
 can't change its size without breaking binary compatibility with every
 previously-compiled program that uses stdio.

1. It can change its size, precisely because user code does not
   declare variables of type FILE, just pointers, and does not
   allocate FILE structures itself (but uses fopen() etc.). So as
   long as the size of the pointer remains the same (which is
   trivial), there's no binary compatibility problem. In fact, a
   libc could even use differently sized FILE structures within one
   program (kind of OOP), and user code wouldn't notice.

2. Even if it couldn't change the size, this would only apply within
   one platform. You said: My understanding, and I believe this
   applies to all Unix variants, is that because the original stdio
   FILE structure used an 8-bit int to hold the file descriptor, the
   number of available streams is 256 always and forever (in 32-bit
   variants. That's not so, as evidenced by my test program. If
   Linux has always had an int there, there's no compatibility
   problem, since programs compiled under original Unix are not
   binary compatible with Linux anyway.

  In particular, on Linux the field is an int ...
 
 Linux is a pretty elastic term,

Sorry, i686, 32-bit, kernel 3.2.2, Debian squeeze (but I suppose it
applies to all x86 variants, 32-bit, and 64-bit anyway).

 Bottom line, in 32-bit code you have to program for the likelihood that no
 more than 256 streams may be open at once.

This might be right (I was just objecting to your claim that it was
necessarily so on any 32-bit Unix), and I'd prefer to use fd's
anyway.

Indeed, as you suggested earlier, it might be useful to use the main
part of open_tmpfile() (i.e. without the fdopen()), though we'd have
to manually remove the file then. On Unix, we could unlink it right
after opening (since we never need the filename again, unlike the
other users of open_tmpfile()). On Windows, though, this might need
to be delayed, AIUI.

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

2013-04-18 Thread Paul Smith

On Thu, 2013-04-18 at 20:36 +0200, Frank Heckenbach wrote:
 And with my progress mechanism, that's exactly what I want. In my
 case it'd look like this:
 
 [Start] Compiling foo.c
 [Start] Compiling bar.c
 # time passes
 foo.c: some error
 # time passes
 bar.c: some error
 # time passes
 [End] Compiling bar.c
 # time passes
 [End] Compiling foo.c
 
 This is useful (to me) because at any time, I know what's running.
 ([Start] messages minus [End] messages.)

Thanks, this is the reason I was looking for; that use-case wasn't clear
to me based on the previous email.

  Probably we'll want to allow the user to
  have more control over it, as well.  Maybe a similar flag that lets you
  choose whether to trace on a per-target or per-make basis.
 
 I think it should in principle be possible without requiring the
 user to specify any more options.

I was thinking more that the user may want to not want all the
enter/leave output even if it is ambiguous from a programmatic sense: I
know where my code lives and so if I see the my_foo.o fails, I know that
the my_foo.c file lives in src/my/my_foo.c.  I might prefer to see a
cleaner output from make and rely on my innate knowledge of the codebase
to navigate.

But maybe you'd still like to see the per-make enter/leave, even if
you're running with -Otarget.

 But it would be some work, requiring make to keep track of which
 directory message was output last, delay the leaving message in case
 the next one will be entering the same directory etc., and
 synchronize this among recursive makes in the different modes.

Synchronization between recursive makes is not something I want to get
into.  As long as the messages are coherent within a single make
instance, either before/after everything the make instance does or
before/after each target (or job, if that is needed) that's enough for
me.


___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

2013-04-18 Thread Eli Zaretskii

 Date: Thu, 18 Apr 2013 19:09:06 +0200
 From: Frank Heckenbach f.heckenb...@fh-soft.de

   . calculation of combined_output in start_job_command will need to be
 reimplemented for Windows, since the reliance on st_dev and st_ino
 makes assumptions that are false on Windows.

 What we need is basically is_same_file (fd1, fd2) (which we
 probably could refactor into a separate function). Eli, can you try:

   test file1 -ef file2

 where test is either a standalone utility or a built-in of bash or
 any other shell. If any of these works, you can see how they do it.
 Otherwise, a bit of googling turned up this page which has code (in
 Python, but it seems like a thin layer around the system calls, so
 it can probably easily be ported to C):

 http://timgolden.me.uk/python/win32_how_do_i/see_if_two_files_are_the_same_file.html

I know about this method, but I'm not sure it does what we want here.
The number used by this method is not guaranteed to be unique on some
versions of Windows.  More importantly, it only works for disk files,
I don't know whether it reports a meaningful value for the console
device or the null device.  So this will be a bit more tricky than
that URL pretends.

 . STREAM_OK uses fcntl and F_GETFD which both don't exist on Windows.

   This is to test if somehow make's stdout or stderr has been closed (not
   just redirected to /dev/null, but closed).

  Right, got that; but Windows needs a different test for that, as
  there's no fcntl.

 Basically we need just any call that succees for a valid file and
 fails otherwise. fstat() might be a good candidate.

fstat is too heavy on Windows; there are better methods.

 If that isn't available, maybe even lseek() (as used in
 FD_NOT_EMPTY) will do if we keep the || errno != EBADF test.

lseek fails for the console as well, so it's not a good candidate.
But this problem is not difficult, I just mentioned that the code as
committed won't compile.

 Eli, can you do some tests to see if you find something that works
 reliably?

Don't worry about that.  My worst problem is lack of time, not lack of
ideas ;-)

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

2013-04-18 Thread Eli Zaretskii

 Date: Thu, 18 Apr 2013 20:39:44 +0200
 Cc: psm...@gnu.org, e...@gnu.org, bo...@kolpackov.net
 From: Frank Heckenbach f.heckenb...@fh-soft.de

 Indeed, as you suggested earlier, it might be useful to use the main
 part of open_tmpfile() (i.e. without the fdopen()), though we'd have
 to manually remove the file then. On Unix, we could unlink it right
 after opening (since we never need the filename again, unlike the
 other users of open_tmpfile()). On Windows, though, this might need
 to be delayed, AIUI.

You can do that on Windows as well, just not if you use open/creat to
create the file.  You need to use the Windows file APIs with options
that are not exposed to Posix-style functions like open.

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

2013-04-18 Thread David Boyce

On Thu, Apr 18, 2013 at 2:39 PM, Frank Heckenbach f.heckenb...@fh-soft.dewrote


 This might be right (I was just objecting to your claim that it was
 necessarily so on any 32-bit Unix), and I'd prefer to use fd's
 anyway.


Well ... Linux is not Unix, as partisans will proudly tell you, so
technically what I said is unobjectionable.  I won't pretend that that's
what I meant though. Presumably the real distinction is between statically
and dynamically linked binaries; anything linked statically with libc would
need to contain an actual FILE struct, not just a pointer to one.

But that aside, we're all in violent agreement so we can let this
sub-thread drop.

-David
___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch

Eli Zaretskii wrote:

  Date: Thu, 18 Apr 2013 20:39:44 +0200
  Cc: psm...@gnu.org, e...@gnu.org, bo...@kolpackov.net
  From: Frank Heckenbach f.heckenb...@fh-soft.de

  Indeed, as you suggested earlier, it might be useful to use the main
  part of open_tmpfile() (i.e. without the fdopen()), though we'd have
  to manually remove the file then. On Unix, we could unlink it right
  after opening (since we never need the filename again, unlike the
  other users of open_tmpfile()). On Windows, though, this might need
  to be delayed, AIUI.

 You can do that on Windows as well, just not if you use open/creat to
 create the file.  You need to use the Windows file APIs with options
 that are not exposed to Posix-style functions like open.

Can you get an fd from a file opened this way? If so, it should be
alright, it could all be encapsulated in open_tmpfd().

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #33138] .PARLLELSYNC enhancement with patch