Re: [bug #33138] .PARLLELSYNC enhancement with patch
On Mon, Sep 30, 2013 at 3:12 PM, Frank Heckenbach f.heckenb...@fh-soft.dewrote: I tested the new version and found no new issues, so as far as I'm concerned this feature can now be considered finished. Thanks for the initial patch, David, and the integration into GNU make, Paul! And special thanks to Frank and Eli for pushing it over the goal line! More than anything I'll be happy to see this thread die of natural causes, since it's going on ~2.5 years with an embarrassing misspelling by me in the subject line. David ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
I tested the new version and found no new issues, so as far as I'm concerned this feature can now be considered finished. Thanks for the initial patch, David, and the integration into GNU make, Paul! ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
As always thanks for your thorough testing Frank. On Tue, 2013-09-24 at 20:41 +0200, Frank Heckenbach wrote: Paul Smith wrote: On Thu, 2013-09-19 at 14:47 +0200, Frank Heckenbach wrote: Paul Smith wrote: I didn't fix it this way. Instead I used the existing MAKE_RESTART environment variable to communicate from the current make to the restarted make that the enter message was already printed (if it was) so it wouldn't be printed again. This ensures we don't get a stream of enter/leave statements as we re-exec. This works now (in my repo). It works for me too. However, since MAKE_RESTARTS is a documented variable, couldn't this change confuse user Makefiles? That's a good point. I rearranged this to ensure we set the make variable to the integer value, so the user can't see the special token. If so, the use of output_context might be slightly irritating (though not wrong) -- at first I wondered where the log_working_directory() output after the pump_from_tmp() calls was going to and whether it didn't need pumping too if it was going to the temp file, but apparently this never happens. ... which apparently does lead to a problem here (non-deterministic like many -j problems): AIUI, it dumps out to stdout/stderr, but prints Enter/Leave to output_context (which might be dumped much later), so out's contents are not properly enclosed. Since we dump out to stdout/stderr (and we got the semaphore for writing to those), it seems logical to me to print Enter/Leave there as well, so this seems to fix it for me (and again would obviate the need for the first parameter to log_working_directory()): I agree with this change. 8. Like job.c, I think function.c should check output_context-err = 0, to avoid redirecting to -1 when no temp file for stderr was set up. Yes, sounds right. 9. - if (! output_context || output_sync == OUTPUT_SYNC_RECURSE) + if (! output_sync || output_sync == OUTPUT_SYNC_RECURSE) This looks right to me as well. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
On Sat, 2013-09-21 at 07:28 +, Edward Welbourne wrote: I've never understood why someone would use $(shell ...) in a recipe... I mean, the recipe will be run in the shell!! I remember we once had a library where the command-line to the archiver was too long (about a quarter megabyte, IIRC). We worked round this by having a temporary scratch dir, hard-linking every .o file into it, then running the archiver in that directory, so as to trim all the paths off .o files and get the command-line short enough. We *would* have populated the scratch-dir by generating rules with define and eval and having those do the job, but some project managers weren't happy to let build machines take any software changes, not even the make upgrade to get define/eval working properly, so we hacked it by having the archiver command start by populating the scratch-dir. Of course, it couldn't do that by running a single command (the command-line would have been too long ...) so we did it with a $(shell ...) per object file via $(foreach ...). It was kinda ugly but it worked ;^ Yeah, I guess that's a legit reason. The next release of GNU make will have the $(file ...) function which will make that a LOT more efficient. Maybe you can convince TPTB to finally take a change :-). (Separate infrastructure for auto-creating directories looked after the .exists target, creating its directory.) I think we could, with hind-sight, have used the foreach to generate one command per object file, all separated with semicolons, so that no single command was too big for the shell, No, that wouldn't work. It's not the individual command (between simicolons) that's too long, the problem is that make can't invoke the shell itself because the command line + environment is too large. The only way to work around this limitation is to avoid invoking a single command that's too long. All but one of those I've looked into is manifestly less powerful than make (and I'm still waiting for my former colleagues to get the exception released to open source). I have no illusions that there aren't very serious issues with make and I have nothing bad to say about people who choose a different tool, or even write a different tool. Make is weighed down by standardization and history, so starting over from scratch has a lot of appeal. On the other hand starting over from scratch with an entirely different syntax means you have a significant disadvantage in terms of familiarity. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
No, that wouldn't work. It's not the individual command (between simicolons) that's too long, the problem is that make can't invoke the shell itself because the command line + environment is too large. The only way to work around this limitation is to avoid invoking a single command that's too long. Ah - OK - kudos belongs to Joakim Bengtsson, FWIW; it was his hack, I was the one too lazy to work out what the actual problem was. All but one of those I've looked into is manifestly less powerful than make (and I'm still waiting for my former colleagues to get the exception released to open source). I have no illusions that there aren't very serious issues with make and I have nothing bad to say about people who choose a different tool, or even write a different tool. Make is weighed down by standardization and history, so starting over from scratch has a lot of appeal. Indeed - and that's sorta the story behind my one exception - but I've seen *many* claims of this is *hugely* better than make that were grounded only in a profound ignorance of what make (or, at the very least, GNU make, which was usually what they actually meant) could do. Eventually I had a (bright young) colleague who was faced with a mess of make-files (that I'd made less of a mess, so folk asked me about, despite my reservations about how it was after my efforts) and tried earnestly to make them better (as opposed to *less horrendous* - which was my modest goal) a few times, then finally ended on the we need something better than make road (and had the generosity to still ask me for advice on that road) where he would have made the same mistakes I'd seen several times over, but I told him about them; and he's done something I want you all to see, mainly to get a less partial (he's my friend and I helped him design it, so I'm very very biassed) view of the tool he came up with. It *may* be better (*I* think it is, so does he, and he's a smart lad) but I'm *very* skeptical of such evaluations from those who have been implicated in developing tools that are allegedly better than make, becauase all the ones I *haven't* had a personal stake in have *very* *very* *very* obviously sucked compared to GNU make (even if I restrict to 3.80, as - I think I mentioned them before - some project managers forced me to do). Speaking of which, Paul: kudos for what you've done. Maintenance is a (usually) thankless business torn by contradictory demands. Whether the flow-er is or not better than make, gmake is *the* tool I reach for when I need a build system (and don't yet have flower available, outside my old-old job). I've been doing that for as long as I have actually understood how make works (about a decade and a half) and I've *repeatedly* seen projects waste developer-*years* on Gettting It Wrong, usually because they haven't learned to use the tool right. *I* learned how make worked by inheriting one of those replace make with something better projects. The seminal recursive make considered harmful paper (google it) isn't entirely right but the lessons it teaches fix nearly everything that is percieved as being wrong with make - nearly all of which is the make-file maintainer didn't do the home-work (or, well, actually - usually there was no-one that actually took on the responsibility of maintaining the make-files, so they got bodged around by a bunch of different developers with different hammer-sets, who thus viewed different things as nails in different ways, with horrendous results). Lamentable though it is: many, that don't have command over the tool they get to use, have - thanks to GNU make - an industry (de facto) standard that everything else has to live up to and (even taking into account sporadic paranoid project managers who won't allow upgrades on build machines) it is *way* *way* better than the other versions of make that I've had occasion to be lumbered with working with. These days I'm senior enough to Lay Down The Law on project managers and tell them that the version of make doesn't affect the binaries that the (always always always GNU, even if antiquated because that's the target system) compile-and-link tool-chain creates - so there *is no reason at all* to object to upgrading the version of (GNU) make on build machines - but I've had to fight that war several times before I learned how to do the (dumb stupid) *rhetoric* needed to win it. Less experienced (in the horrors of politics) developers have to live with stupidity (within which, 20th century versions of GNU make are the gentle alternatives) and you are giving them a credible path out of that nightmare. Thank you. I'll just say that again: THANK YOU. Eddy. -- /me abstains from banging head on wall because the architects of this building probably didn't take into account the possibility of folk with 5% and more neanderthal ancestry. I have a very solid head - which has saved my life at leeast once, where the opposition was a
Re: [bug #33138] .PARLLELSYNC enhancement with patch
Paul Smith wrote: On Thu, 2013-09-19 at 14:47 +0200, Frank Heckenbach wrote: Hm. This is pretty contrived. I have a hard time imagining a real makefile wanting to do this for a good reason. However, it does seem that the solution may be simple enough. I also doubt someone would do it intentionally, but of course, there could be a $(shell) command that writes some error or warning to stderr (perhaps only under certain circumstances, not expected by the original author). Then again, this situation may be so rare that it might not warrant extra effort. I've never understood why someone would use $(shell ...) in a recipe... I mean, the recipe will be run in the shell!! True. Though what about shells such as the Dos one which don't have something like `...` AFAIR? Also, a slight difference is with `...` the echoed command shows the sub-command, whereas with $(shell) it shows its output. Not saying either one is better in general, but perhaps depending on the situation someone might prefer one or the other. FWIW, the way I came across it now wasn't actually from my own usage in a Makefile, just by thorough testing of the features. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
Paul Smith wrote: *sigh* If it weren't for the enter/leave messaging, the output-sync feature would have been quite straightforward! :-/ :-). I'm afraid so. But I think we're almost there now. On Mon, 2013-09-16 at 12:18 +0200, Frank Heckenbach wrote: 1. Twice Entering, once Leaving. I think there's an output_close() missing before reexec (which might also lose other messages in other situations): I didn't fix it this way. Instead I used the existing MAKE_RESTART environment variable to communicate from the current make to the restarted make that the enter message was already printed (if it was) so it wouldn't be printed again. This ensures we don't get a stream of enter/leave statements as we re-exec. This works now (in my repo). This seems even better. However, you still might have to call at least output_dump() before re-exec, especially if you sync the makefile parsing as discussed below. 2. I'm not sure I agree with that assessment. I spent a bit of time considering how recurse should work, and it's very different than line/target. This is because in recurse mode we know that our parent make has redirected all our output to a temp file, so it won't get dumped until the entire make is complete, while in the other modes we do not redirect output from recursive command lines so any output generated by them goes directly to stdout. As a result, when it comes to enter/leave the recurse mode is actually the same as non-sync mode (or even non-parallel mode): we only need to write enter/leave at the beginning and end of the entire make process, not around every recipe. The individual recipes are still sync'd, of course, but we don't need enter/leave around each one because they all get written to a single temp file and that temp file will have a global enter/leave when it is dumped. Thanks for the explanation, that's what I was missing. Now I understand your rationale for doing it the way you did. 3. Hm. I suppose you mean that this will bundle the output from makefile parsing along with the output of the first synchronized output (line or target). I don't think that will work well, since the makefile output is not synchronized the enter message could be a long way from the first target or line output. I think synchronizing the makefile parsing will yield better results. I think you're right, that's the way to go. 4. Here, with -Otarget or -Oline there's no Enter/Leave messages if there's no other output. Not sure if you consider this an actual problem. The solution seems to be to redirect stderr (only) before $(shell) executions just like we do for recipe jobs. Hm. This is pretty contrived. I have a hard time imagining a real makefile wanting to do this for a good reason. However, it does seem that the solution may be simple enough. I also doubt someone would do it intentionally, but of course, there could be a $(shell) command that writes some error or warning to stderr (perhaps only under certain circumstances, not expected by the original author). Then again, this situation may be so rare that it might not warrant extra effort. 7. job.c:1258: OUTPUT_UNSET(); Just wondering, is this necessary? (It's before OUTPUT_SET().) It is, because if you look above you'll see there's a label next_command: and later in the function we goto that label, perhaps with OUTPUT_SET(). PS. I know this use of goto is nightmarish: I didn't write this code :-) Oh yes, sorry, I missed this. (I'm not trained to look for gotos. ;-) ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
On Thu, 2013-09-12 at 04:19 -0400, Paul Smith wrote: I think there may still be some change needed for directory tracking for the -Orecurse mode. If we're collecting output for an entire recursive make invocation we don't need enter/leave notifications around each individual target or line, which is what we have now. We only need it around the entire makefile. I'll look into this. OK, this should be all fixed now. I believe it's operating the way I want. Please try it out and report anything unusual or that you have problems with. I have a few more bugs to fix then RC2. Hopefully RC2 this weekend. Please make an effort to test things if you have access to Git. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
On Sun, 2013-05-05 at 04:37 +0200, Frank Heckenbach wrote: COMMANDS_RECURSE _does_ mean to recurse. The reason for the '+' prerequisite is to tell make that this line, even though it may not look like it, will run a recursive make. OK, let me just say that the meaning of recursive may not be perfectly clear. Though the manual says: The @samp{+} prefix marks those recipe lines as ``recursive'' so that they will be executed despite use of the @samp{-t} flag., the example immediately preceding this sentence has: +touch archive.a +ranlib -t archive.a which are clearly not recursive make invocations. I gather that make uses recursive in a wider sense as anything to be run regardless (because it probably arrages by itself not to do anything serious in a dry run or so), while the current implementation of output-sync uses it in the more specific meaning of a recursive invocation of GNU make (which will do its own syncing). It's not just this new feature that relies on this meaning. The jobserver feature, which also wants to know which commands are running recursive make, also does. If people misuse it then they'll get odd behavior. I don't see that there's anything we can or should do about that. You're right, though, that this example in the make manual might not be the best. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
: I've tested your changes and so far -O job works for my use cases. I shouldn't have written that. :-( Shortly afterwards, I found a bug or perhaps two: foo: @echo foo +@echo bar (a) % make -Ojob foo bar foo (b) % make -Otarget bar foo As you see, (a) -Ojob writes foo twice and (b) -Otarget writes the messages in the wrong order. -Omake is correct. I debugged (a) and found that sync_output() is called twice without an assign_child_tempfiles() in between which would ftruncate() the temp files. The second one does not call it because flags COMMANDS_RECURSE is set. This led me to this ChangeLog entry: : Fri Sep 17 00:37:18 1993 Roland McGrath (rol...@churchy.gnu.ai.mit.edu) : : * job.c (start_job_command): Set COMMANDS_RECURSE in FLAGS if we : see a `+' at the beginning of the command line. Well, I haven't often found an answer I was looking for in a 20 years old ChangeLog entry. :) However, that's kind of bad news since I'd written my changes under the assumption that COMMANDS_RECURSE means, well, to recurse. (a), which is clearly a bug, can be fixed if we make sync_output() ftruncate() the temp files (the original patch closed them at this point, and let the next job open new ones). The patch below does this. I think it also adds some clarity and resilience WRT future changes, since it makes sure that the contents are purged as soon as they were output. In (b), since make thinks the second line is recursive, and the first one not, it syncs the second one first, then outputs the first one normally. In fact, you can see that + lines are not synced at all with -Otarget: test: foo bar foo: +@echo foo1 +sleep 1 +@echo foo2 bar: +@echo bar1 +sleep 1 +@echo bar2 % make -Otarget -j foo1 bar1 sleep 1 sleep 1 foo2 bar2 Maybe you consider this correct if + lines are indeed considered recursive and this isn't just an implementation artifact. If so, alright (it doesn't affect me personally, since I use -Ojob). But if not, this might require slightly bigger changes such as splitting COMMANDS_RECURSE into two flags. Paul? PS: I wrote: : The following patch makes these changes and fixes a precedence bug : that had crept in during the introduction of child_out (although I : can't actually test it on VMS :). : - if (exit_code 1 != 0) : + if ((exit_code 1) != 0) Strictly speaking, it's not a bug, since exit_code 1 != 0 parses as exit_code (1 != 0) which is the same as exit_code 1 and thus (exit_code 1) != 0, but I doubt that was the intention. ;) --- job.c.orig 2013-05-04 04:08:09.0 +0200 +++ job.c 2013-05-04 08:32:43.0 +0200 @@ -617,25 +617,7 @@ CLOSE_ON_EXEC (c-errfd); } } - - return 1; } - - /* We already have a temp file. On per-job output, truncate and reset. */ - if (output_sync == OUTPUT_SYNC_JOB) -{ - if (c-outfd = 0) -{ - lseek(c-outfd, 0, SEEK_SET); - ftruncate(c-outfd, 0); -} - if (c-errfd = 0 c-errfd != c-outfd) -{ - lseek(c-errfd, 0, SEEK_SET); - ftruncate(c-outfd, 0); -} -} - return 1; } @@ -679,6 +661,11 @@ } } + /* Truncate and reset input file to avoid duplicating data + in case it will be used again. */ + lseek (from_fd, 0, SEEK_SET); + ftruncate (from_fd, 0); + finished: #ifdef WINDOWS32 ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
Date: Sat, 04 May 2013 04:42:32 +0200 Cc: e...@gnu.org, david.s.bo...@gmail.com, bug-make@gnu.org From: Frank Heckenbach f.heckenb...@fh-soft.de : /* This is needed to avoid the label at end of compound statement : diagnostics on Posix platforms. */ I don't think that's a POSIX thing (which is a standard about operating systems, not compiler diagnostics), is it? You quote this out of context, which is this: finished: #ifdef WINDOWS32 /* Switch to_fd back to its original mode, so that log messages by Make have the same EOL format as without --output-sync. */ _setmode (to_fd, prev_mode); #endif /* This is needed to avoid the label at end of compound statement diagnostics on Posix platforms. */ return; } I used Posix platforms here (not just POSIX the name of the standard) in contrast to WINDOWS32, where there's some code between the label and the closing brace, which prevents the (stupid, IMO) warning. Also, I think in the Windows branch, the former condition also should say just output_sync instead of output_sync == OUTPUT_SYNC_TARGET, a recent change during the introduction of -O job. You are right. This is a consequence of having 2 separate branches in the code. I fixed this one. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
On Sat, 2013-05-04 at 08:57 +0200, Frank Heckenbach wrote: I shouldn't have written that. :-( Shortly afterwards, I found a bug or perhaps two: foo: @echo foo +@echo bar (a) % make -Ojob foo bar foo (b) % make -Otarget bar foo As you see, (a) -Ojob writes foo twice and (b) -Otarget writes the messages in the wrong order. The second one is known and I mentioned it the other day (hard to keep up with all the messages, I know). I'm working on a fix. The first one I've seen but hadn't had time to debug. I'll look at your patch. I left the truncate where it was rather than doing it after the sync_output() because I was hoping to avoid truncating a file that we'll never use again anyway, but I guess it isn't a big deal. COMMANDS_RECURSE _does_ mean to recurse. The reason for the '+' prerequisite is to tell make that this line, even though it may not look like it, will run a recursive make. Since make doesn't parse the command line it can't know for sure which ones actually recurse. It uses a heuristic, by looking for $(MAKE) or ${MAKE} in the unexpanded line. But this is easily defeated if your sub-make invocation doesn't use that variable for some reason. Hence, using + to force it. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
Paul Smith wrote: The first one I've seen but hadn't had time to debug. I'll look at your patch. I left the truncate where it was rather than doing it after the sync_output() because I was hoping to avoid truncating a file that we'll never use again anyway, but I guess it isn't a big deal. One could probably optimize it to avoid truncating if it's not going to be reused, but I'm not sure it's worth the effort (including future maintenance). COMMANDS_RECURSE _does_ mean to recurse. The reason for the '+' prerequisite is to tell make that this line, even though it may not look like it, will run a recursive make. OK, let me just say that the meaning of recursive may not be perfectly clear. Though the manual says: The @samp{+} prefix marks those recipe lines as ``recursive'' so that they will be executed despite use of the @samp{-t} flag., the example immediately preceding this sentence has: +touch archive.a +ranlib -t archive.a which are clearly not recursive make invocations. I gather that make uses recursive in a wider sense as anything to be run regardless (because it probably arrages by itself not to do anything serious in a dry run or so), while the current implementation of output-sync uses it in the more specific meaning of a recursive invocation of GNU make (which will do its own syncing). Similar concerns seem to apply to other places where COMMANDS_RECURSE is checked, e.g. (right after the output-sync code in question): /* If we aren't running a recursive command and we have a jobserver pipe, close it before exec'ing. */ if (!(flags COMMANDS_RECURSE) job_fds[0] = 0) { close (job_fds[0]); close (job_fds[1]); } I don't think those touch and ranlib commands above need the jobserver pipe (though, of course, this one doesn't hurt much, just costs 2 available fds). Since make doesn't parse the command line it can't know for sure which ones actually recurse. It uses a heuristic, by looking for $(MAKE) or ${MAKE} in the unexpanded line. But this is easily defeated if your sub-make invocation doesn't use that variable for some reason. Hence, using + to force it. If this is the actual intention, and the example is really a misuse of the feature (maybe not ideal to put this in the manual then?), this means indeed that + lines are exempt from output synchronization and I see no easy way around it. A (hypothetical?) use case would be another make-like tool which is invoked from a Makefile. Unless it implements its own output-sync, compatible with ours, its output remains unsynced. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: Default output-sync setting (was: Re: [bug #33138] .PARLLELSYNC enhancement with patch)
Let me add my voice as a user. If you are one of the lucky people whose builds consist mostly of 1 line of output per rule then you will rarely have any trouble in a good build but try interpreting error messages from compiler/tool X when they're 10 lines from the file that they refer to and don't include the filename in the error message. Try working it out when it's not your code that's being built. This suggests an optimisation where you buffer 5 lines in memory and open a file if there's more but I wouldn't be surprised if the gain is quite small as I suspect that the average build step does so much IO as to make a buffered temp file for the output seem a relatively tiny overhead. I have certainly spent a lot of time wondering why xyz.cpp doesn't have a line 370. I faced this 5 years ago and we had to come up with our own solution rather than wait to convince GNU mission control that we had a problem - which was the only choice given the great difficulty at that job of trying to get permission to make open source contributions. It has already been a very long time coming. You've got it now, thank goodness, in whatever shape or form. Please turn it on. As for reasons: a) I would be surprised if there weren't other changes in the next release that give some people pause for thought before upgrading - why worry about 1 more? e.g. globbing differences between various recent releases caused more pain to me than this is likely to. b) Demand is usually only going to come from the few anyhow. I think this is one of those features that few people will know to turn on but many will benefit anyhow. The reason is that you will start to be able to write generic make output analysis tools. I don't know if you guys have seen Electric Insight, for example. I don't intend this as a plug but I can't help it - it's just the only non-inhouse example that I can think of but I have seen a number of others at various companies. This is a trivial example where you can see that parsing is taking most of the time in this parallel build: https://docs.google.com/file/d/0BwJNUxZZ7qItQlRrUjBmM0tTTDg/edit?usp=sharing It lets you see the state of your parallel build - gives you an intuitive idea of where the time is going and what the errors are and so on. You can't implement tools like this across a general a lot of builds unless they have contiguous recipe output and the surest way to get that is for it to be on by default. You also need recipe separators and timing data but I think you'll realise that eventually now that this step is taken. c) Lets benchmark it on something like an android build and you can make a decision after that? Heck, lets try it with various things and be happy it works. Regards, Tim On 29 April 2013 03:53, Eli Zaretskii e...@gnu.org wrote: From: Paul Smith psm...@gnu.org Cc: bug-make@gnu.org Date: Sun, 28 Apr 2013 22:03:39 -0400 Now that we seem to have a workable solution for output synchronization for both POSIX and Windows systems, I wonder if we shouldn't consider enabling it as the default mode when parallel builds are running. I don't think we should do this unless users demand it, which means we should wait for the next release and some time beyond before making the decision. FWIW, I'm running 8-way parallel Make jobs for many months now, and never had any trouble interpreting their output, neither when they succeed, nor when they don't. The synchronized output operation annoys me slightly in that there are distinct time intervals when Make says nothing at all, and then the output appears in large chunks that get dumped en-masse to the screen. IOW, the user experience is different and takes time to get used to. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make -- You could help some brave and decent people to have access to uncensored news by making a donation at: http://www.thezimbabwean.co.uk/friends/ ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: Default output-sync setting (was: Re: [bug #33138] .PARLLELSYNC enhancement with patch)
Date: Mon, 29 Apr 2013 09:58:50 +0100 From: Tim Murphy tnmur...@gmail.com Cc: bug-make@gnu.org bug-make@gnu.org try interpreting error messages from compiler/tool X when they're 10 lines from the file that they refer to and don't include the filename in the error message. That's unrelated: interpreting such output will be hard no matter what order it comes in. b) Demand is usually only going to come from the few anyhow. I think this is one of those features that few people will know to turn on but many will benefit anyhow. The reason is that you will start to be able to write generic make output analysis tools. People who write programs that parse Make output will find and use this switch anyway; we could announce that prominently in NEWS if you think the feature could go unnoticed. But that is irrelevant to interactive human users and their use cases. c) Lets benchmark it on something like an android build and you can make a decision after that? Heck, lets try it with various things and be happy it works. I already tried it, and I'm not happy, FWIW. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: Default output-sync setting (was: Re: [bug #33138] .PARLLELSYNC enhancement with patch)
On 29 April 2013 16:19, Eli Zaretskii e...@gnu.org wrote: Date: Mon, 29 Apr 2013 09:58:50 +0100 From: Tim Murphy tnmur...@gmail.com Cc: bug-make@gnu.org bug-make@gnu.org try interpreting error messages from compiler/tool X when they're 10 lines from the file that they refer to and don't include the filename in the error message. That's unrelated: interpreting such output will be hard no matter what order it comes in. cc fred.c -c -o fred.o cc bob.c -c -o bob.o error on line 20 -X Which one? It gets worse when you scale up. Regards, Tim http://www.thezimbabwean.co.uk/friends/ ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: Default output-sync setting (was: Re: [bug #33138] .PARLLELSYNC enhancement with patch)
Date: Mon, 29 Apr 2013 16:40:03 +0100 From: Tim Murphy tnmur...@gmail.com Cc: bug-make@gnu.org bug-make@gnu.org cc fred.c -c -o fred.o cc bob.c -c -o bob.o error on line 20 -X Which one? Make will actually tell you which one, something like: makefile:342: recipe for target 'oo/i386/acl-errno-valid.o' failed make: *** [oo/i386/acl-errno-valid.o] Error 1 (Copied from a failed make -j8 run I just did for Emacs.) ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: Default output-sync setting (was: Re: [bug #33138] .PARLLELSYNC enhancement with patch)
On Mon, Apr 29, 2013 at 10:30 AM, Tim Murphy tnmur...@gmail.com wrote: cc fred.c -c -o fred.o cc bob.c -c -o bob.o error on line 20 -X error on line 30 - error on line 330 - makefile:342: recipe for target 'fred.o' failed makefile:350: recipe for target 'bob.o' failed ? Doctor, my hammer has a head so large that I always hit my thumb Throw out that hammer and get a non-broken one The GNU tool working around the brokenness of some non-free software? Some would call that collaboration in retaining your chains. :-/ I haven't had a chance to play with the new functionality in the repository, so I'll merely caution that having major new functionality that will affect the user experience be on by default runs the risk of there being a problem and having the new version be tarred-and-feathered, and the projects that would most benefit from the new functionality instead turning it off. If there are other fixes and/or features in this release that people are waiting for, burning them on this would be unfriendly. Philip Guenther ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: Default output-sync setting (was: Re: [bug #33138] .PARLLELSYNC enhancement with patch)
Come now - the broken excuse is an excuse. There's plenty of crap free software out there and some poor bastard trying to build it who can't change the source because the people who own it think it should be make's problem. :-) Cheers, Tim On 29 April 2013 19:00, Philip Guenther guent...@gmail.com wrote: On Mon, Apr 29, 2013 at 10:30 AM, Tim Murphy tnmur...@gmail.com wrote: cc fred.c -c -o fred.o cc bob.c -c -o bob.o error on line 20 -X error on line 30 - error on line 330 - makefile:342: recipe for target 'fred.o' failed makefile:350: recipe for target 'bob.o' failed ? Doctor, my hammer has a head so large that I always hit my thumb Throw out that hammer and get a non-broken one The GNU tool working around the brokenness of some non-free software? Some would call that collaboration in retaining your chains. :-/ I haven't had a chance to play with the new functionality in the repository, so I'll merely caution that having major new functionality that will affect the user experience be on by default runs the risk of there being a problem and having the new version be tarred-and-feathered, and the projects that would most benefit from the new functionality instead turning it off. If there are other fixes and/or features in this release that people are waiting for, burning them on this would be unfriendly. Philip Guenther -- You could help some brave and decent people to have access to uncensored news by making a donation at: http://www.thezimbabwean.co.uk/friends/ ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: Default output-sync setting (was: Re: [bug #33138] .PARLLELSYNC enhancement with patch)
Date: Mon, 29 Apr 2013 18:30:37 +0100 From: Tim Murphy tnmur...@gmail.com Cc: bug-make@gnu.org bug-make@gnu.org cc fred.c -c -o fred.o cc bob.c -c -o bob.o error on line 20 -X error on line 30 - error on line 330 - makefile:342: recipe for target 'fred.o' failed makefile:350: recipe for target 'bob.o' failed ? You need to look in both anyway. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: Default output-sync setting (was: Re: [bug #33138] .PARLLELSYNC enhancement with patch)
Date: Mon, 29 Apr 2013 19:33:10 +0100 From: Tim Murphy tnmur...@gmail.com Cc: Eli Zaretskii e...@gnu.org, bug-make@gnu.org bug-make@gnu.org Come now - the broken excuse is an excuse. There's plenty of crap free software out there and some poor bastard trying to build it who can't change the source because the people who own it think it should be make's problem. Look, no one is claiming that this feature is useless. I just spent two days researching and implementing it on Windows, something I wouldn't do if I thought it wasn't important to have. What we are talking here is whether to have this turned on by default. I hope you are not going to claim that adding -O to a Make command line to get the new behavior is too hard. By contrast, having to type -Onone to have the _old_ behavior back might very well annoy a few. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: Default output-sync setting (was: Re: [bug #33138] .PARLLELSYNC enhancement with patch)
One doesn't have to suffer the problems and learn the option exists afterwards. In the end I can understand why a new feature might not be default to start with - until a lot of people have used it and are sure that it works everywhere. Cheers, Tim On 29 April 2013 20:21, Eli Zaretskii e...@gnu.org wrote: Date: Mon, 29 Apr 2013 19:33:10 +0100 From: Tim Murphy tnmur...@gmail.com Cc: Eli Zaretskii e...@gnu.org, bug-make@gnu.org bug-make@gnu.org Come now - the broken excuse is an excuse. There's plenty of crap free software out there and some poor bastard trying to build it who can't change the source because the people who own it think it should be make's problem. Look, no one is claiming that this feature is useless. I just spent two days researching and implementing it on Windows, something I wouldn't do if I thought it wasn't important to have. What we are talking here is whether to have this turned on by default. I hope you are not going to claim that adding -O to a Make command line to get the new behavior is too hard. By contrast, having to type -Onone to have the _old_ behavior back might very well annoy a few. -- You could help some brave and decent people to have access to uncensored news by making a donation at: http://www.thezimbabwean.co.uk/friends/ ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: Default output-sync setting (was: Re: [bug #33138] .PARLLELSYNC enhancement with patch)
Eli: cc fred.c -c -o fred.o cc bob.c -c -o bob.o error on line 20 -X error on line 30 - error on line 330 - makefile:342: recipe for target 'fred.o' failed makefile:350: recipe for target 'bob.o' failed You need to look in both anyway. That is true of the very specific example Tim had given, in response to a particular objection which his example did genuinely address. Think for a moment longer and you shall see that there are other cases where your answer won't apply. The interleaved output from make -j n is often much harder to read. (When the output isn't even line-interleaved, but has some command's output starting part way through a line of the output of another command, life gets even harder.) Even in the specific example: on which file should I be looking at line 20, at line 30, or line 330 ? The answer look in every failing file for every failing line number scales quadratically, which is not good. Even when I know which line a message relates to, it can be hard to work out how that message relates to that line (especially if there's a preprocessor involved and the undefined symbol xxyyz relates to a line in which xxyyz never appears, but only arises after expansion of a mess of macros). It gets nasty fast to have to look in several files, at line 30 of each, and work out which of them is the one - relates to; and it gets easy to end up mistaking which one it relates to and fixing the wrong one. In a make with -j 12ish compiling many files, the starts and ends of compiles interleave with one another and with the output of running commands. Some of that output is errors, which (if the commands being run are well-behaved - not always true) shall mean the affected files are named by make in the output when they finish. Some output is warnings, that may actually be important, but the command succeeds: there's no marker indicating the end of a file that succeeds, so there are many files that warning might have come from; most likely one of those started recently, but not always easy to tell which. When hundreds of files are compiling - why else use -j 12ish ? - it can get hard to keep track of which are currently active at any given line of output, even when every file has both start and end markers for its output, e.g. if some files take much longer than others. The output from make -j several is hard to make sense of. Your shiny new feature really does sound (I haven't had occasion to try it) like a vast improvement. That *is* a strong case for turning it on by default. Naturally, before doing so, it's important to do ample testing: and one way to do that is to initially release make with it off by default but available, so that lots of folk can try it and give you feed-back. All the same, once it's been well tested, if no serious problems arise (and aren't fixed by your collective ingenuity) this is a good thing to turn on by default. Philip Doctor, my hammer has a head so large that I always hit my thumb Throw out that hammer and get a non-broken one The GNU tool working around the brokenness of some non-free software? Some would call that collaboration in retaining your chains. This problem doesn't arise exclusively with non-free software, so this is an emotive argument, an appeal to bad example. Programmers, generally, are slap-dash about the things they throw together to build the software they're so proud of having (in their own opinions) written so nicely and well. They throw together shell scripts to wrap the compiler with some other commands that they also want to run on the same files, or that they need run before handing off their output to the compiler (e.g. they invented their own way of doing localisation, because it's so much more fun to reinvent the wheel than learn to use gettext): and they're profoundly negligent of error-reporting because it's always worked for me so anyone else inheriting their code is stuck with the results. They write contorted shell-scripts inline in the make-file as the command for a rule, using the [ condition ] action short-hand for if [ condition ]; then action; fi and tack on || true to avoid having make report the command as failed when [ condition ] was false; so the rule never gets reported as failing, even when it does, but that's OK, because the error messaeges tell you what failed. The hideous monstrosities of make files that get generated by helper scripts that work round the deficiencies of make (almost invariably, in fact, they work round the author's poor knowledge of how to use make well - but those who inherit the code are stuck with the results) end up doing perverse things that most on this list would never dream of. I'd love to answer that with so replace the bad scripts and hideously wrapped (and warped) build system with something well written, but the poor bug-fixer who's inherited the code from the oh-so-clever author is stuck with the need to get things fixed today, which doesn't
Re: [bug #33138] .PARLLELSYNC enhancement with patch
On Sun, Apr 28, 2013 at 1:34 AM, Paul Smith psm...@gnu.org wrote: I'm not excited about that term (job); it's kind of accurate, but in the documentation for example we're really mushy about exactly what a job is, vs. a recipe or a command line etc. I'd like to pick some terms for this, define them in a solid way, then clean up the references. It would be best to do this before the release to avoid changing things later. For example, we currently use target as the name; maybe recipe is better? If anyone has opinions I'm listening. This seems quite wrong to me. I think it's inherent from the existence of the --jobs option that job must be synonymous with recipe, even if that's not what was intended at the time. Therefore my HO is that not only does this one need a new name but -O target would be better renamed -O job. What's wrong with line for the line-by-line case? It's exactly the way make is documented, that each line of the recipe is fed to the shell in turn, and the documentation you just wrote for the new argument is all about lines; I see the word line 4 times in 2 sentences. The whole thing seems clear and well written to me, but would be even more clear if the new option was called -O line. So I'd argue for: -O line (new) -O job (current -O target) -O make And document the notion that job and recipe are equivalent terms (more precisely, one is the instantiation of the other: job is to recipe what process is to program). I also think it wouldn't hurt for the documentation of the default to contain a forward reference. Pending potential rename: --- a/doc/make.texi +++ b/doc/make.texi @@ -4136,7 +4136,7 @@ specified by giving an argument to the option (e.g., @samp{-Ojob} or @table @code @item none The is the default: all output is sent directly as it is generated and -no synchronization is performed. +no synchronization is performed. It is identical to @samp{--output-sync=target} @item job Output from each individual line of the recipe is grouped and printed David ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
From: Paul Smith psm...@gnu.org Cc: e...@gnu.org, david.s.bo...@gmail.com, bug-make@gnu.org Date: Sun, 28 Apr 2013 01:34:44 -0400 On Thu, 2013-04-18 at 22:36 +0200, Frank Heckenbach wrote: This is useful (to me) because at any time, I know what's running. ([Start] messages minus [End] messages.) Thanks, this is the reason I was looking for; that use-case wasn't clear to me based on the previous email. OK, so what are we going to do about it? Leave, revert, new option? I've pushed a change to add a new argument to the -O/--output-sync option, job, to write output after each line of the recipe. What is its purpose? To avoid mixing in the same screen line characters from several parallel sub-makes? (That does happen, albeit rarely.) Or is it something else? ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
Date: Sun, 28 Apr 2013 10:22:40 -0400 From: David Boyce david.s.bo...@gmail.com Cc: Frank Heckenbach f.heckenb...@fh-soft.de, Eli Zaretskii e...@gnu.org, bug-make bug-make@gnu.org So I'd argue for: -O line (new) -O job (current -O target) -O make Agree about line (assuming I understood what it means), but disagree about renaming target: it is quite clear what it means. By contrast, make is not so clear; perhaps we should rename it makefile or something. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
On Sun, 2013-04-28 at 20:00 +0300, Eli Zaretskii wrote: I've pushed a change to add a new argument to the -O/--output-sync option, job, to write output after each line of the recipe. What is its purpose? To avoid mixing in the same screen line characters from several parallel sub-makes? (That does happen, albeit rarely.) Or is it something else? I'm not sure exactly what you mean by your second sentence. However I asked the same question you did about this feature. Frank had a use-case: he was tracking which jobs were active/still running by making all his recipes look like this: target: @echo start: $@ ... recipe ... @echo end: $@ This allows a higher-level, dynamic interface to track which jobs are running, when they started, etc. and track the build. Although I implemented this because it was simple, I'm not so sure this is a real use-case. Or to be more accurate, I agree that it's a real use-case but I don't think this is a good solution to the problem. I suspect that a better solution might be to create a machine interface mode for make, as some other GNU CLI tools like GDB, etc. have. This interface would be well-defined and unchanging and easily machine-parseable, and allow people to write front-ends to more accurately examine make's output. However, for now this new output-sync mode doesn't seem to be harmful. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
On Thu, 2013-04-18 at 22:36 +0200, Frank Heckenbach wrote: % make -Omake # same with -Otarget m:2: recipe for target 'foo' failed make: *** [foo] Error 1 foo:error This seems at least strange to me: The conclusion recipe failed is printed before the reason (the messages from the job). Reverting this part (i.e., moving the sync_output() call to where it was before) corrects this: % make -Otarget # same with -Otarget foo:error m:2: recipe for target 'foo' failed make: *** [foo] Error 1 To fix it without reverting the behaviour seems to require calling sync_output() before child_error() (2 places). Unfortunately that isn't sufficient either. It's possible for child_error() to be invoked, but then we don't stop but continue on with the recipe (if the error is ignored). If we ran child_error() here we would show half the recipe output, then the error message, then the rest later, in a separate grouping. I went a different way: I modified the child_error() function so that if there was an output-sync file, the error message would be written to the end of that file instead of printed directly. This way when the output is shown to the user she sees the entire thing, including error messages from make, in order. This also allowed me to drop the layer of enter/leave notices around error messages. Please check/verify this change in your own environments. PS. Some may be tsking to themselves noting this change would have been a lot simpler if we'd kept a FILE* handle instead of a file descriptor. Unfortunately it's not clear that would work. I created a test program to verify that this method of having the parent append messages to the file after the child exits, and with a FILE* handle I couldn't make it work right. My fseek() after the child exited didn't skip past the child's output and my message printed by the parent ended up overwriting that output. I'm sure it would have worked if I'd closed the file in the parent and re-opened it, but since I'm using tmpfile() that is not possible. I might have done something wrong, but anyway it wasn't a slam-dunk. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Default output-sync setting (was: Re: [bug #33138] .PARLLELSYNC enhancement with patch)
Now that we seem to have a workable solution for output synchronization for both POSIX and Windows systems, I wonder if we shouldn't consider enabling it as the default mode when parallel builds are running. I understand that this will be a change that could be visible (beyond the collection of output) due to using a temp file instead of a terminal. Of course people can still use -Onone if they want old behavior. However assuming the new mode works and is reliable, and is not a performance bottleneck, I'm hard-pressed to see why a well-ordered output would not be preferable to just about everyone, and hence shouldn't be the default. Of course it's possible that writing to a file, rather than spewing to stdout, WOULD be noticeably performance impacting at least in some situations/systems. Comments? ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: Default output-sync setting (was: Re: [bug #33138] .PARLLELSYNC enhancement with patch)
From: Paul Smith psm...@gnu.org Cc: bug-make@gnu.org Date: Sun, 28 Apr 2013 22:03:39 -0400 Now that we seem to have a workable solution for output synchronization for both POSIX and Windows systems, I wonder if we shouldn't consider enabling it as the default mode when parallel builds are running. I don't think we should do this unless users demand it, which means we should wait for the next release and some time beyond before making the decision. FWIW, I'm running 8-way parallel Make jobs for many months now, and never had any trouble interpreting their output, neither when they succeed, nor when they don't. The synchronized output operation annoys me slightly in that there are distinct time intervals when Make says nothing at all, and then the output appears in large chunks that get dumped en-masse to the screen. IOW, the user experience is different and takes time to get used to. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
From: Paul Smith psm...@gnu.org Cc: Frank Heckenbach f.heckenb...@fh-soft.de, bug-make@gnu.org Date: Sun, 21 Apr 2013 19:30:05 -0400 On Fri, 2013-04-19 at 14:09 +0300, Eli Zaretskii wrote: Date: Fri, 19 Apr 2013 11:54:05 +0200 Cc: bo...@kolpackov.net, bug-make@gnu.org From: Frank Heckenbach f.heckenb...@fh-soft.de Is there a simple enough Makefile somewhere that could be used to test this feature, once implemented? We have a test in the test suite (output-sync). Can you use that? I hoped for something simpler and not involving Perl or Unixy shell features, because I'd like to use this in the native Windows environment, where the Windows port of Make runs. However, if that's the best possibility, I guess I'd craft something based on that test. The basic feature can be tested trivially like this: all: one two one two: @echo start $@ @sleep 1 @echo stop $@ Now if you run this using make -j you'll get: start one start two stop one stop two If you run this using make -j -O you should get: start one stop one start two stop two There's more to test than that: before it's done we need to test recursive make invocations for example. But the above is simple. Thanks, I tested this, and also the recursive behavior. But I'm not sure whether the results in some less-than-trivial cases are as intended (although I see exactly the same behavior on GNU/Linux, so the questions below are not Windows-specific). E.g., with this Makefile, called mkfsync, (indented 2 spaces for clarity): all: simple recursive simple: one two one two: @echo start $@ @sleep 1 @echo stop $@ @-false recursive: rec1 rec2 rec1 rec2: @echo start $@ $(MAKE) -f mkfsync simple @echo stop $@ and invoking Make as gnumake -j -O -f mkfsync, I get this output: gnumake -f mkfsync simple gnumake -f mkfsync simple mkfsync:6: recipe for target 'one' failed gnumake: [one] Error 1 (ignored) start one stop one mkfsync:6: recipe for target 'two' failed gnumake: [two] Error 1 (ignored) start two stop two start rec1 gnumake[1]: Entering directory 'D:/gnu/make-3.82.90_GIT_2013-04-20' gnumake[1]: Entering directory 'D:/gnu/make-3.82.90_GIT_2013-04-20' mkfsync:6: recipe for target 'one' failed gnumake[1]: Leaving directory 'D:/gnu/make-3.82.90_GIT_2013-04-20' gnumake[1]: Entering directory 'D:/gnu/make-3.82.90_GIT_2013-04-20' gnumake[1]: [one] Error 1 (ignored) gnumake[1]: Leaving directory 'D:/gnu/make-3.82.90_GIT_2013-04-20' gnumake[1]: Entering directory 'D:/gnu/make-3.82.90_GIT_2013-04-20' start one stop one gnumake[1]: Leaving directory 'D:/gnu/make-3.82.90_GIT_2013-04-20' gnumake[1]: Entering directory 'D:/gnu/make-3.82.90_GIT_2013-04-20' mkfsync:6: recipe for target 'two' failed gnumake[1]: Leaving directory 'D:/gnu/make-3.82.90_GIT_2013-04-20' gnumake[1]: Entering directory 'D:/gnu/make-3.82.90_GIT_2013-04-20' gnumake[1]: [two] Error 1 (ignored) gnumake[1]: Leaving directory 'D:/gnu/make-3.82.90_GIT_2013-04-20' gnumake[1]: Entering directory 'D:/gnu/make-3.82.90_GIT_2013-04-20' start two stop two gnumake[1]: Leaving directory 'D:/gnu/make-3.82.90_GIT_2013-04-20' gnumake[1]: Leaving directory 'D:/gnu/make-3.82.90_GIT_2013-04-20' stop rec1 start rec2 gnumake[1]: Entering directory 'D:/gnu/make-3.82.90_GIT_2013-04-20' gnumake[1]: Entering directory 'D:/gnu/make-3.82.90_GIT_2013-04-20' mkfsync:6: recipe for target 'one' failed gnumake[1]: Leaving directory 'D:/gnu/make-3.82.90_GIT_2013-04-20' gnumake[1]: Entering directory 'D:/gnu/make-3.82.90_GIT_2013-04-20' gnumake[1]: [one] Error 1 (ignored) gnumake[1]: Leaving directory 'D:/gnu/make-3.82.90_GIT_2013-04-20' gnumake[1]: Entering directory 'D:/gnu/make-3.82.90_GIT_2013-04-20' start one stop one gnumake[1]: Leaving directory 'D:/gnu/make-3.82.90_GIT_2013-04-20' gnumake[1]: Entering directory 'D:/gnu/make-3.82.90_GIT_2013-04-20' mkfsync:6: recipe for target 'two' failed gnumake[1]: Leaving directory 'D:/gnu/make-3.82.90_GIT_2013-04-20' gnumake[1]: Entering directory 'D:/gnu/make-3.82.90_GIT_2013-04-20' gnumake[1]: [two] Error 1 (ignored) gnumake[1]: Leaving directory 'D:/gnu/make-3.82.90_GIT_2013-04-20' gnumake[1]: Entering directory 'D:/gnu/make-3.82.90_GIT_2013-04-20' start two stop two gnumake[1]: Leaving directory 'D:/gnu/make-3.82.90_GIT_2013-04-20' gnumake[1]: Leaving directory 'D:/gnu/make-3.82.90_GIT_2013-04-20' stop rec2 Is this intended behavior that these two messages: mkfsync:6: recipe for target 'two' failed gnumake[1]: [two] Error 1 (ignored) are separated and wrapped by separate Entering...Leaving blocks, instead of being produced together? They are both produced by a call to 'message', which outputs the message to stdout, so it's not like they went to two different
Re: [bug #33138] .PARLLELSYNC enhancement with patch
Date: Sat, 27 Apr 2013 13:09:02 +0300 From: Eli Zaretskii e...@gnu.org Cc: f.heckenb...@fh-soft.de, bug-make@gnu.org The basic feature can be tested trivially like this: all: one two one two: @echo start $@ @sleep 1 @echo stop $@ Now if you run this using make -j you'll get: start one start two stop one stop two If you run this using make -j -O you should get: start one stop one start two stop two There's more to test than that: before it's done we need to test recursive make invocations for example. But the above is simple. Thanks, I tested this, and also the recursive behavior. The changes needed to make -O work on MS-Windows are now committed to the master repository, see commits da7df54 and 049f8e8. Please review and comment. I also took liberty in committing a small change posted by Frank in this thread, whereby, if taking the semaphore fails, output is dumped unsynchronized, instead of being silently ignored. (Btw, I don't see Frank's assignment on file for Make, so I marked this change tiny, per the FSF procedures.) Thanks to all who contributed to this (lengthy) discussion and allowed me to understand how to implement this feature on Windows. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
On Sat, 2013-04-27 at 13:09 +0300, Eli Zaretskii wrote: Is this intended behavior that these two messages: mkfsync:6: recipe for target 'two' failed gnumake[1]: [two] Error 1 (ignored) are separated and wrapped by separate Entering...Leaving blocks, instead of being produced together? They are both produced by a call to 'message', which outputs the message to stdout, so it's not like they went to two different streams. Am I missing something? Frank mentioned this as well and it's a bug that needs to be fixed. I'll look into it this weekend. I need to check the algorithm; one simple fix, if it's too complex to do it another way, would be to have make write (append) the error message to the temp file rather than printing it directly. That way we know it will come out in the right order when the temp file is dumped to stdout. However that may not be necessary. If this is intended behavior, can we somehow cut down on these Entering...Leaving pairs? They add a lot of clutter and make it hard to read the output of a Make run. Yes, this was also discussed: I've been thinking about a separate way to allow the user to choose. We also need to ensure, as best as we can, that extra unneeded lines are not being generated. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
On Sat, 2013-04-27 at 14:39 +0300, Eli Zaretskii wrote: The changes needed to make -O work on MS-Windows are now committed to the master repository, see commits da7df54 and 049f8e8. Please review and comment. Thanks Eli!! I'll take a look over the next few days. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
On Thu, 2013-04-18 at 22:36 +0200, Frank Heckenbach wrote: This is useful (to me) because at any time, I know what's running. ([Start] messages minus [End] messages.) Thanks, this is the reason I was looking for; that use-case wasn't clear to me based on the previous email. OK, so what are we going to do about it? Leave, revert, new option? I've pushed a change to add a new argument to the -O/--output-sync option, job, to write output after each line of the recipe. Please give it a try and make sure it works for your situation. It worked OK in my more limited testing. I'm not excited about that term (job); it's kind of accurate, but in the documentation for example we're really mushy about exactly what a job is, vs. a recipe or a command line etc. I'd like to pick some terms for this, define them in a solid way, then clean up the references. It would be best to do this before the release to avoid changing things later. For example, we currently use target as the name; maybe recipe is better? If anyone has opinions I'm listening. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
Date: Wed, 24 Apr 2013 21:19:45 +0300 From: Eli Zaretskii e...@gnu.org Cc: bug-make@gnu.org Date: Wed, 24 Apr 2013 19:14:01 +0200 Cc: bug-make@gnu.org From: Frank Heckenbach f.heckenb...@fh-soft.de Testing clearly shows it doesn't: GetFileInformationByHandle simply fails for handles open on console devices and the null device. And we will be comparing handles for console devices in the majority of use cases here. That's right. Or perhaps pipes (I post-process make's output, to handle the directories in messages etc. and I suppose the mentioned Emacs mode does something similar). GetFileInformationByHandle does work on pipes, at least on shell pipes (as in foo | cat). I researched this a bit, and eventually succeeded in finding a way that is good enough for both disk files, pipes, and console devices. Good. (And the null device, I suppose. Though it doesn't really matter if we write to the null device combined or separate. ;-) No, I didn't find a satisfactory solution for the null device. But as you point out, we don't care. One more minor issue with this: do we want the comparison of stdin and stdout for the purposes of combined_output to return zero, although they refer to the same device on Unix? I know it's a bit theoretical for the case in point, but I'd like the code I write to be useful in other contexts as well, and maybe even Make will need that in the future. On Windows, console input and output use different devices, and I can distinguish between them. Should I? Thanks. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: Fwd: [bug #33138] .PARLLELSYNC enhancement with patch
On 25 April 2013 20:06, Eli Zaretskii e...@gnu.org wrote: Date: Thu, 25 Apr 2013 19:36:28 +0100 From: Tim Murphy tnmur...@gmail.com Cc: bug-make@gnu.org bug-make@gnu.org 1) sem_timedwait() in posix lets you timeout so in a big build when something crashes or just sits around, there is at least the option of printing an error message or giving up on locking from that point on which is a bit of a godsend if it happens overnight and there's no-one to restart the build. How much would you use for the timeout, though? A sub-Make could legitimately run for a very long time, depending on what's in the Makefile. FWIW, I currently let Make wait forever for the mutex. It's a horrible question and no answer makes people happy. It's a kind of exceedingly blunt tool that you only use because there is no other. I have seen systems where the child process watches stdio/stderr activity and times out if there isn't any and that just gets into trouble with processes that run long and silent. On the other hand, having your 24-hour build hang after 2 hours when you have a lot of people waiting on the result is a sort of costly thing. Even if you accept that a lot of the build will be broken, there is still a lot of feedback to get from the parts which should be able to complete - errors and warnings, unit tests and so on - that people can take action on while the build is being fixed. So timeouts have to be configurable by users who can set whatever they think is the least bad compromise in their particular case. Regards, Tim -- You could help some brave and decent people to have access to uncensored news by making a donation at: http://www.thezimbabwean.co.uk/friends/ ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
Date: Thu, 25 Apr 2013 02:16:33 +0200 Cc: e...@gnu.org, bug-make@gnu.org From: Frank Heckenbach f.heckenb...@fh-soft.de On Windows, you said fstat was very expensive, didn't you? Or is lseek even worse? I think anything that potentially moves the file pointer can be sometimes expensive and is best avoided. (On Windows, I'd use GetFileInformationByHandle.) OK, if that's so, do that. But I don't think that's true on POSIX. I don't think it's worth doing on Windows as well, see below. Nothing is actually read by lseek (and even if it were, it would only need to look at the first and last part of the file, not read all the content, if that was the worry). Are you sure? How can lseek jump to the last byte of the file, if the file is not contiguous on disk, except by reading some of it? lseek doesn't need to read any data. It just sets the current offset of the FD to the given position, so the next read (which in this case never happens before seeking to the beginning) knows where to read. Even in the case of SEEK_END, all it has to do is add the given offset (here: 0) to the current file size. What I meant is that lseek doesn't just return the byte position, it also makes sure the next read or write happens at that position. So at some point, some piece of software needs to tell the disk to move its reading head to the right point. Whether this happens as part of lseek or the subsequent read/write, and whether this requires reading some of the data on the disk, is a matter of how this is implemented and what data structures does the filesystem maintain in memory at all times. Instead of testing, I just looked at the implementation (Linux 3.2.2). The following is really the whole relevant code. As you see, nothing's read from the disk, it only handles in-memory data. (Also the file size is in memory for open files; even it were not, it would be a constant-time access to the inode and wouldn't need to touch any data blocks.) I timed lseek on Windows on very large files (hundreds of MBs), and found that a single lseek takes less than 1 usec, at least with NTFS volumes and in my core i7 box. So, while more efficient ways of revealing whether the file is empty are possible, I don't think such a small penalty justifies yet another set of ifdef's. The only situation where lseek could be really expensive is if the volume is compressed, because lseek on Windows returns the uncompressed offsets in that case. I don't have access to any machine which has such volumes, so I cannot test that. (Does Unix support such filesystems? If so, what does lseek do there?) ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: Fwd: [bug #33138] .PARLLELSYNC enhancement with patch
Date: Fri, 26 Apr 2013 09:05:18 +0100 From: Tim Murphy tnmur...@gmail.com Cc: Paul D. Smith psm...@gnu.org, bug-make@gnu.org bug-make@gnu.org How much would you use for the timeout, though? A sub-Make could legitimately run for a very long time, depending on what's in the Makefile. FWIW, I currently let Make wait forever for the mutex. It's a horrible question and no answer makes people happy. Yes, I know. Which is why I think we would be better off not asking it in the first place. So timeouts have to be configurable by users who can set whatever they think is the least bad compromise in their particular case. If humans need to be in the loop, they can impose timeouts even if Make does not offer any such customizations. E.g., run the build under the 'timeout' command (from Coreutils), or just plain Ctrl-C the instance of Make that hogs the mutex. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
Eli Zaretskii wrote: You underestimate me ;-) What I have is actually this: [...] and I wrote 'fcntl' emulation for Windows that uses a mutex. Indeed, I had not expected this. :-) That said, I'm not wedded to the above approach, and if people like a completely disjoint code for Windows, that would be a trivial change. I just wanted to comply with what Paul said, viz.: Also, where is the best place to put the emulated Posix functions? Some new file in w32/compat/? I'd like to see it there. I'm thinking I want to move the new stuff out of job.c even for POSIX systems. The ifdefs are really getting to me. So I now have w32/compat/posixfcn.c with the emulation of fcntl (also used for CLOSE_ON_EXEC), and a few support functions it needs. I see. If Paul agrees, I'm fine with it. Nothing is actually read by lseek (and even if it were, it would only need to look at the first and last part of the file, not read all the content, if that was the worry). Are you sure? How can lseek jump to the last byte of the file, if the file is not contiguous on disk, except by reading some of it? lseek doesn't need to read any data. It just sets the current offset of the FD to the given position, so the next read (which in this case never happens before seeking to the beginning) knows where to read. Even in the case of SEEK_END, all it has to do is add the given offset (here: 0) to the current file size. What I meant is that lseek doesn't just return the byte position, it also makes sure the next read or write happens at that position. So at some point, some piece of software needs to tell the disk to move its reading head to the right point. Whether this happens as part of lseek or the subsequent read/write, and whether this requires reading some of the data on the disk, is a matter of how this is implemented and what data structures does the filesystem maintain in memory at all times. Sure, in theory it's up to the filesystem, but I'd be surprised if any fs actually worked this way. There are several layers between an lseek and the actual moving of disk heads (virtual fs, concrete fs, block layer, caching, drive hardware etc.). In particular, if you consider that other processes can read different files simultaneously, it would be premature to position disk heads after an lseek already. (If this is even possible; normally the OS sends a request to read a block to the disk, and the disk controller does the rest.) And as I said, lseek (SEEK_END, 0) is a rather well-known idiom, so I don't think any fs implementor would consider it an optimization to do anything like this. In short, all that's needed for lseek is the current offset and the file size. Both values really should be in memory for any open file. Instead of testing, I just looked at the implementation (Linux 3.2.2). The following is really the whole relevant code. As you see, nothing's read from the disk, it only handles in-memory data. (Also the file size is in memory for open files; even it were not, it would be a constant-time access to the inode and wouldn't need to touch any data blocks.) I timed lseek on Windows on very large files (hundreds of MBs), and found that a single lseek takes less than 1 usec, at least with NTFS volumes and in my core i7 box. So, while more efficient ways of revealing whether the file is empty are possible, I don't think such a small penalty justifies yet another set of ifdef's. As I expected. The only situation where lseek could be really expensive is if the volume is compressed, because lseek on Windows returns the uncompressed offsets in that case. I don't have access to any machine which has such volumes, so I cannot test that. (Does Unix support such filesystems? If so, what does lseek do there?) At least I don't have access to one right now. But I still wouldn't expect it to take long. They surely store the uncompressed size somewhere, and as I said that's all that's needed. When actually reading from the new position is the right time to compute the compressed position. Furthermore, we do this check only on our temp files. Though the user could put them (i.e., the directory where tmpfile() will create them, usually /tmp on Unix) on a compressed fs, that's not exactly the wisest thing to do, and we might not really have to optimize for this case. :-) ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Fwd: [bug #33138] .PARLLELSYNC enhancement with patch
-- Forwarded message -- From: Tim Murphy tnmur...@gmail.com Date: 25 April 2013 07:13 Subject: Re: [bug #33138] .PARLLELSYNC enhancement with patch To: Paul D. Smith psm...@gnu.org To be honest, I have done all this before with named semaphores including the file that gets left over problem and it's all solvable quite nicely. You pass the build id in the environment which is, after all, what it's for. It has been done on huger builds than most people using make ever imagine and contention doesn't appear to be a problem as the semaphore is held momentarily while writing output. I'm not saying that this clever stuff is not a good idea but the reasons for doing it haven't proved to be valid in my experience. On 24 April 2013 23:47, Paul Smith psm...@gnu.org wrote: On Wed, 2013-04-24 at 22:55 +0100, Tim Murphy wrote: why not use a named semaphore wherever possible (windows and linux) and lock a file where not instead of trying to pass kernel object handles around (seems a bit nasty to me)? Hi Tim; I think you're late to the party :-). Let me summarize a lot of discussion then we can use this as a reference for other questions. Named semaphores on POSIX are kind of sucky. They don't get automatically cleaned up when the process exits, for one thing. And my suspicion is that they're not very portable across different variations of UNIX especially older ones. File locking, on the other hand, is very portable, very simple, very fast, and any held locks are automatically released when the descriptor is closed. No muss, no fuss. If you agree file locking is the right way to go on a POSIX system, then we just have to decide what to lock. It can be anything as long as all children of a given top-level make can find it. We could use a temporary file, sure. If we did that we'd have to pass around either the name of the file or, more likely, use tmpfile() and send along the file descriptor, handle close-on-exec, etc. like we do with the jobserver pipe. Which we could certainly do: we do it today with jobserver. However we already have a descriptor available to us: stdout. If we use that we don't have to pass around anything because our children are already inheriting it and it's on a well-known FD. We don't have to worry about using + before sub-make rules like we do with jobserver. There's one other advantage of using stdout: if a sub-make redirects its output to a separate file, that magically starts a new lock domain and that sub-make and its children don't contend with the other sub-makes and their children. This is not a big deal and if we didn't get it for free, we'd never go to the effort of implementing it. But it's nice. Eli is working on the Windows port; I have no idea how he's decided to implement this there yet. -- You could help some brave and decent people to have access to uncensored news by making a donation at: http://www.thezimbabwean.co.uk/friends/ -- You could help some brave and decent people to have access to uncensored news by making a donation at: http://www.thezimbabwean.co.uk/friends/ ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
Date: Thu, 25 Apr 2013 05:14:41 +0200 Cc: psm...@gnu.org, bug-make@gnu.org From: Frank Heckenbach f.heckenb...@fh-soft.de My suggestion was under the assumption that we use different declarations because of different types anyway -- which I still recommend, though Paul might have to decide this one. Yes, eventually it's Paul's call. If I understand it correctly, roughly we'd have now: intptr_t sync_handle; [...] #ifdef POSIX sync_handle = (intptr_t) fileno (stdout); #else sync_handle = (intptr_t) get_mutex_handle (...); #endif [...] #ifdef POSIX fcntl ((int) sync_handle, ...); #else lock_mutex ((mutex *) sync_handle); #endif I.e., everything about it is separate, apart from the declaration, and the latter uses a type that isn't quite right for either one. You underestimate me ;-) What I have is actually this: sync_handle_t sync_handle = -1; [...] #if POSIX sync_handle = fileno (stdout); #elif WINDOWS sync_handle = create_mutex (); #endif [...] if (fcntl (sync_handle, F_SETLKW, fl) != -1) return fl; and I wrote 'fcntl' emulation for Windows that uses a mutex. IOW, I didn't add even a single ifdef (the one you see above was already there, and it can be removed if we try hard enough). That said, I'm not wedded to the above approach, and if people like a completely disjoint code for Windows, that would be a trivial change. I just wanted to comply with what Paul said, viz.: Also, where is the best place to put the emulated Posix functions? Some new file in w32/compat/? I'd like to see it there. I'm thinking I want to move the new stuff out of job.c even for POSIX systems. The ifdefs are really getting to me. So I now have w32/compat/posixfcn.c with the emulation of fcntl (also used for CLOSE_ON_EXEC), and a few support functions it needs. Though I generally like reducing the number of ifdefs, in this case it seems more consistent to split the declaration, so it's clear that these are really two different things. As a side benefit, if in the future someone tried to use it in common code, they would get compile-time errors on the other platform, rather than silently producing wrong code (like applying some fd operation to a mutex handle). If we really want to separate Posix code from non-Posix one, I would suggest to have identically named functions, such as start_job_command, construct_command_argv, etc., in 2 separate files. These are very large and complex functions, and it is very hard to follow their logic when large portions of them are more-or-less copied with non-trivial changes, and you have something like #ifdef MSDOS /* lots of MSDOS stuff */ #else #ifdef WINDOWS /* lots of similar-but-different Windows stuff */ #else #ifdef AMIGA /* Amiga stuff */ #else #ifdef _EMX_ /* EMX stuff */ #else /* Posix stuff */ Just paging through the irrelevant portions without losing track of the logic is non-trivial. You need a very good editor with folding capabilities to not become distracted. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: Fwd: [bug #33138] .PARLLELSYNC enhancement with patch
On Thu, 2013-04-25 at 07:14 +0100, Tim Murphy wrote: To be honest, I have done all this before with named semaphores including the file that gets left over problem and it's all solvable quite nicely. You pass the build id in the environment which is, after all, what it's for. Sure. Given enough effort all programming problems are solvable. The question is what is the effort, and what is the benefit? At the moment to me it looks like the effort is not insignificant and I don't see the benefit. However, that may well just mean I haven't thought of something. Can you give a quick outline of how the cleanup / error handling might work, and what advantages using named semaphores has? Cheers! ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: Fwd: [bug #33138] .PARLLELSYNC enhancement with patch
Ok, I don't want to be too much of a pain - I can see the way things have gone and my attitude is oh well but I thought I'd put the case anyhow: 1) sem_timedwait() in posix lets you timeout so in a big build when something crashes or just sits around, there is at least the option of printing an error message or giving up on locking from that point on which is a bit of a godsend if it happens overnight and there's no-one to restart the build. With stdio and file locks I don't know if this is a problem anyhow for crashes (are locks automatically released?) but the behaviour between windows and linux might be different. 2) Semaphores have the semantics you want a) the ability to lock b) system-wide c) named so that processes can access them. So it seems simpler to me to use that semantic as the model and implement it in whatever is the most faithful way for each platform. 3) As for POSIX semaphores, I found that one could do a lot by just having init and delete functions that created and destroyed them at the start and end of make execution. Only the parent make does this of course. You can make them live in /tmp which is volatile anyhow and you pre-delete ones that by incredible luck might exist before make started and have the same name as the one you want to use. 4) In a way this makes the file-lock and semaphore mechanisms look rather the same - files in /tmp. This seems easier to understand. It's no big deal, though. Regards, Tim On 25 April 2013 19:12, Paul Smith psm...@gnu.org wrote: On Thu, 2013-04-25 at 07:14 +0100, Tim Murphy wrote: To be honest, I have done all this before with named semaphores including the file that gets left over problem and it's all solvable quite nicely. You pass the build id in the environment which is, after all, what it's for. Sure. Given enough effort all programming problems are solvable. The question is what is the effort, and what is the benefit? At the moment to me it looks like the effort is not insignificant and I don't see the benefit. However, that may well just mean I haven't thought of something. Can you give a quick outline of how the cleanup / error handling might work, and what advantages using named semaphores has? Cheers! -- You could help some brave and decent people to have access to uncensored news by making a donation at: http://www.thezimbabwean.co.uk/friends/ ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: Fwd: [bug #33138] .PARLLELSYNC enhancement with patch
Date: Thu, 25 Apr 2013 19:36:28 +0100 From: Tim Murphy tnmur...@gmail.com Cc: bug-make@gnu.org bug-make@gnu.org 1) sem_timedwait() in posix lets you timeout so in a big build when something crashes or just sits around, there is at least the option of printing an error message or giving up on locking from that point on which is a bit of a godsend if it happens overnight and there's no-one to restart the build. How much would you use for the timeout, though? A sub-Make could legitimately run for a very long time, depending on what's in the Makefile. FWIW, I currently let Make wait forever for the mutex. With stdio and file locks I don't know if this is a problem anyhow for crashes (are locks automatically released?) Yes, locks are automatically released once the file descriptor is closed (which happens on a crash or exit). but the behaviour between windows and linux might be different. It is similar: when a process holding a mutex exits or crashes on Windows, the mutex is released. If another process was waiting for the mutex, it gets a special indication about such calamities (because in some cases of synchronization, that might mean the shared resource is in inconsistent state), but in our case, we don't care, at least the way I wrote the code in its current shape. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
Eli Zaretskii wrote: That's one way. Another one is to discuss the design more thoroughly before the patches are accepted. I think it was discussed quite extensively. Also in retrospect, I don't see how the design could have been much different (see below). I tried to turn the discussion that way when the issue was first brought up, but my comments were largely ignored (if I compare what was suggested then with what was eventually committed), and most of the discussions were about the details of the Unix implementation anyway. I just reread the whole discussion, and as far as I can see, most of your questions were answered. Already in http://lists.gnu.org/archive/html/bug-make/2011-04/msg00033.html David wrote: Yes, it's conceptually a semaphore. In fact a Windows port might prefer to use real semaphores. You wrote: : Yes, but a few words about how is this semaphore supposed to get job : done, and in fact what kind of synchronization will this bring to : Make, would be appreciated. I don't think you described the feature : too much in your original post. Just like David, I really don't know what more you want described. I can repeat what I wrote yesterday: The only thing really necessary is that two makes [with the same stdout/stderr] never run sync_output() simultaneously. (Note: We're only talking about makes, i.e. cooperating processes; we never need to synchronize the processes run from the recipes, apart from recursive makes.) That's really all there is to it, the rest is implementation details (and may be totally different on Windows). Perhaps you're thinking of it as much more complicated than is really is. It's just about the simplest possible use case of an inter-process mutual exclusion (all processes related, so they can pass info around; all cooperating; just one code block to lock). : Btw, there will be other side effects, at least on non-Posix : platforms, due to the fact that stuff that was supposed to go to the : screen is redirected to a file instead. Some programs sense that and : behave differently, e.g. with binary non-printable characters or with : special character sequences. By redirecting the output to files, then : displaying it on the screen, the output may look strangely, or have : some more grave effect, like terminating output prematurely. I don't see why it would terminate prematurely, but indeed it may display differently, also on Unix, e.g. ls --color=tty may use colors whwn writing to a terminal and not to a file. So if you use output-sync and your recipes contains this command, you lose colors. I'd say you have several options: - Live with it. - Change the command (in the ls example, use --color=always). - If the effect is more serious than losing colors (you say it may look strangely), maybe you shouldn't put such a command in a make recipe in the first place (at least not in a Makefile that's going to be distributed; note that the same problems will happen if the user redirects the whole make output to a file which I often do). - If the Makefile is only used in-house and you definitely want the output-device-dependent behaviour, just don't use output-sync. It's just an option. - Provide your own work-around, perhaps write your output to a pty. What I'm saying is the circumstances where this is a real problem seem rather exotic, and noone's forcing you to use the option. If we were talking about changing make's default behaviour, the concern may be justified, but we aren't. : We should consider these potential problems, and in any case the : simple loop that reads from temporary files and writes to : stdout/stderr may need to become much more complicated, at least : on DOS/Windows. (I believe similar, though less serious, problems : could happen on Unix as well.) Apart from setting O_BINARY, I don't see why. Can you explain? WRT O_BINARY, you wrote: : . pump_from_tmp_fd will need to switch to_fd to binary mode, since :this is how tmpfile opens the temporary file; we need output mode :match the input mode. I was about to add it, but now I wonder if that's really needed: If the temp file is opened in binary mode, and this temp file is passed as stdout/stderr to the shell commands, those will write their output in binary mode (i.e., without CR, AIUI). Later pump_from_tmp_fd() will read it back in binary and dump it to the original stdout/stderr in whatever mode it was (usually non-binary), so the CRs get added there. Or is there any special handling of stdout/stderr WRT binary mode during inheritance or so? I'm attaching a small patch to: - Factor out is_same_file() into a function. - Turn file_ok() and fd_not_empty() into functions instead of macros. (No real need for macros, and functions may be easier to port.) - Use SEEK_END rather than SEEK_CUR in fd_not_empty(), so it also reports true if the fd is at the beginning of a non-empty file (such as if the position is not shared between processes inheriting
Re: [bug #33138] .PARLLELSYNC enhancement with patch
Date: Wed, 24 Apr 2013 05:56:49 +0300 From: Eli Zaretskii e...@gnu.org Cc: f.heckenb...@fh-soft.de, bug-make@gnu.org I will see if locking works on console handles; if not, I will have to introduce a command-line argument for passing the name or the handle of a mutex to a sub-Make. As I suspected, neither LockFileEx nor LockFile work on console handles, they all fail with ERROR_INVALID_HANDLE. The functions do work on regular files, but since (as it turns out) file locks on Windows are mandatory, not advisory as on Posix platforms, they are not really suitable for this job anyway. So file locking is out, mutex is in. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
Date: Thu, 18 Apr 2013 21:57:56 +0300 From: Eli Zaretskii e...@gnu.org Cc: bug-make@gnu.org, bo...@kolpackov.net Date: Thu, 18 Apr 2013 19:09:06 +0200 From: Frank Heckenbach f.heckenb...@fh-soft.de . calculation of combined_output in start_job_command will need to be reimplemented for Windows, since the reliance on st_dev and st_ino makes assumptions that are false on Windows. What we need is basically is_same_file (fd1, fd2) (which we probably could refactor into a separate function). Eli, can you try: test file1 -ef file2 where test is either a standalone utility or a built-in of bash or any other shell. If any of these works, you can see how they do it. Otherwise, a bit of googling turned up this page which has code (in Python, but it seems like a thin layer around the system calls, so it can probably easily be ported to C): http://timgolden.me.uk/python/win32_how_do_i/see_if_two_files_are_the_same_file.html I know about this method, but I'm not sure it does what we want here. The number used by this method is not guaranteed to be unique on some versions of Windows. More importantly, it only works for disk files, I don't know whether it reports a meaningful value for the console device or the null device. Testing clearly shows it doesn't: GetFileInformationByHandle simply fails for handles open on console devices and the null device. And we will be comparing handles for console devices in the majority of use cases here. I researched this a bit, and eventually succeeded in finding a way that is good enough for both disk files, pipes, and console devices. . STREAM_OK uses fcntl and F_GETFD which both don't exist on Windows. This is to test if somehow make's stdout or stderr has been closed (not just redirected to /dev/null, but closed). Right, got that; but Windows needs a different test for that, as there's no fcntl. Basically we need just any call that succees for a valid file and fails otherwise. fstat() might be a good candidate. fstat is too heavy on Windows; there are better methods. If that isn't available, maybe even lseek() (as used in FD_NOT_EMPTY) will do if we keep the || errno != EBADF test. lseek fails for the console as well, so it's not a good candidate. But this problem is not difficult, I just mentioned that the code as committed won't compile. Eli, can you do some tests to see if you find something that works reliably? Don't worry about that. My worst problem is lack of time, not lack of ideas ;-) Found a reasonable solution for that as well. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
Eli Zaretskii wrote: I know about this method, but I'm not sure it does what we want here. The number used by this method is not guaranteed to be unique on some versions of Windows. More importantly, it only works for disk files, I don't know whether it reports a meaningful value for the console device or the null device. Testing clearly shows it doesn't: GetFileInformationByHandle simply fails for handles open on console devices and the null device. And we will be comparing handles for console devices in the majority of use cases here. That's right. Or perhaps pipes (I post-process make's output, to handle the directories in messages etc. and I suppose the mentioned Emacs mode does something similar). I researched this a bit, and eventually succeeded in finding a way that is good enough for both disk files, pipes, and console devices. Good. (And the null device, I suppose. Though it doesn't really matter if we write to the null device combined or separate. ;-) . STREAM_OK uses fcntl and F_GETFD which both don't exist on Windows. Basically we need just any call that succees for a valid file and fails otherwise. Found a reasonable solution for that as well. OK, so only one problem left. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
Some time ago when solving the same problem in a different way we used semaphores on Windows and Linux. Compatibility might make it less interesting but I would suggest pretending that one has semaphores first with some nice little abstraction and then implementing them in the best way the platform has to offer. We didn't need to support so many platforms but you can see the idea here: https://bitbucket.org/tnmurphy/raptor/src/fbb2e624d320e5eabc0689105e2f2b80d131ca03/util/talon/sema.h?at=default Basically each build needs a unique id to be passed around in an environment variable. The top-level make executable's PID should be good enough. This is used to create the named semaphore. Platforms without semaphores might use a file and locking or if that doesn't work then they can't implement the feature and you configure it out. Regards, Tim On 24 April 2013 18:14, Frank Heckenbach f.heckenb...@fh-soft.de wrote: Eli Zaretskii wrote: I know about this method, but I'm not sure it does what we want here. The number used by this method is not guaranteed to be unique on some versions of Windows. More importantly, it only works for disk files, I don't know whether it reports a meaningful value for the console device or the null device. Testing clearly shows it doesn't: GetFileInformationByHandle simply fails for handles open on console devices and the null device. And we will be comparing handles for console devices in the majority of use cases here. That's right. Or perhaps pipes (I post-process make's output, to handle the directories in messages etc. and I suppose the mentioned Emacs mode does something similar). I researched this a bit, and eventually succeeded in finding a way that is good enough for both disk files, pipes, and console devices. Good. (And the null device, I suppose. Though it doesn't really matter if we write to the null device combined or separate. ;-) . STREAM_OK uses fcntl and F_GETFD which both don't exist on Windows. Basically we need just any call that succees for a valid file and fails otherwise. Found a reasonable solution for that as well. OK, so only one problem left. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make -- You could help some brave and decent people to have access to uncensored news by making a donation at: http://www.thezimbabwean.co.uk/friends/ ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
On Wed, Apr 24, 2013 at 10:26 AM, Tim Murphy tnmur...@gmail.com wrote: I would suggest pretending that one has semaphores first with some nice little abstraction and then implementing them in the best way the platform has to offer. How about we introduce functions called acquire_semaphore() and release_semaphore()? Oh, wait, we did. Platforms without semaphores might use a file and locking or if that doesn't work then they can't implement the feature and you configure it out. This kind of (re)invention is not known to be needed at this time. I have the impression Eli has the Windows port in hand, he's just skeptical about a few details. And what's there now is already the LCD for most other platforms as far as I know. David ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
Date: Wed, 24 Apr 2013 18:50:15 +0200 Cc: bug-make@gnu.org From: Frank Heckenbach f.heckenb...@fh-soft.de That's one way. Another one is to discuss the design more thoroughly before the patches are accepted. I think it was discussed quite extensively. Also in retrospect, I don't see how the design could have been much different (see below). Fine, then this discussion should probably be terminated. The last thing I want is to annoy people. If you think that everything is and was like it should have, then I will have to live with that. It's not the first time, either, and Make is not the first project where I saw this happening. I tried to turn the discussion that way when the issue was first brought up, but my comments were largely ignored (if I compare what was suggested then with what was eventually committed), and most of the discussions were about the details of the Unix implementation anyway. I just reread the whole discussion, and as far as I can see, most of your questions were answered. Answered, yes. But the real reason behind my questions was not really to have them answered. It was to affect the design and the implementation of the feature. In that, I failed. You might argue that I failed because there was nothing to change anyway, but I beg to differ. : Yes, but a few words about how is this semaphore supposed to get job : done, and in fact what kind of synchronization will this bring to : Make, would be appreciated. I don't think you described the feature : too much in your original post. Just like David, I really don't know what more you want described. Nothing. I already understood what's there to understand, thanks. I'm half way into implementing this on Windows, which wouldn't have been possible _unless_ I understood this completely. The problems, whatever they were, and the reasons for my rant are history now, i.e. immutable and uncorrectable. We can only behave differently in the future. Or not. : Btw, there will be other side effects, at least on non-Posix : platforms, due to the fact that stuff that was supposed to go to the : screen is redirected to a file instead. Some programs sense that and : behave differently, e.g. with binary non-printable characters or with : special character sequences. By redirecting the output to files, then : displaying it on the screen, the output may look strangely, or have : some more grave effect, like terminating output prematurely. I don't see why it would terminate prematurely It was long ago, but I presume I thought about the ^Z character that some programs write or interpret to signal end of file. Writing that to the console stops output, AFAIR. - Change the command (in the ls example, use --color=always). This will only work on a Posix console. The Windows console doesn't support the SGR sequences, and will display them as ugly binary garbage. (When 'ls' or 'grep' run on Windows and their stdout is connected to a console, they use special console output commands to produce color, but those commands cannot be recorded in a file and replayed, as you can with SGR sequences.) What I'm saying is the circumstances where this is a real problem seem rather exotic, and noone's forcing you to use the option. If we were talking about changing make's default behaviour, the concern may be justified, but we aren't. I just mentioned some caveats that people might find surprising, that's all. Perhaps those caveats should be mentioned in the documentation. No other intentions. : We should consider these potential problems, and in any case the : simple loop that reads from temporary files and writes to : stdout/stderr may need to become much more complicated, at least : on DOS/Windows. (I believe similar, though less serious, problems : could happen on Unix as well.) Apart from setting O_BINARY, I don't see why. Can you explain? You just explained it yourself. Sure, if we don't want to do anything about those issues, then no complications are needed. WRT O_BINARY, you wrote: : . pump_from_tmp_fd will need to switch to_fd to binary mode, since :this is how tmpfile opens the temporary file; we need output mode :match the input mode. I was about to add it Please don't. I'm actively working on this, will test it, and hope to commit the changes in a few days. Making such changes in parallel will just make my merging harder. but now I wonder if that's really needed: If the temp file is opened in binary mode, and this temp file is passed as stdout/stderr to the shell commands, those will write their output in binary mode No, they won't. The text/binary mode is per process (and per file descriptor in the process), it is not global and not per resource. That's because the text mode is implemented in the C library and uses data private to each process. Therefore, even if the handle is inherited by child processes, those processes will not magically start
Re: [bug #33138] .PARLLELSYNC enhancement with patch
Date: Wed, 24 Apr 2013 19:14:01 +0200 Cc: bug-make@gnu.org From: Frank Heckenbach f.heckenb...@fh-soft.de Testing clearly shows it doesn't: GetFileInformationByHandle simply fails for handles open on console devices and the null device. And we will be comparing handles for console devices in the majority of use cases here. That's right. Or perhaps pipes (I post-process make's output, to handle the directories in messages etc. and I suppose the mentioned Emacs mode does something similar). GetFileInformationByHandle does work on pipes, at least on shell pipes (as in foo | cat). I researched this a bit, and eventually succeeded in finding a way that is good enough for both disk files, pipes, and console devices. Good. (And the null device, I suppose. Though it doesn't really matter if we write to the null device combined or separate. ;-) No, I didn't find a satisfactory solution for the null device. But as you point out, we don't care. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
Date: Wed, 24 Apr 2013 10:34:20 -0700 From: David Boyce david.s.bo...@gmail.com Cc: Frank Heckenbach f.heckenb...@fh-soft.de, bug-make@gnu.org bug-make@gnu.org On Wed, Apr 24, 2013 at 10:26 AM, Tim Murphy tnmur...@gmail.com wrote: I would suggest pretending that one has semaphores first with some nice little abstraction and then implementing them in the best way the platform has to offer. How about we introduce functions called acquire_semaphore() and release_semaphore()? Oh, wait, we did. Sorry, but this is not funny. There are, indeed, such functions, but that's about all there is about modularity in the implementation. The rest is code scattered all over the place, that _knows_ all too well that it is dealing with file descriptors, not with some abstract semaphores or mutexes. Here's a good example: if (output_sync) { static int combined_output; /* If output_sync is turned on, find a resource to synchronize on. This block is traversed only once. */ if (sync_handle == -1) { if (STREAM_OK (stdout)) { struct stat stbuf_o, stbuf_e; sync_handle = fileno (stdout); combined_output = fstat (fileno (stdout), stbuf_o) == 0 fstat (fileno (stderr), stbuf_e) == 0 stbuf_o.st_dev == stbuf_e.st_dev stbuf_o.st_ino == stbuf_e.st_ino; } else if (STREAM_OK (stderr)) sync_handle = fileno (stderr); else { perror_with_name (output-sync suppressed: , stderr); output_sync = 0; } } There are 2 separate things intermixed here: determination of combined_output and determination of the resource to use as a mutex. They are mixed because they both deal with file descriptors, and it was just convenient, by sheer luck (or maybe because of something else) to save a few if's and do them together. But this is exactly where abstractions leak and break. It will come as small surprise, I'm sure, that the corresponding code in the Windows branch looks different, because there the two decisions are uncoupled. I have the impression Eli has the Windows port in hand Almost, yes. I needed to invest some time into researching how to do is_same_file, STREAM_OK, CLOSE_ON_EXEC, etc. But I'm getting there. he's just skeptical about a few details. Not anymore, thanks. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
On Wed, 2013-04-24 at 21:17 +0300, Eli Zaretskii wrote: There's one issue that perhaps needs discussing. A mutex is identified by a handle, which on Windows is actually a pointer to an opaque object (maintained by the kernel). As such, using just 'int' for sync_handle is not wide enough, certainly not in 64-bit builds. Is it OK to use intptr_t instead? Doing this cleanly might require to have a macro (I called it FD_FROM_HANDLE) to extract a file descriptor that is passed to a Posix fcntl; on Windows that macro will be a no-op, and on Posix platforms it's a simple cast. Is this OK? I think the lock or lock handle or whatever can be encapsulated into a typedef, which will be different between the different platforms. That sounds like a good abstraction to me. We can even generalize the way in which we communicate the handle to sub-makes; for example by calling a function with the handle that returns a char*: if that value is non-NULL it's added to the command line (maybe as a third element to the current jobs-fd argument). If it's null, nothing is added. Then the submake can parse it out and hand it back to a function that returns a handle again (serialize/deserialize). I'm not sure if this is necessary; it depends on the details of the Windows model. +/* Test whether a file contains any data. */ +static int +fd_not_empty (int fd) +{ + return fd = 0 lseek (fd, 0, SEEK_END) 0; +} Isn't this unnecessarily expensive (with large output volumes)? Why not use fstat? This lseek() doesn't actually move the file reference: SEEK_END plus an offset of 0 is a no-op so it doesn't matter how large the file is. This is just seeing if the position has moved since we opened the file (still at 0 or not); it just returns the current position in the file, which is known to the system directly without having to go ask anyone (it has to be so, since each file descriptor has its own position). I would be greatly surprised if fstat(), which has to go to the directory (probably) to look up all the information on the file such as ownership, permissions, etc., is faster. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
Eli Zaretskii wrote: : Btw, there will be other side effects, at least on non-Posix : platforms, due to the fact that stuff that was supposed to go to the : screen is redirected to a file instead. Some programs sense that and : behave differently, e.g. with binary non-printable characters or with : special character sequences. By redirecting the output to files, then : displaying it on the screen, the output may look strangely, or have : some more grave effect, like terminating output prematurely. I don't see why it would terminate prematurely It was long ago, but I presume I thought about the ^Z character that some programs write or interpret to signal end of file. Writing that to the console stops output, AFAIR. Oh, that's still done? When I last used Dos in the 1990s it was already obsolete and few programs did it. OK, so you'll have to strip them. Fortunately, that's not too hard, I assume you have already implemented it. What I'm saying is the circumstances where this is a real problem seem rather exotic, and noone's forcing you to use the option. If we were talking about changing make's default behaviour, the concern may be justified, but we aren't. I just mentioned some caveats that people might find surprising, that's all. Perhaps those caveats should be mentioned in the documentation. No other intentions. OK, we can add a few sentences in the docs. Have you already written something (just to avoid further dupliation of work)? - Factor out is_same_file() into a function. I already wrote this in my branch, please wait until I commit. - Turn file_ok() and fd_not_empty() into functions instead of macros. (No real need for macros, and functions may be easier to port.) Fixed that as well here. OK, so Paul now has two versions to choose from. :-) Really, I just tried to be helpful. Since you were constantly complaining about the design, I did the (small) things I saw to improve about it. Other than that I didn't (and still don't) know what's so bad about the design. There's one issue that perhaps needs discussing. A mutex is identified by a handle, which on Windows is actually a pointer to an opaque object (maintained by the kernel). And the pointer is the same between different processes? As such, using just 'int' for sync_handle is not wide enough, certainly not in 64-bit builds. Is it OK to use intptr_t instead? Doing this cleanly might require to have a macro (I called it FD_FROM_HANDLE) to extract a file descriptor that is passed to a Posix fcntl; on Windows that macro will be a no-op, and on Posix platforms it's a simple cast. Is this OK? I can't follow you here. On POSIX, we don't need to pass a fd because it's always stdout/stderr. Or do mean something else? +/* Test whether a file contains any data. */ +static int +fd_not_empty (int fd) +{ + return fd = 0 lseek (fd, 0, SEEK_END) 0; +} Isn't this unnecessarily expensive (with large output volumes)? Why not use fstat? On POSIX both lseek and fstat are not very expensive. On Windows, you said fstat was very expensive, didn't you? Or is lseek even worse? How about we introduce functions called acquire_semaphore() and release_semaphore()? Oh, wait, we did. Sorry, but this is not funny. There are, indeed, such functions, but that's about all there is about modularity in the implementation. The rest is code scattered all over the place, that _knows_ all too well that it is dealing with file descriptors, not with some abstract semaphores or mutexes. Here's a good example: [...] There are 2 separate things intermixed here: determination of combined_output and determination of the resource to use as a mutex. They are mixed because they both deal with file descriptors, and it was just convenient, by sheer luck (or maybe because of something else) to save a few if's and do them together. I talked about this yesterday or so. I still don't see why you can't keep this code as it. You can use sync_handle as a flag to make sure the code is executed only once and ignore it for the locking purpose. Of course, you can rewrite the code to separate the two things, but I still don't see that it's necessary. The checks to be done (file_ok, is_same_file) are the same anyway, so I don't see it as a design flaw. Paul Smith wrote: +/* Test whether a file contains any data. */ +static int +fd_not_empty (int fd) +{ + return fd = 0 lseek (fd, 0, SEEK_END) 0; +} Isn't this unnecessarily expensive (with large output volumes)? Why not use fstat? This lseek() doesn't actually move the file reference: SEEK_END plus an offset of 0 is a no-op so it doesn't matter how large the file is. This is just seeing if the position has moved since we opened the file (still at 0 or not); it just returns the current position in the file, which is known to the system directly without having to go ask anyone (it has to be so,
Re: [bug #33138] .PARLLELSYNC enhancement with patch
I'm fully prepared to accept the blame for not doing the best job getting buy-in etc. on this effort. Can we leave the discussion on the process behind? I'd prefer that, unless there are real constructive comments on how to do better next time rather than rehashing what was done wrong. I think we have discussed that enough, and I'd prefer to avoid going further along that road. Thanks. On Wed, 2013-04-24 at 20:46 +0200, Frank Heckenbach wrote: That's true about SEEK_CUR which was there originally. I actually changed it to SEEK_END, which does move the position to the end. Oh right. My head cold is keeping me foggy. What was the reason to change to SEEK_END, again? I'm not so sure fstat() is that cheap. struct stat contains a lot of information. Although I guess since we are only ever talking about temp files, not NFS files or something, it's probably not too bad. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
From: Paul Smith psm...@gnu.org Cc: e...@gnu.org, bug-make@gnu.org Date: Wed, 24 Apr 2013 15:07:21 -0400 I'm not so sure fstat() is that cheap. struct stat contains a lot of information. Although I guess since we are only ever talking about temp files, not NFS files or something, it's probably not too bad. We could time it if we are afraid of the cost, but I'd be surprised if 'fstat' wasn't extremely fast on Posix platforms. Most of the information you get in struct stat is already available when the file is open. In particular, the OS tracks the file's size as it is being written. (On Windows, we will have to use a different method.) ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
Date: Wed, 24 Apr 2013 20:46:14 +0200 Cc: p...@mad-scientist.net, bug-make@gnu.org From: Frank Heckenbach f.heckenb...@fh-soft.de I don't see why it would terminate prematurely It was long ago, but I presume I thought about the ^Z character that some programs write or interpret to signal end of file. Writing that to the console stops output, AFAIR. Oh, that's still done? When I last used Dos in the 1990s it was already obsolete and few programs did it. OK, so you'll have to strip them. Fortunately, that's not too hard, I assume you have already implemented it. No need: if the file is read and written in binary mode, ^Z is just a character like any other. I just mentioned some caveats that people might find surprising, that's all. Perhaps those caveats should be mentioned in the documentation. No other intentions. OK, we can add a few sentences in the docs. Have you already written something (just to avoid further dupliation of work)? No. Go ahead, if you have time. Really, I just tried to be helpful. I realize, and I am grateful. The discussion was very helpful. There's one issue that perhaps needs discussing. A mutex is identified by a handle, which on Windows is actually a pointer to an opaque object (maintained by the kernel). And the pointer is the same between different processes? I arrange for it to be inherited, therefore yes. This way, I don't have to invent a unique name and don't risk a slim chance that some stale process left a mutex by the same name around. (Still need to make sure that nameless mutexes will do the job in this case, though.) As such, using just 'int' for sync_handle is not wide enough, certainly not in 64-bit builds. Is it OK to use intptr_t instead? Doing this cleanly might require to have a macro (I called it FD_FROM_HANDLE) to extract a file descriptor that is passed to a Posix fcntl; on Windows that macro will be a no-op, and on Posix platforms it's a simple cast. Is this OK? I can't follow you here. On POSIX, we don't need to pass a fd because it's always stdout/stderr. Or do mean something else? I meant the file descriptor passed to fcntl to put a lock on it. The value of that descriptor is stored in sync_handle, whose type is 'int'. +/* Test whether a file contains any data. */ +static int +fd_not_empty (int fd) +{ + return fd = 0 lseek (fd, 0, SEEK_END) 0; +} Isn't this unnecessarily expensive (with large output volumes)? Why not use fstat? On POSIX both lseek and fstat are not very expensive. But fstat should be cheaper, as it already tracks the size of the file, which is what you want here. On Windows, you said fstat was very expensive, didn't you? Or is lseek even worse? I think anything that potentially moves the file pointer can be sometimes expensive and is best avoided. (On Windows, I'd use GetFileInformationByHandle.) There are 2 separate things intermixed here: determination of combined_output and determination of the resource to use as a mutex. They are mixed because they both deal with file descriptors, and it was just convenient, by sheer luck (or maybe because of something else) to save a few if's and do them together. I talked about this yesterday or so. I still don't see why you can't keep this code as it. Because the Windows implementation doesn't care whether stdout or stderr (or both) are OK. It doesn't use their file descriptor for locking, so it doesn't care. It just needs to know, for the purposes of combined_output, whether they point to the same file/device. But for determining what to use for sync_handle, this is irrelevant. So the logic looks different. You can use sync_handle as a flag to make sure the code is executed only once and ignore it for the locking purpose. I need to store the handle for the mutex somewhere, and sync_handle sounds just like the right place. Nothing is actually read by lseek (and even if it were, it would only need to look at the first and last part of the file, not read all the content, if that was the worry). Are you sure? How can lseek jump to the last byte of the file, if the file is not contiguous on disk, except by reading some of it? Anyway, we should time this instead of arguing. Or maybe we should abandon this optimization and take the lock regardless. How bad can that be? ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
On Wed, 2013-04-24 at 22:25 +0300, Eli Zaretskii wrote: From: Paul Smith psm...@gnu.org Cc: e...@gnu.org, bug-make@gnu.org Date: Wed, 24 Apr 2013 15:07:21 -0400 I'm not so sure fstat() is that cheap. struct stat contains a lot of information. Although I guess since we are only ever talking about temp files, not NFS files or something, it's probably not too bad. We could time it if we are afraid of the cost, but I'd be surprised if 'fstat' wasn't extremely fast on Posix platforms. Most of the information you get in struct stat is already available when the file is open. In particular, the OS tracks the file's size as it is being written. True. It's probably all right there and not measurably different if you have a file descriptor already. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
On Wed, 2013-04-24 at 22:39 +0300, Eli Zaretskii wrote: Nothing is actually read by lseek (and even if it were, it would only need to look at the first and last part of the file, not read all the content, if that was the worry). Are you sure? How can lseek jump to the last byte of the file, if the file is not contiguous on disk, except by reading some of it? If fstat() can get the size from an internal structure then lseek() can do the same, then just update the file descriptor's position. I don't think there's more to it than setting that value, but it could be. Certainly at the filesystem layer we don't know, and we don't care, about things like whether the file is kept contiguously at the block layer. As you say, we should just measure. Or maybe we should abandon this optimization and take the lock regardless. How bad can that be? Well, we want to know if the file has any content anyway: for example we don't want to output the enter/leave notifications if there's nothing to print. So there's no extra cost to avoiding the lock here. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
why not use a named semaphore wherever possible (windows and linux) and lock a file where not instead of trying to pass kernel object handles around (seems a bit nasty to me)? On 24 April 2013 21:19, Paul Smith psm...@gnu.org wrote: On Wed, 2013-04-24 at 22:39 +0300, Eli Zaretskii wrote: Nothing is actually read by lseek (and even if it were, it would only need to look at the first and last part of the file, not read all the content, if that was the worry). Are you sure? How can lseek jump to the last byte of the file, if the file is not contiguous on disk, except by reading some of it? If fstat() can get the size from an internal structure then lseek() can do the same, then just update the file descriptor's position. I don't think there's more to it than setting that value, but it could be. Certainly at the filesystem layer we don't know, and we don't care, about things like whether the file is kept contiguously at the block layer. As you say, we should just measure. Or maybe we should abandon this optimization and take the lock regardless. How bad can that be? Well, we want to know if the file has any content anyway: for example we don't want to output the enter/leave notifications if there's nothing to print. So there's no extra cost to avoiding the lock here. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make -- You could help some brave and decent people to have access to uncensored news by making a donation at: http://www.thezimbabwean.co.uk/friends/ ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
On Wed, 2013-04-24 at 22:55 +0100, Tim Murphy wrote: why not use a named semaphore wherever possible (windows and linux) and lock a file where not instead of trying to pass kernel object handles around (seems a bit nasty to me)? Hi Tim; I think you're late to the party :-). Let me summarize a lot of discussion then we can use this as a reference for other questions. Named semaphores on POSIX are kind of sucky. They don't get automatically cleaned up when the process exits, for one thing. And my suspicion is that they're not very portable across different variations of UNIX especially older ones. File locking, on the other hand, is very portable, very simple, very fast, and any held locks are automatically released when the descriptor is closed. No muss, no fuss. If you agree file locking is the right way to go on a POSIX system, then we just have to decide what to lock. It can be anything as long as all children of a given top-level make can find it. We could use a temporary file, sure. If we did that we'd have to pass around either the name of the file or, more likely, use tmpfile() and send along the file descriptor, handle close-on-exec, etc. like we do with the jobserver pipe. Which we could certainly do: we do it today with jobserver. However we already have a descriptor available to us: stdout. If we use that we don't have to pass around anything because our children are already inheriting it and it's on a well-known FD. We don't have to worry about using + before sub-make rules like we do with jobserver. There's one other advantage of using stdout: if a sub-make redirects its output to a separate file, that magically starts a new lock domain and that sub-make and its children don't contend with the other sub-makes and their children. This is not a big deal and if we didn't get it for free, we'd never go to the effort of implementing it. But it's nice. Eli is working on the Windows port; I have no idea how he's decided to implement this there yet. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
Paul Smith wrote: On Wed, 2013-04-24 at 20:46 +0200, Frank Heckenbach wrote: That's true about SEEK_CUR which was there originally. I actually changed it to SEEK_END, which does move the position to the end. Oh right. My head cold is keeping me foggy. What was the reason to change to SEEK_END, again? lseek (SEEK_END, 0) seeks to the end and then returns the current position, i.e. the file size. lseek (SEEK_CUR, 0) stays at the current position and returns it. So if the fd (in the parent) remains at the start while the child wrote something, it would return 0, whereas the former wouldn't. AFAIK, lseek (SEEK_END, 0) is one of two canonical ways (besides fstat) to get the size of a file. Eli Zaretskii wrote: I just mentioned some caveats that people might find surprising, that's all. Perhaps those caveats should be mentioned in the documentation. No other intentions. OK, we can add a few sentences in the docs. Have you already written something (just to avoid further dupliation of work)? No. Go ahead, if you have time. Attached. As such, using just 'int' for sync_handle is not wide enough, certainly not in 64-bit builds. Is it OK to use intptr_t instead? Doing this cleanly might require to have a macro (I called it FD_FROM_HANDLE) to extract a file descriptor that is passed to a Posix fcntl; on Windows that macro will be a no-op, and on Posix platforms it's a simple cast. Is this OK? I can't follow you here. On POSIX, we don't need to pass a fd because it's always stdout/stderr. Or do mean something else? I meant the file descriptor passed to fcntl to put a lock on it. The value of that descriptor is stored in sync_handle, whose type is 'int'. So you want to overload sync_handle to contain an fd on POSIX and a pointer on Windows? I'm not sure I'd like this. Since it's used in separate code branches (ifdef'd or perhaps in different source files in compat) anyway, why not define separate variables of the appropriate types (preferably with different names, to reduce confusion)? On Windows, you said fstat was very expensive, didn't you? Or is lseek even worse? I think anything that potentially moves the file pointer can be sometimes expensive and is best avoided. (On Windows, I'd use GetFileInformationByHandle.) OK, if that's so, do that. But I don't think that's true on POSIX. Nothing is actually read by lseek (and even if it were, it would only need to look at the first and last part of the file, not read all the content, if that was the worry). Are you sure? How can lseek jump to the last byte of the file, if the file is not contiguous on disk, except by reading some of it? lseek doesn't need to read any data. It just sets the current offset of the FD to the given position, so the next read (which in this case never happens before seeking to the beginning) knows where to read. Even in the case of SEEK_END, all it has to do is add the given offset (here: 0) to the current file size. Instead of testing, I just looked at the implementation (Linux 3.2.2). The following is really the whole relevant code. As you see, nothing's read from the disk, it only handles in-memory data. (Also the file size is in memory for open files; even it were not, it would be a constant-time access to the inode and wouldn't need to touch any data blocks.) loff_t generic_file_llseek_size(struct file *file, loff_t offset, int origin, loff_t maxsize) { struct inode *inode = file-f_mapping-host; switch (origin) { case SEEK_END: offset += i_size_read(inode); break; // skipped other cases } return lseek_execute(file, inode, offset, maxsize); } static loff_t lseek_execute(struct file *file, struct inode *inode, loff_t offset, loff_t maxsize) { if (offset 0 !unsigned_offsets(file)) return -EINVAL; if (offset maxsize) return -EINVAL; if (offset != file-f_pos) { file-f_pos = offset; file-f_version = 0; } return offset; } static inline int unsigned_offsets(struct file *file) { return file-f_mode FMODE_UNSIGNED_OFFSET; } static inline loff_t i_size_read(const struct inode *inode) { // skipped code to deal with preemption and synchronized access in SMP return inode-i_size; } --- doc/make.texi.orig 2013-04-24 16:30:05.0 +0200 +++ doc/make.texi 2013-04-25 02:08:58.0 +0200 @@ -8639,6 +8639,17 @@ complex parallel build in the background and checking its output afterwards. +Note that with this option, each job's output is redirected to a +temporary file first. Some commands can behave differently depending +on the type of their standard output or standard error. E.g., +@code{ls --color=tty} might display a colored directory listing when +standard output is connected to a terminal.
Re: [bug #33138] .PARLLELSYNC enhancement with patch
From: Paul Smith p...@mad-scientist.net Cc: Frank Heckenbach f.heckenb...@fh-soft.de, bug-make@gnu.org Date: Wed, 24 Apr 2013 16:03:56 -0400 Or maybe we should abandon this optimization and take the lock regardless. How bad can that be? Well, we want to know if the file has any content anyway: for example we don't want to output the enter/leave notifications if there's nothing to print. Why not? ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
Eli Zaretskii wrote: From: Paul Smith p...@mad-scientist.net Cc: Frank Heckenbach f.heckenb...@fh-soft.de, bug-make@gnu.org Date: Wed, 24 Apr 2013 16:03:56 -0400 Or maybe we should abandon this optimization and take the lock regardless. How bad can that be? Well, we want to know if the file has any content anyway: for example we don't want to output the enter/leave notifications if there's nothing to print. Why not? To avoid clutter. There are probably still too many of these messages anyway, as discussed before. Note, we still give the regular enter/leave messages as we do without output-sync, so no information is lost. We just avoid additional enter/leave messages with nothing in between. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
Date: Wed, 24 Apr 2013 22:55:59 +0100 From: Tim Murphy tnmur...@gmail.com Cc: bug-make@gnu.org bug-make@gnu.org trying to pass kernel object handles around (seems a bit nasty to me)? Why is that nasty? The handle is returned by a documented interface, and communicating it to another process is an officially documented technique. How is that different from passing file descriptors, which are indices into a kernel-maintained array? ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
Date: Thu, 25 Apr 2013 02:16:33 +0200 Cc: e...@gnu.org, bug-make@gnu.org From: Frank Heckenbach f.heckenb...@fh-soft.de I can't follow you here. On POSIX, we don't need to pass a fd because it's always stdout/stderr. Or do mean something else? I meant the file descriptor passed to fcntl to put a lock on it. The value of that descriptor is stored in sync_handle, whose type is 'int'. So you want to overload sync_handle to contain an fd on POSIX and a pointer on Windows? I don't see it as an overloading. 'sync_handle' is already opaque, even its name says so. It just so happens that on Unix it is actually a file descriptor. And it just so happens that on Windows it will be a handle to a mutex. I'm not sure I'd like this. Since it's used in separate code branches (ifdef'd or perhaps in different source files in compat) anyway, why not define separate variables of the appropriate types (preferably with different names, to reduce confusion)? I don't see any confusion (how probable is it that someone who is interested in Unix will ever look at the details of the Windows implementation, and vice versa?). And we never do that in Make, we generally try to use the same variables. If nothing else, it keeps the number of ifdef's down. --- doc/make.texi.orig2013-04-24 16:30:05.0 +0200 +++ doc/make.texi 2013-04-25 02:08:58.0 +0200 @@ -8639,6 +8639,17 @@ complex parallel build in the background and checking its output afterwards. +Note that with this option, each job's output is redirected to a +temporary file first. Some commands can behave differently depending +on the type of their standard output or standard error. E.g., +@code{ls --color=tty} might display a colored directory listing when +standard output is connected to a terminal. When using this option, +colors would be disabled because the output of the command goes to a +file. Note, however, that it is generally best to avoid such +commands in makefiles, independent of this option, since the +different behavior would also be triggered when users redirect the +whole output of @code{make}, e.g. to a log file. Thanks. I think you need two spaces between sentences (that's GNU Coding Standards' requirement), and command lines people type should be in @kbd, not @code. Also, e.g. needs either a comma or a @: after it, to signal to TeX it's not an end of a sentence. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
Eli Zaretskii wrote: Date: Thu, 25 Apr 2013 02:16:33 +0200 Cc: e...@gnu.org, bug-make@gnu.org From: Frank Heckenbach f.heckenb...@fh-soft.de I can't follow you here. On POSIX, we don't need to pass a fd because it's always stdout/stderr. Or do mean something else? I meant the file descriptor passed to fcntl to put a lock on it. The value of that descriptor is stored in sync_handle, whose type is 'int'. So you want to overload sync_handle to contain an fd on POSIX and a pointer on Windows? I don't see it as an overloading. 'sync_handle' is already opaque, even its name says so. Alright, in this case I'd suggest renaming it to sync_fd on the POSIX side. It just so happens that on Unix it is actually a file descriptor. And it just so happens that on Windows it will be a handle to a mutex. I'm not sure I'd like this. Since it's used in separate code branches (ifdef'd or perhaps in different source files in compat) anyway, why not define separate variables of the appropriate types (preferably with different names, to reduce confusion)? I don't see any confusion (how probable is it that someone who is interested in Unix will ever look at the details of the Windows implementation, and vice versa?). I often grep through sources in order to quickly find all places a declaration is used. And we never do that in Make, we generally try to use the same variables. If nothing else, it keeps the number of ifdef's down. My suggestion was under the assumption that we use different declarations because of different types anyway -- which I still recommend, though Paul might have to decide this one. If I understand it correctly, roughly we'd have now: intptr_t sync_handle; [...] #ifdef POSIX sync_handle = (intptr_t) fileno (stdout); #else sync_handle = (intptr_t) get_mutex_handle (...); #endif [...] #ifdef POSIX fcntl ((int) sync_handle, ...); #else lock_mutex ((mutex *) sync_handle); #endif I.e., everything about it is separate, apart from the declaration, and the latter uses a type that isn't quite right for either one. Though I generally like reducing the number of ifdefs, in this case it seems more consistent to split the declaration, so it's clear that these are really two different things. As a side benefit, if in the future someone tried to use it in common code, they would get compile-time errors on the other platform, rather than silently producing wrong code (like applying some fd operation to a mutex handle). Thanks. I think you need two spaces between sentences (that's GNU Coding Standards' requirement), and command lines people type should be in @kbd, not @code. Also, e.g. needs either a comma or a @: after it, to signal to TeX it's not an end of a sentence. Changed. --- doc/make.texi.orig 2013-04-24 16:30:05.0 +0200 +++ doc/make.texi 2013-04-25 02:08:58.0 +0200 @@ -8639,6 +8639,17 @@ complex parallel build in the background and checking its output afterwards. +Note that with this option, each job's output is redirected to a +temporary file first. Some commands can behave differently depending +on the type of their standard output or standard error. E.g., +@kbd{ls --color=tty} might display a colored directory listing when +standard output is connected to a terminal. When using this option, +colors would be disabled because the output of the command goes to a +file. Note, however, that it is generally best to avoid such +commands in makefiles, independent of this option, since the +different behavior would also be triggered when users redirect the +whole output of @code{make}, e.g.@: to a log file. + @item -q @cindex @code{-q} @itemx --question ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
Date: Fri, 19 Apr 2013 11:54:05 +0200 Cc: bo...@kolpackov.net, bug-make@gnu.org From: Frank Heckenbach f.heckenb...@fh-soft.de Eli Zaretskii wrote: Initial investigation indicates that tmpfile should do the job just fine: the file is deleted only when the last descriptor for it is closed. That includes any duplicated descriptors. Great. As for fcntl, F_SETLKW, and F_GETFD, they will need to be emulated. In particular, it looks like LockFileEx with LOCKFILE_EXCLUSIVE_LOCK flag set and LOCKFILE_FAIL_IMMEDIATELY flag cleared should do the job. I will need to see how it works in reality, though. OK. Upon a second look, I'm not sure I understand how this feature works, exactly, and why you-all thought making it work on Windows is a matter of a few functions. I sincerely hope I'm missing something, please bear with me. First, most of the meat of OUTPUT_SYNC code, which sets up the stage when running child jobs, is in a branch that isn't compiled on Windows (#if !defined(__MSDOS__) !defined(_AMIGA) !defined(WINDOWS32) on line 1482 of job.c). So currently that part is not even run on Windows. Please tell me that nothing in this feature relies on 'fork', with its copying of handles and other data structures. Because if it does, we have no hope of making it work on Windows, at least not using the same algorithms as on Unix. More importantly, how exactly locking the (redirected) stdout/stderr of the child is supposed to cause synchronization, and why do we need it at all? Isn't synchronization already achieved by redirecting child's output to a file, and only dumping it to screen when the child exits? What does lock add to this? Who else will be writing what to where, that we want to prevent by holding the lock/semaphore? In an old thread, Paul explained something similar: David, can you explain why you needed to lock the files? Also, what region(s) of the file you are locking? fcntl with F_WRLCK won't work on Windows, so the question is how to emulate it. David wants to interlock between ALL instances of make printing output, so that even during recursive makes no matter how many you have running concurrently, only one will print its output at a time. There is no specific region of the file that's locked: the lockfile is basically a file-based, system-wide semaphore. The entire file is locked; it's empty and has no content. Assuming this all is still basically true, I guess I still don't understand what exactly is being locked and why. E.g., why do we only want to interlock instances of Make, but not the programs they run? Also, acquire_semaphore is used only in sync_output, which is called only when a child exits. IOW, nothing is locked while the child runs, only when its output is ready. In addition, we are locking stdout. But doesn't each instance of Make have, or can have, its own stdout? If so, how will the interlock work? What am I missing? Probably a lot. TIA ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
Since you asked basic questions I'm going to start this at a basic level. Apologies if it covers some stuff you already know or if I misinterpreted the questions. Note that I haven't actually looked at the patch that went in so this is generally wrt the original. The first thing is get the word lock out of your mind because we aren't really locking anything. Yes, that API is in use but it's only to create a semaphore or baton. Nobody is ever prevented from doing anything. It just happens that on Unix the most portable (i.e. oldest) way of implementing a semaphore is with the advisory locking API. All cooperating processes agree not to proceed unless and until they are able to acquire the exclusive lock on a shared file descriptor, but it's not necessary to ever actually write anything to that descriptor. Second, the original implementation (not sure if I ever sent that one one in though) actually created a temp file to use as the semaphore fd. But then I discovered that stdout can be locked in the same way, which is simpler. But applying the lock to stdout is just a frill; it could be a temp file, especially if some platform turned out to need it that way. I just figured that stdout is always available, or at least if it's closed you don't have to worry about synchronizing output. Third, yes, nothing is locked while the child runs. If a shared resource was locked during child runs it would have the effect of re-serializing the build as each supposedly parallel child waited on the lock. So what happens here is really very simple: each child (aka recipe) runs asynchronously, assuming -j of course, and dumps its output to one or two temp files. Only when the child has finished and wants to report results does it enter the queue waiting for the baton. When it gets it, it holds it just long enough to copy its output from the temp files to stdout/stderr and then lets the next guy have his turn. Thus, assuming the average job runs for a significant amount of time (multiples of a write() system call anyway) there will not be much contention on the semaphore and it won't be a bottleneck. You're right that simply writing to temp files and dumping everything at once when the job finished would be likely to reduce the incidence of garbling even without the semaphore, but not to zero. It may be that the locking of stdout is only useful on Unix due to the fact that it's inherited into child processes. I don't know what Paul or Frank is thinking, and as mentioned I haven't looked at the current version, but my thinking originally was that Windows could easily handle this using its own far richer set of semaphore/locking APIs. I'd actually expect this to be easier and more natural on Windows than Unix. All that's required is to choose a semaphore to synchronize on, dump output to temp files, and copy it to stdout/stderr only after acquiring the semaphore. And remove the temp files of course. -David Boyce On Tue, Apr 23, 2013 at 10:50 AM, Eli Zaretskii e...@gnu.org wrote: Date: Fri, 19 Apr 2013 11:54:05 +0200 Cc: bo...@kolpackov.net, bug-make@gnu.org From: Frank Heckenbach f.heckenb...@fh-soft.de Eli Zaretskii wrote: Initial investigation indicates that tmpfile should do the job just fine: the file is deleted only when the last descriptor for it is closed. That includes any duplicated descriptors. Great. As for fcntl, F_SETLKW, and F_GETFD, they will need to be emulated. In particular, it looks like LockFileEx with LOCKFILE_EXCLUSIVE_LOCK flag set and LOCKFILE_FAIL_IMMEDIATELY flag cleared should do the job. I will need to see how it works in reality, though. OK. Upon a second look, I'm not sure I understand how this feature works, exactly, and why you-all thought making it work on Windows is a matter of a few functions. I sincerely hope I'm missing something, please bear with me. First, most of the meat of OUTPUT_SYNC code, which sets up the stage when running child jobs, is in a branch that isn't compiled on Windows (#if !defined(__MSDOS__) !defined(_AMIGA) !defined(WINDOWS32) on line 1482 of job.c). So currently that part is not even run on Windows. Please tell me that nothing in this feature relies on 'fork', with its copying of handles and other data structures. Because if it does, we have no hope of making it work on Windows, at least not using the same algorithms as on Unix. More importantly, how exactly locking the (redirected) stdout/stderr of the child is supposed to cause synchronization, and why do we need it at all? Isn't synchronization already achieved by redirecting child's output to a file, and only dumping it to screen when the child exits? What does lock add to this? Who else will be writing what to where, that we want to prevent by holding the lock/semaphore? In an old thread, Paul explained something similar: David, can you explain why you needed to lock the files? Also, what region(s) of the file you
Re: [bug #33138] .PARLLELSYNC enhancement with patch
Date: Tue, 23 Apr 2013 11:29:35 -0700 From: David Boyce david.s.bo...@gmail.com Cc: Frank Heckenbach f.heckenb...@fh-soft.de, bug-make bug-make@gnu.org Since you asked basic questions I'm going to start this at a basic level. Apologies if it covers some stuff you already know or if I misinterpreted the questions. Note that I haven't actually looked at the patch that went in so this is generally wrt the original. [...] Thanks, I will dwell on this. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
On Tue, 2013-04-23 at 21:40 +0300, Eli Zaretskii wrote: Date: Tue, 23 Apr 2013 11:29:35 -0700 From: David Boyce david.s.bo...@gmail.com Cc: Frank Heckenbach f.heckenb...@fh-soft.de, bug-make bug-make@gnu.org Since you asked basic questions I'm going to start this at a basic level. Apologies if it covers some stuff you already know or if I misinterpreted the questions. Note that I haven't actually looked at the patch that went in so this is generally wrt the original. [...] Thanks, I will dwell on this. When thinking about this, remember that it's not enough to consider how a single make invocation will work. If you run with a single make instance under -j, then redirecting each job's output to a temp file and then when make reaps each job, copying the contents of that temp file to stdout, is a sufficient solution. You just need to be able to redirect stdout/stderr of a given job to temporary files. In UNIX of course this is done by dup()'ing the file descriptors after the fork and before the exec. Presumably on Windows there's some other way to do this. However, in a situation where we have recursive make instances all running at the same time under -j then each one of those make instances is also running one or more jobs in parallel. In this case it's not good enough for each make to synchronize its own jobs' output. So in addition to the temp file change above, you ALSO need a way to synchronize the use of the single resource (stdout) that is being shared by all instances of recursive make. On UNIX we have chosen to use an advisory lock on the stdout file descriptor: it's handy, and it's the resource being contended for, so it makes sense. You asked, what if someone redirected the stdout of a sub-make. In that case things still work: that sub-make will not participate in the same locking as the other sub-makes, it's true, but that's OK because the output is going to a different location from the other sub-makes so there's no need to synchronize them. Meanwhile any sub-sub-makes using the same output file will still be synchronous with each other. I'm not sure if the lock locks the FD (so that if you dup'd the FD but it still pointed to the same output, you could take exclusive locks on both), or if it locks the underlying resource. If the former I guess it's possible to break the synchrony if you worked at it. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
From: Paul Smith psm...@gnu.org Cc: David Boyce david.s.bo...@gmail.com, f.heckenb...@fh-soft.de, bug-make@gnu.org Date: Tue, 23 Apr 2013 15:04:39 -0400 When thinking about this, remember that it's not enough to consider how a single make invocation will work. If you run with a single make instance under -j, then redirecting each job's output to a temp file and then when make reaps each job, copying the contents of that temp file to stdout, is a sufficient solution. You just need to be able to redirect stdout/stderr of a given job to temporary files. In UNIX of course this is done by dup()'ing the file descriptors after the fork and before the exec. Presumably on Windows there's some other way to do this. It's the same way, actually (modulo the fork/exec dance). So in addition to the temp file change above, you ALSO need a way to synchronize the use of the single resource (stdout) that is being shared by all instances of recursive make. On UNIX we have chosen to use an advisory lock on the stdout file descriptor: it's handy, and it's the resource being contended for, so it makes sense. I still don't know how does Make achieve on Unix the interlocking with its sub-Make's. Is it because the lock is inherited as part of fork? If so, we will need a special command-line argument on Windows to pass the name of the mutex, or the value of its handle, down the sub-make chain, because even if the handle is inherited, its value needs to be communicated. (I wish the design and implementation of this feature were less Posix-centric...) I'm not sure if the lock locks the FD (so that if you dup'd the FD but it still pointed to the same output, you could take exclusive locks on both), or if it locks the underlying resource. If the former I guess it's possible to break the synchrony if you worked at it. Sorry, you lost me here. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
On Tue, Apr 23, 2013 at 12:04 PM, Paul Smith psm...@gnu.org wrote: I'm not sure if the lock locks the FD (so that if you dup'd the FD but it still pointed to the same output, you could take exclusive locks on both), or if it locks the underlying resource. I'm pretty sure it's the underlying resource that gets locked. The man page is not quite explicit but implicitly fairly clear. The opening paragraph says: Apply, test or remove a POSIX lock on a section of an open file. The file is specified by fd, a file descriptor open for writing David ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
(TL;DR: Probably irrelevant.) Paul Smith wrote: I'm not sure if the lock locks the FD (so that if you dup'd the FD but it still pointed to the same output, you could take exclusive locks on both), or if it locks the underlying resource. If the former I guess it's possible to break the synchrony if you worked at it. The Linux manpage says: : As well as being removed by an explicit F_UNLCK, record locks are auto- : matically released when the process terminates or if it closes any file : descriptor referring to a file on which locks are held. This is bad: : it means that a process can lose the locks on a file like /etc/passwd : or /etc/mtab when for some reason a library function decides to open, : read and close it. : : Record locks are not inherited by a child created via fork(2), but are : preserved across an execve(2). A quick test indicates that the lock indeed refers to the resource, but several locks to the same resource are merged by the system (rather than deadlocking itself): #include fcntl.h int main () { int a = open (/dev/null, O_RDWR); int b = open (/dev/null, O_RDWR); // or b = dup (a) struct flock fl; fl.l_type = F_WRLCK; fl.l_whence = SEEK_SET; fl.l_start = 0; fl.l_len = 1; if (fcntl (a, F_SETLKW, fl)) perror (a); if (fcntl (b, F_SETLKW, fl)) perror (b); return 0; } Though I don't think any of this really matters, at least with our current implementation, since we don't dup the FD locked or fork or exec during this time. And it's possible break the synchrony if you worked at it since our locking is only advisory, anyway. (I.e., within make; not so easy in the jobs run since they don't see the FD used for locking, i.e. the original stdout.) PS: Paul, the following comment is obsolete; I forgot to remove it: fl.l_start = 0; /* lock just one byte according to pid */ ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
David Boyce wrote: The first thing is get the word lock out of your mind because we aren't really locking anything. Yes, that API is in use but it's only to create a semaphore or baton. Nobody is ever prevented from doing anything. It just happens that on Unix the most portable (i.e. oldest) way of implementing a semaphore is with the advisory locking API. All cooperating processes agree not to proceed unless and until they are able to acquire the exclusive lock on a shared file descriptor, but it's not necessary to ever actually write anything to that descriptor. Just to clarify: It's not necessary to write anything to that descriptor in order for the locking to work. We do actually write to the descriptor when we have the lock, but that's just an implementation detail, i.e. the lock could be something else, as long as the different make instances agree not to write to the descriptor without holding the lock. You're right that simply writing to temp files and dumping everything at once when the job finished would be likely to reduce the incidence of garbling even without the semaphore, but not to zero. It may be that the locking of stdout is only useful on Unix due to the fact that it's inherited into child processes. I don't know what Paul or Frank is thinking, and as mentioned I haven't looked at the current version, but my thinking originally was that Windows could easily handle this using its own far richer set of semaphore/locking APIs. I'd actually expect this to be easier and more natural on Windows than Unix. All that's required is to choose a semaphore to synchronize on, dump output to temp files, and copy it to stdout/stderr only after acquiring the semaphore. And remove the temp files of course. Yes, as I wrote in another mail, even a completely global semaphore should do. Sure, it's excluding much more than necessary, but since the critical section is very short, this shouldn't hurt much. (In other words, if make jobs produce such huge output that copying it around takes a significant amount of time, nobody's ever going to read that output anyway, or someone is post-processing it which will take much longer than copying it, anyway.) Indeed, as David wrote, under Unix, locking stdout/stderr is most straightforward because it's available without further setup. Incidentally, this way of locking is also as fine-grained as possible (considering recursive makes). However, as I wrote, this fine-grained locking is not really necessary, so it's not worth worrying about replicating it on Windows if this causes trouble. On Tue, Apr 23, 2013 at 10:50 AM, Eli Zaretskii e...@gnu.org wrote: Please tell me that nothing in this feature relies on 'fork', with its copying of handles and other data structures. All it requires is inheriting the redirected stdout/stderr to child processes. This was already possible under Dos (with the exception that since there was no fork, you had to redirect in the parent process, call the child, then undirect in the parent, IIRC). It's just like shell redirections, i.e. if you do foo bar baz, the stdout of foo and all processes called by foo is redirected to a file called bar, while baz and the rest of the shell continue running with their original stdout. If that's the same under Windows (at least using bash or a similar shell; no idea what Windows's own shell does), there should be no problem (or the bash sources should reveal what needs to be done). Perhaps you know all of this already and perhaps it's trivial, or perhaps it's impossible ... (I really don't know how different things are under Windows.) In an old thread, Paul explained something similar: David, can you explain why you needed to lock the files? Also, what region(s) of the file you are locking? fcntl with F_WRLCK won't work on Windows, so the question is how to emulate it. David wants to interlock between ALL instances of make printing output, so that even during recursive makes no matter how many you have running concurrently, only one will print its output at a time. There is no specific region of the file that's locked: the lockfile is basically a file-based, system-wide semaphore. The entire file is locked; it's empty and has no content. Assuming this all is still basically true, It's almost still true. As written above, we now don't use an extra, empty, file for locking, but stdout itself. Otherwise it's still so, we don't lock a particular region of the file, but use it as a semaphore. I guess I still don't understand what exactly is being locked and why. E.g., why do we only want to interlock instances of Make, but not the programs they run? Also, acquire_semaphore is used only in sync_output, which is called only when a child exits. IOW, nothing is locked while the child runs, only when its output is ready. As David wrote, that's necessary to preserve parallelism. In
Re: [bug #33138] .PARLLELSYNC enhancement with patch
Date: Tue, 23 Apr 2013 21:41:22 +0200 From: Frank Heckenbach f.heckenb...@fh-soft.de Yes, as I wrote in another mail, even a completely global semaphore should do. Not healthy, IMHO. Some snafu could inadvertently and completely silently stop an unrelated build somewhere on the same system. Sure, it's excluding much more than necessary, but since the critical section is very short, this shouldn't hurt much. (In other words, if make jobs produce such huge output that copying it around takes a significant amount of time, nobody's ever going to read that output anyway, or someone is post-processing it which will take much longer than copying it, anyway.) It doesn't matter whether someone reads the output or not, that output gets written and takes its time. We sometimes invoke tools with options we later regret, because we don't anticipate the results. Indeed, as David wrote, under Unix, locking stdout/stderr is most straightforward because it's available without further setup. Incidentally, this way of locking is also as fine-grained as possible (considering recursive makes). However, as I wrote, this fine-grained locking is not really necessary, so it's not worth worrying about replicating it on Windows if this causes trouble. It is not easy to know what needs to be replicated and what doesn't, because the fine print of the implementation is not clear, at least to me, and therefore it could easily happen that non-replication of some seemingly secondary detail completely breaks the design. After all, the basic models of Unix and Windows regarding processes and signals are very different. On Tue, Apr 23, 2013 at 10:50 AM, Eli Zaretskii e...@gnu.org wrote: Please tell me that nothing in this feature relies on 'fork', with its copying of handles and other data structures. All it requires is inheriting the redirected stdout/stderr to child processes. This was already possible under Dos (with the exception that since there was no fork, you had to redirect in the parent process, call the child, then undirect in the parent, IIRC). Inheritance is not the problem; _finding_ the inherited object or its handle is. With stdout, its name is global, but not so with a handle to some mutex I can create or its name (unless we indeed make the name fixed and thus system-global, which I don't like, if only because it's not how Make works on Unix). It's just like shell redirections, i.e. if you do foo bar baz, the stdout of foo and all processes called by foo is redirected to a file called bar, while baz and the rest of the shell continue running with their original stdout. If that's the same under Windows (at least using bash or a similar shell; no idea what Windows's own shell does), there should be no problem (or the bash sources should reveal what needs to be done). Perhaps you know all of this already and perhaps it's trivial, or perhaps it's impossible ... (I really don't know how different things are under Windows.) They aren't different, and all this is pretty basic. It's not what worries me. They can have their own stdout, in particular with the --output-sync=make option. But that's actually the harmless case: Each sub-make runs with its stdout already redirected to a temp file by the main make. In turn, it redirects the stdout of its children to separate temp files, and when they are done, collects the data to its stdout, i.e. the temp file from the main make. When the sub make is finished, the main make collects its output to the original stdout. So unless I'm mistaken, no locking is actually required in this case. But we still acquire the semaphore in this case, don't we? Perhaps we shouldn't. I still don't know how does Make achieve on Unix the interlocking with its sub-Make's. Is it because the lock is inherited as part of fork? The fd is inherited as part of fork. Each make that needs to takes a lock on the fd. On the same fd, or, rather, on an fd connected to the same resource. That's what I thought (it isn't spelled out anywhere in the comments, and the code that implements that is quite scattered). I need to check whether LockFileEx works on handles connected to the console; somehow I'm not sure. And LockFileEx is not ideal because it doesn't work on Windows 9X. So maybe I will have to use a mutex after all. (I wish the design and implementation of this feature were less Posix-centric...) The implementation my be (though it's only two functions, acquire_semaphore(), release_semaphore()) that can be completely replaced (note, again, the fact that the stdout or stderr fd also serves for locking, is just an implementation detail and not central to the design). I beg to differ. The implementation relies heavily on the fact that the semaphore's name or descriptor is globally known to every program in the chain. Writing a replacement for F_SETLKW using LockFileEx took me no more than 5 minutes; it is only when I
Re: [bug #33138] .PARLLELSYNC enhancement with patch
On Tue, 2013-04-23 at 23:16 +0300, Eli Zaretskii wrote: All it requires is inheriting the redirected stdout/stderr to child processes. This was already possible under Dos (with the exception that since there was no fork, you had to redirect in the parent process, call the child, then undirect in the parent, IIRC). Inheritance is not the problem; _finding_ the inherited object or its handle is. With stdout, its name is global, but not so with a handle to some mutex I can create or its name (unless we indeed make the name fixed and thus system-global, which I don't like, if only because it's not how Make works on Unix). You're right: we cannot use a fixed name. That would allow only one (top-level) invocation of make per system to use -O. That's not acceptable. They can have their own stdout, in particular with the --output-sync=make option. But that's actually the harmless case: Each sub-make runs with its stdout already redirected to a temp file by the main make. In turn, it redirects the stdout of its children to separate temp files, and when they are done, collects the data to its stdout, i.e. the temp file from the main make. When the sub make is finished, the main make collects its output to the original stdout. So unless I'm mistaken, no locking is actually required in this case. But we still acquire the semaphore in this case, don't we? Perhaps we shouldn't. We do acquire it but it's on a different resource. We still do need the semaphore because we want the output directed to the other file to be synchronized as well, at least within itself. If you run make -O and your output to stdout is synchronized by you run make -O out and the contents of the out file are not synchronized, that would be bogus. However, I'm not worried about losing this ability to change the lock domain for sub-makes based on redirection. If all the sub-makes were synchronized together even though some redirected to a different file that is fine. First, I sincerely doubt there are many makefiles which run sub-makes with IO redirected like that. And second, the slight loss in parallelization is not worth a huge increase in complexity. It's a nice thing that we get it for free with stdout but I wouldn't expend days trying to implement it if we didn't... in fact without stdout I just don't see that it's even possible. I guess if the child could somehow detect that it was using a different stdout it could be done, but that would require the parent passing information about stdout to the child... ugh. Not worth it. IOW, the top-level design is indeed quite general, but the low-level algorithmic details are not, and therefore just replacing these functions will not necessarily do the job. Right. The locking is the part that's not portable, and we need a lock that can be shared between all the make instances for one top-level make, but not shared with another top-level make. Without knowing what kind of resource Windows can take locks on, we can't really know how to help with that. There must be precedent for this kind of shared between a subset of processes lock on Windows. What handle do we need to know, and how can that handle be communicated? We can pass a filename through the environment, we can add a command-line flag to the sub-makes, we can do whatever the equivalent of inheriting an open file descriptor is on Windows... If we really want to make this reasonably portable (and without that, I cannot see how Paul's dream about less ifdef's is going to materialize), this code needs IMO to be refactored to make the algorithm know less about the implementation details. Personally I've never had any luck trying to create portable code from scratch, at least not unless I'm very familiar with all the platforms which is certainly not the case here. Once we see the second implementation it will be a lot more obvious what needs to be done to make this more portable. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
Eli Zaretskii wrote: Date: Tue, 23 Apr 2013 21:41:22 +0200 From: Frank Heckenbach f.heckenb...@fh-soft.de Sure, it's excluding much more than necessary, but since the critical section is very short, this shouldn't hurt much. (In other words, if make jobs produce such huge output that copying it around takes a significant amount of time, nobody's ever going to read that output anyway, or someone is post-processing it which will take much longer than copying it, anyway.) It doesn't matter whether someone reads the output or not, that output gets written and takes its time. We sometimes invoke tools with options we later regret, because we don't anticipate the results. I was just saying we don't need to optimize for such a strange case. I'm not saying we should fail in such a case, just that it may take a little longer if locking is too coarse. Producing the output by the jobs takes some time anyway, and if the additional delay caused by locking is too much, they just shouldn't use this option. It's an option after all. Indeed, as David wrote, under Unix, locking stdout/stderr is most straightforward because it's available without further setup. Incidentally, this way of locking is also as fine-grained as possible (considering recursive makes). However, as I wrote, this fine-grained locking is not really necessary, so it's not worth worrying about replicating it on Windows if this causes trouble. It is not easy to know what needs to be replicated and what doesn't, because the fine print of the implementation is not clear, at least to me, and therefore it could easily happen that non-replication of some seemingly secondary detail completely breaks the design. I think it was discussed extensively in the original discussion (in which you also participated) and here. If anything is still unclear, feel free to ask. But just to repeat: The only thing really necessary is that two makes with the same stdout/stderr never run sync_output() simultaneously. If *any* two sub-makes never run sync_output() simultaneously, that is also fine. On Tue, Apr 23, 2013 at 10:50 AM, Eli Zaretskii e...@gnu.org wrote: Please tell me that nothing in this feature relies on 'fork', with its copying of handles and other data structures. All it requires is inheriting the redirected stdout/stderr to child processes. This was already possible under Dos (with the exception that since there was no fork, you had to redirect in the parent process, call the child, then undirect in the parent, IIRC). Inheritance is not the problem; _finding_ the inherited object or its handle is. With stdout, its name is global, but not so with a handle to some mutex I can create or its name (unless we indeed make the name fixed and thus system-global, which I don't like, if only because it's not how Make works on Unix). Now we're talking about different things. I was replying to your question what it requires about fork etc. (answer: just stdout/stderr redirection like in the shell). You're worried about how to pass a reference to a mutex around. Paul just replied to that. They can have their own stdout, in particular with the --output-sync=make option. But that's actually the harmless case: Each sub-make runs with its stdout already redirected to a temp file by the main make. In turn, it redirects the stdout of its children to separate temp files, and when they are done, collects the data to its stdout, i.e. the temp file from the main make. When the sub make is finished, the main make collects its output to the original stdout. So unless I'm mistaken, no locking is actually required in this case. But we still acquire the semaphore in this case, don't we? Perhaps we shouldn't. It wouldn't really make things easier, just add another conditional which needs to be tested etc. I still don't know how does Make achieve on Unix the interlocking with its sub-Make's. Is it because the lock is inherited as part of fork? The fd is inherited as part of fork. Each make that needs to takes a lock on the fd. On the same fd, or, rather, on an fd connected to the same resource. That's what I thought (it isn't spelled out anywhere in the comments, and the code that implements that is quite scattered). It's the same fd (namely, stdout or stderr) via inheritance. Of course, you can say it's not the same fd, but two different fd's (i.e., stdout of different processes) referring to the same resource, but that's splitting hairs (and, again, probably irrelevant since you're going to implement it differently anyway). (I wish the design and implementation of this feature were less Posix-centric...) The implementation my be (though it's only two functions, acquire_semaphore(), release_semaphore()) that can be completely replaced (note, again, the fact that the stdout or stderr fd also serves for locking, is just an implementation detail and
Re: [bug #33138] .PARLLELSYNC enhancement with patch
From: Paul Smith p...@mad-scientist.net Cc: Frank Heckenbach f.heckenb...@fh-soft.de, bug-make@gnu.org Date: Tue, 23 Apr 2013 16:54:33 -0400 Without knowing what kind of resource Windows can take locks on, we can't really know how to help with that. The only resources that don't need their names passed to sub-Make are the standard streams. I will see if locking works on console handles; if not, I will have to introduce a command-line argument for passing the name or the handle of a mutex to a sub-Make. we can do whatever the equivalent of inheriting an open file descriptor is on Windows... Doesn't help unless it's a file descriptor of one of the 3 standard streams. Any other descriptor needs to be passed on the command line, for the same reasons we have --job-server-fds option. If we really want to make this reasonably portable (and without that, I cannot see how Paul's dream about less ifdef's is going to materialize), this code needs IMO to be refactored to make the algorithm know less about the implementation details. Personally I've never had any luck trying to create portable code from scratch, at least not unless I'm very familiar with all the platforms which is certainly not the case here. Once we see the second implementation it will be a lot more obvious what needs to be done to make this more portable. That's one way. Another one is to discuss the design more thoroughly before the patches are accepted. I tried to turn the discussion that way when the issue was first brought up, but my comments were largely ignored (if I compare what was suggested then with what was eventually committed), and most of the discussions were about the details of the Unix implementation anyway. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
On Fri, 2013-04-19 at 14:09 +0300, Eli Zaretskii wrote: Date: Fri, 19 Apr 2013 11:54:05 +0200 Cc: bo...@kolpackov.net, bug-make@gnu.org From: Frank Heckenbach f.heckenb...@fh-soft.de Is there a simple enough Makefile somewhere that could be used to test this feature, once implemented? We have a test in the test suite (output-sync). Can you use that? I hoped for something simpler and not involving Perl or Unixy shell features, because I'd like to use this in the native Windows environment, where the Windows port of Make runs. However, if that's the best possibility, I guess I'd craft something based on that test. The basic feature can be tested trivially like this: all: one two one two: @echo start $@ @sleep 1 @echo stop $@ Now if you run this using make -j you'll get: start one start two stop one stop two If you run this using make -j -O you should get: start one stop one start two stop two There's more to test than that: before it's done we need to test recursive make invocations for example. But the above is simple. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
On Fri, 2013-04-19 at 12:36 +0300, Eli Zaretskii wrote: Also, where is the best place to put the emulated Posix functions? Some new file in w32/compat/? I'd like to see it there. I'm thinking I want to move the new stuff out of job.c even for POSIX systems. The ifdefs are really getting to me. I was thinking of creating a posix.c file for example. The problem is there's a lot of variation even in POSIX, and there's a lot of POSIX-y stuff we actually do use in Windows too. We might need to break it down further than just posix.c. For this feature I probably won't have a clear feel for how to split it until I see what the Windows version looks like. Anyway I'll probably wait until after the next release to do major renovations. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
Date: Fri, 19 Apr 2013 00:06:52 +0300 From: Eli Zaretskii e...@gnu.org Cc: bo...@kolpackov.net, bug-make@gnu.org Indeed, as you suggested earlier, it might be useful to use the main part of open_tmpfile() (i.e. without the fdopen()), though we'd have to manually remove the file then. On Unix, we could unlink it right after opening (since we never need the filename again, unlike the other users of open_tmpfile()). On Windows, though, this might need to be delayed, AIUI. You can do that on Windows as well, just not if you use open/creat to create the file. You need to use the Windows file APIs with options that are not exposed to Posix-style functions like open. Can you get an fd from a file opened this way? Yes. If so, it should be alright, it could all be encapsulated in open_tmpfd(). I'm not sure it's worth the hassle. Using tmpfile is so much easier (if it works, which is to be determined). Initial investigation indicates that tmpfile should do the job just fine: the file is deleted only when the last descriptor for it is closed. That includes any duplicated descriptors. As for fcntl, F_SETLKW, and F_GETFD, they will need to be emulated. In particular, it looks like LockFileEx with LOCKFILE_EXCLUSIVE_LOCK flag set and LOCKFILE_FAIL_IMMEDIATELY flag cleared should do the job. I will need to see how it works in reality, though. Is there a simple enough Makefile somewhere that could be used to test this feature, once implemented? Also, where is the best place to put the emulated Posix functions? Some new file in w32/compat/? ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
Eli Zaretskii wrote: Initial investigation indicates that tmpfile should do the job just fine: the file is deleted only when the last descriptor for it is closed. That includes any duplicated descriptors. Great. As for fcntl, F_SETLKW, and F_GETFD, they will need to be emulated. In particular, it looks like LockFileEx with LOCKFILE_EXCLUSIVE_LOCK flag set and LOCKFILE_FAIL_IMMEDIATELY flag cleared should do the job. I will need to see how it works in reality, though. OK. Is there a simple enough Makefile somewhere that could be used to test this feature, once implemented? We have a test in the test suite (output-sync). Can you use that? ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
Date: Fri, 19 Apr 2013 11:54:05 +0200 Cc: bo...@kolpackov.net, bug-make@gnu.org From: Frank Heckenbach f.heckenb...@fh-soft.de Is there a simple enough Makefile somewhere that could be used to test this feature, once implemented? We have a test in the test suite (output-sync). Can you use that? I hoped for something simpler and not involving Perl or Unixy shell features, because I'd like to use this in the native Windows environment, where the Windows port of Make runs. However, if that's the best possibility, I guess I'd craft something based on that test. Thanks. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
Paul Smith wrote: I've applied the patch from Frank. Thanks. I did some tests and so far everything works in my setup. Since I was away for a day, I couldn't follow the discussion earlier, so allow me to respond to several mails at once ... I changed the behavior so that the entire recipe is printed together, rather than each individual job being printed together, so that even without .ONESHELL you still have the whole recipe as a single output. I can't justify to myself the need for individual job output but if someone really wants it please provide a reason. I almost do. I use a mechanism to give me a progress indicator while make is running. This mechanism does not require any special make features and works outside of make. How it works is roughly, my rules (at least the longer running ones) look like this (where [Start] and [End] are in fact some different strings that do not easily occur in normal messages): foo.o: +@echo [Start] Compiling foo.c @sleep 1 # cc -o foo.o foo.c +@echo [End] Compiling foo.c My program that interpretes make's output catches those special messages and shows me a list of currently running tasks. (In non-parallel builds, the non-silent make output serves this purpose, you can just see the last command shown which is the one running now. In parallel builds, this is not so; a long-running command may have scrolled far away by subsequent shorter commands run in parallel, so it's not easy to see what's actually running. Furthermore, many typical command lines these days (including mine) have so many compiler options etc. that they're hard to read by themselves. The above messages make it much more comfortable for me to see what's going on.) This mechanism was unaffected by my output-sync patch, and I expected your change broke it. In fact, it didn't, that's why I said almost. That's because of the + in the echo commands which prevents make from considering the above a single recipe. (The reason I use + is so I can run make -n and count those messages before the actual run in order to get the total number that will be produced which allows me to draw a progress bar. Not terribly important, I admit, but in this case it incidentally saved my ass.) So at the moment, it works for me, but it indicates there may be legitimate reasons to want the output grouped per job rather than per recipe. So I'd plead to revert this bit (since one can still use .ONESHELL if wanted). Or we could add another mode like --output-sync=job. Shouldn't be too hard now (if you like, I can implement it). Paul Smith wrote: On Tue, 2013-04-16 at 11:30 +0300, Eli Zaretskii wrote: . why is leaving directory FOO displayed _before_ releasing the stderr output? it should be after, I think. The enter/leave output needs investigation for sure. The idea was since log_working_directory() writes its messages to stdout, we need to wrap the stdout data with them. If stderr is the same as stdout, they are combined, so the first pump_from_tmp_fd handles all of them and they are properly wrapped. If they are separate, it doesn't matter when we give the leave message, since they go to different places anyway. So I think it's not necessary to move the leave message after the stderr output, though it wouldn't hurt either (just require another if (outfd_not_empty)). I think we're doing too much here. I would have left it alone but I think some tools that parse make results expect to see that output to help them figure out what's going on. As I described in https://savannah.gnu.org/bugs/?33138#comment3, the problem is how to interpret messages that contain file names when different jobs run in different subdirectories. A typical message looks like this: foo.c:42: syntax error blah In fact, also make itself gives such messages which depend on the current directory: Makefile:7: recipe for target 'foo-fail' failed With no context, there's no straightforward way to find the offending file. One could search all subdirectories for a file named foo.c (but there may be several ones in different directories), or leave it up to the user to find the right file. But that's not necessary if we have proper context. As I understand, Emacs has a directory change tracking mode which does just that (I don't use it myself, but a utility which does the same). In non-parallel builds, the normal enter/leave messages from make provide enough context. Not so in parallel builds, especially with target-level synchronization. Adding more of these messages helps to establish the context. As I wrote in the abovementioned comment, we now output rather too many messages, though trimming them better might be nontrivial (I did the easy part by giving them only if outfd_not_empty). . btw, why an error in acquire_semaphore and release_semaphore causes Make to bail out? isn't it better to fall back on the normal operation mode instead? I'll look at this. I guess the problem is
Re: [bug #33138] .PARLLELSYNC enhancement with patch
On Thu, 2013-04-18 at 19:09 +0200, Frank Heckenbach wrote: This mechanism was unaffected by my output-sync patch, and I expected your change broke it. I was reading your email with interest, waiting for the punch-line, but then after all that description you just said that the change broke it, with no explanation :-). My question is, what about the new behavior does not work for you? The change I made was to have all the jobs in a target recipe write (appending of course) to the same temp file, then after the last job in the recipe is completed the output for the entire recipe is generated. So for example, if you have: foo bar: : $@ one : $@ two : $@ three and you run make -j -O, the way you had it the output from each job would be mixed something like this (the actual result order will be slightly random): foo one bar one foo two bar two bar three foo three with my change you still get the same results but only after all jobs are complete, and they will be collected: foo one foo two foo three bar one bar two bar three So I'd plead to revert this bit (since one can still use .ONESHELL if wanted). Or we could add another mode like --output-sync=job. Shouldn't be too hard now (if you like, I can implement it). I'd prefer to not add another option here unless there's a compelling case for it, even though you're correct that we have the flexibility now. I think we're doing too much here. I would have left it alone but I think some tools that parse make results expect to see that output to help them figure out what's going on. As I described in https://savannah.gnu.org/bugs/?33138#comment3, the problem is how to interpret messages that contain file names when different jobs run in different subdirectories. I do get it. I just think we might need to do more drastic surgery on this to REALLY get it right. Probably we'll want to allow the user to have more control over it, as well. Maybe a similar flag that lets you choose whether to trace on a per-target or per-make basis. And, if you choose per-target you don't need to ALSO generate output per-make. We do need to be careful about tracing non-target output, for example output generated by $(info ...), $(warning ...), and $(error ...). I guess this points out a potential issue with the current UNIX implementation as well: if you had some utility you were running as part of your build that was also trying to lock stdout, then it would interfere with make. I don't think so, since its stdout is now our temp file which it may lock as it pleases. (In fact, that's just what recursive makes do with --output-sync=make.) True. And if they were trying to use stdout like we are, to lock between different recipes, they just can't use -O at all since by definition each target will be writing to a different file. PS: In assign_child_tempfiles(), the following declarations are no more needed and can be removed: FILE *outstrm = NULL, *errstrm = NULL; const char *suppressed = output-sync suppressed: ; char *failmode = NULL; Yeah, I have a commit in my local repo that fixes this. Thanks! ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
On Thu, Apr 18, 2013 at 10:09 AM, Frank Heckenbach f.heckenb...@fh-soft.dewrote: FILE is an opaque structure which should never be allocated by user code, so its layout can be implementation specific. Of course it's an opaque structure. The problem is that the implementation can't change its size without breaking binary compatibility with every previously-compiled program that uses stdio. This is a well-known problem, easily googled for. Here are a couple of references for Solaris: https://groups.google.com/forum/?fromgroups=#!topic/comp.unix.solaris/yky1fsJ3PMI http://www.oracle.com/technetwork/server-storage/solaris10/stdio-256-136698.html In particular, on Linux the field is an int ... Linux is a pretty elastic term, covering both 32- and 64-bit variants plus many kernels and distros. All 64-bit builds of any system will expand the original unsigned char; the opportunity to break binary compatibility allows for a lot of fixes. For all I know Linux even offers a clever fix in 32-bit versions, but there remain plenty of other 32-bit platforms with the problem. Bottom line, in 32-bit code you have to program for the likelihood that no more than 256 streams may be open at once. -David Boyce ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
Paul Smith wrote: On Thu, 2013-04-18 at 19:09 +0200, Frank Heckenbach wrote: This mechanism was unaffected by my output-sync patch, and I expected your change broke it. I was reading your email with interest, waiting for the punch-line, but then after all that description you just said that the change broke it, with no explanation :-). My question is, what about the new behavior does not work for you? I said I *expected* your change broke it, but it didn't. :-) Let me explain: The change I made was to have all the jobs in a target recipe write (appending of course) to the same temp file, then after the last job in the recipe is completed the output for the entire recipe is generated. So for example, if you have: foo bar: : $@ one : $@ two : $@ three and you run make -j -O, the way you had it the output from each job would be mixed something like this (the actual result order will be slightly random): foo one bar one foo two bar two bar three foo three And with my progress mechanism, that's exactly what I want. In my case it'd look like this: [Start] Compiling foo.c [Start] Compiling bar.c # time passes foo.c: some error # time passes bar.c: some error # time passes [End] Compiling bar.c # time passes [End] Compiling foo.c This is useful (to me) because at any time, I know what's running. ([Start] messages minus [End] messages.) with my change you still get the same results but only after all jobs are complete, and they will be collected: foo one foo two foo three bar one bar two bar three In my case: # time passes [Start] Compiling foo.c foo.c: some error [End] Compiling foo.c # time passes [Start] Compiling bar.c bar.c: some error [End] Compiling bar.c Here, I don't see what's running, since the start and end messages always appear together, and I don't see a job started until it's already finished. I expected(!) your patch would do this to my output which would break my mechanism. In fact, it didn't, but only because I have + before the echo rules (for a different reason) which causes make to consider them separate recipes (at least for the purpose of this check). So, it's OK for me at the moment, but just coincidentally. (I checked, if I remove the +, it does just what I expected and breaks my mechanism.) Of course, you could say I'm misusing the same message channel for different purposes here, but that's just the only channel readily available without adding some message-passing framework or whatever. (Sure, there are two channels, stdout and stderr, but separating the messages between them wouldn't help here.) I think we're doing too much here. I would have left it alone but I think some tools that parse make results expect to see that output to help them figure out what's going on. As I described in https://savannah.gnu.org/bugs/?33138#comment3, the problem is how to interpret messages that contain file names when different jobs run in different subdirectories. I do get it. I just think we might need to do more drastic surgery on this to REALLY get it right. I'm not saying it's perfect now, but my priorities are first correct (i.e. sufficient context), then nice (i.e. no redundant messages). Probably we'll want to allow the user to have more control over it, as well. Maybe a similar flag that lets you choose whether to trace on a per-target or per-make basis. I think it should in principle be possible without requiring the user to specify any more options. But it would be some work, requiring make to keep track of which directory message was output last, delay the leaving message in case the next one will be entering the same directory etc., and synchronize this among recursive makes in the different modes. And, if you choose per-target you don't need to ALSO generate output per-make. Probably, though I didn't check it in detail. I just didn't want to remove any messages that were output before. We do need to be careful about tracing non-target output, for example output generated by $(info ...), $(warning ...), and $(error ...). The latter two are covered by the changes in error() and fatal(), but $(info) was missing. At least in function.c, this seems to be the only place that was forgotten: --- function.c.orig 2013-04-18 11:39:15.0 +0200 +++ function.c 2013-04-18 20:08:35.0 +0200 @@ -1106,8 +1106,12 @@ break; case 'i': + if (output_sync) +log_working_directory (1, 1); printf (%s\n, msg); fflush(stdout); + if (output_sync) +log_working_directory (0, 1); break; default: ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
David Boyce wrote: On Thu, Apr 18, 2013 at 10:09 AM, Frank Heckenbach f.heckenb...@fh-soft.dewrote: FILE is an opaque structure which should never be allocated by user code, so its layout can be implementation specific. Of course it's an opaque structure. The problem is that the implementation can't change its size without breaking binary compatibility with every previously-compiled program that uses stdio. 1. It can change its size, precisely because user code does not declare variables of type FILE, just pointers, and does not allocate FILE structures itself (but uses fopen() etc.). So as long as the size of the pointer remains the same (which is trivial), there's no binary compatibility problem. In fact, a libc could even use differently sized FILE structures within one program (kind of OOP), and user code wouldn't notice. 2. Even if it couldn't change the size, this would only apply within one platform. You said: My understanding, and I believe this applies to all Unix variants, is that because the original stdio FILE structure used an 8-bit int to hold the file descriptor, the number of available streams is 256 always and forever (in 32-bit variants. That's not so, as evidenced by my test program. If Linux has always had an int there, there's no compatibility problem, since programs compiled under original Unix are not binary compatible with Linux anyway. In particular, on Linux the field is an int ... Linux is a pretty elastic term, Sorry, i686, 32-bit, kernel 3.2.2, Debian squeeze (but I suppose it applies to all x86 variants, 32-bit, and 64-bit anyway). Bottom line, in 32-bit code you have to program for the likelihood that no more than 256 streams may be open at once. This might be right (I was just objecting to your claim that it was necessarily so on any 32-bit Unix), and I'd prefer to use fd's anyway. Indeed, as you suggested earlier, it might be useful to use the main part of open_tmpfile() (i.e. without the fdopen()), though we'd have to manually remove the file then. On Unix, we could unlink it right after opening (since we never need the filename again, unlike the other users of open_tmpfile()). On Windows, though, this might need to be delayed, AIUI. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
On Thu, 2013-04-18 at 20:36 +0200, Frank Heckenbach wrote: And with my progress mechanism, that's exactly what I want. In my case it'd look like this: [Start] Compiling foo.c [Start] Compiling bar.c # time passes foo.c: some error # time passes bar.c: some error # time passes [End] Compiling bar.c # time passes [End] Compiling foo.c This is useful (to me) because at any time, I know what's running. ([Start] messages minus [End] messages.) Thanks, this is the reason I was looking for; that use-case wasn't clear to me based on the previous email. Probably we'll want to allow the user to have more control over it, as well. Maybe a similar flag that lets you choose whether to trace on a per-target or per-make basis. I think it should in principle be possible without requiring the user to specify any more options. I was thinking more that the user may want to not want all the enter/leave output even if it is ambiguous from a programmatic sense: I know where my code lives and so if I see the my_foo.o fails, I know that the my_foo.c file lives in src/my/my_foo.c. I might prefer to see a cleaner output from make and rely on my innate knowledge of the codebase to navigate. But maybe you'd still like to see the per-make enter/leave, even if you're running with -Otarget. But it would be some work, requiring make to keep track of which directory message was output last, delay the leaving message in case the next one will be entering the same directory etc., and synchronize this among recursive makes in the different modes. Synchronization between recursive makes is not something I want to get into. As long as the messages are coherent within a single make instance, either before/after everything the make instance does or before/after each target (or job, if that is needed) that's enough for me. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
Date: Thu, 18 Apr 2013 19:09:06 +0200 From: Frank Heckenbach f.heckenb...@fh-soft.de . calculation of combined_output in start_job_command will need to be reimplemented for Windows, since the reliance on st_dev and st_ino makes assumptions that are false on Windows. What we need is basically is_same_file (fd1, fd2) (which we probably could refactor into a separate function). Eli, can you try: test file1 -ef file2 where test is either a standalone utility or a built-in of bash or any other shell. If any of these works, you can see how they do it. Otherwise, a bit of googling turned up this page which has code (in Python, but it seems like a thin layer around the system calls, so it can probably easily be ported to C): http://timgolden.me.uk/python/win32_how_do_i/see_if_two_files_are_the_same_file.html I know about this method, but I'm not sure it does what we want here. The number used by this method is not guaranteed to be unique on some versions of Windows. More importantly, it only works for disk files, I don't know whether it reports a meaningful value for the console device or the null device. So this will be a bit more tricky than that URL pretends. . STREAM_OK uses fcntl and F_GETFD which both don't exist on Windows. This is to test if somehow make's stdout or stderr has been closed (not just redirected to /dev/null, but closed). Right, got that; but Windows needs a different test for that, as there's no fcntl. Basically we need just any call that succees for a valid file and fails otherwise. fstat() might be a good candidate. fstat is too heavy on Windows; there are better methods. If that isn't available, maybe even lseek() (as used in FD_NOT_EMPTY) will do if we keep the || errno != EBADF test. lseek fails for the console as well, so it's not a good candidate. But this problem is not difficult, I just mentioned that the code as committed won't compile. Eli, can you do some tests to see if you find something that works reliably? Don't worry about that. My worst problem is lack of time, not lack of ideas ;-) ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
Date: Thu, 18 Apr 2013 20:39:44 +0200 Cc: psm...@gnu.org, e...@gnu.org, bo...@kolpackov.net From: Frank Heckenbach f.heckenb...@fh-soft.de Indeed, as you suggested earlier, it might be useful to use the main part of open_tmpfile() (i.e. without the fdopen()), though we'd have to manually remove the file then. On Unix, we could unlink it right after opening (since we never need the filename again, unlike the other users of open_tmpfile()). On Windows, though, this might need to be delayed, AIUI. You can do that on Windows as well, just not if you use open/creat to create the file. You need to use the Windows file APIs with options that are not exposed to Posix-style functions like open. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
On Thu, Apr 18, 2013 at 2:39 PM, Frank Heckenbach f.heckenb...@fh-soft.dewrote This might be right (I was just objecting to your claim that it was necessarily so on any 32-bit Unix), and I'd prefer to use fd's anyway. Well ... Linux is not Unix, as partisans will proudly tell you, so technically what I said is unobjectionable. I won't pretend that that's what I meant though. Presumably the real distinction is between statically and dynamically linked binaries; anything linked statically with libc would need to contain an actual FILE struct, not just a pointer to one. But that aside, we're all in violent agreement so we can let this sub-thread drop. -David ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
Eli Zaretskii wrote: Date: Thu, 18 Apr 2013 20:39:44 +0200 Cc: psm...@gnu.org, e...@gnu.org, bo...@kolpackov.net From: Frank Heckenbach f.heckenb...@fh-soft.de Indeed, as you suggested earlier, it might be useful to use the main part of open_tmpfile() (i.e. without the fdopen()), though we'd have to manually remove the file then. On Unix, we could unlink it right after opening (since we never need the filename again, unlike the other users of open_tmpfile()). On Windows, though, this might need to be delayed, AIUI. You can do that on Windows as well, just not if you use open/creat to create the file. You need to use the Windows file APIs with options that are not exposed to Posix-style functions like open. Can you get an fd from a file opened this way? If so, it should be alright, it could all be encapsulated in open_tmpfd(). ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
Paul Smith wrote: On Thu, 2013-04-18 at 20:36 +0200, Frank Heckenbach wrote: And with my progress mechanism, that's exactly what I want. In my case it'd look like this: [Start] Compiling foo.c [Start] Compiling bar.c # time passes foo.c: some error # time passes bar.c: some error # time passes [End] Compiling bar.c # time passes [End] Compiling foo.c This is useful (to me) because at any time, I know what's running. ([Start] messages minus [End] messages.) Thanks, this is the reason I was looking for; that use-case wasn't clear to me based on the previous email. OK, so what are we going to do about it? Leave, revert, new option? Well, right now I did find a problem with the new behaviour: foo: @echo foo:error @false % make -Omake # same with -Otarget m:2: recipe for target 'foo' failed make: *** [foo] Error 1 foo:error This seems at least strange to me: The conclusion recipe failed is printed before the reason (the messages from the job). Reverting this part (i.e., moving the sync_output() call to where it was before) corrects this: % make -Otarget # same with -Otarget foo:error m:2: recipe for target 'foo' failed make: *** [foo] Error 1 To fix it without reverting the behaviour seems to require calling sync_output() before child_error() (2 places). Probably we'll want to allow the user to have more control over it, as well. Maybe a similar flag that lets you choose whether to trace on a per-target or per-make basis. I think it should in principle be possible without requiring the user to specify any more options. I was thinking more that the user may want to not want all the enter/leave output even if it is ambiguous from a programmatic sense: I know where my code lives and so if I see the my_foo.o fails, I know that the my_foo.c file lives in src/my/my_foo.c. I might prefer to see a cleaner output from make and rely on my innate knowledge of the codebase to navigate. But maybe you'd still like to see the per-make enter/leave, even if you're running with -Otarget. I for my part don't care about individual enter/leave messages since my script (much like the Emacs mode) interprets them. I only care that the last enter-message before each message from the job is correct; any redundant messages before are simply ignored. So as long as this behaviour remains (or at least an option exists to keep it), I'm fine with any changes. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
Date: Thu, 18 Apr 2013 22:05:33 +0200 Cc: bug-make@gnu.org, david.s.bo...@gmail.com, psm...@gnu.org, bo...@kolpackov.net From: Frank Heckenbach f.heckenb...@fh-soft.de Eli Zaretskii wrote: Date: Thu, 18 Apr 2013 20:39:44 +0200 Cc: psm...@gnu.org, e...@gnu.org, bo...@kolpackov.net From: Frank Heckenbach f.heckenb...@fh-soft.de Indeed, as you suggested earlier, it might be useful to use the main part of open_tmpfile() (i.e. without the fdopen()), though we'd have to manually remove the file then. On Unix, we could unlink it right after opening (since we never need the filename again, unlike the other users of open_tmpfile()). On Windows, though, this might need to be delayed, AIUI. You can do that on Windows as well, just not if you use open/creat to create the file. You need to use the Windows file APIs with options that are not exposed to Posix-style functions like open. Can you get an fd from a file opened this way? Yes. If so, it should be alright, it could all be encapsulated in open_tmpfd(). I'm not sure it's worth the hassle. Using tmpfile is so much easier (if it works, which is to be determined). ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
From: Paul Smith psm...@gnu.org Cc: bo...@kolpackov.net, bug-make@gnu.org, f.heckenb...@fh-soft.de Date: Tue, 16 Apr 2013 12:56:21 -0400 On Tue, 2013-04-16 at 19:20 +0300, Eli Zaretskii wrote: From: Paul Smith psm...@gnu.org Cc: bo...@kolpackov.net, bug-make@gnu.org, f.heckenb...@fh-soft.de Date: Tue, 16 Apr 2013 10:44:39 -0400 On Tue, 2013-04-16 at 16:43 +0300, Eli Zaretskii wrote: I'm not sure what the semantics of tmpfile() are on Windows. The file is automatically deleted when closed. But the documentation doesn't say what happens if it is open on more than one descriptor, or what happens if the original descriptor is dup'ed. I will need to test that, and perhaps provide a work-around. It might be that we have to allow use of a file handle on Windows, rather than a descriptor. That doesn't matter, really. One can get one from the other on Windows. Ah interesting. In UNIX you can get _A_ file handle back from a file descriptor (using fdopen()), but it's not guaranteed to be the SAME file handle you had originally. That is, if you run: FILE* f1 = fopen(...); int fd = fileno(f1); FILE* f2 = fdopen(fd, ...); fclose(f2); you don't get back f2 == f1. And although fd will be closed here, I'm pretty sure not all the resources associated with f1 are freed, which is a resource leak that will eventually lead to running out of file handles. That could be a misunderstanding on my part: I didn't realize that by handle you mean a FILE object. I thought you meant Windows specific HANDLE objects (which underly every open file). Anyway, I'm not sure why the current code calls tmpfile, which produces a FILE object, but then only uses its file descriptor and read/write functions. Why not keep the FILE object in the child struct, and use fread/fwrite instead? As a nice benefit, you get to avoid leaking the resources due to the fact that no one calls fclose on those FILE objects, or so it seems. The descriptor-based mutex has the very slight advantage over a system-wide mutex in that if a sub-make's output is redirected it now has its own lock domain. I didn't mean a system-wide mutex, I meant a process-wide mutex. Will this be OK? I don't think so: especially now that we support full jobserver capabilities in Windows we can have recursive make invocations all running jobs in parallel, and we'd want them to synchronize their output across multiple processes. If we were only concerned about a single process we really wouldn't need even mutexes since make is single-threaded. Right. E.g., Make sees that both are connected to the same device and redirects them to the same file, but then the job redirects stderr to a separate file using shell redirection (2 foo). Or vice versa. Sure... but I don't see the problem. Maybe I've lost the thread. When the command starts both stdout and stdin are writing to the same destination. If the command does nothing special then the output will be a combination of stdout and stderr in the order in which the command generated them, which is good. If the command script changes one or both of stdin/stdout, then they won't be cojoined anymore, yes, but that's what the user asked for...? Maybe there's no problem, I don't know. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make
Re: [bug #33138] .PARLLELSYNC enhancement with patch
On Wed, 2013-04-17 at 19:10 +0300, Eli Zaretskii wrote: That could be a misunderstanding on my part: I didn't realize that by handle you mean a FILE object. I thought you meant Windows specific HANDLE objects (which underly every open file). I'm not very familiar with Windows terminology. Is a HANDLE equivalent to a UNIX file descriptor? Or is it a third thing, different from UNIX fd's or C standard FILE*'s? Anyway, I'm not sure why the current code calls tmpfile, which produces a FILE object, but then only uses its file descriptor and read/write functions. Why not keep the FILE object in the child struct, and use fread/fwrite instead? I believe the thinking is that some implementations may have a much smaller number of open streams (FILE*) allowed, than open file descriptors. The POSIX standard, for example, allows this: Some implementations may limit {STREAM_MAX} to 20 but allow {OPEN_MAX} to be considerably larger. Also, a stream is much more resource-heavy than a file descriptor, as it implies buffering etc. in addition to the open file. We wouldn't use the buffering, but it's still there. We might need two different temp files per running job, and for high values of -j (people are doing -j builds on very large systems these days) that may be significant. However, I don't know if it's worth it in reality systems. As a nice benefit, you get to avoid leaking the resources due to the fact that no one calls fclose on those FILE objects, or so it seems. They are closed in open_tmpfd(), that's why we dup() the file descriptor first (so when we close the FILE* we don't lose the underlying file). But you're right, for Windows this may not make sense and we should simiply use the FILE*: this is what I was referring to by some kind of portability layer. Sure... but I don't see the problem. Maybe there's no problem, I don't know. OK. I think it will behave just as I want, but I'm the one who suggested this behavior so I'm probably biased. Let me know of you think of something that doesn't work about it. ___ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make