Since you asked basic questions I'm going to start this at a basic level.
Apologies if it covers some stuff you already know or if I misinterpreted
the questions. Note that I haven't actually looked at the patch that went
in so this is generally wrt the original.

The first thing is get the word "lock" out of your mind because we aren't
really locking anything. Yes, that API is in use but it's only to create a
semaphore or baton. Nobody is ever prevented from doing anything. It just
happens that on Unix the most portable (i.e. oldest) way of implementing a
semaphore is with the advisory locking API. All cooperating processes agree
not to proceed unless and until they are able to acquire the exclusive lock
on a shared file descriptor, but it's not necessary to ever actually write
anything to that descriptor.

Second, the original implementation (not sure if I ever sent that one one
in though) actually created a temp file to use as the semaphore fd. But
then I discovered that stdout can be locked in the same way, which is
simpler. But applying the lock to stdout is just a frill; it could be a
temp file, especially if some platform turned out to need it that way. I
just figured that stdout is always available, or at least if it's closed
you don't have to worry about synchronizing output.

Third, yes, nothing is locked while the child runs. If a shared resource
was locked during child runs it would have the effect of re-serializing the
build as each supposedly parallel child waited on the lock. So what happens
here is really very simple: each child (aka recipe) runs asynchronously,
assuming -j of course, and dumps its output to one or two temp files. Only
when the child has finished and wants to report results does it enter the
queue waiting for the baton. When it gets it, it holds it just long enough
to copy its output from the temp files to stdout/stderr and then lets the
next guy have his turn. Thus, assuming the average job runs for
a significant amount of time (multiples of a write() system call anyway)
there will not be much contention on the semaphore and it won't be a
bottleneck.

You're right that simply writing to temp files and dumping everything at
once when the job finished would be likely to reduce the incidence of
garbling even without the semaphore, but not to zero.

It may be that the locking of stdout is only useful on Unix due to the fact
that it's inherited into child processes. I don't know what Paul or Frank
is thinking, and as mentioned I haven't looked at the current version, but
my thinking originally was that Windows could easily handle this using its
own far richer set of semaphore/locking APIs. I'd actually expect this to
be easier and more natural on Windows than Unix. All that's required is to
choose a semaphore to synchronize on, dump output to temp files, and copy
it to stdout/stderr only after acquiring the semaphore. And remove the temp
files of course.

-David Boyce




On Tue, Apr 23, 2013 at 10:50 AM, Eli Zaretskii <e...@gnu.org> wrote:

> > Date: Fri, 19 Apr 2013 11:54:05 +0200
> > Cc: bo...@kolpackov.net, bug-make@gnu.org
> > From: Frank Heckenbach <f.heckenb...@fh-soft.de>
> >
> > Eli Zaretskii wrote:
> >
> > > Initial investigation indicates that tmpfile should do the job just
> > > fine: the file is deleted only when the last descriptor for it is
> > > closed.  That includes any duplicated descriptors.
> >
> > Great.
> >
> > > As for fcntl, F_SETLKW, and F_GETFD, they will need to be emulated.
> > > In particular, it looks like LockFileEx with LOCKFILE_EXCLUSIVE_LOCK
> > > flag set and LOCKFILE_FAIL_IMMEDIATELY flag cleared should do the
> > > job.  I will need to see how it works in reality, though.
> >
> > OK.
>
> Upon a second look, I'm not sure I understand how this feature works,
> exactly, and why you-all thought making it work on Windows is a matter
> of a few functions.  I sincerely hope I'm missing something, please
> bear with me.
>
> First, most of the meat of OUTPUT_SYNC code, which sets up the stage
> when running child jobs, is in a branch that isn't compiled on Windows
> ("#if !defined(__MSDOS__) && !defined(_AMIGA) && !defined(WINDOWS32)"
> on line 1482 of job.c).  So currently that part is not even run on
> Windows.  Please tell me that nothing in this feature relies on
> 'fork', with its copying of handles and other data structures.
> Because if it does, we have no hope of making it work on Windows, at
> least not using the same algorithms as on Unix.
>
> More importantly, how exactly locking the (redirected) stdout/stderr
> of the child is supposed to cause synchronization, and why do we need
> it at all?  Isn't synchronization already achieved by redirecting
> child's output to a file, and only dumping it to screen when the child
> exits?  What does lock add to this?  Who else will be writing what to
> where, that we want to prevent by holding the lock/semaphore?
>
> In an old thread, Paul explained something similar:
>
>     > David, can you explain why you needed to lock the files?  Also, what
>     > region(s) of the file you are locking?  fcntl with F_WRLCK won't work
>     > on Windows, so the question is how to emulate it.
>
>     David wants to interlock between ALL instances of make printing output,
>     so that even during recursive makes no matter how many you have running
>     concurrently, only one will print its output at a time.
>
>     There is no specific region of the file that's locked: the lockfile is
>     basically a file-based, system-wide semaphore.  The entire file is
>     "locked"; it's empty and has no content.
>
> Assuming this all is still basically true, I guess I still don't
> understand what exactly is being locked and why.  E.g., why do we only
> want to interlock instances of Make, but not the programs they run?
> Also, acquire_semaphore is used only in sync_output, which is called
> only when a child exits.  IOW, nothing is locked while the child
> runs, only when its output is ready.
>
> In addition, we are locking stdout.  But doesn't each instance of Make
> have, or can have, its own stdout?  If so, how will the interlock
> work?
>
> What am I missing?  Probably a lot.
>
> TIA
>
> _______________________________________________
> Bug-make mailing list
> Bug-make@gnu.org
> https://lists.gnu.org/mailman/listinfo/bug-make
>
_______________________________________________
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Reply via email to