Re: [ccache] gmake and ccache conspiring together in creating gremlins

2021-02-08 Thread Bernd Petrovitsch via ccache
Hi all!

On 07/02/2021 16:45, Sam Varshavchik via ccache wrote:
[...]
> In the interim I discovered a workaround that seems to work, for
> anyone who's also affected by this. Instead of setting CCACHE_PREFIX
> to distcc, I pointed CCACHE_PREFIX to this:
> 
> #! /bin/sh
> 
> unset MFLAGS ${!MAKE@}
> exec distcc "$@"
> 
> This clears make variables from the environment and when the linker
> invokes make for ...whatever it needs to invoke it for, against a
> manufactured makefile it generates on the fly... it doesn't go look
> for the job server file descriptors, and go sideways.

Hmm, with the whole thread: Shouldn't better the linker (or whatever
tool) cleanup the environment variables before invoking `make` (or
whatever programs) to fix the - IMHO - core cause of the bug?

MfG,
Bernd
-- 
Bernd Petrovitsch  Email : be...@petrovitsch.priv.at
 LUGA : http://www.luga.at

___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


Re: [ccache] gmake and ccache conspiring together in creating gremlins

2021-02-08 Thread Sam Varshavchik via ccache
On Mon, Feb 8, 2021 at 2:38 PM Paul Smith  wrote:
>
> On Mon, 2021-02-08 at 10:43 +, Edward Welbourne wrote:
> > Sounds to me like that's a bug: when the descriptors are closed, the
> > part of MAKEFLAGS that claims they're make's jobserver file
> > descriptors should be removed, since that's when the claim stops
> > being true.
>
> I believe there have been other similar issues reported recently.
>
> Certainly fixing MAKEFLAGS when we run without jobserver available is
> something that could be done.
>
> There is a loss of debugging information if we make this change: today
> make can detect if it was invoked in a way that _should_ expect to
> receive a jobserver context, but _didn't_ receive that context.  That
> is, if make sees that jobserver-auth is set but it can't open the
> jobserver pipes it can warn the user that most likely there's a problem
> in their environment or with the setup of their makefiles.
>
> Without this warning there's no way to know when this situation occurs.
>  It's easy to create a situation where every sub-make will create its
> own completely unique jobserver domain.  So you start the top make with
> -j4 and run 4 sub-makes; if you do it wrong then each of 4 sub-makes
> could create a new jobserver domain, and now you're running 16 jobs in
> parallel instead of 4... there's no way for make to warn you about this
> situation.

One thought occurred to me. Specifically: when make executes what it
believes to be something other than a recursive invocation of $(MAKE),
and it closes the job server pipe file descriptors for that, it can
also:

1) Add an additional parameter to MAKEFLAGS, let's call it
"--no-jobserver", and perhaps remove the --jobserver-auth parameter
completely. It might be easier just to append something there, instead
of surgically removing this.

2) Make checks for a --no-jobserver in MAKEFLAGS when it starts. If
it's there it does NOT attempt to validate the file descriptors that
are given in --jobserver-auth (if this parameter is preserved). It's a
given that they're not there:

  if (!FD_OK (job_fds[0]) || !FD_OK (job_fds[1]) || make_job_rfd () < 0)

Don't even do that. What happens right now a warning message gets
printed and make runs without a job server. This change should have
the same result, print the warning but skip the FD_OK tests.

This will result in the same warning, but it should avoid triggering
the bug that I found.

However that might cause a minor regression in LTO linking. I think
that this prevents the LTO linker's internal invocation of make from
finding that it can attach to the original make process's job server.

>From sifting through strace dumps, I see that a linker-invoked make
gets its own -j flag. It appears that the linker is courteous enough
to count how many CPUs it has and use it to construct its own -j flag.

How about this, safe approach: once --no-jobserver is there it stays
there, and gets propagated to all recursively invoked makes. If an
invoke make finds that it has both a --no-jobserver and a -j flag,
it'll warn and refuse to create its own job server, and then proceed
executing one command at a time.

This prevents an arithmetic proliferation of job worker processes if
the original job server's file descriptors get lost. Currently
recursively-invoked makes will find, and attach themselves to, an
existing job server. This is nice; but this is vulnerable to an edge
case that I think I'm hitting: a false positive involving a leaked
file descriptor. This change encourages fixing whatever's causing make
to fail to detect a recursive invocation.

___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


Re: [ccache] gmake and ccache conspiring together in creating gremlins

2021-02-08 Thread Sam Varshavchik via ccache
On Sun, Feb 7, 2021 at 7:59 AM Joel Rosdahl  wrote:

> From my perspective, I don't see anything wrong with ccache dup-ing stderr – I
> guess the bug can be triggered by the user running something like
>
>   exec 3>&2
>
> in the shell as well.
>
> I can think of an easy workaround for the problem: only dup2 stderr when 
> running
> the preprocessor or compiler. Done in
> .

In the interim I discovered a workaround that seems to work, for
anyone who's also affected by this. Instead of setting CCACHE_PREFIX
to distcc, I pointed CCACHE_PREFIX to this:

#! /bin/sh

unset MFLAGS ${!MAKE@}
exec distcc "$@"

This clears make variables from the environment and when the linker
invokes make for ...whatever it needs to invoke it for, against a
manufactured makefile it generates on the fly... it doesn't go look
for the job server file descriptors, and go sideways.

___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache