Re: INTERNAL: Exiting with 2 jobserver tokens available; should be 5!

2016-11-19 Thread Tim Murphy
Hi, sounds like a tough situation and printfs() might be the easiest but I
thought I might suggest one other complicated idea :-). Surely someone has
put together a cross-compiler for this os/hw combination? the idea of
compiling on-device would certainly have been impossibly slow until fairly
recent times.

Regards,

Tim

On 19 November 2016 at 22:03, Jaak Ristioja  wrote:

> Hi!
>
> On 13.11.2016 07:37, Tim Murphy wrote:
> > Something like Valgrind might spot some initial problem that doesn't
> > immediately crash but eventually spirals out of control.
>
> I could try valgrind, but (1) I will need to recompile glibc with debug
> symbols to use it, and (2) I don't have an eternity to wait for
> --trace-children to finish, so I'd have to run it without
> --trace-children because afaik valgrind doesn't provide good means to
> trace only certain children (i.e. only "make" processes). Actually the
> RPi2 would probably OOM/die from multiple valgrind processes anyway.
>
> > I don't know what the gcc version is on your Pi but if you have a recent
> > enough one  you might manage to use the address sanitiser option to get
> > a similar result.
>
> I'm currently using GCC 5.4, so its fairly new from that aspect, but I
> won't be able to use its address sanitizer, because it doesn't work with
> a PaX/grsecurity kernel like Gentoo's sys-devel/hardened-sources, due to
> "ASAN assumes/uses hardcoded userland address space size values, which
> breaks when UDEREF is set as it pitches a bit from the size" [1].
> Because of this, it is disabled by default on hardened profiles, hence
> I'd have to both recompile GCC and a kernel without PaX/grsec to
>
> So I guess I'll attempt to recompile glibc and run valgrind on the
> parent "make -j5" process and see whether that turns up anything. If
> not, then I'll try the -fsanitize=address approach. I expect all of this
> to take some time (and perhaps wear out more flash storage) on the slow
> RPi2.
>
> Best regards,
> J
>
>
> [1] http://blog.siphos.be/2013/04/another-gentoo-hardened-month-
> has-passed/
>
> ___
> Bug-make mailing list
> Bug-make@gnu.org
> https://lists.gnu.org/mailman/listinfo/bug-make
>
___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make


Re: INTERNAL: Exiting with 2 jobserver tokens available; should be 5!

2016-11-19 Thread Paul Smith
On Sun, 2016-11-20 at 00:03 +0200, Jaak Ristioja wrote:
> So I guess I'll attempt to recompile glibc and run valgrind on the
> parent "make -j5" process and see whether that turns up anything. If
> not, then I'll try the -fsanitize=address approach. I expect all of this
> to take some time (and perhaps wear out more flash storage) on the slow
> RPi2.

Of course another option is the old standby: add printf calls etc. to
the code to try to figure out what's going on.

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make


Re: INTERNAL: Exiting with 2 jobserver tokens available; should be 5!

2016-11-19 Thread Jaak Ristioja
Hi!

On 13.11.2016 07:37, Tim Murphy wrote:
> Something like Valgrind might spot some initial problem that doesn't
> immediately crash but eventually spirals out of control.

I could try valgrind, but (1) I will need to recompile glibc with debug
symbols to use it, and (2) I don't have an eternity to wait for
--trace-children to finish, so I'd have to run it without
--trace-children because afaik valgrind doesn't provide good means to
trace only certain children (i.e. only "make" processes). Actually the
RPi2 would probably OOM/die from multiple valgrind processes anyway.

> I don't know what the gcc version is on your Pi but if you have a recent
> enough one  you might manage to use the address sanitiser option to get
> a similar result.

I'm currently using GCC 5.4, so its fairly new from that aspect, but I
won't be able to use its address sanitizer, because it doesn't work with
a PaX/grsecurity kernel like Gentoo's sys-devel/hardened-sources, due to
"ASAN assumes/uses hardcoded userland address space size values, which
breaks when UDEREF is set as it pitches a bit from the size" [1].
Because of this, it is disabled by default on hardened profiles, hence
I'd have to both recompile GCC and a kernel without PaX/grsec to

So I guess I'll attempt to recompile glibc and run valgrind on the
parent "make -j5" process and see whether that turns up anything. If
not, then I'll try the -fsanitize=address approach. I expect all of this
to take some time (and perhaps wear out more flash storage) on the slow
RPi2.

Best regards,
J


[1] http://blog.siphos.be/2013/04/another-gentoo-hardened-month-has-passed/

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make


Re: INTERNAL: Exiting with 2 jobserver tokens available; should be 5!

2016-11-12 Thread Tim Murphy
Something like Valgrind might spot some initial problem that doesn't
immediately crash but eventually spirals out of control. It seems to
support ARM linux now:

"20 October 2016: valgrind-3.12.0 is available. This release supports:
X86/Linux, AMD64/Linux, ARM32/Linux, ARM64/Linux, PPC32/Linux,
PPC64BE/Linux, PPC64LE/Linux, S390X/Linux, MIPS32/Linux, MIPS64/Linux,
ARM/Android, ARM64/Android, MIPS32/Android, X86/Android, X86/Solaris,
AMD64/Solaris, X86/MacOSX 10.10 and AMD64/MacOSX 10.10. There is also
preliminary support for X86/MacOSX 10.11/12, AMD64/MacOSX 10.11/12 and
TILEGX/Linux. For more details see the release notes
."

I don't know what the gcc version is on your Pi but if you have a recent
enough one  you might manage to use the address sanitiser option to get a
similar result.

Regards,

Tim

On 12 November 2016 at 14:30, Paul Smith  wrote:

> On Fri, 2016-11-11 at 19:41 +0200, Jaak Ristioja wrote:
> > After examining about 10 more core files, these all point to job.c:519
> > and job.c:537, similarly to the above:
> >
> >   #0  0x00c2bd74 in child_error (child=0x0, exit_code=0, exit_sig=0,
> > coredump=0, ignored=0) at job.c:519
> >   pre = 0x0
> >   post = 0x0
> >   dump = 0x0
> >   f = 0x0
> >   flocp = 0x0
> >   nm = 0x0
> >   l = 0
> >   #1  0x00c2bd8e in child_handler (sig=0) at job.c:537
> >   No locals.
> >   #2  0x00c67cc0 in ?? ()
> >   No symbol table info available.
> >   Backtrace stopped: previous frame identical to this frame (corrupt
> stack?)
> >
> > Any ideas?
>
> Unfortunately this doesn't help much.  It's pretty clear that either ARM
> debug information is very limited, or GDB is problematic on ARM, or else
> the core file is very corrupted: there's no way that all the values in
> the argument lists to these functions are really 0/NULL.  Also, as you
> point out, in no way does child_handler() (which is a signal handler)
> ever call child_error().
>
> In fact, if your config.h from GNU make has HAVE_PSELECT set (which I
> would expect it would since this is Linux) there's no way the
> child_handler() function should ever be invoked at all.
>
> ___
> Bug-make mailing list
> Bug-make@gnu.org
> https://lists.gnu.org/mailman/listinfo/bug-make
>
___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make


Re: INTERNAL: Exiting with 2 jobserver tokens available; should be 5!

2016-11-12 Thread Paul Smith
On Fri, 2016-11-11 at 19:41 +0200, Jaak Ristioja wrote:
> After examining about 10 more core files, these all point to job.c:519
> and job.c:537, similarly to the above:
> 
>   #0  0x00c2bd74 in child_error (child=0x0, exit_code=0, exit_sig=0,
> coredump=0, ignored=0) at job.c:519
>   pre = 0x0
>   post = 0x0
>   dump = 0x0
>   f = 0x0
>   flocp = 0x0
>   nm = 0x0
>   l = 0
>   #1  0x00c2bd8e in child_handler (sig=0) at job.c:537
>   No locals.
>   #2  0x00c67cc0 in ?? ()
>   No symbol table info available.
>   Backtrace stopped: previous frame identical to this frame (corrupt stack?)
> 
> Any ideas?

Unfortunately this doesn't help much.  It's pretty clear that either ARM
debug information is very limited, or GDB is problematic on ARM, or else
the core file is very corrupted: there's no way that all the values in
the argument lists to these functions are really 0/NULL.  Also, as you
point out, in no way does child_handler() (which is a signal handler)
ever call child_error().

In fact, if your config.h from GNU make has HAVE_PSELECT set (which I
would expect it would since this is Linux) there's no way the
child_handler() function should ever be invoked at all.

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make


Re: INTERNAL: Exiting with 2 jobserver tokens available; should be 5!

2016-11-12 Thread Paul Smith
On Sat, 2016-11-12 at 13:06 +0200, Jaak Ristioja wrote:
> I'm guessing that PATH_MAX is 4096 on most Linux systems, while the stack is
> 8192.

There's no way the stack is so small.  Virtually no userspace program
can run with an 8k stack, regardless of whether they use alloca() or
not.

I think you might be misled by the output of ulimit -s as "8192";
however, the doc says:

> Values are in 1024-byte increments

so really the default is an 8M stack, not an 8K stack.

Also, traditional Linux systems set the hard limit on the stack size to
"unlimited" (run 'ulimit -S' to see it), and GNU make will reset its own
stack size to the maximum when it starts.

I don't know if there are special features of ARM which make alloca()
more problematic than other systems, but I've certainly never heard of
any issues like this on ARM.

I suspect this is a red herring.

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make


Re: INTERNAL: Exiting with 2 jobserver tokens available; should be 5!

2016-11-12 Thread Jaak Ristioja
On 11.11.2016 19:41, Jaak Ristioja wrote:
> On 10.11.2016 09:55, Jaak Ristioja wrote:
>> On 09.11.2016 22:58, Paul Smith wrote:
>>> On Wed, 2016-11-09 at 22:42 +0200, Jaak Ristioja wrote:
 I have no ARM experience myself. I don't even know where to look for
 ABI
 documentation. This is the best I can currently get from the core:

 (gdb) thread apply all bt full

 Thread 1 (LWP 15210):
 #0  0x0d33b0bc in ?? ()
 No symbol table info available.
 #1  
 No symbol table info available.
 #2  0x64a2a8b0 in strlen () from /lib/libc.so.6
 No symbol table info available.
 #3  0x0d340370 in concat ()
 No symbol table info available.
 #4  0x0d680d34 in ?? ()
 No symbol table info available.
 Backtrace stopped: previous frame identical to this frame (corrupt
 stack?)
>>>
>>> You won't need any ABI docs.  This is a good first step, but if you can
>>> rebuild GNU make with debugging (-g) and without optimization (-O0) you
>>> will hopefully get a much more interesting and useable core.  I'm
>>> assuming, although I'm not familiar with working on ARM.
>>>
>>> It looks like somewhere in the GNU make code we're passing an invalid
>>> pointer to strlen(); either NULL or pointing to invalid memory of some
>>> kind.
>>
>> After re-compiling make 4.2.1 with "-O0 -pipe -mcpu=cortex-a7
>> -mfpu=neon-vfpv4 -mfloat-abi=hard -ggdb" instead of the regular "-O2
>> -pipe -mcpu=cortex-a7 -mfpu=neon-vfpv4 -mfloat-abi=hard" I got:
>>
>> Thread 1 (LWP 20416):
>> #0  0x0c5cbd74 in child_error (child=0xbf78e700, exit_code=1900259124,
>> exit_sig=-1082595584, coredump=1900259184, ignored=0) at job.c:519
>> #1  0x0c5cbd8e in child_handler (sig=1935828325) at job.c:537
>> #2  0x0008 in ?? ()
>> Backtrace stopped: previous frame identical to this frame (corrupt stack?)
>>
>> Which looks even more weird. I'm not even sure its the same crash.
>> Something seriously seems to corrupt the stack in both cases. As far as
>> I can tell, child_handler() does not call child_error() directly or
>> indirectly.
> 
> After examining about 10 more core files, these all point to job.c:519
> and job.c:537, similarly to the above:
> 
>   #0  0x00c2bd74 in child_error (child=0x0, exit_code=0, exit_sig=0,
> coredump=0, ignored=0) at job.c:519
>   pre = 0x0
>   post = 0x0
>   dump = 0x0
>   f = 0x0
>   flocp = 0x0
>   nm = 0x0
>   l = 0
>   #1  0x00c2bd8e in child_handler (sig=0) at job.c:537
>   No locals.
>   #2  0x00c67cc0 in ?? ()
>   No symbol table info available.
>   Backtrace stopped: previous frame identical to this frame (corrupt stack?)
> 
> Any ideas?

Looking at the code, job.c (and other source code files in GNU Make)
uses alloca (3) to allocate memory!? This is definitely looks like one
possible source for stack overflows! Quoting `man 3 alloca`:

BUGS:
There is no error indication if the stack frame cannot
be extended. (However, after a failed allocation, the
program is likely to receive a SIGSEGV signal if it
attempts to access the unallocated space.)

Personally, I would never use this function in regular code. In
child_error() it is used to allocate enough space for a filename:

char *a = alloca (strlen (flocp->filenm) + 1 + 11 + 1);

Assuming that flocp->filenm points to a path and not just the name of
the single file, one could easily overflow that stack, IMHO. I'm
guessing that PATH_MAX is 4096 on most Linux systems, while the stack is
8192.

Are you sure this is safe?

J


___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make


Re: INTERNAL: Exiting with 2 jobserver tokens available; should be 5!

2016-11-11 Thread Jaak Ristioja
On 10.11.2016 09:55, Jaak Ristioja wrote:
> On 09.11.2016 22:58, Paul Smith wrote:
>> On Wed, 2016-11-09 at 22:42 +0200, Jaak Ristioja wrote:
>>> I have no ARM experience myself. I don't even know where to look for
>>> ABI
>>> documentation. This is the best I can currently get from the core:
>>>
>>> (gdb) thread apply all bt full
>>>
>>> Thread 1 (LWP 15210):
>>> #0  0x0d33b0bc in ?? ()
>>> No symbol table info available.
>>> #1  
>>> No symbol table info available.
>>> #2  0x64a2a8b0 in strlen () from /lib/libc.so.6
>>> No symbol table info available.
>>> #3  0x0d340370 in concat ()
>>> No symbol table info available.
>>> #4  0x0d680d34 in ?? ()
>>> No symbol table info available.
>>> Backtrace stopped: previous frame identical to this frame (corrupt
>>> stack?)
>>
>> You won't need any ABI docs.  This is a good first step, but if you can
>> rebuild GNU make with debugging (-g) and without optimization (-O0) you
>> will hopefully get a much more interesting and useable core.  I'm
>> assuming, although I'm not familiar with working on ARM.
>>
>> It looks like somewhere in the GNU make code we're passing an invalid
>> pointer to strlen(); either NULL or pointing to invalid memory of some
>> kind.
> 
> After re-compiling make 4.2.1 with "-O0 -pipe -mcpu=cortex-a7
> -mfpu=neon-vfpv4 -mfloat-abi=hard -ggdb" instead of the regular "-O2
> -pipe -mcpu=cortex-a7 -mfpu=neon-vfpv4 -mfloat-abi=hard" I got:
> 
> Thread 1 (LWP 20416):
> #0  0x0c5cbd74 in child_error (child=0xbf78e700, exit_code=1900259124,
> exit_sig=-1082595584, coredump=1900259184, ignored=0) at job.c:519
> #1  0x0c5cbd8e in child_handler (sig=1935828325) at job.c:537
> #2  0x0008 in ?? ()
> Backtrace stopped: previous frame identical to this frame (corrupt stack?)
> 
> Which looks even more weird. I'm not even sure its the same crash.
> Something seriously seems to corrupt the stack in both cases. As far as
> I can tell, child_handler() does not call child_error() directly or
> indirectly.

After examining about 10 more core files, these all point to job.c:519
and job.c:537, similarly to the above:

  #0  0x00c2bd74 in child_error (child=0x0, exit_code=0, exit_sig=0,
coredump=0, ignored=0) at job.c:519
  pre = 0x0
  post = 0x0
  dump = 0x0
  f = 0x0
  flocp = 0x0
  nm = 0x0
  l = 0
  #1  0x00c2bd8e in child_handler (sig=0) at job.c:537
  No locals.
  #2  0x00c67cc0 in ?? ()
  No symbol table info available.
  Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Any ideas?
J

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make


Re: INTERNAL: Exiting with 2 jobserver tokens available; should be 5!

2016-11-10 Thread Andreas Schwab
On Nov 09 2016, Jaak Ristioja  wrote:

> Is this some known GNU Make bug on ARM?

Works fine here:

https://build.opensuse.org/project/show/openSUSE:Factory:ARM

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make


Re: INTERNAL: Exiting with 2 jobserver tokens available; should be 5!

2016-11-09 Thread Jaak Ristioja
On 09.11.2016 22:58, Paul Smith wrote:
> On Wed, 2016-11-09 at 22:42 +0200, Jaak Ristioja wrote:
>> I have no ARM experience myself. I don't even know where to look for
>> ABI
>> documentation. This is the best I can currently get from the core:
>>
>> (gdb) thread apply all bt full
>>
>> Thread 1 (LWP 15210):
>> #0  0x0d33b0bc in ?? ()
>> No symbol table info available.
>> #1  
>> No symbol table info available.
>> #2  0x64a2a8b0 in strlen () from /lib/libc.so.6
>> No symbol table info available.
>> #3  0x0d340370 in concat ()
>> No symbol table info available.
>> #4  0x0d680d34 in ?? ()
>> No symbol table info available.
>> Backtrace stopped: previous frame identical to this frame (corrupt
>> stack?)
> 
> You won't need any ABI docs.  This is a good first step, but if you can
> rebuild GNU make with debugging (-g) and without optimization (-O0) you
> will hopefully get a much more interesting and useable core.  I'm
> assuming, although I'm not familiar with working on ARM.
> 
> It looks like somewhere in the GNU make code we're passing an invalid
> pointer to strlen(); either NULL or pointing to invalid memory of some
> kind.

After re-compiling make 4.2.1 with "-O0 -pipe -mcpu=cortex-a7
-mfpu=neon-vfpv4 -mfloat-abi=hard -ggdb" instead of the regular "-O2
-pipe -mcpu=cortex-a7 -mfpu=neon-vfpv4 -mfloat-abi=hard" I got:

Thread 1 (LWP 20416):
#0  0x0c5cbd74 in child_error (child=0xbf78e700, exit_code=1900259124,
exit_sig=-1082595584, coredump=1900259184, ignored=0) at job.c:519
#1  0x0c5cbd8e in child_handler (sig=1935828325) at job.c:537
#2  0x0008 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Which looks even more weird. I'm not even sure its the same crash.
Something seriously seems to corrupt the stack in both cases. As far as
I can tell, child_handler() does not call child_error() directly or
indirectly.

J

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make


Re: INTERNAL: Exiting with 2 jobserver tokens available; should be 5!

2016-11-09 Thread Paul Smith
On Wed, 2016-11-09 at 22:42 +0200, Jaak Ristioja wrote:
> I have no ARM experience myself. I don't even know where to look for
> ABI
> documentation. This is the best I can currently get from the core:
> 
> (gdb) thread apply all bt full
> 
> Thread 1 (LWP 15210):
> #0  0x0d33b0bc in ?? ()
> No symbol table info available.
> #1  
> No symbol table info available.
> #2  0x64a2a8b0 in strlen () from /lib/libc.so.6
> No symbol table info available.
> #3  0x0d340370 in concat ()
> No symbol table info available.
> #4  0x0d680d34 in ?? ()
> No symbol table info available.
> Backtrace stopped: previous frame identical to this frame (corrupt
> stack?)

You won't need any ABI docs.  This is a good first step, but if you can
rebuild GNU make with debugging (-g) and without optimization (-O0) you
will hopefully get a much more interesting and useable core.  I'm
assuming, although I'm not familiar with working on ARM.

It looks like somewhere in the GNU make code we're passing an invalid
pointer to strlen(); either NULL or pointing to invalid memory of some
kind.

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make


Re: INTERNAL: Exiting with 2 jobserver tokens available; should be 5!

2016-11-09 Thread Jaak Ristioja
On 09.11.2016 21:44, Paul Smith wrote:
> On Wed, 2016-11-09 at 21:31 +0200, Jaak Ristioja wrote:
>> I'm attaching[*] the core and the binaries for 4.2.1, but I don't
>> know how to debug it myself.
> 
> It's unlikely anyone here will be able to help debug a random ARM core
> file (for sure I can't).  At the very least we would need a stacktrace
> from the core to even begin to look at this.
> 
> You, or someone who can reproduce the problem, need to install gdb on
> your system, and run:
> 
>   gdb -c core make
> 
> then run the "bt" command to get a backtrace.  You can choose a frame
> with "fr " where  is the frame number, and print the value of
> variables with "p ", or "p *" if it's a pointer and
> you want to see what it points to.


I have no ARM experience myself. I don't even know where to look for ABI
documentation. This is the best I can currently get from the core:

(gdb) thread apply all bt full

Thread 1 (LWP 15210):
#0  0x0d33b0bc in ?? ()
No symbol table info available.
#1  
No symbol table info available.
#2  0x64a2a8b0 in strlen () from /lib/libc.so.6
No symbol table info available.
#3  0x0d340370 in concat ()
No symbol table info available.
#4  0x0d680d34 in ?? ()
No symbol table info available.
Backtrace stopped: previous frame identical to this frame (corrupt stack?)


J

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make


Re: INTERNAL: Exiting with 2 jobserver tokens available; should be 5!

2016-11-09 Thread Paul Smith
On Wed, 2016-11-09 at 21:31 +0200, Jaak Ristioja wrote:
> I'm attaching[*] the core and the binaries for 4.2.1, but I don't
> know how to debug it myself.

It's unlikely anyone here will be able to help debug a random ARM core
file (for sure I can't).  At the very least we would need a stacktrace
from the core to even begin to look at this.

You, or someone who can reproduce the problem, need to install gdb on
your system, and run:

  gdb -c core make

then run the "bt" command to get a backtrace.  You can choose a frame
with "fr " where  is the frame number, and print the value of
variables with "p ", or "p *" if it's a pointer and
you want to see what it points to.

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make


Re: INTERNAL: Exiting with 2 jobserver tokens available; should be 5!

2016-11-09 Thread Jaak Ristioja
On 09.11.2016 19:55, Paul Smith wrote:
> On Wed, 2016-11-09 at 19:29 +0200, Jaak Ristioja wrote:
>> GNU Make seems to randomly crash on an Raspberry Pi 2 with
>>
>>INTERNAL: Exiting with 2 jobserver tokens available; should be 5!
>>
>> or similar when emerging Gentoo Linux packages using multiple jobs
>> (e.g. -j5). The kernel log then has lines like
>>
>>   Segmentation fault occurred at(nil) in
>> /usr/bin/gmake[make:23312]
>> uid/euid:250/250 gid/egid:250/250, parent /bin/bash[sh:23311]
>> uid/euid:250/250 gid/egid:250/250
> 
> Internal errors don't dump core in GNU make.
> 
> I think what is happening is that GNU make is crashing, and that's
> causing it to lose jobserver tokens (if an instance of GNU make owns an
> extra token and crashes, then no one is available to release that token
> again and you'll get an error about mismatched numbers of tokens at the
> end of the build).
> 
> In other words, cause and effect here are backwards.  You'll need to
> figure out why GNU make is throwing a segfault (is there a core file
> you can examine for example): fixing that will likely solve the token
> count issue.
> 
>> Is this some known GNU Make bug on ARM?
> 
> I'm not aware of any such bug on ARM.
> 
> It would be helpful if you mentioned which version of GNU make you're
> using.

# make --version
GNU Make 4.2.1
Built for armv7a-hardfloat-linux-gnueabi

But this also happened with 4.1 and 4.2.

J

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make


Re: INTERNAL: Exiting with 2 jobserver tokens available; should be 5!

2016-11-09 Thread Paul Smith
On Wed, 2016-11-09 at 19:29 +0200, Jaak Ristioja wrote:
> GNU Make seems to randomly crash on an Raspberry Pi 2 with
> 
>    INTERNAL: Exiting with 2 jobserver tokens available; should be 5!
> 
> or similar when emerging Gentoo Linux packages using multiple jobs
> (e.g. -j5). The kernel log then has lines like
> 
>   Segmentation fault occurred at    (nil) in
> /usr/bin/gmake[make:23312]
> uid/euid:250/250 gid/egid:250/250, parent /bin/bash[sh:23311]
> uid/euid:250/250 gid/egid:250/250

Internal errors don't dump core in GNU make.

I think what is happening is that GNU make is crashing, and that's
causing it to lose jobserver tokens (if an instance of GNU make owns an
extra token and crashes, then no one is available to release that token
again and you'll get an error about mismatched numbers of tokens at the
end of the build).

In other words, cause and effect here are backwards.  You'll need to
figure out why GNU make is throwing a segfault (is there a core file
you can examine for example): fixing that will likely solve the token
count issue.

> Is this some known GNU Make bug on ARM?

I'm not aware of any such bug on ARM.

It would be helpful if you mentioned which version of GNU make you're
using.

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make


INTERNAL: Exiting with 2 jobserver tokens available; should be 5!

2016-11-09 Thread Jaak Ristioja
Hello!

GNU Make seems to randomly crash on an Raspberry Pi 2 with

   INTERNAL: Exiting with 2 jobserver tokens available; should be 5!

or similar when emerging Gentoo Linux packages using multiple jobs (e.g.
-j5). The kernel log then has lines like

  Segmentation fault occurred at(nil) in /usr/bin/gmake[make:23312]
uid/euid:250/250 gid/egid:250/250, parent /bin/bash[sh:23311]
uid/euid:250/250 gid/egid:250/250

Is this some known GNU Make bug on ARM?


Best regards,
Jaak Ristioja

___
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make