So after having to do some debugging and sysadmin on one of my Xen
machines, I found I wanted a version of emacs installed on it, and not
having one handy I thought I'd try building a static-linked emac25-nox11.

I have long used static-linked emacs up to and including 23.2 on
NetBSD-5.  Start-up time of the static-linked binary is phenomenally
faster, especially on slower systems, and especially with the x11
version.  Some hacks are needed to make it configure, but nothing more
than fiddling with the order and completeness of "ld -l" parameters.

Unfortunately emacs25 will no longer build static-linked on a more
recent-ish NetBSD/amd64 systems.  I've not tried NetBSD-6 (I'm guessing
it wouldn't work), nor a very recent -current (though I don't expect
recent changes will help either).  I suspect it will work on NetBSD-5,
though I'll have to kick off a build of quite a few things to try it.

This is on NetBSD/amd64-7.99.34, which I think was updated 2016/07/23.

Here is what the core file looks like:

$ gdb src/bootstrap-emacs lisp/bootstrap-emacs.core
src/bootstrap-emacs:      ELF 64-bit LSB executable, x86-64, version 1 (SYSV), 
statically linked, for NetBSD 7.99.34, PaX: -ASLR, not stripped
lisp/bootstrap-emacs.core: ELF 64-bit LSB core file x86-64, version 1 (SYSV), 
NetBSD-style, from 'emacs' (signal 11)
$ gdb src/bootstrap-emacs lisp/bootstrap-emacs.core
GNU gdb (GDB) 7.10.1
Copyright (C) 2015 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64--netbsd".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from src/bootstrap-emacs...done.
[New process 1]
Core was generated by `bootstrap-emacs'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  pthread__self () at 
/build/woods/future/current-amd64-destdir/usr/include/amd64/mcontext.h:91
91              __asm volatile("movq %%fs:0, %0" : "=r" (__tmp));
(gdb) bt
#0  pthread__self () at 
/build/woods/future/current-amd64-destdir/usr/include/amd64/mcontext.h:91
#1  pthread_mutex_lock (ptm=0xba8b00 <__atexit_mutex>) at 
/building/work/woods/m-NetBSD-current/lib/libpthread/pthread_mutex.c:194
#2  0x0000000000591140 in __cxa_atexit (func=0x5b9d30 <_fini>, arg=0x0, dso=0x0)
    at /building/work/woods/m-NetBSD-current/lib/libc/stdlib/atexit.c:156
#3  0x0000000000400227 in ___start ()
#4  0x0000000000000000 in ?? ()
(gdb) up
#1  pthread_mutex_lock (ptm=0xba8b00 <__atexit_mutex>) at 
/building/work/woods/m-NetBSD-current/lib/libpthread/pthread_mutex.c:194
194             self = pthread__self();
(gdb) up
#2  0x0000000000591140 in __cxa_atexit (func=0x5b9d30 <_fini>, arg=0x0, dso=0x0)
    at /building/work/woods/m-NetBSD-current/lib/libc/stdlib/atexit.c:156
156             mutex_lock(&__atexit_mutex);
(gdb) print __atexit_mutex
$1 = {ptm_magic = 858980355, ptm_errorcheck = 0 '\000', ptm_pad1 = "\000\000", 
{ptm_ceiling = 0 '\000', ptm_unused = 0 '\000'}, 
  ptm_pad2 = "\000\000", ptm_owner = 0x2, ptm_waiters = 0x0, ptm_recursed = 0, 
ptm_spare2 = 0x0}
(gdb) down
#1  pthread_mutex_lock (ptm=0xba8b00 <__atexit_mutex>) at 
/building/work/woods/m-NetBSD-current/lib/libpthread/pthread_mutex.c:194
194             self = pthread__self();
(gdb) down
#0  pthread__self () at 
/build/woods/future/current-amd64-destdir/usr/include/amd64/mcontext.h:91
91              __asm volatile("movq %%fs:0, %0" : "=r" (__tmp));
(gdb) info registers 
rax            0x0      0
rbx            0x0      0
rcx            0x82a840 8562752
rdx            0x0      0
rsi            0x0      0
rdi            0xba8b00 12225280
rbp            0x5b9d30 0x5b9d30 <_fini>
rsp            0x7f7fffffc0b8   0x7f7fffffc0b8
r8             0x101010101010101        72340172838076673
r9             0x8080808080808080       -9187201950435737472
r10            0x1      1
r11            0x202    514
r12            0x0      0
r13            0x1      1
r14            0x701da5a1a808   123272635197448
r15            0x1      1
rip            0x54ffaa 0x54ffaa <pthread_mutex_lock+10>
eflags         0x10246  [ PF ZF IF RF ]
cs             0xe033   57395
ss             0xe02b   57387
ds             0x82003f 8519743
es             0xffff003f       -65473
fs             0x0      0
gs             0x0      0
(gdb) 


I no longer understand the differences between temacs and the "dumped"
version, but I don't see why __start() should behave differently in
each, however it seems something different is happening.  Since the
failure is still in __start(), I can see that _libc_init() must have
been called, and indeed the __atexit_mutex object does appear to have
been initialised (or at least it contains the right _PT_MUTEX_MAGIC
value, and has otherwise been zeroed out except for ptm_owner), so how
or why would mutex_lock() fail?

I suppose this could be a bug in the NetBSD support for emacs unexec()
since a normal static-linked binary created by 'ld' works.  Indeed the
initial "temacs" binary runs successfully and creates this unexec()'ed
binary.

Are there any patches to anything that I could try to make this work?

BTW, the latest emacs from git.sv.gnu.org fails in a similar place, but
in a different way on NetBSD-5/amd64 (where static-linked emacs-23.3
works fine):

$ file src/bootstrap-emacs lisp/bootstrap-emacs.core
src/bootstrap-emacs:       ELF 64-bit LSB executable, x86-64, version 1 (SYSV), 
statically linked, for NetBSD 5.2, not stripped
lisp/bootstrap-emacs.core: ELF 64-bit LSB core file x86-64, version 1 (SYSV), 
NetBSD-style, from 'bootstrap-emacs' (signal 11)
$ gdb src/bootstrap-emacs lisp/bootstrap-emacs.core
GNU gdb 6.5
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "x86_64--netbsd"...
Core was generated by `bootstrap-emacs'.
Program terminated with signal 11, Segmentation fault.
#0  0x000000000064939f in choose_arena_hard () at 
/once/rest/work/woods/m-NetBSD-5/lib/libc/stdlib/jemalloc.c:1545
1545    /once/rest/work/woods/m-NetBSD-5/lib/libc/stdlib/jemalloc.c: No such 
file or directory.
        in /once/rest/work/woods/m-NetBSD-5/lib/libc/stdlib/jemalloc.c
(gdb) bt
#0  0x000000000064939f in choose_arena_hard () at 
/once/rest/work/woods/m-NetBSD-5/lib/libc/stdlib/jemalloc.c:1545
#1  0x000000000064a50d in imalloc (size=16) at 
/once/rest/work/woods/m-NetBSD-5/lib/libc/stdlib/jemalloc.c:1572
#2  0x000000000064ab4b in malloc (size=0) at 
/once/rest/work/woods/m-NetBSD-5/lib/libc/stdlib/jemalloc.c:3700
#3  0x0000000000633f5c in _pthread_atfork (prepare=0, parent=0, child=0x62e210 
<pthread__fork_callback>)
    at /once/rest/work/woods/m-NetBSD-5/lib/libc/gen/pthread_atfork.c:119
#4  0x000000000062e11d in pthread__init () at 
/once/rest/work/woods/m-NetBSD-5/lib/libpthread/pthread.c:234
#5  0x000000000065e163 in __libc_init () at 
/once/rest/work/woods/m-NetBSD-5/lib/libc/misc/initfini.c:58
#6  0x000000000068845b in __do_global_ctors_aux ()
#7  0x000000000040015e in _init ()
#8  0x00007f7fffffffe0 in ?? ()
#9  0x000000000040024a in ___start (argc=9, argv=0x7f7fffffce20, envp=<value 
optimized out>, cleanup=0, obj=0x400180, 
    ps_strings=0x7f7fffffffe0) at 
/once/rest/work/woods/m-NetBSD-5/lib/csu/x86_64/crt0.c:92
#10 0x0000000000000009 in ?? ()
#11 0x00007f7ffffff268 in ?? ()
#12 0x00007f7ffffff27f in ?? ()
#13 0x00007f7ffffff286 in ?? ()
#14 0x00007f7ffffff295 in ?? ()
#15 0x00007f7ffffff2a4 in ?? ()
#16 0x00007f7ffffff2ab in ?? ()
#17 0x00007f7ffffff2c6 in ?? ()
#18 0x00007f7ffffff2c9 in ?? ()
#19 0x00007f7ffffff2dc in ?? ()
#20 0x0000000000000000 in ?? ()
(gdb) 

That seems even more mysterious to me, though perhaps there's a good
explanation somehow.  (E.g. don't malloc() in a dumped binary because
that's already been done?  Or maybe the dumped heap is inconsistent?)

-- 
                                        Greg A. Woods <gwo...@acm.org>

+1 250 762-7675                           RoboHack <wo...@robohack.ca>
Planix, Inc. <wo...@planix.com>     Avoncote Farms <wo...@avoncote.ca>

Attachment: pgp0xUxIYhwLH.pgp
Description: PGP signature

Reply via email to