I believe that the files in WRKSRC/prebuilt/32-bit-big-endian are
broken: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=26854

The diff below adds a post-extract target that moves away the prebuilt
files, so the build ignores them.  This fixes the build for me, but
the build is slow, takes about 24 hours on my G4 at 666 MHz.

On Sun, 8 Dec 2019 13:42:38 -0500
George Koehler <kern...@gmail.com> wrote:

> ...  Some code
> might put bad pointers in program objects.  I modified guile to look for
> such code.  I added a global "scm_t_uint32 aaa;" and added some checks
> like "aaa = *pointer".  One such check crashed at vm-engine.c:1654
> "make-closure":
> 
>       UNPACK_24 (op, dst);
>       offset = ip[1];
>       UNPACK_24 (ip[2], nfree);
> 
>       // FIXME: Assert range of nfree?
>       SYNC_IP ();
>       closure = scm_inline_words (thread, scm_tc7_program | (nfree << 16),
>                                   nfree + 2);
>       aaa = *(ip + offset);
>       SCM_SET_CELL_WORD_1 (closure, ip + offset);
>       // FIXME: Elide these initializations?
>       for (n = 0; n < nfree; n++)
>         SCM_PROGRAM_FREE_VARIABLE_SET (closure, n, SCM_BOOL_F);
>       SP_SET (dst, closure);
>       NEXT (3);
> 
> (gdb) print ip   
> $12 = (scm_t_uint32 *) 0xcf1ea3b8
> (gdb) print offset
> $13 = -1005191168
> (gdb) print *(ip + offset)
> Cannot access memory at address 0xdf76a3b8
> (gdb) print ip[1]
> Cannot access memory at address 0xcf1ea3bc
> 
> I can't read ip[1] in the core dump, but the program did read ip[1] in
> "offset = ip[1];" before the crash.  The call to scm_inline_words(), to
> allocate the scm_tc7_program object, seems to have also freed the memory
> where ip points.  This might be a problem with the garbage collector.

The failure to read ip[1] was a red herring.  Before the crash, `ip`
pointed to an mmap(2) file.  In ktrace(1), the file was somewhere
under prebuilt/32-bit-big-endian.  This mapping disappeared in the
core dump, so GDB can't access it.

`offset` -1005191168 is 0xc4160000.  This looks like the wrong byte
order.  The correct value might be 0x000016c4 = 5828.  This would make
more sense, if ip + offset should be inside the file!

modules/system/vm/assembler.scm can byte-swap values when it emits
bytecode for a different-endian machine.  If a little-endian machine
wrote the prebuilt/32-bit-big-endian files, and assembler.scm forgot
to swap `offset`, then it would cause this bug.

powerpc might be the only 32-bit-big-endian arch where OpenBSD builds
packages.  mips64 and sparc64 might be 64-bit-big-endian (but there is
no prebuilt/64-bit-big-endian, so those arches would bootstrap without
prebuilt files), and the other arches might be *-little-endian.

With no prebuilt files, the build ran some slow "bootstrap" commands on
my 666 MHz cpu.  (The MPC7447A in my PowerBook G4 can run at 1333 MHz
using apmd(8) and apm -A, but I left it at 666 MHz.)  The first
bootstrap command took more than 100 minutes.  The second command took
just over 4 hours.  The next commands continued overnight, and the whole
build might have taken almost 24 hours.  The build passes most tests:

SKIP: test-pthread-create-secondary
FAIL: test-stack-overflow
FAIL: test-out-of-memory
==================================
2 of 38 tests failed
(1 test was not run)

Here's the diff.  I didn't set REVISION because powerpc had no package,
and I guess that other arches would ignore prebuilt/32-bit-big-endian.
  --George

Index: Makefile
===================================================================
RCS file: /cvs/ports/lang/guile2/Makefile,v
retrieving revision 1.23
diff -u -p -r1.23 Makefile
--- Makefile    16 Jul 2019 21:29:41 -0000      1.23
+++ Makefile    12 Dec 2019 01:02:07 -0000
@@ -3,8 +3,6 @@
 # When updating, check that x11/gnome/aisleriot MODGNOME_CPPFLAGS references 
the
 # proper guile2 includes directory
 
-BROKEN-powerpc=                Segmentation fault (core dumped)
-
 COMMENT=               GNU's Ubiquitous Intelligent Language for Extension
 # '
 
@@ -51,6 +49,10 @@ CONFIGURE_ARGS=              --program-suffix=${V}
 # Needed because otherwise regress tests won't build:
 # warning: format '%ji' expects type 'intmax_t', but argument 4 has type 
'scm_t_intmax'
 CONFIGURE_ARGS +=      --disable-error-on-warning
+
+# powerpc: Prevent "Segmentation fault (core dumped)" during build.
+post-patch:
+       mv ${WRKSRC}/prebuilt/32-bit-big-endian{,-broken}
 
 post-install:
        install -d ${PREFIX}/share/guile/site/${V}/

Reply via email to