Re: Issues with debugging GC-related crashes #2

Matthias Klumpp via Digitalmars-d Thu, 19 Apr 2018 17:15:49 -0700

On Thursday, 19 April 2018 at 18:45:41 UTC, kinke wrote:

On Thursday, 19 April 2018 at 17:01:48 UTC, Matthias Klumppwrote:
Something that maybe is relevant though: I occasionally getthe following SIGABRT crash in the tool on machines which havethe SIGSEGV crash:
```
Thread 53 "appstream-gener" received signal SIGABRT, Aborted.
[Switching to Thread 0x7fdfe98d4700 (LWP 7326)]
0x00007ffff5040428 in __GI_raise (sig=sig@entry=6) at../sysdeps/unix/sysv/linux/raise.c:5454 ../sysdeps/unix/sysv/linux/raise.c: No such file ordirectory.
(gdb) bt
#0 0x00007ffff5040428 in __GI_raise (sig=sig@entry=6) at../sysdeps/unix/sysv/linux/raise.c:54
#1  0x00007ffff504202a in __GI_abort () at abort.c:89
#2 0x0000000000780ae0 in core.thread.Fiber.allocStack(ulong,ulong) (this=0x7fde0758a680, guardPageSize=4096, sz=20480) atsrc/core/thread.d:4606#3 0x00000000007807fc in_D4core6thread5Fiber6__ctorMFNbDFZvmmZCQBlQBjQBf(this=0x7fde0758a680, guardPageSize=4096, sz=16384, dg=...)
    at src/core/thread.d:4134
#4 0x00000000006f9b31 in_D3std11concurrency__T9GeneratorTAyaZQp6__ctorMFDFZvZCQCaQBz__TQBpTQBiZQBx (this=0x7fde0758a680, dg=...)at/home/ubuntu/dtc/dmd/generated/linux/debug/64/../../../../../druntime/import/core/thread.d:4126#5 0x00000000006e9467 in_D5asgen8handlers11iconhandler5Theme21matchingIconFilenamesMFAyaSQCl5utils9ImageSizebZC3std11concurrency__T9GeneratorTQCfZQp (this=0x7fdea2747800, relaxedScalingRules=true, size=..., iname=...) at ../src/asgen/handlers/iconhandler.d:196#6 0x00000000006ea75a in_D5asgen8handlers11iconhandler11IconHandler21possibleIconFilenamesMFAyaSQCs5utils9ImageSizebZ9__lambda4MFZv (this=0x7fde0752bd00)
    at ../src/asgen/handlers/iconhandler.d:392
#7 0x000000000082fdfa in core.thread.Fiber.run()(this=0x7fde07528580) at src/core/thread.d:4436#8 0x000000000082fd5d in fiber_entryPoint () atsrc/core/thread.d:3665
#9  0x0000000000000000 in  ()
```
You probably already figured that the new Fiber seems to beallocating its 16KB-stack, with an additional 4 KB guard pageat its bottom, via a 20 KB mmap() call. The abort seems to betriggered by mprotect() returning -1, i.e., a failure todisallow all access to the the guard page; so checking `errno`should help.

Jup, I did that already, it just took a really long time to runbecause when I made the change to print errno I also enableddetailed GC profiling (via the PRINTF* debug options). Enablingthe INVARIANT option for the GC is completely broken by the way,I enforced the compile to work by casting to shared, with theresult of the GC locking up forever at the start of the program.

Anyway, I think for a chance I actually produced some usefulinformation via the GC debug options:

Given the following crash:
```

#0 0x00000000007f1d94 in_D2gc4impl12conservativeQw3Gcx4markMFNbNlPvQcZv (this=...,ptop=0x7fdfce7fc010, pbot=0x7fdfcdbfc010)

    at src/gc/impl/conservative/gc.d:1990
        p1 = 0x7fdfcdbfc010
        p2 = 0x7fdfce7fc010
        stackPos = 0
[...]
```

The scanned range seemed fairly odd to me, so I searched for itin the (very verbose!) GC debug output, which yielded:

```
235.244445: 0xc4f090.Gcx::addRange(0x8264230, 0x8264270)
235.244460: 0xc4f090.Gcx::addRange(0x7fdfcdbfc010, 0x7fdfce7fc010)
235.253861: 0xc4f090.Gcx::addRange(0x8264300, 0x8264340)
235.253873: 0xc4f090.Gcx::addRange(0x8264390, 0x82643d0)
```

So, something is calling addRange explicitly there, causing theGC to scan a range that it shouldn't scan. Since my code doesn'tadd ranges to the GC, and I looked at the generated code fromgirtod/GtkD and it very much looks fine to me, I am currentlylooking into EMSI containers[1] as the possible culprit.That library being the issue would also make perfect sense,because this issue started to appear with such a frequency onlyafter containers were added (there was a GC-related crash before,but that might have been a different one).


So, I will look into that addRange call next.

[1]: https://github.com/dlang-community/containers

Re: Issues with debugging GC-related crashes #2

Reply via email to