On Jul 2 15:25, Ken Brown wrote:
> On 7/2/2015 8:20 AM, Corinna Vinschen wrote:
> >On Jul 2 14:13, Corinna Vinschen wrote:
> >>On Jul 1 22:10, Ken Brown wrote:
> >>>I may have spoken too soon. As I repeat the experiment on a different
> >>>computer, with a build from a slightly different snapshot of the emacs
> >>>trunk, emacs crashes when I type 'C-x d' with the following stack dump:
> >>>
> >>>Stack trace:
> >>>Frame Function Args
> >>>00100A3E240 00180071CC3 (00000829630, 000008296D0, 00000000000,
> >>>0000082CE00)
> >>>00030000002 001800732BE (00000000000, 00000000002, 00100A48C80,
> >>>00000000002)
> >>>00000000000 00000006B40 (00000000002, 00100A48C80, 00000000002,
> >>>00100A48768)
> >>>00000000000 21000000003 (00000000002, 00100A48C80, 00000000002,
> >>>00100A48768)
> >>>End of stack trace
> >>>
> >>>$ addr2line 00180071CC3 -e /usr/lib/debug/usr/bin/cygwin1.dbg
> >>>/usr/src/debug/cygwin-2.1.0-0.3/winsup/cygwin/exception.h:175
> >>>
> >>>$ addr2line 001800732BE -e /usr/lib/debug/usr/bin/cygwin1.dbg
> >>>/usr/src/debug/cygwin-2.1.0-0.3/winsup/cygwin/exceptions.cc:1639
> >>
> >>That points to a crash while setting up the alternate stack. This is
> >>always a possibility because, in contrast to the kernel signal handler
> >>in a real POSIX system, the Cygwin exception handler is still running on
> >>the stack which triggered the crash up to the point where we call the
> >>signal handler function. Dependent on how the stack overflow occured,
> >>this additional stack usage may be enough to kill the process for good.
> >>
> >>Out of curiosity, can you add this to the init_sigsegv() function:
> >>
> >> #include <windows.h>
> >> [...]
> >> init_sigsegv (void)
> >> {
> >> [...]
> >> SetThreadStackGuarantee (65536);
> >
> >Of course this only works "per thread", so if init_sigsegv is called
> >for the main thread, only the main thread gets this treatment. For
> >testing this should be enough, though.
>
> That didn't make any difference.It should have. If you don't also tweak STACK_DANGER_ZONE accordingly, handle_sigsegv should fail to call siglongjmp. Either way, I tested it locally as well, and it doesn't work. In the meantime I found that there's another problem. Assuming you longjmp out of handle_sigsegv, the stack will still be "broken". It doesn't have the usual guard pages anymore, and the next time you have a stack overflow, NTDLL will simply terminate the process. I create a wrapper function which resets the stack so it has valid guard pages again and then the stack overflow can be handled repeatedly. While I was at it, I found that the setup for pthread stacks is not quite right, either, so right now I'm hacking on this stuff to make it behave as expected in the usual cases. > But I do have a little more information. > I tried running emacs under gdb with a breakpoint at handle_sigsegv. The > breakpoint is hit when I deliberately trigger the stack overflow. Then I > continue, emacs says it has recovered from the stack overflow, and I type > 'C-x d'. At this point there's a second SIGSEGV and handle_sigsegv is > called again. But this time garbage collection is in progress, and > handle_sigsegv just gives up. Sounds right to me. > I don't know what caused the second SIGSEGV but I'll try to figure that out > when I next have a chance to look at this. I also don't know why the stack > dump pointed to a crash while setting up the alternate stack, since the > fatal crash actually seems to have happened later. But maybe the stack was > just completely messed up after the second SIGSEGV and the stack dump can't > be trusted. > > More later. Thanks! Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Maintainer cygwin AT cygwin DOT com Red Hat
pgppZowNuzHTt.pgp
Description: PGP signature

