Hi,
On 2019-10-03 08:23:49 -0700, Andres Freund wrote:
> On 2019-10-03 08:18:42 -0700, Andres Freund wrote:
> > This is around where an error is thrown:
> > -- badly formatted interval
> > INSERT INTO INTERVAL_TBL (f1) VALUES ('badly formatted interval');
> > -ERROR: invalid input syntax for type interval: "badly formatted interval"
> > -LINE 1: INSERT INTO INTERVAL_TBL (f1) VALUES ('badly formatted inter...
> > - ^
> >
> > and the error is stack related. So I suspect that setjmp/longjmp might
> > be to blame here, and somehow don't save/restore the stack into a proper
> > state. I don't know enough about mingw/msys/windows to know whether that
> > uses a self-written setjmp or relies on the MS implementation.
> >
> > If you could gather a backtrace it might help us. It's possible that the
> > stack is "just" misaligned or something, we had problems with that
> > before (IIRC valgrind didn't always align stacks correctly for processes
> > that forked from within a signal handler, which then crashed when using
> > instructions with alignment requirements, but only sometimes, because
> > the stack coiuld be aligned).
>
> It seems we're not the only ones hitting this:
> https://rt.perl.org/Public/Bug/Display.html?id=133603
>
> Doesn't look like they've really narrowed it down that much yet.
A few notes:
* As an experiment, it could be worthwhile to try to redefine
sigsetjmp/longjmp/sigjmp_buf with what
https://gcc.gnu.org/onlinedocs/gcc/Nonlocal-Gotos.html
provides, it's apparently a separate implementation from MS crt one.
* Arguably
"Do not use longjmp to transfer control from a callback routine
invoked directly or indirectly by Windows code."
and
"Do not use longjmp to transfer control out of an interrupt-handling
routine unless the interrupt is caused by a floating-point
exception. In this case, a program may return from an interrupt
handler via longjmp if it first reinitializes the floating-point math
package by calling _fpreset."
from
https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/longjmp?view=vs-2019
might be violated by our signal signal emulation on windows. But I've
not looked into that in detail.
* Any chance you could get the pre-processed source for postgres.c or
such? I'm kinda wondering if the definition of setjmp() that we get
includes the returns_twice attribute that gcc wants to see, and
whether we're picking up the mingw version of longjmp, or the windows
one.
https://sourceforge.net/p/mingw-w64/mingw-w64/ci/844cb490ab2cc32ac3df5914700564b2e40739d8/tree/mingw-w64-headers/crt/setjmp.h#l31
* It's certainly curious that the failures so far only have happended as
part of pg_upgradeCheck, rather than the plain regression tests.
Greetings,
Andres Freund