Re: gs_group_1 crashing on 13beta2/s390x

2020-10-16 Thread Christoph Berg
Re: Andres Freund > I had a successful check-world run with maximum jittery on s390x. But I > did hit the issue in different places than you did, so it'd be cool if > you could re-enable JIT for s390x - I think you have a package tracking > HEAD? Cool, thanks! I'm tracking PG14 head with apt.post

Re: gs_group_1 crashing on 13beta2/s390x

2020-10-15 Thread Andres Freund
Hi, On 2020-10-15 17:12:54 -0700, Andres Freund wrote: > On 2020-10-15 15:37:01 -0700, Andres Freund wrote: > It's a bug that was fixed in LLVM 4, but too late to be backported to > 3.9. > > The easiest seems to be to just use a wrapper function that does the > necessary pre-checks. Something lik

Re: gs_group_1 crashing on 13beta2/s390x

2020-10-15 Thread Andres Freund
Hi, On 2020-10-15 15:37:01 -0700, Andres Freund wrote: > On 2020-10-15 15:29:24 -0700, Andres Freund wrote: > > Pushed now to 11-master. > > Ugh - there's a failure with an old LLVM version (3.9): > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dragonet&dt=2020-10-15%2022%3A24%3A04 > >

Re: gs_group_1 crashing on 13beta2/s390x

2020-10-15 Thread Andres Freund
Hi, On 2020-10-15 15:29:24 -0700, Andres Freund wrote: > Pushed now to 11-master. Ugh - there's a failure with an old LLVM version (3.9): https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dragonet&dt=2020-10-15%2022%3A24%3A04 Need to rebuild that locally to reproduce. Greetings, Andres F

Re: gs_group_1 crashing on 13beta2/s390x

2020-10-15 Thread Andres Freund
Hi, On 2020-10-15 01:32:46 -0700, Andres Freund wrote: > I have a fix for this, but I've just stared at s390 assembly code for > ~10h, never having done so before. So that'll have to wait for tomorrow. > > It's quite possible that that fix would also help on other > architectures... Pushed now to

Re: gs_group_1 crashing on 13beta2/s390x

2020-10-15 Thread Andres Freund
Hi, On 2020-10-14 17:56:16 -0700, Andres Freund wrote: > Oh dear. It's not as simple as that. The issue indeed are relocations, > but we don't hit those errors. The issue rather is that the systemz > specific relative redirection code thought that the only relative > symbols are functions. So it c

Re: gs_group_1 crashing on 13beta2/s390x

2020-10-14 Thread Andres Freund
Hi, On 2020-10-14 14:58:35 -0700, Andres Freund wrote: > I suspect that building with LDFLAGS="-Wl,-z,relro -Wl,-z,now" - which > is what I think the debian package does - creates the types of > relocations that LLVM doesn't handle for elf + s390. > > 10 release branch: > > void RuntimeDyldELF::

Re: gs_group_1 crashing on 13beta2/s390x

2020-10-14 Thread Andres Freund
Hi, Christoph helped me to get access to a s390x machine - I wasn't able to reproduce exactly the error he hit. Initially all tests passed, but after recompiling with build flags more similar to Christop's I was able to hit another instance of what I assume to be the same bug. I am fairly sure th

Re: gs_group_1 crashing on 13beta2/s390x

2020-10-13 Thread Christoph Berg
Re: Andres Freund > > clang-10 [!alpha !hppa !hurd-i386 !ia64 !kfreebsd-amd64 !kfreebsd-i386 > > !m68k !powerpc !riscv64 !s390x !sh4 !sparc64 !x32], > > !powerpc doesn't exclude ppc64, I assume? That's direct matches only, there's no architecture-family logic in there. > > [*] apparently pgbenc

Re: gs_group_1 crashing on 13beta2/s390x

2020-10-13 Thread Andres Freund
Hi, On 2020-09-28 14:22:01 +0200, Christoph Berg wrote: > Re: Andres Freund > > > > Ok, but given that Debian is currently targeting 22 architectures, I > > > > doubt the PostgreSQL buildfarm covers all of them with the extra JIT > > > > option, so I should probably make sure to do that here whe

Re: gs_group_1 crashing on 13beta2/s390x

2020-09-28 Thread Christoph Berg
Re: Andres Freund > > > Ok, but given that Debian is currently targeting 22 architectures, I > > > doubt the PostgreSQL buildfarm covers all of them with the extra JIT > > > option, so I should probably make sure to do that here when running tests. > > > > +1. I rather doubt our farm is running

Re: gs_group_1 crashing on 13beta2/s390x

2020-09-25 Thread Andres Freund
Hi, On 2020-09-25 14:11:46 -0400, Tom Lane wrote: > Christoph Berg writes: > > Ok, but given that Debian is currently targeting 22 architectures, I doubt > > the PostgreSQL buildfarm covers all of them with the extra JIT option, so I > > should probably make sure to do that here when running te

Re: gs_group_1 crashing on 13beta2/s390x

2020-09-25 Thread Andres Freund
Hi, On 2020-09-25 19:05:52 +0200, Christoph Berg wrote: > Am 25. September 2020 18:42:04 MESZ schrieb Andres Freund > >> * jit is not exercised enough by "make installcheck" > > > >So far we've exercised more widely it by setting up machines that use > >it > >for all queries (by setting the confi

Re: gs_group_1 crashing on 13beta2/s390x

2020-09-25 Thread Tom Lane
Christoph Berg writes: > Ok, but given that Debian is currently targeting 22 architectures, I doubt > the PostgreSQL buildfarm covers all of them with the extra JIT option, so I > should probably make sure to do that here when running tests. +1. I rather doubt our farm is running this type of

Re: gs_group_1 crashing on 13beta2/s390x

2020-09-25 Thread Ranier Vilela
Em sex., 25 de set. de 2020 às 14:36, Ranier Vilela escreveu: > Em sex., 25 de set. de 2020 às 11:30, Christoph Berg > escreveu: > >> Re: Tom Lane >> > > Tom> It's hardly surprising that datumCopy would segfault when given >> a >> > > Tom> null "value" and told it is pass-by-reference. However

Re: gs_group_1 crashing on 13beta2/s390x

2020-09-25 Thread Ranier Vilela
Em sex., 25 de set. de 2020 às 11:30, Christoph Berg escreveu: > Re: Tom Lane > > > Tom> It's hardly surprising that datumCopy would segfault when given a > > > Tom> null "value" and told it is pass-by-reference. However, to get to > > > Tom> the datumCopy call, we must have passed the MemoryC

Re: gs_group_1 crashing on 13beta2/s390x

2020-09-25 Thread Christoph Berg
Am 25. September 2020 18:42:04 MESZ schrieb Andres Freund >> * jit is not exercised enough by "make installcheck" > >So far we've exercised more widely it by setting up machines that use >it >for all queries (by setting the config option). I'm doubtful it's worth >doing differently. Ok, but given

Re: gs_group_1 crashing on 13beta2/s390x

2020-09-25 Thread Andres Freund
Hi, On 2020-09-25 17:29:07 +0200, Christoph Berg wrote: > I guess that suggests two things: > * jit is not ready for prime time on s390x and I should disable it I don't know how good LLVMs support for s390x JITing is, and given that it's unrealistic for people to get access to s390x... > * jit

Re: gs_group_1 crashing on 13beta2/s390x

2020-09-25 Thread Christoph Berg
Re: To Tom Lane > I poked around with the SET in the offending tests, and the crash is > only present if `set jit_above_cost = 0;` is present. Removing that > makes it pass. Removing work_mem or enable_hashagg does not make a > difference. llvm version is 10.0.1. I put jit_above_cost=0 into postgr

Re: gs_group_1 crashing on 13beta2/s390x

2020-09-25 Thread Christoph Berg
I poked around with the SET in the offending tests, and the crash is only present if `set jit_above_cost = 0;` is present. Removing that makes it pass. Removing work_mem or enable_hashagg does not make a difference. llvm version is 10.0.1. Test file: -- -- Compare results between plans using sor

Re: gs_group_1 crashing on 13beta2/s390x

2020-09-25 Thread Christoph Berg
Re: Tom Lane > > Tom> It's hardly surprising that datumCopy would segfault when given a > > Tom> null "value" and told it is pass-by-reference. However, to get to > > Tom> the datumCopy call, we must have passed the MemoryContextContains > > Tom> check on that very same pointer value, and that

Re: gs_group_1 crashing on 13beta2/s390x

2020-07-16 Thread Tom Lane
Andrew Gierth writes: > "Tom" == Tom Lane writes: > Tom> It's hardly surprising that datumCopy would segfault when given a > Tom> null "value" and told it is pass-by-reference. However, to get to > Tom> the datumCopy call, we must have passed the MemoryContextContains > Tom> check on that ver

Re: gs_group_1 crashing on 13beta2/s390x

2020-07-16 Thread Christoph Berg
Re: Tom Lane > Given the apparently-can't-happen situation at the call site, > and the fact that we're not seeing similar failures reported > elsewhere (and note that every line shown above is at least > five years old), I'm kind of forced to the conclusion that this > is a compiler bug. Does adju

Re: gs_group_1 crashing on 13beta2/s390x

2020-07-15 Thread Andrew Gierth
> "Tom" == Tom Lane writes: Tom> It's hardly surprising that datumCopy would segfault when given a Tom> null "value" and told it is pass-by-reference. However, to get to Tom> the datumCopy call, we must have passed the MemoryContextContains Tom> check on that very same pointer value, and

Re: gs_group_1 crashing on 13beta2/s390x

2020-07-15 Thread Tom Lane
Christoph Berg writes: >> On the Debian s390x buildd, the 13beta2 build is crashing: > I wired gdb into the build process and got this backtrace: > #0 datumCopy (typByVal=false, typLen=-1, value=0) at > ./build/../src/backend/utils/adt/datum.c:142 > vl = 0x0 > res = >

Re: gs_group_1 crashing on 13beta2/s390x

2020-07-15 Thread Christoph Berg
Re: To PostgreSQL Hackers > On the Debian s390x buildd, the 13beta2 build is crashing: > > 2020-07-15 01:19:59.149 UTC [859] LOG: server process (PID 1415) was > terminated by signal 11: Segmentation fault > 2020-07-15 01:19:59.149 UTC [859] DETAIL: Failed process was running: create > table g