Re: segfault in geqo on experimental gcc animal

2019-11-18 Thread Fabien COELHO



Hello Martin,


The issue is resolved now and tests are fine for me.


I recompiled gcc trunk and the moonjelly is back to green.

Thanks!

--
Fabien.




Re: segfault in geqo on experimental gcc animal

2019-11-18 Thread Martin Liška
Hello.

The issue is resolved now and tests are fine for me.

Martin

On Fri, 15 Nov 2019 at 13:11, Martin Liška  wrote:
>
> Heh, it's me who now breaks postgresql build:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92529
>
> Martin
>
> On Fri, 15 Nov 2019 at 13:01, Fabien COELHO
>  wrote:
> >
> >
> > > Yes, after the revision I see other failing tests like:
> >
> > Indeed, I can confirm there are still 18/195 fails with the updated gcc.
> >
> > > I'm going to investigate that and will inform you guys.
> >
> > Great, thanks!
> >
> > --
> > Fabien.




Re: segfault in geqo on experimental gcc animal

2019-11-15 Thread Martin Liška
Heh, it's me who now breaks postgresql build:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92529

Martin

On Fri, 15 Nov 2019 at 13:01, Fabien COELHO
 wrote:
>
>
> > Yes, after the revision I see other failing tests like:
>
> Indeed, I can confirm there are still 18/195 fails with the updated gcc.
>
> > I'm going to investigate that and will inform you guys.
>
> Great, thanks!
>
> --
> Fabien.




Re: segfault in geqo on experimental gcc animal

2019-11-15 Thread Martin Liška
Yes, after the revision I see other failing tests like:
...
 select_having... ok   16 ms
 subselect... FAILED   92 ms
 union... FAILED   77 ms
 case ... ok   32 ms
 join ... FAILED  239 ms
 aggregates   ... FAILED  136 ms
 transactions ... ok   59 ms
...

I'm going to investigate that and will inform you guys.

Martin

On Fri, 15 Nov 2019 at 11:56, Fabien COELHO  wrote:
>
>
> > Yep, I build periodically PostgreSQL package in openSUSE with the latest
> > GCC and so that I identified that and isolated to a simple test-case. I
> > would expect a fix today or tomorrow.
>
> Indeed, the gcc issue reported seems fixed by gcc r278259. I'm updating
> moonjelly gcc to check if this solves pg compilation woes.
>
> --
> Fabien.




Re: segfault in geqo on experimental gcc animal

2019-11-15 Thread Fabien COELHO




Yes, after the revision I see other failing tests like:


Indeed, I can confirm there are still 18/195 fails with the updated gcc.


I'm going to investigate that and will inform you guys.


Great, thanks!

--
Fabien.




Re: segfault in geqo on experimental gcc animal

2019-11-14 Thread Fabien COELHO



Yep, I build periodically PostgreSQL package in openSUSE with the latest 
GCC and so that I identified that and isolated to a simple test-case. I 
would expect a fix today or tomorrow.


Indeed, the gcc issue reported seems fixed by gcc r278259. I'm updating 
moonjelly gcc to check if this solves pg compilation woes.


--
Fabien.




Re: segfault in geqo on experimental gcc animal

2019-11-14 Thread Martin Liška
Hi.

Yep, I build periodically PostgreSQL package in openSUSE with the
latest GCC and so
that I identified that and isolated to a simple test-case. I would expect a fix
today or tomorrow.

See you,
Martin

On Thu, 14 Nov 2019 at 16:46, Fabien COELHO  wrote:
>
>
> Hello,
>
> I did a (slow) dichotomy on gcc sources which determined that gcc r277979
> was the culprit, then I started a bug report which showed that the issue
> was already reported this morning by Martin Liška, including a nice
> example isolated from sources. See:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92506
>
> --
> Fabien.




Re: segfault in geqo on experimental gcc animal

2019-11-14 Thread Fabien COELHO


Hello,

I did a (slow) dichotomy on gcc sources which determined that gcc r277979 
was the culprit, then I started a bug report which showed that the issue 
was already reported this morning by Martin Liška, including a nice 
example isolated from sources. See:


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92506

--
Fabien.

Re: segfault in geqo on experimental gcc animal

2019-11-13 Thread Fabien COELHO



so it sure looks like a gcc upgrade caused the failure. But it's not
clear wheter it's a compiler bug, or some undefined behaviour that
triggers the bug.

Fabien, any chance to either bisect or get a bit more information on 
the backtrace?


There is a promising "keep_error_builds" option in buildfarm settings, 
but it does not seem to be used anywhere in the scripts. Well, I can 
probably relaunch by hand.


However, given the experimental nature of the setup, I think that the 
most probable cause is a newly introduced gcc bug, so I'd suggest to 
wait to check whether the issue persist before spending time on that, 
and if it persists to investigate further to either report a bug to gcc 
or pg, depending.


Also, I'll recompile gcc before the next weekly builds.


I did some manual testing.

All versions are tested failed miserably (I tested master, 12, 11, 10, 
9.6…). High probability that it is a newly introduced gcc bug, however pg 
is not a nice self contain tested case to submit to gcc for debugging:-(


I suggest to ignore for the time being, and if the problem persist I'll 
try to investigate to detect which gcc commit caused the regression.


--
Fabien.

Re: segfault in geqo on experimental gcc animal

2019-11-10 Thread Fabien COELHO



Hello Andres,


I don't think there's been any relevant code changes since the last
success.

last success:
2019-11-09 09:20:28.346 CET [28785:1] LOG:  starting PostgreSQL 13devel on 
x86_64-pc-linux-gnu, compiled by gcc (GCC) 10.0.0 20191102 (experimental), 
64-bit

first failure:
2019-11-09 11:19:36.277 CET [42512:1] LOG:  starting PostgreSQL 13devel on 
x86_64-pc-linux-gnu, compiled by gcc (GCC) 10.0.0 20191109 (experimental), 
64-bit


so it sure looks like a gcc upgrade caused the failure. But it's not
clear wheter it's a compiler bug, or some undefined behaviour that
triggers the bug.

Fabien, any chance to either bisect or get a bit more information on the
backtrace?


There is a promising "keep_error_builds" option in buildfarm settings, but 
it does not seem to be used anywhere in the scripts. Well, I can probably 
relaunch by hand.


However, given the experimental nature of the setup, I think that the most 
probable cause is a newly introduced gcc bug, so I'd suggest to wait to 
check whether the issue persist before spending time on that, and if it 
persists to investigate further to either report a bug to gcc or pg, 
depending.


Also, I'll recompile gcc before the next weekly builds.

--
Fabien.




segfault in geqo on experimental gcc animal

2019-11-09 Thread Andres Freund
Hi,

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=moonjelly=2019-11-09%2010%3A17%3A06

shows a failure, including a backtrace:

==-=-== stack trace: pgsql.build/src/test/regress/tmp_check/data/core 
==-=-==
[New LWP 42902]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `postgres: fabien regression [local] SELECT   
 '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x006d962b in gimme_tour (root=root@entry=0x1cfb4b0, 
edge_table=edge_table@entry=0x1d3afc0, new_gene=, num_gene=5) at 
geqo_erx.c:209
209 remove_gene(root, new_gene[i - 1], edge_table[(int) 
new_gene[i - 1]], edge_table);
#0  0x006d962b in gimme_tour (root=root@entry=0x1cfb4b0, 
edge_table=edge_table@entry=0x1d3afc0, new_gene=, num_gene=5) at 
geqo_erx.c:209
#1  0x006da0a8 in geqo (root=0x1cfb4b0, number_of_rels=, 
initial_rels=) at geqo_main.c:190
#2  0x006de084 in make_one_rel (root=root@entry=0x1cfb4b0, 
joinlist=joinlist@entry=0x1d0a868) at allpaths.c:227
#3  0x00701d19 in query_planner (root=root@entry=0x1cfb4b0, 
qp_callback=qp_callback@entry=0x702300 , 
qp_extra=qp_extra@entry=0x7ffd46b55a60) at planmain.c:269
#4  0x00706844 in grouping_planner () at planner.c:2054
#5  0x007093c7 in subquery_planner (glob=glob@entry=0x1cfb418, 
parse=parse@entry=0x1cd77b8, parent_root=parent_root@entry=0x0, 
hasRecursion=hasRecursion@entry=false, tuple_fraction=tuple_fraction@entry=0) 
at planner.c:1014
#6  0x0070a803 in standard_planner (parse=0x1cd77b8, cursorOptions=256, 
boundParams=) at planner.c:406
#7  0x007cb1dc in pg_plan_query (querytree=0x1cd77b8, 
cursorOptions=256, boundParams=0x0) at postgres.c:873
#8  0x007cb2be in pg_plan_queries (querytrees=0x1cfb3c0, 
cursorOptions=cursorOptions@entry=256, boundParams=boundParams@entry=0x0) at 
postgres.c:963
#9  0x007cb618 in exec_simple_query () at postgres.c:1154
#10 0x007cd384 in PostgresMain (argc=, 
argv=argv@entry=0x1c23058, dbname=, username=) at 
postgres.c:4278
#11 0x0074b574 in BackendRun (port=0x1c1c650) at postmaster.c:4498
#12 BackendStartup (port=0x1c1c650) at postmaster.c:4189
#13 ServerLoop () at postmaster.c:1727
#14 0x0074c34d in PostmasterMain (argc=argc@entry=8, 
argv=argv@entry=0x1bf35b0) at postmaster.c:1400
#15 0x00491f41 in main (argc=8, argv=0x1bf35b0) at main.c:210
$1 = {si_signo = 11, si_errno = 0, si_code = 1, _sifields = {_pad = {30650304, 
-12, 0 }, _kill = {si_pid = 30650304, si_uid = 4294967284}, 
_timer = {si_tid = 30650304, si_overrun = -12, si_sigval = {sival_int = 0, 
sival_ptr = 0x0}}, _rt = {si_pid = 30650304, si_uid = 4294967284, si_sigval = 
{sival_int = 0, sival_ptr = 0x0}}, _sigchld = {si_pid = 30650304, si_uid = 
4294967284, si_status = 0, si_utime = 0, si_stime = 0}, _sigfault = {si_addr = 
0xfff401d3afc0, _addr_lsb = 0, _addr_bnd = {_lower = 0x0, _upper = 0x0}}, 
_sigpoll = {si_band = -51508957248, si_fd = 0}}}

I don't think there's been any relevant code changes since the last
success.

last success:
2019-11-09 09:20:28.346 CET [28785:1] LOG:  starting PostgreSQL 13devel on 
x86_64-pc-linux-gnu, compiled by gcc (GCC) 10.0.0 20191102 (experimental), 
64-bit

first failure:
2019-11-09 11:19:36.277 CET [42512:1] LOG:  starting PostgreSQL 13devel on 
x86_64-pc-linux-gnu, compiled by gcc (GCC) 10.0.0 20191109 (experimental), 
64-bit


so it sure looks like a gcc upgrade caused the failure. But it's not
clear wheter it's a compiler bug, or some undefined behaviour that
triggers the bug.

Fabien, any chance to either bisect or get a bit more information on the
backtrace?


Greetings,

Andres Freund