[Bug tree-optimization/114872] [13/14/15 Regression] Miscompilation with -O2 after commit r13-8037

2024-05-08 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114872

--- Comment #28 from Jakub Jelinek  ---
(In reply to Dmitrii Pasechnik from comment #26)
> We have megabytes of code calling libraries where setjmp/longjmp is used,
> in github.com/sagemath/sage/ (most of it in Cython).
> 
> It looks like a huge hassle to put volatiles there. 
> Looks like we would need ways to disable particular optimisations which lead
> to these sorts of errors:-(

You mean turn off all optimizations then?  -O0.
Even that doesn't guarantee it will work if some such variables are declared
with register keyword.

It isn't just POSIX which says this, e.g. C99 also says:
"All accessible objects have values, and all other components of the abstract
machine
have state, as of the time the longjmp function was called, except that the
values of
objects of automatic storage duration that are local to the function containing
the
invocation of the corresponding setjmp macro that do not have
volatile-qualified type
and have been changed between the setjmp invocation and longjmp call are
indeterminate."
C89 said pretty much the same.

[Bug tree-optimization/114872] [13/14/15 Regression] Miscompilation with -O2 after commit r13-8037

2024-05-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114872

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=56512
 Resolution|--- |INVALID

--- Comment #27 from Andrew Pinski  ---
https://pubs.opengroup.org/onlinepubs/7908799/xsh/longjmp.html


The requirement of using volatile has been there a long time before your code
was written ...

[Bug tree-optimization/114872] [13/14/15 Regression] Miscompilation with -O2 after commit r13-8037

2024-05-08 Thread dima.pasechnik at cs dot ox.ac.uk via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114872

--- Comment #26 from Dmitrii Pasechnik  ---
We have megabytes of code calling libraries where setjmp/longjmp is used,
in github.com/sagemath/sage/ (most of it in Cython).

It looks like a huge hassle to put volatiles there. 
Looks like we would need ways to disable particular optimisations which lead to
these sorts of errors:-(

[Bug tree-optimization/114872] [13/14/15 Regression] Miscompilation with -O2 after commit r13-8037

2024-05-08 Thread slyfox at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114872

--- Comment #25 from Sergei Trofimovich  ---
(In reply to Richard Biener from comment #24)
> (In reply to Sergei Trofimovich from comment #23)
> [...]
> > Why did `gcc` generate unconditional NULL dereference here? I suspect it
> > somehow inferred that `__pyx_t_6 = NULL;` in that branch, but not before
> > comparison.
> 
> That's what happens if we isolate an unreachable path because of a NULL
> dereference (like if exposed by jump-threading).  We make the NULL
> dereference volatile so it stays but DCE/DSE can cleanup code on the path
> leading to it.
> 
> If you run into such path the this might suggest that jump-threading
> triggered
> a problem with the setjmp/longjmp, so it's then likely some condition that's
> evaluated in a wrong way after the longjmp, either because a dependent
> value wasn't properly preserved or by GCC breaking that.  Seeing stack memory
> arguments used on a call in a previous comment

Yeah, that makes sense. Having stared a bit more at
__pyx_pf_4sage_4libs_3gap_7element_19GapElement_Function_2__call__() I
think I get the problem now. We deal with the code similar to the following:

__pyx_pf_4sage_4libs_3gap_7element_19GapElement_Function_2__call__() {
__pyx_t_6 = NULL;

// the loop is not very important, but it forces `__pyx_t_6` initialization
before `setjmp()`
for (;;) {
__pyx_t_6 = something();
int done = use_pre_setjmp(__pyx_t_6);
__pyx_t_6 = NULL;
if (done) break;
}

int mode = setjmp(jb);

switch (mode) {
  case 1: // longjmp() case
break;
  case 0: // regular case (`case 3:` in real code)
__pyx_t_6 = something_else(); // set __pyx_t_6 to non-zero
int done = use_post_setjmp(__pyx_t_6); // call longjmp(jb, 1) here
__pyx_t_6 = NULL;
break;
}

// get here via longjmp()
if (__pyx_t_6 != NULL) deref(__pyx_t_6);
}

AFAIU `gcc` is smart enough to see that all paths to `deref()` reach with
`__pyx_t_6 = NULL`, but it does not eliminate the `deref()` entirely
and uses `if (__pyx_t_6 != NULL) deref(NULL);` as Richard explained above.

Now due to `longjmp()` `__pyx_t_6 = NULL;` does not get executed (even
though it's present in assembly code as `movq $0, -200(%rbp)` in all the
places where it's present in C code.

As a result after the `longjmp()` `__pyx_t_6` is not `NULL` and we get
to `deref(NNULL)` and SIGSEGV.

Thus it's a matter of missing `volatile __pyx_t_6`. Sounds about right?

> I wondered if POSIX suggests
> that even non-register variables need to be made volatile and thus whether
> SRA or FRE might impose problems with code using setjmp/longjmp.

That matches my understanding as well. Would it be fair to say that sprinkling
`volatile` has to be done for every single local variable in the function to
prevent
possible stack reuse? And that rule should extend to the functions that could
host an inline variant of the function using setjmp()/longjmp() and not just
immediate caller of setjmp()/longjmp()?

[Bug tree-optimization/114872] [13/14/15 Regression] Miscompilation with -O2 after commit r13-8037

2024-05-08 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114872

--- Comment #24 from Richard Biener  ---
(In reply to Sergei Trofimovich from comment #23)
[...]
> Why did `gcc` generate unconditional NULL dereference here? I suspect it
> somehow inferred that `__pyx_t_6 = NULL;` in that branch, but not before
> comparison.

That's what happens if we isolate an unreachable path because of a NULL
dereference (like if exposed by jump-threading).  We make the NULL
dereference volatile so it stays but DCE/DSE can cleanup code on the path
leading to it.

If you run into such path the this might suggest that jump-threading triggered
a problem with the setjmp/longjmp, so it's then likely some condition that's
evaluated in a wrong way after the longjmp, either because a dependent
value wasn't properly preserved or by GCC breaking that.  Seeing stack memory
arguments used on a call in a previous comment I wondered if POSIX suggests
that even non-register variables need to be made volatile and thus whether
SRA or FRE might impose problems with code using setjmp/longjmp.

[Bug tree-optimization/114872] [13/14/15 Regression] Miscompilation with -O2 after commit r13-8037

2024-05-08 Thread slyfox at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114872

--- Comment #23 from Sergei Trofimovich  ---
At SIGSEGV site the code is an unconditional NULL dereference due to
dereference of `xor %esi,%esi` result from `gdb`.

797 if (op != _Py_NULL) {
   0x7f940c871563 <+2563>:  cmpq   $0x0,-0xc8(%rbp)
   0x7f940c87156b <+2571>:  je 0x7f940c871583
<__pyx_pf_4sage_4libs_3gap_7element_19GapElement_Function_2__call__+2595>

242 return _Py_CAST(PY_INT32_T, op->ob_refcnt) < 0;
   0x7f940c87156d <+2573>:  xor%esi,%esi
=> 0x7f940c87156f <+2575>:  mov(%rsi),%rax

In `element-verbose.S` it is:

# /usr/include/python3.12/object.h:797: if (op != _Py_NULL) {
.loc 5 797 8 is_stmt 0 view .LVU65876
cmpq $0, -200(%rbp)<>#, %sfp
je<>.L12727>#,
.loc 5 798 9 is_stmt 1 view .LVU65877
.LVL15705:
.LBB49946:
.LBI49946:
.loc 5 696 37 view .LVU65878
.LBB49947:
.loc 5 700 5 view .LVU65879
.LBB49948:
.LBI49948:
.loc 5 239 36 view .LVU65880
.LBB49949:
.loc 5 242 5 view .LVU65881
# /usr/include/python3.12/object.h:242: return _Py_CAST(PY_INT32_T,
op->ob_refcnt) < 0;
.loc 5 242 12 is_stmt 0 view .LVU65882
xorl %esi, %esi # r
movq (%rsi), %rax # __pyx_t_6_208(ab)->D.11083.ob_refcnt, _991

Looking at other sites in `element-verbose.S` for comparison do try to use
`-0xc8(%rbp)` contents:

# /usr/include/python3.12/object.h:797: if (op != _Py_NULL) {
.loc 5 797 8 is_stmt 0 view .LVU66162
cmpq $0, -200(%rbp) #, %sfp
je .L12782>#,
.loc 5 798 9 is_stmt 1 view .LVU66163
.LVL15760:
.LBB50093:
.LBI50093:
.loc 5 696 37 view .LVU66164
.LBB50094:
.loc 5 700 5 view .LVU66165
.LBB50095:
.LBI50095:
.loc 5 239 36 view .LVU66166
.LBB50096:
.loc 5 242 5 view .LVU66167
# /usr/include/python3.12/object.h:242: return _Py_CAST(PY_INT32_T,
op->ob_refcnt) < 0;
.loc 5 242 12 is_stmt 0 view .LVU66168
movq -200(%rbp), %rdx # %sfp, r
movq (%rdx), %rax # __pyx_t_6_10(ab)->D.11083.ob_refcnt, _1070

Thus my guess is that something clobbered `-200(%rbp)` value across
setjmp()/longjmp().

Trying to trace:

$ gdb -p `pgrep sage-ipython`
(gdb) break __pyx_pf_4sage_4libs_3gap_7element_19GapElement_Function_2__call__
(gdb) continue

# trigger break with with ` libgap.AbelianGroup(0,0,0)`

(gdb) disassemble
Dump of assembler code for function
__pyx_pf_4sage_4libs_3gap_7element_19GapElement_Function_2__call__:
=> 0x7f4ed9981b60 <+0>: push   %rbp
   0x7f4ed9981b61 <+1>: mov%rsp,%rbp

# Populating `%rbp`:

(gdb) nexti
(gdb) nexti
(gdb) disassemble
Dump of assembler code for function
__pyx_pf_4sage_4libs_3gap_7element_19GapElement_Function_2__call__:
   0x7f4ed9981b60 <+0>: push   %rbp
   0x7f4ed9981b61 <+1>: mov%rsp,%rbp
=> 0x7f4ed9981b64 <+4>: push   %r15

(gdb) print $rbp-200
$2 = (void *) 0x7ffd2824c5e8

(gdb) watch *(int*)(void *) 0x7ffd2824c5e8
Hardware watchpoint 2: *(int*)(void *) 0x7ffd2824c5e8

(gdb) continue
Continuing.

Thread 1 "sage-ipython" hit Hardware watchpoint 2: *(int*)(void *)
0x7ffd2824c5e8

Old value = 673498624
New value = 0
0x7f98e609d2a8 in
__pyx_pf_4sage_4libs_3gap_7element_19GapElement_Function_2__call__ (
__pyx_v_self=__pyx_v_self@entry=0x7f98dfe70dc0,
__pyx_v_args=__pyx_v_args@entry=(, ,
))
at
/usr/src/debug/sci-mathematics/sagemath-standard-10.3/sagemath-standard-10.3-python3_12/build/cythonized/sage/libs/gap/element.c:26192
26192   __pyx_t_6 = NULL;

NULL store.

(gdb) continue
Continuing.

Thread 1 "sage-ipython" hit Hardware watchpoint 2: *(int*)(void *)
0x7ffd2824c5e8

Old value = 0
New value = -538669696
__Pyx_GetItemInt_List_Fast (wraparound=0, boundscheck=1, i=2,
o=[,
,
])
at
/usr/src/debug/sci-mathematics/sagemath-standard-10.3/sagemath-standard-10.3-python3_12/build/cythonized/sage/libs/gap/element.c:38070
38070   Py_INCREF(r);

Create an object?

(gdb) continue
Continuing.

Thread 1 "sage-ipython" received signal SIGABRT, Aborted.
0x7f99428617a7 in __GI_kill () at ../sysdeps/unix/syscall-template.S:120
120 T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)

Abort.

(gdb) continue
Continuing.

Thread 1 "sage-ipython" received signal SIGSEGV, Segmentation fault.
0x7f98e609c56f in _Py_IsImmortal (op=0x0) at
/usr/include/python3.12/object.h:242
242 return _Py_CAST(PY_INT32_T, op->ob_refcnt) < 0;

SIGSEGV.

Note that all two memory references happen before longjmp() (the ABORT).

Why did `gcc` generate unconditional NULL dereference here? I suspect it
somehow inferred that `__pyx_t_6 = NULL;` in that branch, but not before
comparison.

[Bug tree-optimization/114872] [13/14/15 Regression] Miscompilation with -O2 after commit r13-8037

2024-05-07 Thread slyfox at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114872

--- Comment #22 from Sergei Trofimovich  ---
Trying again to catch more precise place for SIGABRT.

Beginning at the start of the possibly aborting function:

(gdb) break __pyx_pf_4sage_4libs_3gap_7element_19GapElement_Function_2__call__
(gdb) continue

Running step by step to get the rough location:

(gdb) continue # many times

(gdb)
26459 __pyx_t_6 = __Pyx_GetItemInt_List(__pyx_v_a, 2, long, 1,
__Pyx_PyInt_From_long, 1, 0, 1); if (unlikely(!__pyx_t_6)) __PYX_ERR(0, 2528,
__pyx_L14_error)
(gdb)
26469 __pyx_v_result =
GAP_CallFunc3Args(__pyx_v_self->__pyx_base.value, ((struct
__pyx_obj_4sage_4libs_3gap_7element_GapElement *)__pyx_t_5)->value, ((struct
__pyx_obj_4sage_4libs_3gap_7element_GapElement *)__pyx_t_4)->value, ((struct
__pyx_obj_4sage_4libs_3gap_7element_GapElement *)__pyx_t_6)->value);
(gdb)

Thread 1 "sage-ipython" received signal SIGABRT, Aborted.
0x7f73e4a617a7 in __GI_kill () at ../sysdeps/unix/syscall-template.S:120
120 T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)

Don't know why backtrace does not point at element.c:26459. It is this snippet:

  /* "sage/libs/gap/element.pyx":2525
 *(a[1]).value)
 * elif n == 3:
 * result = GAP_CallFunc3Args(self.value, #
<<
 *(a[0]).value,
 *(a[1]).value,
 */
  __pyx_v_result = GAP_CallFunc3Args(__pyx_v_self->__pyx_base.value,
((struct __pyx_obj_4sage_4libs_3gap_7element_GapElement *)__pyx_t_5)->value,
((struct __pyx_obj_4sage_4libs_3gap_7element_GapElement *)__pyx_t_4)->value,
((struct __pyx_obj_4sage_4libs_3gap_7element_GapElement *)__pyx_t_6)->value);

[Bug tree-optimization/114872] [13/14/15 Regression] Miscompilation with -O2 after commit r13-8037

2024-05-07 Thread slyfox at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114872

--- Comment #21 from Sergei Trofimovich  ---
Good point! I wonder if I'm looking at the backtrace too late (or at the wrong
one). I'll retry again this evening and will extract more context.

[Bug tree-optimization/114872] [13/14/15 Regression] Miscompilation with -O2 after commit r13-8037

2024-05-07 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114872

--- Comment #20 from Jakub Jelinek  ---
(In reply to Sergei Trofimovich from comment #18)
> > 2) ideally show a gdb session with the important events, which setjmp was 
> > it (I see _setjmp and __sigsetjmp calls in the function), which exact 
> > function called from the function ended up aborting/doing longjmp in the 
> > signal handler and where is the crash

Strange.  If it is the _setjmp call on line 26315 followed by kill SIGABRT from
sig_error(); on the same line, then I don't see any local vars modified in
between, except that int t on line 26315 inside of sig_GAP_Enter macro.
There is then __sigsetjmp on line 26324, but no further kill calls and I think
this isn't in a loop.

[Bug tree-optimization/114872] [13/14/15 Regression] Miscompilation with -O2 after commit r13-8037

2024-05-06 Thread dima.pasechnik at cs dot ox.ac.uk via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114872

--- Comment #19 from Dmitrii Pasechnik  ---
Declaring the last argument in the call to GAP_CallFunc3Args() volatile appears
to fix the issue. Namely, apply

diff --git a/src/sage/libs/gap/element.pyx b/src/sage/libs/gap/element.pyx
index f1482997b8..7ca4a666ab 100644
--- a/src/sage/libs/gap/element.pyx
+++ b/src/sage/libs/gap/element.pyx
@@ -2504,6 +2504,7 @@ cdef class GapElement_Function(GapElement):
 cdef Obj result = NULL
 cdef Obj arg_list
 cdef int n = len(args)
+cdef volatile Obj v2

 if n > 0 and n <= 3:
 libgap = self.parent()
@@ -2522,10 +2523,11 @@ cdef class GapElement_Function(GapElement):
(a[0]).value,
(a[1]).value)
 elif n == 3:
+v2 = (a[2]).value
 result = GAP_CallFunc3Args(self.value,
(a[0]).value,
(a[1]).value,
-   (a[2]).value)
+   v2)
 else:
 arg_list = make_gap_list(args)
 result = GAP_CallFuncList(self.value, arg_list)

[Bug tree-optimization/114872] [13/14/15 Regression] Miscompilation with -O2 after commit r13-8037

2024-05-06 Thread slyfox at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114872

--- Comment #18 from Sergei Trofimovich  ---
> 2) ideally show a gdb session with the important events, which setjmp was it 
> (I see _setjmp and __sigsetjmp calls in the function), which exact function 
> called from the function ended up aborting/doing longjmp in the signal 
> handler and where is the crash

# gdb --quiet -p 1180766

Attaching to a running `sage` interactive process.
In sage repl typing:

  libgap.AbelianGroup(0,0,0)

Breakpoint happens. SIGABRT (immediate longjmp trigger) backtrace:

Thread 1 "sage-ipython" received signal SIGABRT, Aborted.
0x7f53f8e617a7 in __GI_kill () at ../sysdeps/unix/syscall-template.S:120
120 T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
(gdb) bt
#0  0x7f53f8e617a7 in __GI_kill () at
../sysdeps/unix/syscall-template.S:120
#1  0x7f539c581edf in sig_error () at
/usr/lib/python3.12/site-packages/cysignals/macros.h:298
#2  __pyx_pf_4sage_4libs_3gap_7element_19GapElement_Function_2__call__
(__pyx_v_self=__pyx_v_self@entry=0x7f539635bc40,
__pyx_v_args=__pyx_v_args@entry=(, ,
))
at
/usr/src/debug/sci-mathematics/sagemath-standard-10.3/sagemath-standard-10.3-python3_12/build/cythonized/sage/libs/gap/element.c:26315
#3  0x7f539c5834e7 in
__pyx_pw_4sage_4libs_3gap_7element_19GapElement_Function_3__call__
(__pyx_v_self=,
__pyx_args=(,
,
), __pyx_kwds=)
at
/usr/src/debug/sci-mathematics/sagemath-standard-10.3/sagemath-standard-10.3-python3_12/build/cythonized/sage/libs/gap/element.c:26105
#4  0x7f53f916496b in _PyObject_MakeTpCall (tstate=0x7f53f9670d08
<_PyRuntime+459656>,
callable=callable@entry=,
args=args@entry=0x7f53f96b5480, nargs=3, keywords=0x0) at
Objects/call.c:240
...

SIGSEGV backtrace (for completeness):

Thread 1 "sage-ipython" received signal SIGSEGV, Segmentation fault.
0x7f539c58256f in _Py_IsImmortal (op=0x0) at
/usr/include/python3.12/object.h:242
242 return _Py_CAST(PY_INT32_T, op->ob_refcnt) < 0;
(gdb) bt
#0  0x7f539c58256f in _Py_IsImmortal (op=0x0) at
/usr/include/python3.12/object.h:242
#1  Py_DECREF (op=0x0) at /usr/include/python3.12/object.h:700
#2  Py_XDECREF (op=0x0) at /usr/include/python3.12/object.h:798
#3  __pyx_pf_4sage_4libs_3gap_7element_19GapElement_Function_2__call__
(__pyx_v_self=__pyx_v_self@entry=0x7f539635bc40,
__pyx_v_args=__pyx_v_args@entry=(, ,
))
at
/usr/src/debug/sci-mathematics/sagemath-standard-10.3/sagemath-standard-10.3-python3_12/build/cythonized/sage/libs/gap/element.c:26535
#4  0x7f539c5834e7 in
__pyx_pw_4sage_4libs_3gap_7element_19GapElement_Function_3__call__
(__pyx_v_self=,
__pyx_args=(,
,
), __pyx_kwds=)
at
/usr/src/debug/sci-mathematics/sagemath-standard-10.3/sagemath-standard-10.3-python3_12/build/cythonized/sage/libs/gap/element.c:26105
#5  0x7f53f916496b in _PyObject_MakeTpCall (tstate=0x7f53f9670d08
<_PyRuntime+459656>,
callable=callable@entry=,
args=args@entry=0x7f53f96b5480, nargs=3, keywords=0x0) at
Objects/call.c:240


Catching `*jmp` flavours:

(gdb) break __pyx_pf_4sage_4libs_3gap_7element_19GapElement_Function_2__call__
(gdb) continue
...
(gdb) break __GI___sigsetjmp
(gdb) break longjmp
(gdb) break siglongjmp
(gdb) continue # a lot of them

(gdb) bt
#0  __GI___sigsetjmp () at ../sysdeps/x86_64/setjmp.S:33
#1  0x7fc83e6e6ea4 in
__pyx_pf_4sage_4libs_3gap_7element_19GapElement_Function_2__call__
(__pyx_v_self=__pyx_v_self@entry=0x7fc83844f2c0,
__pyx_v_args=__pyx_v_args@entry=(, ,
))
at
/usr/src/debug/sci-mathematics/sagemath-standard-10.3/sagemath-standard-10.3-python3_12/build/cythonized/sage/libs/gap/element.c:26315

(gdb) fr 1
#1  0x7fc83e6e6ea4 in
__pyx_pf_4sage_4libs_3gap_7element_19GapElement_Function_2__call__
(__pyx_v_self=__pyx_v_self@entry=0x7fc83844f2c0,
__pyx_v_args=__pyx_v_args@entry=(, ,
))
at
/usr/src/debug/sci-mathematics/sagemath-standard-10.3/sagemath-standard-10.3-python3_12/build/cythonized/sage/libs/gap/element.c:26315
26315   sig_GAP_Enter();
(gdb) disassemble
Dump of assembler code for function
__pyx_pf_4sage_4libs_3gap_7element_19GapElement_Function_2__call__:

   0x7fc83e6e6e9f <+831>:   call   0x7fc83e6b54e0 <_setjmp@plt>
=> 0x7fc83e6e6ea4 <+836>:   mov$0x1,%ebx

That is a _setjmp at element.c:26315. ABORT happens after it.

[Bug tree-optimization/114872] [13/14/15 Regression] Miscompilation with -O2 after commit r13-8037

2024-05-06 Thread slyfox at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114872

--- Comment #17 from Sergei Trofimovich  ---
> 1) attach your *.s file and state which exact compiler you used (revision)

Generate code first:

https://slyfox.uni.cx/b/gcc/PR114872/d.tar.gz (4MB, does not fit on bugzilla's
1MB limit)

is the archive of source files from sage-10.3:
- element.c: original unpreprocessed file
- element.c.c: preprocessed file
- element.S: assembly file (-S)
- element-verbose.S: assembly file (-S -fverbose-asm, has compiler options)

Compiler:

# gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-pc-linux-gnu/13/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with:
/dev/shm/portage/sys-devel/gcc-13.2.1_p20240503/work/gcc-13-20240503/configure
--host=x86_64-pc-linux-gnu --build=x86_64-pc-linux-gnu --prefix=/usr
--bindir=/usr/x86_64-pc-linux-gnu/gcc-bin/13
--includedir=/usr/lib/gcc/x86_64-pc-linux-gnu/13/include
--datadir=/usr/share/gcc-data/x86_64-pc-linux-gnu/13
--mandir=/usr/share/gcc-data/x86_64-pc-linux-gnu/13/man
--infodir=/usr/share/gcc-data/x86_64-pc-linux-gnu/13/info
--with-gxx-include-dir=/usr/lib/gcc/x86_64-pc-linux-gnu/13/include/g++-v13
--disable-silent-rules --disable-dependency-tracking
--with-python-dir=/share/gcc-data/x86_64-pc-linux-gnu/13/python
--enable-languages=c,c++,fortran --enable-obsolete --enable-secureplt
--disable-werror --with-system-zlib --enable-nls --without-included-gettext
--disable-libunwind-exceptions --enable-checking=release
--with-bugurl=https://bugs.gentoo.org/ --with-pkgversion='Gentoo
13.2.1_p20240503 p15' --with-gcc-major-version-only --enable-libstdcxx-time
--enable-lto --disable-libstdcxx-pch --enable-shared --enable-threads=posix
--enable-__cxa_atexit --enable-clocale=gnu --enable-multilib
--with-multilib-list=m32,m64 --disable-fixed-point --enable-targets=all
--enable-libgomp --disable-libssp --disable-libada --disable-cet
--disable-systemtap --disable-valgrind-annotations --disable-vtable-verify
--disable-libvtv --without-zstd --without-isl --enable-default-pie
--enable-default-ssp --disable-fixincludes
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 13.2.1 20240503 (Gentoo 13.2.1_p20240503 p15)

Note: it's a weekly snapshot of 20240503 with --enable-default-ssp. I think
it's a r13-8684-g704b15e277a879.

The compiler also has a few distro-specific patches on top:
https://gitweb.gentoo.org/proj/gcc-patches.git/tree/13.2.0/gentoo. Most of them
should not affect code generation too much, but some like
22_all_default_ssp-buffer-size.patch and
84_all_x86_PR110792-Early-clobber-issues-with-rot32di2.patch might.

[Bug tree-optimization/114872] [13/14/15 Regression] Miscompilation with -O2 after commit r13-8037

2024-05-06 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114872

--- Comment #16 from Jakub Jelinek  ---
(In reply to Sergei Trofimovich from comment #14)
> I reproduced the `SIGSEGV` on Gentoo ~amd64 and ::sage-on-gentoo overlay
> against sci-mathematics/sagemath-standard package.
> 
> One of the unusual properties of
> __pyx_pf_4sage_4libs_3gap_7element_19GapElement_Function_2__call__() is that
> it raises 2 signals while it gets executed:
> 
> - SIGABRT handler uses longjmp() to return to the ~beginning of a function
> - and then SIGSEGV happens at cleanup when an attempt to dereference the
> pointer happens.
> 
> I see no `volatile` annotations anywhere in the
> __pyx_pf_4sage_4libs_3gap_7element_19GapElement_Function_2__call__().
> 
> My wild guess would be that:
> 1. `PyObject *__pyx_t_4 = ((void *)0);` gets saved in setjmp() with one
> value (probably NULL)
> 2. updated at some point later in the same function to non-NULL that `gcc`
> can infer and throw away all later `NULL` checks
> 3. then SIGABRT returns with longjmp() by accidentally resetting
> 
> I would expect `__pyx_t_4` to require volatile annotation for such an
> `element.i` definition. Or `longjmp()` should be called from a `((noipa))`
> function to force register spill/reload on stack.
> 
> To cite `man setjmp`:
> 
> """
> CAVEATS
>The  compiler  may  optimize  variables  into registers, and
> longjmp() may restore the values of other registers in addition to the stack
> pointer and program counter.  Consequently, the values of automatic
>variables are unspecified after a call to longjmp() if they meet all
> the following criteria:
>•  they are local to the function that made the corresponding
> setjmp() call;
>•  their values are changed between the calls to setjmp() and
> longjmp(); and
>•  they are not declared as volatile.
>Analogous remarks apply for siglongjmp().
> """
> 
> Sounds plausible?

So, if you can reproduce it, can you:
1) attach your *.s file and state which exact compiler you used (revision)
2) ideally show a gdb session with the important events, which setjmp was it (I
see
_setjmp and __sigsetjmp calls in the function), which exact function called
from the function ended up aborting/doing longjmp in the signal handler and
where is the crash
3) is it __pyx_t_6, __pyx_t_4 or some other pointer that triggers it (from the
line numbers in #c0 my guess was __pyx_t_6, but you talk about __pyx_t_4)

Yes, there are no volatile keywords on any of the vars, but without knowing
which setjmp call it is and from where longjmp jumps to it, it is hard to know
if the variables have been modified in between (then volatile would be
required) or if they
are only modified before the setjmp call or after the call that calls longjmp
(then volatile might not be required).

[Bug tree-optimization/114872] [13/14/15 Regression] Miscompilation with -O2 after commit r13-8037

2024-05-06 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114872

--- Comment #15 from Jakub Jelinek  ---
(In reply to Dmitrii Pasechnik from comment #12)
> A colleague disassembled, using ghidra (https://ghidra-sre.org/), the
> results of the compilations with, respectively, -O2 and with -O0 flags.
> Comparing the results, in the broken (built with -O2) case one sees a
> miscompilation of a call to GAP_CallFunc3Args - it is called with one
> argument less than it should, three instead of four!
> 
> broken (-O2):
> 
> >   plVar9 = (long *)GAP_CallFunc3Args(*(undefined8 *)(param_1 + 
> > 0x20),local_a0[4],
> >  local_a8[4]);
> 
> vs. good (-O0):
> 
> < LAB_0013cbd9:
> < plVar10 = (long *)GAP_CallFunc3Args(*(undefined8 *)(param_1 +
> 0x20),local_a8[4],
> < local_a0[4],plVar16[4]);
> 
> And this is despite the prototype for calling GAP_CallFunc3Args() is
> found in "gap/libgap-api.h", which is included in example.c as #include
> "gap/libgap-api.h",  meant to be respected during the compilation. 
> 
> I hope this helps in chasing down the obvious compiler bug. Perhaps it can
> be also seen without disassembling, simply on the intermediate data
> generated by the compiler.

Both calls pass 4 arguments, both in optimized -O2 -fPIC GCC 13.2.1 dump and in
assembly:
movq-88(%rbp), %rdx
movq%r12, %rcx
movq%r13, %rsi
movq%rax, %rdi
callGAP_CallFunc3Args@PLT
...
movq-88(%rbp), %rdx
movq%r12, %rcx
movq%r13, %rsi
movq%rax, %rdi
callGAP_CallFunc3Args@PLT
and
  GAP_CallFunc3Args (_53, _47, __pyx_v_func_54(D), _49);
...
  _441 = GAP_CallFunc3Args (_98, _97, _96, _95);

[Bug tree-optimization/114872] [13/14/15 Regression] Miscompilation with -O2 after commit r13-8037

2024-05-06 Thread slyfox at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114872

Sergei Trofimovich  changed:

   What|Removed |Added

 CC||slyfox at gcc dot gnu.org

--- Comment #14 from Sergei Trofimovich  ---
I reproduced the `SIGSEGV` on Gentoo ~amd64 and ::sage-on-gentoo overlay
against sci-mathematics/sagemath-standard package.

One of the unusual properties of
__pyx_pf_4sage_4libs_3gap_7element_19GapElement_Function_2__call__() is that it
raises 2 signals while it gets executed:

- SIGABRT handler uses longjmp() to return to the ~beginning of a function
- and then SIGSEGV happens at cleanup when an attempt to dereference the
pointer happens.

I see no `volatile` annotations anywhere in the
__pyx_pf_4sage_4libs_3gap_7element_19GapElement_Function_2__call__().

My wild guess would be that:
1. `PyObject *__pyx_t_4 = ((void *)0);` gets saved in setjmp() with one value
(probably NULL)
2. updated at some point later in the same function to non-NULL that `gcc` can
infer and throw away all later `NULL` checks
3. then SIGABRT returns with longjmp() by accidentally resetting

I would expect `__pyx_t_4` to require volatile annotation for such an
`element.i` definition. Or `longjmp()` should be called from a `((noipa))`
function to force register spill/reload on stack.

To cite `man setjmp`:

"""
CAVEATS
   The  compiler  may  optimize  variables  into registers, and longjmp()
may restore the values of other registers in addition to the stack pointer and
program counter.  Consequently, the values of automatic
   variables are unspecified after a call to longjmp() if they meet all the
following criteria:
   •  they are local to the function that made the corresponding setjmp()
call;
   •  their values are changed between the calls to setjmp() and longjmp();
and
   •  they are not declared as volatile.
   Analogous remarks apply for siglongjmp().
"""

Sounds plausible?

[Bug tree-optimization/114872] [13/14/15 Regression] Miscompilation with -O2 after commit r13-8037

2024-05-03 Thread dima.pasechnik at cs dot ox.ac.uk via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114872

--- Comment #13 from Dmitrii Pasechnik  ---
Created attachment 58099
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58099=edit
data for comment 12 - decompiled things

data for comment 12 - decompiled .so's, .so's themselves, original C file.

BROKEN* are for -O2 option, and FIXED* are for -O0 option.

[Bug tree-optimization/114872] [13/14/15 Regression] Miscompilation with -O2 after commit r13-8037

2024-05-03 Thread dima.pasechnik at cs dot ox.ac.uk via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114872

Dmitrii Pasechnik  changed:

   What|Removed |Added

 CC||dima.pasechnik at cs dot 
ox.ac.uk

--- Comment #12 from Dmitrii Pasechnik  ---
A colleague disassembled, using ghidra (https://ghidra-sre.org/), the results
of the compilations with, respectively, -O2 and with -O0 flags.
Comparing the results, in the broken (built with -O2) case one sees a
miscompilation of a call to GAP_CallFunc3Args - it is called with one argument
less than it should, three instead of four!

broken (-O2):

>   plVar9 = (long *)GAP_CallFunc3Args(*(undefined8 *)(param_1 + 
> 0x20),local_a0[4],
>  local_a8[4]);

vs. good (-O0):

< LAB_0013cbd9:
< plVar10 = (long *)GAP_CallFunc3Args(*(undefined8 *)(param_1 +
0x20),local_a8[4],
< local_a0[4],plVar16[4]);

And this is despite the prototype for calling GAP_CallFunc3Args() is
found in "gap/libgap-api.h", which is included in example.c as #include
"gap/libgap-api.h",  meant to be respected during the compilation. 

I hope this helps in chasing down the obvious compiler bug. Perhaps it can be
also seen without disassembling, simply on the intermediate data generated by
the compiler.

[Bug tree-optimization/114872] [13/14/15 Regression] Miscompilation with -O2 after commit r13-8037

2024-04-30 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114872

--- Comment #11 from Sam James  ---
also, do asan and ubsan give anything?

[Bug tree-optimization/114872] [13/14/15 Regression] Miscompilation with -O2 after commit r13-8037

2024-04-30 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114872

--- Comment #10 from Sam James  ---
also maybe obvious, but if you can find something from the cython testsuite, or
at least some other heavy use of cython, which fails, that may be a good
direction.

[Bug tree-optimization/114872] [13/14/15 Regression] Miscompilation with -O2 after commit r13-8037

2024-04-30 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114872

--- Comment #9 from Sam James  ---
unfortunately*

[Bug tree-optimization/114872] [13/14/15 Regression] Miscompilation with -O2 after commit r13-8037

2024-04-30 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114872

--- Comment #8 from Sam James  ---
element.i is unfortunatley huge. It's hard to analyse things without a
standalone testcase, but it's even harder without _something_ one can run.

I'd suggest:
1) giving instructions to reproduce the crash assuming someone knows nothing
about sage and cython;
2) taking sage and cannibalising it (first to directly call
`libgap.AbelianGroup(0,0,0)`, then you can cut things down with lots of
removals + gdb so that the important caller of Py_XDECREF gets passed with the
same args; it's harder if there's a lot of state involved though, of course)
3) build element.i with -fdump-tree-all -fdump-unnumbered -fdump-noaddr (can
also try e.g. -da, see
https://gcc.gnu.org/onlinedocs/gcc/Developer-Options.html) at the bad commit
and before, and diff the produced dumps, and show us the *first* dump which
differs between the two

[Bug tree-optimization/114872] [13/14/15 Regression] Miscompilation with -O2 after commit r13-8037

2024-04-29 Thread arojas at archlinux dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114872

--- Comment #7 from Antonio Rojas  ---
(In reply to Sam James from comment #5)
> Some ideas:
> * Could you maybe give a reproducer for the runtime crash?

In sage, calling any libgap function in exactly 3 parameters triggers this. For
instance:

sage: libgap.AbelianGroup(0,0,0)

Note that you need Python 3.12 to get the crash (which happens when trying to
derefence the null pointer in the newly introduced _Py_IsImmortal function)

> * Any chance you'd be willing to try bisect element.i with pragmas to
> disable/enable optimisation for chunks of it, to find the miscompiled
> function?

I have done that. As expected, the problem is in Py_XDECREF. This makes the
problem disappear:

#pragma GCC push_options
#pragma GCC optimize ("O0")

static inline void Py_XDECREF(PyObject *op)
{
if (op != 
# 797 "/usr/include/python3.12/object.h" 3 4
 ((void *)0)
# 797 "/usr/include/python3.12/object.h"
 ) {
Py_DECREF(((PyObject*)((op;
}
}

#pragma GCC pop_options

[Bug tree-optimization/114872] [13/14/15 Regression] Miscompilation with -O2 after commit r13-8037

2024-04-29 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114872

--- Comment #6 from Jakub Jelinek  ---
I have looked at the IL and I don't see how it could crash that way based on
the -fdump-tree-optimized-lineno dump, neither on branch nor on the trunk.
On the branch, I see
   [local count: 462566023]:
  [element.c:26539:57 discrim 1] __pyx_t_5_522(ab) = 0B;
  [/usr/include/python3.12/object.h:797:8] if (__pyx_t_6_211(ab) != 0B)
goto ; [70.00%]
  else
goto ; [30.00%]

   [local count: 462566023]:
  [element.c:26540:57 discrim 1] __pyx_t_6_524(ab) = 0B;
  [/usr/include/python3.12/object.h:797:8] if (__pyx_t_8_1031(ab) != 0B)
goto ; [70.00%]
  else
goto ; [30.00%]

   [local count: 323796219]:
  [/usr/include/python3.12/object.h:242:25] _1105 =
[/usr/include/python3.12/object.h:242:25]
[/usr/include/python3.12/object.h:242:25] __pyx_t_6_211(ab)->D.10046.ob_refcnt;
  [/usr/include/python3.12/object.h:242:13] _1106 = (int) _1105;
  [/usr/include/python3.12/object.h:700:8 discrim 1] if (_1106 < 0)
goto ; [26.36%]
  else
goto ; [73.64%]

   [local count: 238443538]:
  [/usr/include/python3.12/object.h:704:9] _1107 = _1105 + -1;
  [/usr/include/python3.12/object.h:704:8]
[/usr/include/python3.12/object.h:704:13]
[/usr/include/python3.12/object.h:704:13] __pyx_t_6_211(ab)->D.10046.ob_refcnt
= _1107;
  [/usr/include/python3.12/object.h:704:8] if (_1107 == 0)
goto ; [33.00%]
  else
goto ; [67.00%]

   [local count: 78686364]:
  [/usr/include/python3.12/object.h:705:9] _Py_Dealloc (__pyx_t_6_211(ab));
  goto ; [100.00%]

so the 26540 line case checks if the pointer later passed to the _Py_Dealloc is
non-NULL and doesn't let it be called in that case.

[Bug tree-optimization/114872] [13/14/15 Regression] Miscompilation with -O2 after commit r13-8037

2024-04-29 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114872

--- Comment #5 from Sam James  ---
Some ideas:
* Could you maybe give a reproducer for the runtime crash?
* Any chance you'd be willing to try bisect element.i with pragmas to
disable/enable optimisation for chunks of it, to find the miscompiled function?

[Bug tree-optimization/114872] [13/14/15 Regression] Miscompilation with -O2 after commit r13-8037

2024-04-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114872

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |13.3
Summary|Miscompilation with -O2 |[13/14/15 Regression]
   |after commit r13-8037   |Miscompilation with -O2
   ||after commit r13-8037
  Known to fail||13.2.1
  Known to work||13.2.0