[Bug c/104011] New: s390: r12 is not setup for _mcount call

2022-01-13 Thread stli at linux dot ibm.com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104011

Bug ID: 104011
   Summary: s390: r12 is not setup for _mcount call
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: stli at linux dot ibm.com
  Target Milestone: ---

On 31bit, as r12 is not setup before brasl _mcount@plt, we jump to a   
different function.
Note that the PIE plt-slot is using r12.
In the debugging-case, e.g. __libc_calloc is called.
In a different glibc-testcase "gmon/tst-gmon-pie" we jump to another function,
which leads to a segfault.

This happens with, e.g.:
- gcc version 9.2.1 20190827 (Red Hat 9.2.1-1) (GCC)
- gcc 11.2.0

Steps to reproduce:
$ cat tst-pie-mcount.c
#include 
#include 

int
main (void)
{
  puts ("Hello world");
  return EXIT_SUCCESS;
}

$ gcc -o tst-pie-mcount -g -m31 -fpie -pg -pie tst-pie-mcount.c
$ objdump -d tst-pie-mcount
...
05c8 <_mcount@plt>:
 5c8:   58 10 c0 20 l   %r1,32(%r12)
 5cc:   07 f1   br  %r1
 5ce:   00 00 00 00 .long   0x
 5d2:   00 00 0d 10 .long   0x0d10
 5d6:   58 10 10 0e l   %r1,14(%r1)
 5da:   a7 f4 ff 97 j   508 <.plt>
...
 5e6:   00 3c   .short  0x003c

...

0860 :
 860:   50 e0 f0 04 st  %r14,4(%r15)
 864:   c0 10 00 00 0b f2   larl%r1,2048 <__data_start+0x4>

We jump to the plt-slot, which uses r12, which is loaded later.
 86a:   c0 e5 ff ff fe af   brasl   %r14,5c8 <_mcount@plt>

 870:   58 e0 f0 04 l   %r14,4(%r15)
 874:   90 bf f0 2c stm %r11,%r15,44(%r15)
 878:   a7 fa ff a0 ahi %r15,-96
 87c:   18 bf   lr  %r11,%r15

GOT-Pointer is loaded here for puts:
 87e:   c0 c0 00 00 0b c1   larl%r12,2000 <_GLOBAL_OFFSET_TABLE_>
 884:   c0 20 00 00 00 6c   larl%r2,95c <_IO_stdin_used+0x4>
 88a:   c0 e5 ff ff fe 7f   brasl   %r14,588 

 890:   a7 18 00 00 lhi %r1,0
 894:   18 21   lr  %r2,%r1
 896:   98 bf b0 8c lm  %r11,%r15,140(%r11)
 89a:   07 fe   br  %r14
 89c:   07 07   nopr%r7
 89e:   07 07   nopr%r7
 */

[Bug c/99134] S390x: pfpo instructions are not used for dfp[128|64|32] to/from long double conversions

2021-02-22 Thread stli at linux dot ibm.com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99134

stli at linux dot ibm.com  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #2 from stli at linux dot ibm.com  ---
I've just restested libdfp with gcc-head:
$ git log --oneline
60b99ee3bc0 (HEAD -> master, origin/master, origin/HEAD) Daily bump.
...
b6e446cb581 IBM Z: Fix long double <-> DFP conversions
a974b8a592e IBM Z: Improve FPRX2 <-> TF conversions


Now all the long double <-> _Decimal data-type conversions are using the pfpo
instruction.

Thanks.

[Bug c/99134] New: S390x: pfpo instructions are not used for dfp[128|64|32] to/from long double conversions

2021-02-17 Thread stli at linux dot ibm.com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99134

Bug ID: 99134
   Summary: S390x: pfpo instructions are not used for
dfp[128|64|32] to/from long double conversions
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: stli at linux dot ibm.com
  Target Milestone: ---

Created attachment 50212
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50212=edit
Test which runs the dfpXYZ <-> long double conversions which are not performed
via pfpo instruction, but by calling __dpd_[trunc|extend] functions.

See libdfp-issue "s390x: 3 test failures on Fedora Rawhide #160"
https://github.com/libdfp/libdfp/issues/160
(Notice that Rawhide is using GCC 11 now.)

Reproduced the issues with gcc commit 78a6d0e30d7950216dc0c5be5d65d0cbed13924c
You have to configure gcc with --enable-decimal-float

All decimal-floating-point[128|64|32] <-> binary-floating-point[128|64|32]
conversions should emit the
pfpo (PERFORM FLOATING-POINT OPERATION) instruction as used in previous GCC
versions.

GCC 11 is not using the pfpo instruction if bfp128 (long double) is involved
in the conversion. In the libdfp implementation of dpd_extend/trunc functions,
this leads to be a recursive call to itself which segfaults as it runs out of
stack:
- bfp128 -> dfp128 (do__dpd_extendtftd(): brasl %r14,<__dpd_extendtftd>)
- bfp128 -> dfp64  (do_bfp128_to_dfp64(): brasl %r14,<__dpd_trunctfdd>)
- bfp128 -> dfp32  (do_bfp128_to_dfp32(): brasl %r14,<__dpd_trunctfsd>)

- dfp128 -> bfp128 (do__dpd_trunctdtf(): brasl %r14,<__dpd_trunctdtf>)
- dfp64  -> bfp128 (do_dfp64_to_bfp128(): brasl %r14,<__dpd_extendddtf>)
- dfp32  -> bfp128 (do_dfp32_to_bfp128(): brasl %r14,<__dpd_extendsdtf>)

[Bug c/98269] gcc 6.5.0 __builtin_add_overflow() with small uint32_t values incorrectly detects overflow

2020-12-17 Thread stli at linux dot ibm.com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98269

--- Comment #5 from stli at linux dot ibm.com  ---
Just as information,
I've just committed this glibc patch:
"s390x: Require GCC 7.1 or later to build glibc."
https://sourceware.org/git/?p=glibc.git;a=commit;h=844b4d8b4b937fe6943d2c0c80ce7d871cdb1eb5

[Bug c/98269] gcc 6.5.0 __builtin_add_overflow() with small uint32_t values incorrectly detects overflow

2020-12-14 Thread stli at linux dot ibm.com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98269

stli at linux dot ibm.com  changed:

   What|Removed |Added

 Target||s390x
  Known to work||10.1.0, 5.4.0, 5.5.0,
   ||7.1.0, 8.1.0, 9.1.0
 CC||stli at linux dot ibm.com
  Known to fail||6.3.0, 6.4.0, 6.5.0

--- Comment #2 from stli at linux dot ibm.com  ---
That's okay for me. But I wanted to document it. Currently glibc is requiring
gcc 6.2 as minimum. For s390x, I will post a patch which requires gcc 7.1 as
minimum.

[Bug c/98269] New: gcc 6.5.0 __builtin_add_overflow() with small uint32_t values incorrectly detects overflow

2020-12-14 Thread stli at linux dot ibm.com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98269

Bug ID: 98269
   Summary: gcc 6.5.0 __builtin_add_overflow() with small uint32_t
values incorrectly detects overflow
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: stli at linux dot ibm.com
  Target Milestone: ---

Created attachment 49756
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49756=edit
Build this tst-gcc-addoverflow.c with gcc 6.5.0 to see the ERROR

If build on s390x (I had no chance to test it on other architectures) with gcc
6.5.0 the attached testcase with small uint32_t input values for
__builtin_add_overflow() detects an overflow and fails:
  else if (__builtin_add_overflow (previous->offset,
   previous->length + 1,
   >offset))
{
  printf ("ERROR: __builtin_add_overflow() OVERFLOWED: "
  "previous->offset=%" PRIu32 " + "
  "(previous->length=%" PRIu32 " + 1)"
  " => current->offset=%" PRIu32 "\n",
  previous->offset, previous->length, current->offset);
  return EXIT_FAILURE;
}

=>
ERROR: __builtin_add_overflow() OVERFLOWED: previous->offset=7 +
(previous->length=3 + 1) => current->offset=11

I have not recognized this issue with gcc 7.1 and later.

The original issue was found if glibc is build with gcc 6.5.0:
__builtin_add_overflow is used in
/elf/stringtable.c:stringtable_finalize()
(https://sourceware.org/git/?p=glibc.git;a=blob;f=elf/stringtable.c;h=099347d73ee70b8ffa4b4a91c493e0bba147ffa2;hb=HEAD#l185)
which leads to ldconfig failing with "String table is too large". This is
also recognizable in following glibc-tests:
FAIL: elf/tst-glibc-hwcaps-cache
FAIL: elf/tst-glibc-hwcaps-prepend-cache
FAIL: elf/tst-ldconfig-X
FAIL: elf/tst-ldconfig-bad-aux-cache
FAIL: elf/tst-ldconfig-ld_so_conf-update
FAIL: elf/tst-stringtable

Please also have a look at attached tst-gcc-addoverflow.c for some more details
from my gdb session showing the add and jump instruction.

[Bug middle-end/98070] [11 Regression] errno is not re-evaluated after clearing errno and calling realloc(ptr, SIZE_MAX)

2020-12-01 Thread stli at linux dot ibm.com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98070

--- Comment #5 from stli at linux dot ibm.com  ---
I've just build and run the attached test on s390x/x86_64 with your fix.
Now errno is re-evaluated after realloc.

I've also rebuild glibc on s390x and the original glibc-test
/malloc/tst-malloc-too-large.c is now also passing.

Many thanks.

[Bug c/98070] New: errno is not re-evaluated after clearing errno and calling realloc(ptr, SIZE_MAX)

2020-11-30 Thread stli at linux dot ibm.com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98070

Bug ID: 98070
   Summary: errno is not re-evaluated after clearing errno and
calling realloc(ptr, SIZE_MAX)
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: stli at linux dot ibm.com
  Target Milestone: ---

Created attachment 49652
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49652=edit
Testcase reproducing the issue with gcc-head

Hi,

After setting errno=0 and calling realloc with a too large size, which sets
errno to ENOMEM, a subsequent "if (errno == ENOMEM)" is not evaluated as true.
Instead gcc assumes that errno has not changed and is directly executing the
else-path without testing errno again.

This happens in the glibc-testcase:
/malloc/tst-malloc-too-large.c test
(see
https://sourceware.org/git/?p=glibc.git;a=blob;f=malloc/tst-malloc-too-large.c;h=b5ad7eb7e7bf764fe57ceff5a810e3c211ca05e0;hb=refs/heads/master)
on at least x86_64 and s390x with gcc-head.

The attached small reproducer fails with gcc-head, but not with gcc 10, 9
(before):
/* Output with gcc 11:
   $ ./tst-errno-realloc (build with >= -O1)
   47: errno == 0 (Cannot allocate memory). We are in the else-part of 'if
(errno == ENOMEM)'. Does errno correspond to %m or the line below or to '(gdb)
p errno'?!
   dump_errno(48, compare to line above!): errno == 12 (Cannot allocate memory)
vs main_errno=0

   On s390x:
   $ gcc -v
   Using built-in specs.
   COLLECT_GCC=./install-s390x-head/bin/gcc
  
COLLECT_LTO_WRAPPER=/home/stli/gccDir/install-s390x-head/libexec/gcc/s390x-ibm-linux-gnu/11.0.0/lto-wrapper
   Target: s390x-ibm-linux-gnu
   Configured with: /home/stli/gccDir/gcc-head/configure
--prefix=/home/stli/gccDir/install-s390x-head/ --enable-shared
--with-system-zlib --enable-threads=posix --enable-__cxa_atexit
--enable-checking --enable-gnu-indirect-function --enable-languages=c,c++
--with-arch=zEC12 --with-tune=z13 --disable-bootstrap --with-long-double-128
--enable-decimal-float
   Thread model: posix
   Supported LTO compression algorithms: zlib
   gcc version 11.0.0 20201127 (experimental) (GCC)
   $ git log --oneline
   5e9f814d754 (HEAD -> master, origin/master, origin/HEAD) rs6000: Change
rs6000_expand_vector_set param

   Also on x86_64:
   $ gcc -v
   Using built-in specs.
   COLLECT_GCC=/home/stli/gccDir/install-x86_64-head/bin/gcc
  
COLLECT_LTO_WRAPPER=/home/stli/gccDir/install-x86_64-head/libexec/gcc/x86_64-pc-linux-gnu/11.0.0/lto-wrapper
   Target: x86_64-pc-linux-gnu
   Configured with: /home/stli/gccDir/gcc-head/configure
--prefix=/home/stli/gccDir/install-x86_64-head/ --enable-shared
--with-system-zlib --enable-threads=posix --enable-__cxa_atexit
--enable-checking --enable-gnu-indirect-function --enable-languages=c,c++
--with-tune=generic --with-arch_32=x86-64 --disable-bootstrap
--disable-libunwind-exceptions --enable-gnu-unique-object
--enable-linker-build-id --with-linker-hash-style=gnu --enable-plugin
--enable-initfini-array --disable-libgcj --disable-multilib
   Thread model: posix
   Supported LTO compression algorithms: zlib zstd
   gcc version 11.0.0 20201130 (experimental) (GCC)
   $ git log --oneline
   a5ad5d5c478 (HEAD -> master, origin/master, origin/HEAD) RISC-V: Always
define MULTILIB_DEFAULTS
*/

[Bug d/91628] libdruntime uses glibc internal symbol on s390

2020-04-08 Thread stli at linux dot ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91628

--- Comment #19 from stli at linux dot ibm.com  ---
Fixed with gcc commit "S/390: Fix PR91628"
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=88e508f9f112acd07d0c49c53589160db8c85fcd

If somebody is backporting this fix, please also backport
gcc commit "S/390: Fix layout of struct sigaction_t"
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=434fe1a4092e12e5b518ef0716dc5b315e06118d

Otherwise you'll still see tls testsuite FAILs.

[Bug target/77918] S390: Floating point comparisons don't raise invalid for unordered operands.

2019-11-05 Thread stli at linux dot ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77918

--- Comment #16 from stli at linux dot ibm.com  ---
Just as information, this glibc commit will be first available with glibc 2.31
release:
"S390: Fp comparison are now raising FE_INVALID with gcc 10."
https://sourceware.org/git/?p=glibc.git;a=commit;h=64bca76f42a82e6a9ea2b0166deab7aa2b7efbea

[Bug target/77918] S390: Floating point comparisons don't raise invalid for unordered operands.

2019-10-28 Thread stli at linux dot ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77918

stli at linux dot ibm.com  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #14 from stli at linux dot ibm.com  ---
I've tested this patch with help of glibc testsuite.
Therefore I've disabled the current workaround:
/sysdeps/s390/fpu/fix-fp-int-compare-invalid.h:
#define FIX_COMPARE_INVALID 0

All tests passed.

As information: Without this patch there were fails like:
math/test-ldouble-iseqsig.out:
testing long double (without inline functions)
Failure: iseqsig (-0, qNaN): Exception "Invalid operation" not set
Failure: iseqsig (-0, -qNaN): Exception "Invalid operation" not set
...

As soon as gcc 10 is released, I will post a glibc-patch which conditionally
disables the current workaround.

Thanks.

[Bug target/80080] S390: Isses with emitted cs-instructions for __atomic builtins.

2018-08-06 Thread stli at linux dot ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80080

--- Comment #11 from stli at linux dot ibm.com  ---
Hi,
I've retested the samples with gcc 7, 8 and head from 2018-07-20, but there are
still issues:
The examples foo1 and foo2 are okay.

The issue in example foo3 is still present (see description of the bug-report):

00a0 :
  a0:   a7 18 00 05 lhi %r1,5
  a4:   c4 2d 00 00 00 00   lrl %r2,a4 
a6: R_390_PC32DBL   foo3_mem+0x2

  aa:   c0 30 00 00 00 00   larl%r3,aa 
ac: R_390_PC32DBL   foo3_mem+0x2
  b0:   ba 21 30 00 cs  %r2,%r1,0(%r3)
  b4:   a7 74 ff fb jne aa 

The address of the global variable is still reloaded within the loop. If the
value was not swapped with cs, the jne can jump directly to the cs instruction
instead of the larl-instruction.

  b8:   b9 14 00 22 lgfr%r2,%r2
  bc:   07 fe   br  %r14
  be:   07 07   nopr%r7

I've found a further issue which is observable with the following two examples.
See the questions in the disassembly:

void foo4(int *mem)
{
  int oldval = 0;
  if (!__atomic_compare_exchange_n (mem, (void *) , 1,
1, __ATOMIC_ACQUIRE, __ATOMIC_RELAXED))
{
  bar (mem);
}
  /*
 :
   0:   e3 10 20 00 00 12   lt  %r1,0(%r2)
   6:   a7 74 00 06 jne 12 

Why do we need to jump to 0x12 first instead of directly jumping to 0x18?

   a:   a7 38 00 01 lhi %r3,1
   e:   ba 13 20 00 cs  %r1,%r3,0(%r2)
  12:   a7 74 00 03 jne 18 
  16:   07 fe   br  %r14
  18:   c0 f4 00 00 00 00   jg  18 
1a: R_390_PC32DBL   bar+0x2
  1e:   07 07   nopr%r7
  */
}


void foo5(int *mem)
{
  int oldval = 0;
  __atomic_compare_exchange_n (mem, (void *) , 1,
   1, __ATOMIC_ACQUIRE, __ATOMIC_RELAXED);
  if (oldval != 0)
bar (mem);
  /*
0040 :
  40:   e3 10 20 00 00 12   lt  %r1,0(%r2)
  46:   a7 74 00 06 jne 52 

This is similar to foo4, but the variable oldval is compared against zero
instead of using the return value of __atomic_compare_exchange_n.
Can't we jump directly to 0x5a instead of 0x52?

  4a:   a7 38 00 01 lhi %r3,1
  4e:   ba 13 20 00 cs  %r1,%r3,0(%r2)
  52:   12 11   ltr %r1,%r1
  54:   a7 74 00 03 jne 5a 
  58:   07 fe   br  %r14
  5a:   c0 f4 00 00 00 00   jg  5a 
5c: R_390_PC32DBL   bar+0x2
   */
}