[Bug tree-optimization/98110] [11 Regression] dl-lookup.c in glibc is miscompiled by r11-5029

2020-12-03 Thread fw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98110

Florian Weimer  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #18 from Florian Weimer  ---
This is a glibc bug.

[Bug tree-optimization/98110] [11 Regression] dl-lookup.c in glibc is miscompiled by r11-5029

2020-12-03 Thread fw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98110

--- Comment #17 from Florian Weimer  ---
Jakub's glibc test failures were due to --prefix=/usr/local, so that glibc
wouldn't find the installed system libgcc_s in /usr/lib64.

[Bug tree-optimization/98110] [11 Regression] dl-lookup.c in glibc is miscompiled by r11-5029

2020-12-03 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98110

--- Comment #16 from Jakub Jelinek  ---
In the glibc bugzilla or on private IRC I've suggested either:
diff --git a/sysdeps/x86_64/nptl/tls.h b/sysdeps/x86_64/nptl/tls.h
index a08bf972de..ccb5f24d92 100644
--- a/sysdeps/x86_64/nptl/tls.h
+++ b/sysdeps/x86_64/nptl/tls.h
@@ -180,11 +180,16 @@ _Static_assert (offsetof (tcbhead_t, __glibc_unused2) ==
0x80,
assignments like
pthread_descr self = thread_self();
do not get optimized away.  */
-# define THREAD_SELF \
+# if __GNUC_PREREQ (6, 0)
+#  define THREAD_SELF \
+  (*(struct pthread *__seg_fs *) offsetof (struct pthread, header.self))
+# else
+#  define THREAD_SELF \
   ({ struct pthread *__self; \
  asm ("mov %%fs:%c1,%0" : "=r" (__self)  \
  : "i" (offsetof (struct pthread, header.self)));\
  __self;})
+# endif

 /* Magic for libthread_db to know how to do THREAD_SELF.  */
 # define DB_THREAD_SELF_INCLUDE   /* For the FS constant.  */
or
diff --git a/sysdeps/x86_64/nptl/tls.h b/sysdeps/x86_64/nptl/tls.h
index a08bf972de..f3cbae9cba 100644
--- a/sysdeps/x86_64/nptl/tls.h
+++ b/sysdeps/x86_64/nptl/tls.h
@@ -180,11 +180,21 @@ _Static_assert (offsetof (tcbhead_t, __glibc_unused2) ==
0x80,
assignments like
pthread_descr self = thread_self();
do not get optimized away.  */
-# define THREAD_SELF \
+# if __GNUC_PREREQ (6, 0)
+#  define THREAD_SELF \
+  ({ struct pthread *__self; \
+ asm ("mov %%fs:%c1,%0" : "=r" (__self)  \
+ : "i" (offsetof (struct pthread, header.self)), \
+   "m" (*(struct pthread *__seg_fs *)\
+offsetof (struct pthread, header.self)));\
+ __self;})
+# else
+#  define THREAD_SELF \
   ({ struct pthread *__self; \
  asm ("mov %%fs:%c1,%0" : "=r" (__self)  \
  : "i" (offsetof (struct pthread, header.self)));\
  __self;})
+# endif

 /* Magic for libthread_db to know how to do THREAD_SELF.  */
 # define DB_THREAD_SELF_INCLUDE   /* For the FS constant.  */

With make check I'm seeing 153 glibc FAILs with trunk with the first patch,
about to make check the second one, but maybe I'm doing something wrong or
there are other gcc or glibc bugs unrelated to this one.  I think it would be
good to try these patches with GCC 10 first to see if those patches don't break
stuff with known working compilers and only then look at the failed tests if
any.
Florian said he'll take it over on the glibc side.

[Bug tree-optimization/98110] [11 Regression] dl-lookup.c in glibc is miscompiled by r11-5029

2020-12-03 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98110

--- Comment #15 from Richard Biener  ---
(In reply to Jakub Jelinek from comment #13)
> And no, the asm isn't marked volatile, that would have prevented it too:
> # define THREAD_SELF \
>   ({ struct pthread *__self;
> \
>  asm ("mov %%fs:%c1,%0" : "=r" (__self) 
> \
>   : "i" (offsetof (struct pthread, header.self)));  
> \
>  __self;})
> This all boils down to:
> void
> foo (int *p)
> {
>   for (int i = 0; i < 64; i++)
> {
>   if (p[i])
>   {
> int *q;
> asm ("mov %%fs:%c1,%0" : "=r" (q) : "i" (16));
> q[0]++;
>   }
> }
> }
> which would hoist the inline asm to the function prologue even in GCC 4.1.

I suppose we'd need to mark it possibly trapping (which it actually does),
but I think there's no good way to do this.

[Bug tree-optimization/98110] [11 Regression] dl-lookup.c in glibc is miscompiled by r11-5029

2020-12-03 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98110

--- Comment #14 from Jakub Jelinek  ---
I guess with GCC 6 and later one can use:
void
foo (int *p)
{
  for (int i = 0; i < 64; i++)
{
  if (p[i])
{
  int *q;
//asm ("mov %%fs:%c1,%0" : "=r" (q) : "i" (16));
  q = *(int *__seg_fs *) 16;
  q[0]++;
}
}
}
instead (but I haven't done any testing beyond this testcase), for older GCC I
guess you can keep using what it did because we've been lucky (dunno why).

[Bug tree-optimization/98110] [11 Regression] dl-lookup.c in glibc is miscompiled by r11-5029

2020-12-03 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98110

--- Comment #13 from Jakub Jelinek  ---
And no, the asm isn't marked volatile, that would have prevented it too:
# define THREAD_SELF \
  ({ struct pthread *__self;  \
 asm ("mov %%fs:%c1,%0" : "=r" (__self)   \
  : "i" (offsetof (struct pthread, header.self)));\
 __self;})
This all boils down to:
void
foo (int *p)
{
  for (int i = 0; i < 64; i++)
{
  if (p[i])
{
  int *q;
  asm ("mov %%fs:%c1,%0" : "=r" (q) : "i" (16));
  q[0]++;
}
}
}
which would hoist the inline asm to the function prologue even in GCC 4.1.

[Bug tree-optimization/98110] [11 Regression] dl-lookup.c in glibc is miscompiled by r11-5029

2020-12-03 Thread fw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98110

--- Comment #12 from Florian Weimer  ---
(In reply to Jakub Jelinek from comment #11)
> Ah, RTL loop_invariant.  Perhaps because the inline asm is buggy?
> asm ("mov %%fs:%c1,%0" : "=r" (__self) : "i" (__builtin_offsetof (struct
> pthread, header.self)));
> The only input of the asm is the constant, so really nothing tells the
> optimizers it can't move it arbitrarily.  I think it needs to have some
> memory input or clobber to properly model that it reads from unknown memory.

Ahh, do you mean the THREAD_SELF macro?

# define THREAD_SELF \
  ({ struct pthread *__self;  \
 asm ("mov %%fs:%c1,%0" : "=r" (__self)   \
  : "i" (offsetof (struct pthread, header.self)));\
 __self;})

I had only looked at the other macros.

glibc relies on GCC optimizing away THREAD_SELF reads on x86-64 because most of
the TCB access does not need it.

[Bug tree-optimization/98110] [11 Regression] dl-lookup.c in glibc is miscompiled by r11-5029

2020-12-03 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98110

--- Comment #11 from Jakub Jelinek  ---
Ah, RTL loop_invariant.  Perhaps because the inline asm is buggy?
asm ("mov %%fs:%c1,%0" : "=r" (__self) : "i" (__builtin_offsetof (struct
pthread, header.self)));
The only input of the asm is the constant, so really nothing tells the
optimizers it can't move it arbitrarily.  I think it needs to have some memory
input or clobber to properly model that it reads from unknown memory.

[Bug tree-optimization/98110] [11 Regression] dl-lookup.c in glibc is miscompiled by r11-5029

2020-12-03 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98110

--- Comment #10 from Jakub Jelinek  ---
Looking at *.optimized dump, mov %%fs:%c1,%0 appears in there only bb 64,
guarded by flags_90(D) & 4 check, so exactly the check that appears in the
source code
- flags & DL_LOOKUP_GSCOPE_LOCK (among various other earlier checks).
But in the assembly it is unconditional at the start of the function, so
something must have hoisted that asm insn to the start of the function.

[Bug tree-optimization/98110] [11 Regression] dl-lookup.c in glibc is miscompiled by r11-5029

2020-12-03 Thread fw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98110

--- Comment #9 from Florian Weimer  ---
And we should not end up in the add_dependency part, either because l_type
won't be lt_loaded and the DL_LOOKUP_ADD_DEPENDENCY flag hasn't been set,
either.

The inline asm is marked as volatile, and that should prevent it from being
moved across these checks.

[Bug tree-optimization/98110] [11 Regression] dl-lookup.c in glibc is miscompiled by r11-5029

2020-12-03 Thread fw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98110

--- Comment #8 from Florian Weimer  ---
(In reply to Jakub Jelinek from comment #6)
> on the mov %fs:0x10,%rax perhaps %fs isn't initialized yet?

Yes, that's why the access is guarded by flags & DL_LOOKUP_GSCOPE_LOCK. During
initial relocation, _dl_lookup_symbol_x is called without this flag.

[Bug tree-optimization/98110] [11 Regression] dl-lookup.c in glibc is miscompiled by r11-5029

2020-12-03 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98110

--- Comment #7 from Martin Liška  ---
Hm, adding -fno-ipa-sra does not help while -O1 does in order to remove the
miscompilation.

[Bug tree-optimization/98110] [11 Regression] dl-lookup.c in glibc is miscompiled by r11-5029

2020-12-03 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98110

--- Comment #6 from Jakub Jelinek  ---
The crash seems to be:
=> 0x77fdbe20 <_dl_lookup_symbol_x+0>:  push   %r15
   0x77fdbe22 <_dl_lookup_symbol_x+2>:  push   %r14
   0x77fdbe24 <_dl_lookup_symbol_x+4>:  push   %r13
   0x77fdbe26 <_dl_lookup_symbol_x+6>:  push   %r12
   0x77fdbe28 <_dl_lookup_symbol_x+8>:  mov%rdi,%r12
   0x77fdbe2b <_dl_lookup_symbol_x+11>: push   %rbp
   0x77fdbe2c <_dl_lookup_symbol_x+12>: mov%rdx,%rbp
   0x77fdbe2f <_dl_lookup_symbol_x+15>: push   %rbx
   0x77fdbe30 <_dl_lookup_symbol_x+16>: mov%fs:0x10,%rax
   0x77fdbe39 <_dl_lookup_symbol_x+25>: sub$0xa8,%rsp
End of assembler dump.
(gdb) bt
#0  _dl_lookup_symbol_x (undef_name=0x77ff414a "__vdso_clock_gettime",
undef_map=0x77ffe850, ref=0x7fffd598, 
symbol_scope=0x77ffebe8, version=0x7fffd5c0, type_class=0, flags=0,
skip_map=0x0) at dl-lookup.c:835
#1  0x77fd463f in dl_vdso_vsym (name=0x77ff414a
"__vdso_clock_gettime") at ../sysdeps/unix/sysv/linux/dl-vdso.h:52
#2  setup_vdso_pointers () at ../sysdeps/unix/sysv/linux/dl-vdso-setup.h:30
#3  dl_main (phdr=, phnum=12, user_entry=,
auxv=0x7fffdac8) at rtld.c:1620
#4  0x77feb1e7 in _dl_sysdep_start
(start_argptr=start_argptr@entry=0x7fffd8f0,
dl_main=dl_main@entry=0x77fd34e0 )
at ../elf/dl-sysdep.c:252
#5  0x77fd3015 in _dl_start_final (arg=0x7fffd8f0) at rtld.c:485
#6  _dl_start (arg=0x7fffd8f0) at rtld.c:578
#7  0x77fd2058 in _start ()

on the mov %fs:0x10,%rax perhaps %fs isn't initialized yet?

[Bug tree-optimization/98110] [11 Regression] dl-lookup.c in glibc is miscompiled by r11-5029

2020-12-03 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98110

--- Comment #5 from Jakub Jelinek  ---
(In reply to Martin Liška from comment #3)
> Do we have a simple reproducer for the miscompiled Glibc library?

All one needs to do is with glibc git trunk
CC=trunk-gcc CXX=trunk-g++ ../configure --disable-sanity-check
make -jN
elf/ld.so ./libc.so.6
Segmentation fault (core dumped)

[Bug tree-optimization/98110] [11 Regression] dl-lookup.c in glibc is miscompiled by r11-5029

2020-12-03 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98110

--- Comment #4 from Jakub Jelinek  ---
_value is passed to do_lookup_x call which does modify it in some
cases, e.g.
   result->s = sym;
   result->m = (struct link_map *) map;
or passes it to other function, do_lookup_unique, which can perform similar
changes.
But it happens only early:
  for (size_t start = i; *scope != 
# 854 "dl-lookup.c" 3 4
  ((void *)0)
# 854 "dl-lookup.c"
  ; start = 0, ++scope)
if (do_lookup_x (undef_name, new_hash, _hash, *ref,
   _value, *scope, start, version, flags,
   skip_map, type_class, undef_map) != 0)
  break;
and then current_value is no longer modified, so unless I've missed do_lookup_x
remembering the address of current_value somewhere, that optimization seems to
be ok.

[Bug tree-optimization/98110] [11 Regression] dl-lookup.c in glibc is miscompiled by r11-5029

2020-12-03 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98110

Martin Liška  changed:

   What|Removed |Added

 CC||marxin at gcc dot gnu.org

--- Comment #3 from Martin Liška  ---
Do we have a simple reproducer for the miscompiled Glibc library?

[Bug tree-optimization/98110] [11 Regression] dl-lookup.c in glibc is miscompiled by r11-5029

2020-12-03 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98110

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek  ---
I see first real code changes between r11-5028 and r11-5029 during fre3 which
optimizes away some rereads of current_value.m in _dl_lookup_symbol_x, which
passes its address to other functions.
   # DEBUG BEGIN_STMT
   # PT = nonlocal escaped null 
   _47 = current_value.m;
   _48 = _47->l_used;
   _49 = _48 == 0;
...
   _335 = val.m;
-  # PT = nonlocal escaped null 
-  _336 = current_value.m;
-  if (_335 != _336)
+  if (_47 != _335)
...
[local count: 3147328]:
-  # PT = nonlocal escaped null 
-  _383 = current_value.m;
-  iftmp.45_384 = _383->l_map_start;
+  iftmp.45_384 = _47->l_map_start;
...
-  # PT = nonlocal escaped null 
-  _120 = current_value.m;

[local count: 114862414]:
   # PT = nonlocal escaped null 
-  # _66 = PHI <0B(30), _116(153), _120(216)>
+  # _66 = PHI <0B(30), _116(153), _47(216)>

(plus of course since the modref pass points to info changes).

[Bug tree-optimization/98110] [11 Regression] dl-lookup.c in glibc is miscompiled by r11-5029

2020-12-02 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98110

Jakub Jelinek  changed:

   What|Removed |Added

 CC||romain.geissler at amadeus dot 
com

--- Comment #1 from Jakub Jelinek  ---
*** Bug 98106 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/98110] [11 Regression] dl-lookup.c in glibc is miscompiled by r11-5029

2020-12-02 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98110

H.J. Lu  changed:

   What|Removed |Added

   Target Milestone|--- |11.0

[Bug tree-optimization/98110] [11 Regression] dl-lookup.c in glibc is miscompiled by r11-5029

2020-12-02 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98110

H.J. Lu  changed:

   What|Removed |Added

   Priority|P3  |P1