[Bug sanitizer/110799] [tsan] False positive due to -fhoist-adjacent-loads

2023-07-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110799

Richard Biener  changed:

   What|Removed |Added

   Last reconfirmed||2023-07-31
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #17 from Richard Biener  ---
(In reply to Alexander Monakov from comment #16)
> In C11 and C++11 the issue of compiler-introduced racing loads is discussed
> as follows (5.1.2.4 Multi-threaded executions and data races in C11):
> 
> 28 NOTE 14 Transformations that introduce a speculative read of a
> potentially shared memory location may not preserve the semantics of the
> program as defined in this standard, since they potentially introduce a data
> race. However, they are typically valid in the context of an optimizing
> compiler that targets a specific machine with well-defined semantics for
> data races. They would be invalid for a hypothetical machine that is not
> tolerant of races or provides hardware race detection.
> 
> 
> So for TSan we'd allow hoisting only after TSan instrumentation, and for
> Helgrind we'd ask them to handle load-load-cmov combo as only consuming one
> of the loads?

Yes.  I think for testing optimized code (aka production code) Helgrind
is more useful while TSan could as well be used with -O0.  Note I've
never had the need to use either of them.

> I think the other optimizations from comment #8 introduce racing loads more
> rarely in practice, because they are limited to non-trapping accesses.

Yes, that's true.

Generally with the above note from the standard I wonder if we want to
have an option to control speculating loads (of "global" memory, aka
memory which address possibly escapes to another thread?).

[Bug sanitizer/110799] [tsan] False positive due to -fhoist-adjacent-loads

2023-07-31 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110799

--- Comment #16 from Alexander Monakov  ---
In C11 and C++11 the issue of compiler-introduced racing loads is discussed as
follows (5.1.2.4 Multi-threaded executions and data races in C11):

28 NOTE 14 Transformations that introduce a speculative read of a potentially
shared memory location may not preserve the semantics of the program as defined
in this standard, since they potentially introduce a data race. However, they
are typically valid in the context of an optimizing compiler that targets a
specific machine with well-defined semantics for data races. They would be
invalid for a hypothetical machine that is not tolerant of races or provides
hardware race detection.


So for TSan we'd allow hoisting only after TSan instrumentation, and for
Helgrind we'd ask them to handle load-load-cmov combo as only consuming one of
the loads?


I think the other optimizations from comment #8 introduce racing loads more
rarely in practice, because they are limited to non-trapping accesses.

[Bug sanitizer/110799] [tsan] False positive due to -fhoist-adjacent-loads

2023-07-27 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110799

--- Comment #15 from CVS Commits  ---
The releases/gcc-13 branch has been updated by Richard Biener
:

https://gcc.gnu.org/g:064a4ec9d7107c8278a759cb61dc161548c846ee

commit r13-7612-g064a4ec9d7107c8278a759cb61dc161548c846ee
Author: Richard Biener 
Date:   Wed Jul 26 09:28:10 2023 +0200

tree-optimization/110799 - fix bug in code hoisting

Code hoisting part of GIMPLE PRE failed to adjust the TBAA behavior
of common loads in the case the alias set of the ref was the same
but the base alias set was not.  It also failed to adjust the
base behavior, assuming it would match.  The following plugs this
hole.

PR tree-optimization/110799
* tree-ssa-pre.cc (compute_avail): More thoroughly match
up TBAA behavior of redundant loads.

* gcc.dg/torture/pr110799.c: New testcase.

(cherry picked from commit 565e0e80660dac24e5193873ca07542b99cd8bc3)

[Bug sanitizer/110799] [tsan] False positive due to -fhoist-adjacent-loads

2023-07-26 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110799

--- Comment #14 from Richard Biener  ---
The PRE/hoisting bug is fixed now.

[Bug sanitizer/110799] [tsan] False positive due to -fhoist-adjacent-loads

2023-07-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110799

--- Comment #13 from CVS Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:565e0e80660dac24e5193873ca07542b99cd8bc3

commit r14-2783-g565e0e80660dac24e5193873ca07542b99cd8bc3
Author: Richard Biener 
Date:   Wed Jul 26 09:28:10 2023 +0200

tree-optimization/110799 - fix bug in code hoisting

Code hoisting part of GIMPLE PRE failed to adjust the TBAA behavior
of common loads in the case the alias set of the ref was the same
but the base alias set was not.  It also failed to adjust the
base behavior, assuming it would match.  The following plugs this
hole.

PR tree-optimization/110799
* tree-ssa-pre.cc (compute_avail): More thoroughly match
up TBAA behavior of redundant loads.

* gcc.dg/torture/pr110799.c: New testcase.

[Bug sanitizer/110799] [tsan] False positive due to -fhoist-adjacent-loads

2023-07-26 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110799

--- Comment #12 from Jakub Jelinek  ---
For TSAN, any optimization that hoists loads like that after the tsan/tsan0
passes doesn't really matter and for those that do happen before the
possibilities are either to disable those optimizations when being asked for
SANITIZE_THREAD instrumentation, or mark such loads some special way so that
the tsan/tsan0 passes would consider them as speculative.  Disabling those
optimizations seems far easier IMHO.
For valgrind, I think it is valgrind that when seeing these load vs. store
races actually checks if the load is actually used.  It certainly wouldn't be
the first place
where it does something like that, we already have e.g. vectorization which can
load from some bytes after an end of malloced area or other ends of an object
and it matters if bits from those bytes are ever used later or not.

[Bug sanitizer/110799] [tsan] False positive due to -fhoist-adjacent-loads

2023-07-26 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110799

--- Comment #11 from rguenther at suse dot de  ---
On Wed, 26 Jul 2023, amonakov at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110799
> 
> --- Comment #9 from Alexander Monakov  ---
> (In reply to Tom de Vries from comment #7)
> > Can you elaborate on what you consider a correct approach?
> 
> I think this optimization is incorrect and should be active only under -Ofast.
> 
> I can offer two arguments. First, even without considering correctness,
> breaking TSan and Helgrind is a substantial QoI issue and we should consider
> shielding -O2 users from that (otherwise they'll discover it the hard way,
> curse at us, stick -fno-hoist-adjacent-loads in their build system and 
> consider
> switching to another compiler).
> 
> Second, I can upgrade the initial example to an actual miscompilation. The
> upgrade is based on two considerations: the optimization works on
> possibly-trapping accesses, and relies on types of memory references to decide
> if it's safe, but it runs late where the types are not what they were in the C
> source. Hence, the following example:
> 
> struct S {
> int a;
> };
> struct M {
> int a, b;
> };
> 
> int f(struct S *p, int c, int d)
> {
> int r;
> if (c)
> if (d)
> r = p->a;
> else
> r = ((struct M*)p)->a;
> else
> r = ((struct M*)p)->b;
> return r;
> }
> 
> is miscompiled to
> 
> f:
> mov eax, DWORD PTR [rdi+4]
> testesi, esi
> cmovne  eax, DWORD PTR [rdi]
> ret
> 
> even though the original program never accesses beyond struct S if 'c && d'.
> Phi-opt incorrectly performs hoisting after PRE collapses 'if (d) ... else
> ...'.

It looks like a bug in PRE (as well) though.  It shouldn't be an
argument to ditch the adjacent load hoisting alltogether.

[Bug sanitizer/110799] [tsan] False positive due to -fhoist-adjacent-loads

2023-07-26 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110799

--- Comment #10 from rguenther at suse dot de  ---
On Tue, 25 Jul 2023, vries at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110799
> 
> --- Comment #7 from Tom de Vries  ---
> (In reply to Alexander Monakov from comment #5)
> > This trips Valgrind's data race detector (valgrind --tool=helgrind) too. So
> > I don't think checking SANITIZE_THREAD is the correct approach.
> 
> Can you elaborate on what you consider a correct approach?

If we decide that speculating loads is a no-go we should instead go
for a flag similar to -fallow-store-data-races.  Maybe
-fno-speculate-loads or so.  I do wonder whether TSAN instrumentation
can be made more clever here, like instrument the load together
with the condition under which the loaded value will be used?

The only downside of speculating loads is possible cachline/locking
conflicts across CPU cores and thus impacted runtime performance.
So this isn't really a wrong-code issue.

How do helgrind / TSAN handle masked loads (and stores) in this
context?  I think those still possibly have the runtime impact,
on the masked store side I'm not sure how those interact with
concurrent atomic operations on the masked parts.

[Bug sanitizer/110799] [tsan] False positive due to -fhoist-adjacent-loads

2023-07-26 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110799

--- Comment #9 from Alexander Monakov  ---
(In reply to Tom de Vries from comment #7)
> Can you elaborate on what you consider a correct approach?

I think this optimization is incorrect and should be active only under -Ofast.

I can offer two arguments. First, even without considering correctness,
breaking TSan and Helgrind is a substantial QoI issue and we should consider
shielding -O2 users from that (otherwise they'll discover it the hard way,
curse at us, stick -fno-hoist-adjacent-loads in their build system and consider
switching to another compiler).

Second, I can upgrade the initial example to an actual miscompilation. The
upgrade is based on two considerations: the optimization works on
possibly-trapping accesses, and relies on types of memory references to decide
if it's safe, but it runs late where the types are not what they were in the C
source. Hence, the following example:

struct S {
int a;
};
struct M {
int a, b;
};

int f(struct S *p, int c, int d)
{
int r;
if (c)
if (d)
r = p->a;
else
r = ((struct M*)p)->a;
else
r = ((struct M*)p)->b;
return r;
}

is miscompiled to

f:
mov eax, DWORD PTR [rdi+4]
testesi, esi
cmovne  eax, DWORD PTR [rdi]
ret

even though the original program never accesses beyond struct S if 'c && d'.
Phi-opt incorrectly performs hoisting after PRE collapses 'if (d) ... else
...'.

[Bug sanitizer/110799] [tsan] False positive due to -fhoist-adjacent-loads

2023-07-26 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110799

--- Comment #8 from rguenther at suse dot de  ---
On Tue, 25 Jul 2023, amonakov at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110799
> 
> Alexander Monakov  changed:
> 
>What|Removed |Added
> 
>  CC||amonakov at gcc dot gnu.org
> 
> --- Comment #5 from Alexander Monakov  ---
> (In reply to Richard Biener from comment #1)
> > We consider introducing load data races OK, what's the difference here? 
> > There are other passes that would do similar things but in practice the
> > loads would be considered to possibly trap so the real-world impact might be
> > limited?
> 
> What are the examples of other transforms that can introduce data races?

Off-head it would be loop invariant motion and partial-PRE, loop
if-conversion and if-combine.  All of those could speculate loads
when there's no trapping possibility but the values wouldn't be used
when not used without the transform.

> This trips Valgrind's data race detector (valgrind --tool=helgrind) too. So I
> don't think checking SANITIZE_THREAD is the correct approach.

I can see that it's difficult for those tools to avoid those false
positives but eventually valgrind might be able to see the values loaded
are not used (hopefully the register content is overwritten "soon").

[Bug sanitizer/110799] [tsan] False positive due to -fhoist-adjacent-loads

2023-07-25 Thread vries at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110799

--- Comment #7 from Tom de Vries  ---
(In reply to Alexander Monakov from comment #5)
> This trips Valgrind's data race detector (valgrind --tool=helgrind) too. So
> I don't think checking SANITIZE_THREAD is the correct approach.

Can you elaborate on what you consider a correct approach?

[Bug sanitizer/110799] [tsan] False positive due to -fhoist-adjacent-loads

2023-07-25 Thread vries at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110799

--- Comment #6 from Tom de Vries  ---
(In reply to rguent...@suse.de from comment #4)
> I'm suggesting to not fix it ;) 

Can you explain why ?

It doesn't look difficult to fix to me, and I don't see any downsides.

> That said, is TSAN a useful vehicle?

Well, false positives aside, yes.

[Bug sanitizer/110799] [tsan] False positive due to -fhoist-adjacent-loads

2023-07-25 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110799

Alexander Monakov  changed:

   What|Removed |Added

 CC||amonakov at gcc dot gnu.org

--- Comment #5 from Alexander Monakov  ---
(In reply to Richard Biener from comment #1)
> We consider introducing load data races OK, what's the difference here? 
> There are other passes that would do similar things but in practice the
> loads would be considered to possibly trap so the real-world impact might be
> limited?

What are the examples of other transforms that can introduce data races?

This trips Valgrind's data race detector (valgrind --tool=helgrind) too. So I
don't think checking SANITIZE_THREAD is the correct approach.

[Bug sanitizer/110799] [tsan] False positive due to -fhoist-adjacent-loads

2023-07-25 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110799

--- Comment #4 from rguenther at suse dot de  ---
On Tue, 25 Jul 2023, vries at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110799
> 
> --- Comment #2 from Tom de Vries  ---
> (In reply to Richard Biener from comment #1)
> > We consider introducing load data races OK, what's the difference here? 
> 
> This is a load vs. store data race.
> 
> > There are other passes that would do similar things but in practice the
> > loads would be considered to possibly trap so the real-world impact might be
> > limited?
> 
> If you're suggesting to fix this in a more generic way, I'm all for it.

I'm suggesting to not fix it ;)  That said, is TSAN a useful vehicle?

[Bug sanitizer/110799] [tsan] False positive due to -fhoist-adjacent-loads

2023-07-25 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110799

--- Comment #3 from Jakub Jelinek  ---
The data race is harmless as we don't use the value.
&& (flag_sanitize & SANITIZE_THREAD) == 0
is not right, it should be
&& !sanitize_flags_p (SANITIZE_THREAD)
or so.

[Bug sanitizer/110799] [tsan] False positive due to -fhoist-adjacent-loads

2023-07-25 Thread vries at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110799

--- Comment #2 from Tom de Vries  ---
(In reply to Richard Biener from comment #1)
> We consider introducing load data races OK, what's the difference here? 

This is a load vs. store data race.

> There are other passes that would do similar things but in practice the
> loads would be considered to possibly trap so the real-world impact might be
> limited?

If you're suggesting to fix this in a more generic way, I'm all for it.

[Bug sanitizer/110799] [tsan] False positive due to -fhoist-adjacent-loads

2023-07-25 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110799

Richard Biener  changed:

   What|Removed |Added

 CC||rguenth at gcc dot gnu.org

--- Comment #1 from Richard Biener  ---
We consider introducing load data races OK, what's the difference here?  There
are other passes that would do similar things but in practice the loads would
be considered to possibly trap so the real-world impact might be limited?