[Bug target/94103] Wrong optimization: reading value of a variable changes its representation for optimizer

2020-03-20 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94103

--- Comment #15 from Alexander Cherepanov  ---
(In reply to rguent...@suse.de from comment #14)
> From a language Pov that's the same.
> But memcpy and friends work on any dynamic type so have to copy all bytes.
Sorry, I don't understand. Bug 61872, comment 1, and bug 92486, comment 9, look
the same -- a memset followed by an assignment.  But the former is with long
double while the latter is with a struct.

You mean that assignments of structs are equivalent to memcpy while assignments
of long doubles aren't? Why the difference?

[Bug target/94103] Wrong optimization: reading value of a variable changes its representation for optimizer

2020-03-20 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94103

--- Comment #13 from Alexander Cherepanov  ---
(In reply to rguent...@suse.de from comment #11)
> I think if the user writes a long double store then padding becomes
> undefined so the testcase in comment#1 in PR61872 is technically
> undefined IMHO.

Sure, but why not the same for structs?

[Bug target/94103] Wrong optimization: reading value of a variable changes its representation for optimizer

2020-03-20 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94103

--- Comment #10 from Alexander Cherepanov  ---
The case of assignment+memcpy -- testcases in comment 0, in pr92824 and similar
-- is fixed.

But the case of memset+assignment -- pr93270 and pr61872 (these seem to be
dups) -- is not fixed. Is it supposed to be fixed?

Before, I've seen somewhat contradicting approaches in bug 92486, comment 12,
which says that memset+assignment should set padding in structs, and in bug
93270, comment 4, which implies that memset+assignment shouldn't set padding in
long double. I'm in no way trying to imply that memset+assignment should or
shouldn't be fixed, just wondering if there is a difference of two cases.

[Bug tree-optimization/92486] Wrong optimization: padding in structs is not copied even with memcpy

2020-03-20 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92486

--- Comment #18 from Alexander Cherepanov  ---
Adding a `memset` makes trunk fail too:

--
#include 
#include 

struct s {
char c;
int i;
};

__attribute__((noinline,noclone))
void f(struct s *p, struct s *q)
{
struct s w;

memset(, 0, sizeof(struct s));
w = *q;
memcpy(, q, sizeof(struct s));

*p = w;
memcpy(p, , sizeof(struct s));
}

int main()
{
struct s x;
memset(, 1, sizeof(struct s));

struct s y;
memset(, 2, sizeof(struct s));

f(, );

for (unsigned char *p = (unsigned char *) p < (unsigned char *) +
sizeof(struct s); p++)
printf("%d", *p);
printf("\n");
}
--
$ gcc -std=c11 -pedantic -Wall -Wextra test.c && ./a.out

$ gcc -std=c11 -pedantic -Wall -Wextra -O3 test.c && ./a.out
1222
--
gcc x86-64 version: gcc (GCC) 10.0.1 20200320 (experimental)
--

[Bug middle-end/94206] New: Wrong optimization: memset of n-bit integer types (from bit-fields) is truncated to n bits (instead of sizeof)

2020-03-17 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94206

Bug ID: 94206
   Summary: Wrong optimization: memset of n-bit integer types
(from bit-fields) is truncated to n bits (instead of
sizeof)
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ch3root at openwall dot com
  Target Milestone: ---

Bit-fields of different widths have their own types in gcc. And `memset` seems
to be optimized strangely for them. Even at `-O0`!

This is not standard C due to the use of `typeof` but this topic is starting to
be more interesting in light of N2472 "Adding Fundamental Type for N-bit
Integers".

--
#include 
#include 

struct {
unsigned long x:33;
} s;
typedef __typeof__(s.x + 0) uint33;

int main()
{
uint33 x;
printf("sizeof = %zu\n", sizeof x);

memset(, -1, sizeof x);

unsigned long u;
memcpy(, , sizeof u);
printf("%lx\n", u);
}
--
$ gcc -std=c11 -pedantic -Wall -Wextra test.c && ./a.out
sizeof = 8
1
$ gcc -std=c11 -pedantic -Wall -Wextra -O3 test.c && ./a.out
sizeof = 8
1
--
gcc x86-64 version: gcc (GCC) 10.0.1 20200317 (experimental)
--

The right result is ``.

[Bug target/61872] Assigning to "long double" causes memset to be improperly elided

2020-03-15 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61872

Alexander Cherepanov  changed:

   What|Removed |Added

 CC||ch3root at openwall dot com

--- Comment #4 from Alexander Cherepanov  ---
This looks like a dup of pr93270, so:
- this is invalid because assignments can change padding;
- this is probably fixed together with pr94103.

[Bug middle-end/93806] Wrong optimization: instability of floating-point results with -funsafe-math-optimizations leads to nonsense

2020-03-11 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93806

--- Comment #45 from Alexander Cherepanov  ---
(In reply to Vincent Lefèvre from comment #44)
> (In reply to Alexander Cherepanov from comment #43)
> > GCC on x86-64 uses the binary encoding for the significand.
> 
> In general, yes. This includes the 32-bit ABI under Linux. But it seems to
> be different under MS-Windows, at least with MinGW using the 32-bit ABI:
> according to my tests of MPFR,
> 
> MPFR config.status 4.1.0-dev
> configured by ./configure, generated by GNU Autoconf 2.69,
>   with options "'--host=i686-w64-mingw32' '--disable-shared'
> '--with-gmp=/usr/local/gmp-6.1.2-mingw32' '--enable-assert=full'
> '--enable-thread-safe' 'host_alias=i686-w64-mingw32'"
> [...]
> CC='i686-w64-mingw32-gcc'
> [...]
> [tversion] Compiler: GCC 8.3-win32 20191201
> [...]
> [tversion] TLS = yes, float128 = yes, decimal = yes (DPD), GMP internals = no
> 
> i.e. GCC uses DPD instead of the usual BID.

Strange, I tried mingw from stable Debian on x86-64 and see it behaving the
same way as the native gcc:

$ echo 'int main() { return (union { _Decimal32 d; int i; }){0.df}.i; }' >
test.c

$ gcc -O3 -fdump-tree-optimized=test.out test.c && grep -h return test.out
  return 847249408;
$ gcc --version | head -n 1
gcc (Debian 8.3.0-6) 8.3.0

$ x86_64-w64-mingw32-gcc -O3 -fdump-tree-optimized=test.out test.c && grep -h
return test.out
  return 847249408;
$ x86_64-w64-mingw32-gcc --version | head -n 1
x86_64-w64-mingw32-gcc (GCC) 8.3-win32 20190406

$ i686-w64-mingw32-gcc -O3 -fdump-tree-optimized=test.out test.c && grep -h
return test.out
  return 847249408;
$ i686-w64-mingw32-gcc --version | head -n 1
i686-w64-mingw32-gcc (GCC) 8.3-win32 20190406

Plus some other cross-compilers:

$ powerpc64-linux-gnu-gcc -O3 -fdump-tree-optimized=test.out test.c && grep -h
return test.out
  return 575668224;
$ powerpc64-linux-gnu-gcc --version | head -n 1
powerpc64-linux-gnu-gcc (Debian 8.3.0-2) 8.3.0

$ powerpc64le-linux-gnu-gcc -O3 -fdump-tree-optimized=test.out test.c && grep
-h return test.out
  return 575668224;
$ powerpc64le-linux-gnu-gcc --version | head -n 1
powerpc64le-linux-gnu-gcc (Debian 8.3.0-2) 8.3.0

$ s390x-linux-gnu-gcc -O3 -fdump-tree-optimized=test.out test.c && grep -h
return test.out
  return 575668224;
$ s390x-linux-gnu-gcc --version | head -n 1
s390x-linux-gnu-gcc (Debian 8.3.0-2) 8.3.0

AIUI the value 847249408 (= 0x3280) is right for 0.df with BID and
575668224 (= 0x2250) is right with DPD.

> > So the first question: does any platform (that gcc supports) use the decimal
> > encoding for the significand (aka densely packed decimal encoding)?
> 
> DPD is also used on PowerPC (at least the 64-bit ABI), as these processors
> now have hardware decimal support.

Oh, this means that cohorts differs by platform.

> > Then, the rules about (non)propagation of some encodings blur the boundary
> > between values and representations in C. In particular this means that
> > different encodings are _not_ equivalent. Take for example the optimization
> > `x == C ? C + 0 : x` -> `x` for a constant C that is the unique member of
> > its cohort and that has non-canonical encodings (C is an infinity according
> > to the above analysis). Not sure about encoding of literals but the result
> > of addition `C + 0` is required to have canonical encoding. If `x` has
> > non-canonical encoding then the optimization is invalid.
> 
> In C, it is valid to choose any possible encoding. Concerning the IEEE 754
> conformance, this depends on the bindings. But IEEE 754 does not define the
> ternary operator. It depends whether C considers encodings before or
> possibly after optimizations (in the C specification, this does not matter,
> but when IEEE 754 is taken into account, there may be more restrictions).

The ternary operator is not important, let's replace it with `if`:

--
#include 

_Decimal32 f(_Decimal32 x)
{
_Decimal32 inf = (_Decimal32)INFINITY + 0;

if (x == inf)
return inf;
else
return x;
}
--

This is optimized into just `return x;`.

> > While at it, convertFormat is required to return canonical encodings, so
> > after `_Decimal32 x = ..., y = (_Decimal32)(_Decimal64)x;` `y` has to have
> > canonical encoding? But these casts are nop in gcc now.
> 
> A question is whether casts are regarded as explicit convertFormat
> operations 

N2478, a recent draft of C2x, lists bindings in F.3 and "convertFormat -
different formats" corresponds to "cast and implicit conversions". Is this
enough?

BTW "convertFormat - same format" corresponds to "canonicalize", so I guess a
cast to the same type is not required to canonicalize.

> or whether simplification is allowed as it does not affect the
> value, in which case the canonicalize() function would be needed here. 

Not sure what this means.

> And
> in any case, when 

[Bug middle-end/93806] Wrong optimization: instability of floating-point results with -funsafe-math-optimizations leads to nonsense

2020-03-10 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93806

--- Comment #43 from Alexander Cherepanov  ---
Joseph, Vincent, thanks a lot for the crash course in decimal floating-point.
Indeed, quite interesting types. Findings so far: bug 94035, comment 5, bug
94111, bug 94122.

Let me try to summarize the understanding and to get closer to the C
terminology. Please correct me if I'm wrong.

Different representations in the IEEE 754-2019 speak are different values in
the C speak, e.g. 1.0DF and 1.00DF are different values in _Decimal32 (in
particular, the assignment operator is required to preserve the difference).
The set of values corresponding to the same number is a cohort. Cohorts of
non-zero non-inf non-nan values in _Decimal32 have from 1 to 7 elements. Both
infinities have only 1 element in their cohorts. Both zeros have much more
elements in their cohorts (with all possible quantum exponents -- 192 for
_Decimal32).

Some values admit several representations (in the C speak; it's encodings in
the IEEE speak). GCC on x86-64 uses the binary encoding for the significand.
Hence, non-zero non-inf non-nan values have only one representation each.
Significands exceeding the maximum are treated as zero which gives many
(non-canonical) representations for each of many zero values. Inf and nan
values have many representations too (due to ignored trailing exponent and/or
significand).

So the first question: does any platform (that gcc supports) use the decimal
encoding for the significand (aka densely packed decimal encoding)?

Then, the rules about (non)propagation of some encodings blur the boundary
between values and representations in C. In particular this means that
different encodings are _not_ equivalent. Take for example the optimization `x
== C ? C + 0 : x` -> `x` for a constant C that is the unique member of its
cohort and that has non-canonical encodings (C is an infinity according to the
above analysis). Not sure about encoding of literals but the result of addition
`C + 0` is required to have canonical encoding. If `x` has non-canonical
encoding then the optimization is invalid.

While at it, convertFormat is required to return canonical encodings, so after
`_Decimal32 x = ..., y = (_Decimal32)(_Decimal64)x;` `y` has to have canonical
encoding? But these casts are nop in gcc now.

[Bug target/94103] Wrong optimization: reading value of a variable changes its representation for optimizer

2020-03-10 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94103

Alexander Cherepanov  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=94122

--- Comment #5 from Alexander Cherepanov  ---
(In reply to Richard Biener from comment #4)
> I agree that the decimal float variant is an entirely different bug,
> maybe you can open a new one for this?

Sure -- pr94122.

[Bug middle-end/94122] New: Wrong optimization: reading value of a decimal FP variable changes its representation for optimizer

2020-03-10 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94122

Bug ID: 94122
   Summary: Wrong optimization: reading value of a decimal FP
variable changes its representation for optimizer
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ch3root at openwall dot com
  Target Milestone: ---

Split from bug 94103, comment 1.

It seems the optimizer sometimes computes the representation of variables from
its value instead of tracking it directly. This is wrong when the value admits
different representations.
(Given that the value is used, the representation should be valid (non-trap).)

Example with decimal floating-point:

--
#include 
#include 

int main()
{
_Decimal32 x = 999; // maximum significand
unsigned u;

memcpy(, , sizeof u);
printf("%#08x\n", u);

++*(unsigned char *) // create non-canonical representation of 0
(void)-x;

memcpy(, , sizeof u);
printf("%#08x\n", u);
}
--
$ gcc -std=c2x -pedantic -Wall -Wextra test.c && ./a.out
0x6cb8967f
0x6cb89680
$ gcc -std=c2x -pedantic -Wall -Wextra -O3 test.c && ./a.out
0x6cb8967f
0x3280
--
gcc x86-64 version: gcc (GCC) 10.0.1 20200305 (experimental)
--

Unoptimized results are right, optimized -- wrong.

[Bug middle-end/92824] Wrong optimization: representation of long doubles not copied even with memcpy

2020-03-09 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92824

Alexander Cherepanov  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|NEW |RESOLVED

--- Comment #5 from Alexander Cherepanov  ---
I thought that memcpy ignored after the assignment here but the problem remain
even after replacing `y = x;` with `-x;`. After fre1 we get a normalized value:

  MEM <__int128 unsigned> [(char * {ref-all})] = 0x18000;

So this is an exact dup of pr94103.

*** This bug has been marked as a duplicate of bug 94103 ***

[Bug middle-end/94103] Wrong optimization: reading value of a variable changes its representation for optimizer

2020-03-09 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94103

--- Comment #3 from Alexander Cherepanov  ---
*** Bug 92824 has been marked as a duplicate of this bug. ***

[Bug middle-end/94111] New: Wrong optimization: decimal floating-point infinity casted to double -> zero

2020-03-09 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94111

Bug ID: 94111
   Summary: Wrong optimization: decimal floating-point infinity
casted to double -> zero
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ch3root at openwall dot com
  Target Milestone: ---

Cast to double of a decimal floating-point infinity gives zero:

--
#include 
#include 
#include 

int main()
{
_Decimal32 d = (_Decimal32)INFINITY;

unsigned u;
memcpy(, , sizeof u);
printf("repr: %08x\n", u);

printf("cast: %g\n", (double)d);
}
--
$ gcc -std=c2x -pedantic -Wall -Wextra test.c && ./a.out
repr: 7800
cast: inf
$ gcc -std=c2x -pedantic -Wall -Wextra -O3 test.c && ./a.out
repr: 7800
cast: 0
--
gcc x86-64 version: gcc (GCC) 10.0.1 20200305 (experimental)
--

The representation is right for infinity in _Decimal32.

[Bug middle-end/94103] Wrong optimization: reading value of a variable changes its representation for optimizer

2020-03-09 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94103

--- Comment #1 from Alexander Cherepanov  ---
Example with decimal floating-point:

--
#include 
#include 

int main()
{
_Decimal32 x = 999; // maximum significand
unsigned u;

memcpy(, , sizeof u);
printf("%#08x\n", u);

++*(unsigned char *) // create non-canonical representation of 0
(void)-x;

memcpy(, , sizeof u);
printf("%#08x\n", u);
}
--
$ gcc -std=c2x -pedantic -Wall -Wextra test.c && ./a.out
0x6cb8967f
0x6cb89680
$ gcc -std=c2x -pedantic -Wall -Wextra -O3 test.c && ./a.out
0x6cb8967f
0x3280
--
gcc x86-64 version: gcc (GCC) 10.0.1 20200305 (experimental)
--

Unoptimized results are right, optimized -- wrong.

[Bug middle-end/94103] New: Wrong optimization: reading value of a variable changes its representation for optimizer

2020-03-09 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94103

Bug ID: 94103
   Summary: Wrong optimization: reading value of a variable
changes its representation for optimizer
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ch3root at openwall dot com
  Target Milestone: ---

It seems the optimizer sometimes computes the representation of variables from
its value instead of tracking it directly. This is wrong when the value admits
different representations.
(Given that the value is used, the representation should be valid (non-trap).)

Example with lost padding in x86-64 long double:

--
#include 
#include 

int main()
{
long double x;

// fill x including the padding
unsigned long u[2] = {0x, 0x};
memcpy(, , sizeof x);

// print the representation of x (initial)
memcpy(, , sizeof u);
printf("%016lX %016lX\n", u[1], u[0]);

// change the representation of x a bit
++*(unsigned char *)
(void)-x; // use the value of x but don't write it

// print the representation of x (resulting)
memcpy(, , sizeof u);
printf("%016lX %016lX\n", u[1], u[0]);
}
--
$ gcc -std=c2x -pedantic -Wall -Wextra test.c && ./a.out
 
 EEEF
$ gcc -std=c2x -pedantic -Wall -Wextra -O3 test.c && ./a.out
 
 EEEF
--
gcc x86-64 version: gcc (GCC) 10.0.1 20200305 (experimental)
--

Zeros in the last output line are wrong.

[Bug middle-end/94035] Wrong optimization: conditional equivalence vs. values with several representations

2020-03-05 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94035

--- Comment #5 from Alexander Cherepanov  ---
I see. But the problem with decimal floating-point formats remains...

Based on bug 93806, comment 41, here is an example with equal but different
values:

--
#include 

__attribute__((noipa)) // imagine it in a separate TU
static _Decimal32 opaque(_Decimal32 d) { return d; }

int main()
{
_Decimal32 x = opaque(1.0DF);
unsigned char *p = (unsigned char *)

if (x == 1.00DF)
printf("%d\n", p[0]);
printf("%d\n", p[0]);
}
--
$ gcc -std=c2x -pedantic -Wall -Wextra test.c && ./a.out
10
10
$ gcc -std=c2x -pedantic -Wall -Wextra -O3 test.c && ./a.out
100
10
--
gcc x86-64 version: gcc (GCC) 10.0.1 20200305 (experimental)
--

And then there are non-canonical representations for DFP values...

[Bug middle-end/94035] Wrong optimization: conditional equivalence vs. values with several representations

2020-03-05 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94035

--- Comment #3 from Alexander Cherepanov  ---
(In reply to jos...@codesourcery.com from comment #2)
> I think pseudo-denormals should be considered trap representations.
Cool!

What about IBM extended double (double-double)? All cases where (double)(hi +
lo) != hi are trap representations too?

But there is one case which you once mentioned as valid -- when the low part is
zero, this zero could be of any sign.

--
#include 

__attribute__((noipa)) // imagine it in a separate TU
static long double opaque(long double d) { return d; }

int main()
{
long double x = -opaque(1);
unsigned char *px = (unsigned char *)

if (x == -1)
printf("px[8] = %d\n", px[8]);
printf("px[8] = %d\n", px[8]);
}
--
$ powerpc64-linux-gnu-gcc -std=c11 -pedantic -Wall -Wextra -O3 test.c &&
qemu-ppc64 /usr/powerpc-linux-gnu/lib64/ld64.so.1 --library-path
/usr/powerpc-linux-gnu/lib64 ./a.out
px[8] = 0
px[8] = 128
--
gcc x86-64 version: powerpc64-linux-gnu-gcc (Debian 8.3.0-2) 8.3.0
--

Here the low part of x is -0. , px[8] is a byte containing the sign bit of the
low part.

[Bug middle-end/93806] Wrong optimization: instability of floating-point results with -funsafe-math-optimizations leads to nonsense

2020-03-04 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93806

--- Comment #40 from Alexander Cherepanov  ---
(In reply to Vincent Lefèvre from comment #35)
> > You seem to say that either Annex F is fully there or not at all but why?
> > -fno-signed-zeros breaks Annex F but only parts of it. Isn't it possible to
> > retain the other parts of it? Maybe it's impossible or maybe it's impossible
> > to retain division by zero, I don't know. What is your logic here?
> 
> This issue is that the nice property x == y implies f(x) == f(y), in
> particular, x == y implies 1 / x == 1 / y is no longer valid with signed
> zeros. Thus one intent of -fno-signed-zeros could be to enable optimizations
> based on this property. But this means that division by zero becomes
> undefined behavior (like in C without Annex F). Major parts of Annex F would
> still remain valid.

I agree that the intent is to enable optimization based on the property "x == y
implies f(x) == f(y)". But I'm not sure what follows from this.

Sure, one possibility is make undefined any program that uses f(x) where x
could be a zero and f(x) differs for two zeros. But this approach make printf
and memory-accesss undefined too. Sorry, I don't how you could undefine
division by zero while not undefining printing of zero.

Another approach is to say that we don't care which of possible two values f(x)
returns x is zero. That is, we don't care whether 1/0. is +inf or -inf and we
don't care whether printf("%g", 0.) outputs 0 or -0.

> > This means that you cannot implement you own printf: if you analyze sign bit
> > of your value to decide whether you need to print '-', the sign of zero is
> > significant in your code.
> 
> If you want to implement a printf that takes care of the sign of 0, you must
> not use -fno-signed-zeros.

So if I use ordinary printf from a libc with -fno-signed-zeros it's fine but if
I copy its implementation into my own program it's not fine?

> > IOW why do you think that printf is fine while "1 / x == 1 / 0." is not?
> 
> printf is not supposed to trigger undefined behavior. Part of its output is
> unspecified, but that's all.

Why the same couldn't be said about division? Division by zero is not supposed
to trigger undefined behavior. Part of its result (the sign of infinit) is
unspecified, but that's all.

> > > * Memory analysis. Again, the sign does not matter, but for instance,
> > > reading an object twice as a byte sequence while the object has not been
> > > changed by the code must give the same result. I doubt that this is 
> > > affected
> > > by optimization.
> > 
> > Working with objects on byte level is often optimized too:
> 
> Indeed, there could be invalid optimization... But I would have thought that
> in such a case, the same kind of issue could also occur without
> -fno-signed-zeros. Indeed, if x == y, then this does not mean that x and y
> have the same memory representation. Where does -fno-signed-zeros introduce
> a difference?

Right. But it's well known that x == y doesn't imply that x and y have the same
value. And the only such case is zeros of different signs (right?). So
compilers deal with this case in a special way. (E.g., the optimization `if (x
== C) use(x)` -> `if (x == C) use(C)` is normally done only for non-zero FP
constant `C`. -fno-signed-zeros changes this.)

The idea that one value could have different representations is not widely
distributed. I didn't manage to construct a testcase for this yesterday but I
succeeded today -- see pr94035 (affects clang too).

The next level -- the same value, the same representation, different meaning.
E.g., pointers of different provenance. But that's another story:-)

> Note: There's also the case of IEEE 754 decimal floating-point formats (such
> as _Decimal64), for instance, due to the "cohorts", where two identical
> values can have different memory representations. Is GCC always correct here?

I have used pseudo-denormals in long double (x86_fp80) for this so far. Are
decimal floating-point formats more interesting?

[Bug middle-end/94035] Wrong optimization: conditional equivalence vs. values with several representations

2020-03-04 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94035

--- Comment #1 from Alexander Cherepanov  ---
clang bug -- https://bugs.llvm.org/show_bug.cgi?id=45101

[Bug middle-end/94035] New: Wrong optimization: conditional equivalence vs. values with several representations

2020-03-04 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94035

Bug ID: 94035
   Summary: Wrong optimization: conditional equivalence vs. values
with several representations
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ch3root at openwall dot com
  Target Milestone: ---

The problem happens when:
- conditional equivalence propagation replaces an expression with a variable or
a constant that has the same value but a different representation, and
- this happens in a computation where representation is accessed.

Example with a pseudo-denormal in long double:

--
#include 

__attribute__((noipa,optnone)) // imagine it in a separate TU
static void *opaque(void *p) { return p; }

int main()
{
long double x, y;

unsigned char *px = (unsigned char *)
unsigned char *py = (unsigned char *)

// make x pseudo-denormal
x = 0;
px[7] = 0x80;
opaque(); // hide it from the optimizer

y = x;

if (y == 0x1p-16382l)
printf("py[8] = %d\n", py[8]);
printf("py[8] = %d\n", py[8]);
}
--
$ gcc -std=c11 -pedantic -Wall -Wextra -Wno-attributes -O3 test.c && ./a.out
py[8] = 1
py[8] = 0
--
gcc x86-64 version: gcc (GCC) 10.0.1 20200304 (experimental)
--

The value 0x1p-16382l admits two representations:

00 00 80 00 00 00 00 00 00 00  pseudo-denormal
00 01 80 00 00 00 00 00 00 00  normalized value

So both 0 and 1 for py[8] are fine but the testcase should print the same value
both times, i.e. the representation of y should be stable.

DR 260 Q1 allows for unstable representation but IMO this is wrong.

[Bug middle-end/93806] Wrong optimization: instability of floating-point results with -funsafe-math-optimizations leads to nonsense

2020-03-03 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93806

--- Comment #34 from Alexander Cherepanov  ---
(In reply to Vincent Lefèvre from comment #13)
> > Since C without Annex F allows arbitrarily awful floating point results,
> 
> In C without Annex F, division by 0 is undefined behavior (really undefined
> behavior, not an unspecified result, which would be very different).
> 
> With the examples using divisions by 0, you need to assume that Annex F
> applies, but at the same time, with your interpretation, -fno-signed-zeros
> breaks Annex F in some cases, e.g. if you have floating-point divisions by
> 0. So I don't follow you...

You seem to say that either Annex F is fully there or not at all but why?
-fno-signed-zeros breaks Annex F but only parts of it. Isn't it possible to
retain the other parts of it? Maybe it's impossible or maybe it's impossible to
retain division by zero, I don't know. What is your logic here?

(In reply to Vincent Lefèvre from comment #15)
> Note that there are very few ways to be able to distinguish the sign of
> zero. The main one is division by zero. Other ones are:
> 
> * Conversion to a character string, e.g. via printf(). But in this case, if
> -fno-signed-zeros is used, whether "0" or "-0" is output (even in a way that
> seems to be inconsistent) doesn't matter since the user does not care about
> the sign of 0, i.e. "0" and "-0" are regarded as equivalent (IIRC, this
> would be a bit like NaN, which has a sign bit in IEEE 754, but the output
> does not need to match its sign bit).

This means that you cannot implement you own printf: if you analyze sign bit of
your value to decide whether you need to print '-', the sign of zero is
significant in your code.

IOW why do you think that printf is fine while "1 / x == 1 / 0." is not?

> * Memory analysis. Again, the sign does not matter, but for instance,
> reading an object twice as a byte sequence while the object has not been
> changed by the code must give the same result. I doubt that this is affected
> by optimization.

Working with objects on byte level is often optimized too:

--
#include 
#include 

__attribute__((noipa)) // imagine it in a separate TU
static double opaque(double d) { return d; }

int main()
{
int zero = opaque(0);

double x = opaque(-0.);
long l;
memcpy(, , sizeof l);
int a = l == 0;
// or just this:
//int a = (union { double d; long l; }){x}.l == 0;

printf("zero = %d\n", zero);

opaque(a);
if (zero == a) {
opaque(0);
if (x == 0) {
opaque(0);
if (a) {
opaque(0);
if (zero == 1)
printf("zero = %d\n", zero);
}
}
}
}
--
$ gcc -std=c11 -pedantic -Wall -Wextra -fno-signed-zeros -O3 test.c && ./a.out
zero = 0
zero = 1
--
gcc x86-64 version: gcc (GCC) 10.0.1 20200303 (experimental)
--

Bonus: bare -fno-signed-zeros is used here, without -fno-trapping-math.

[Bug middle-end/93848] missing -Warray-bounds warning for array subscript 1 is outside array bounds

2020-03-01 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93848

--- Comment #11 from Alexander Cherepanov  ---
(In reply to Martin Sebor from comment #10)
> An array is implicitly converted to a pointer; 

Sometimes converted, sometimes -- not.

> it's not an lvalue.  

Why not? Take for example p[1] that we discussed. It's equivalent to *(p+1) and
the unary * operator is explicitly documented to yield an lvalue (C11,
6.5.3.2p4):

"If the operand points to a function, the result is a function designator; if
it points to an object, the result is an lvalue designating the object."

> But I
> think we're splitting hairs.  I agree we want a warning for passing
> past-the-end pointers to functions that might inadvertently dereference it;
> I plan to implement it for GCC 11.

This is something more general and I'm not sure we want such a warning. You are
right that passing a past-the-end pointers to functions perfectly legal so
there would be false alarms. I cannot readily assess whether it's worth it.

OTOH the specific construction that started this PR is not legal and should
always give a warning, whether it's passed to an unknown function, printed
without additional dereferences or used in a computation.

> The reference in
> 
> int a[1][4];
> printf("%p\n", (void *)[1][1]);
> 
> is of course undefined, but when the warning sees the address-of operator it
> allows off-by-one indices.  That's necessary only for the rightmost index
> but not otherwise.  The missing warning here is the subject of pr84079.  I
> have a simple fix that handles this case.

Great, here we agree. But does it cover [1][0]?

[Bug middle-end/93848] missing -Warray-bounds warning for array subscript 1 is outside array bounds

2020-02-26 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93848

--- Comment #9 from Alexander Cherepanov  ---
(In reply to Martin Sebor from comment #8)
> In
> 
>   int i[4];
>   int (*p)[4] = 
>   bar_aux (p[1]);
> 
> p[0] points to i and p[1] to (char*) + sizeof (i) (which is the same as
> [4]).

It seems we start to differ here. p[1] itself doesn't point anywhere. It's an
array, it has type int[4], its size is 4*sizeof(int). (That's easy to check.)
It's an lvalue that "potentially designates an object". Given that there is no
real object here, its evaluation is UB (C11, 6.3.2.1p1).

IOW there is no pointer here, UB happens before we get a pointer.

>  The latter is a pointer just past the end of i.  Evaluating
> past-the-end pointers is well-defined, as is all pointer arithmetic on them,
> just as long as the result is also a valid pointer to the same object (or
> just past its end).  The only way past-the-end pointers differ from others
> is that the former cannot be dereferenced (by any means, either the *
> operator, or [] or ->).

Right, but this is applicable to p+1, not to p[1]. Indeed, p+1 is a
past-the-end pointer, it points past the end of the array consisting of one
element (i). In particular, you can add 0 and -1 to it to roam over the array.
And you are right that you cannot apply * to it. That is, there is *(p+0) (it's
i) but there is no *(p+1). And p[1] is *(p+1) by definition, so evaluating it
is UB.

> In
> int a[1][4];
> printf("%p\n", (void *)[1][1]);
> 
> on the other hand, the reference to a[1][1] is undefined because a[1] is not
> a reference to an element in a but rather just past the end of a, and so it
> can be neither dereferenced nor used to obtain pointers other than to a or
> just past it.

Sorry I don't quite understand. Are you saying that a[1]-1 is valid? And the
result is a?

> [1] alone is valid (it's equivalent to (char*) + sizeof
> a) and points just past the end of a, but [1][1] is equivalent to
> (char*) + sizeof a + 1 which is not valid.

So we at least agree that [1][1] is UB and should give a warning and that
-Warray-bounds doesn't work as intended here?

[Bug middle-end/93848] missing -Warray-bounds warning for array subscript 1 is outside array bounds

2020-02-25 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93848

Alexander Cherepanov  changed:

   What|Removed |Added

 CC||ch3root at openwall dot com

--- Comment #7 from Alexander Cherepanov  ---
I agree that the original example exhibits UB. One of the violated norms that
was not yet mentioned is C11, 6.3.2.1p1:

"if an lvalue does not designate an object when it is evaluated, the behavior
is undefined"

To go further in this direction, let's compare arrays and structs:

char (*p)[2] = malloc(1); ... use (*p)[0]
struct { char x, y; } *q = malloc(1); ... use (*q).x

Are these valid? Do structs differ? DR 073[1], items A, B, C, F, G, H, says
that the . operator requires a complete structure as its left operand but fails
to address the issue with an array directly. IMHO arrays should not differ.

[1] http://open-std.org/jtc1/sc22/wg14/www/docs/dr_073.html

OTOH suppose that p[1] is not UB per se in the original example. What is the
result of its decay? C11, 6.3.2.1p3 says that it "points to the initial element
of the array object". But there is no array object here. Then, which operations
are allowed for this pointer? p[1]+0 is ok? Writing it as [1][0] is ok? What
about p[1]+1 or [1][1]? gcc doesn't warn about it:

--
#include 

int main()
{
int a[1][4];
printf("%p\n", (void *)[1][1]);
}
--
$ gcc -std=c11 -pedantic -Wall -Wextra -Warray-bounds=2 -O3 test.c && ./a.out
0x7ffc5904aa04
--
gcc x86-64 version: gcc (GCC) 10.0.1 20200225 (experimental)
--

[Bug middle-end/93806] Wrong optimization: instability of floating-point results with -funsafe-math-optimizations leads to nonsense

2020-02-21 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93806

--- Comment #24 from Alexander Cherepanov  ---
(In reply to Vincent Lefèvre from comment #11)
> But what does "internal consistency" mean?
That's a good question. Here we talk about cases (like
-funsafe-math-optimizations) that are not covered by any standards. Other PRs
(like pr61502 or pr93301) discuss possible changes to the standard. So we need
some basic rules to decide what is good and what is bad.

pr61502 taught me that discussing which value is the right result for a
particular computation is very interesting but not very conclusive. So now I'm
looking for contradictions. If you can derive a contradiction then you can
derive any statement, so it's an ultimate goal. How to apply this to a
compiler? I thought the following is supposed to always hold: if you explicitly
write a value into a variable (of any type) then you should get back the same
value at every future read no matter how the results of other reads are used or
what control flow happens (without other writes to the variable, of course).
That is, after `int x = 0;` all `printf("x = %d", x);` should output the same
no matter how many `if (x == ...)` there are in-between -- either `printf`
doesn't fire at all or it prints `x = 0`. If we demonstrated that it's broken
then we demonstrated a contradiction (nonsense). And I hoped that it would be
uncontroversial:-(

Sometimes it's possible to raise the bar even higher and to construct a
testcase where `if` connecting the problematic part with the "independent"
variable is hidden in non-executed code in such a way that loop unswitching
will move it back into executable part (see bug 93301, comment 6 for an
example).

OTOH I think the bar should be lowered in gcc and I hope it would be possible
to come to an agreement that all integers in gcc should be stable. That is, in
this PR the testcase in comment 0 should be enough to demonstrate the problem,
without any need for a testcase in comment 1. It's quite easy to get the latter
from the former so this agreement doesn't seem very important. Much more
important to agree on the general principle described above.

It's always possible that any particular testcase is broken itself. Then some
undefined behavior should be pointed out. So I totally support how you assessed
my testcase from comment 8. We can disagree whether it's UB (I'll get to this a
bit later) but we agree that it's either UB or the program should print
something sane. What I don't understand is what is happening with my initial
testcases.

(In reply to Richard Biener from comment #3)
> But you asked for that.  So no "wrong-code" here, just another case
> of "instabilities" or how you call that via conditional equivalence
> propagation.
Just to be sure: you are saying that everything works as intended -- the
testcase doesn't contain UB and the result it prints is one of the allowed? (It
could also be read as implying that this pr is a dup of another bug about
conditional equivalence propagation or something.) Then we just disagree.

Discussion went to specifics of particular optimizations. IMHO it's not
important at all for deciding whether comment 1 demonstrates a bug or not.
Again, IMHO either the testcase contains UB or it shouldn't print nonsense (in
the sense described above). And it doesn't matter which options are used,
whether it's a standards compliant mode, etc.

[Bug middle-end/93806] Wrong optimization: instability of floating-point results with -funsafe-math-optimizations leads to nonsense

2020-02-20 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93806

--- Comment #16 from Alexander Cherepanov  ---
On 21/02/2020 03.40, vincent-gcc at vinc17 dot net wrote:
> Concerning -fno-signed-zeros, your code has undefined behavior since the sign
> of zero is significant in "1 / x == 1 / 0.", i.e. changing the sign of 0
> changes the result. If you use this option, you are telling GCC that this
> cannot be the case. Thus IMHO, this is not a bug.

If I understand correctly what you are saying, then one cannot even do
`printf("%f", 0);` because changing the sign of 0 will add a minus in front of
0. This seems kinda restrictive.

> I would say that -fno-trapping-math should have no effect because there are no
> traps by default (for floating point).

With -fno-trapping-math gcc simplifies `1 / x == 1 / 0.` into `1 / x >
1.79769313486231570814527423731704356798070567525844996599e+308`.

[Bug middle-end/93806] Wrong optimization: instability of floating-point results with -funsafe-math-optimizations leads to nonsense

2020-02-20 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93806

--- Comment #8 from Alexander Cherepanov  ---
A similar problem happens with -fno-signed-zeros -fno-trapping-math. Not sure
if a separate PR should be filed...

--
#include 

__attribute__((noipa)) // imagine it in a separate TU
static double opaque(double d) { return d; }

int main()
{
int zero = opaque(0);

double x = opaque(-0.);
int a = 1 / x == 1 / 0.;

printf("zero = %d\n", zero);

opaque(a);
if (zero == a) {
opaque(0);
if (x == 0) {
opaque(0);
if (a) {
opaque(0);
if (zero == 1)
printf("zero = %d\n", zero);
}
}
}
}
--
$ gcc -std=c11 -pedantic -Wall -Wextra -fno-signed-zeros -fno-trapping-math -O3
test.c && ./a.out
zero = 0
zero = 1
--
gcc x86-64 version: gcc (GCC) 10.0.1 20200220 (experimental)
--

[Bug middle-end/90248] [8/9/10 Regression] larger than 0 compare fails with -ffinite-math-only -funsafe-math-optimizations

2020-02-20 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90248

Alexander Cherepanov  changed:

   What|Removed |Added

 CC||ch3root at openwall dot com

--- Comment #13 from Alexander Cherepanov  ---
(In reply to Andrew Pinski from comment #7)
> I copied an optimization from LLVM so I
> think they also mess up a similar way (though differently).
I looked into reporting this problem to llvm but I don't see it there. In the
current llvm sources I can only find this:
https://github.com/llvm/llvm-project/blob/master/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp#L2348

  // If needed, negate the value that will be the sign argument of the
copysign:
  // (bitcast X) <  0 ? -TC :  TC --> copysign(TC,  X)
  // (bitcast X) <  0 ?  TC : -TC --> copysign(TC, -X)
  // (bitcast X) >= 0 ? -TC :  TC --> copysign(TC, -X)
  // (bitcast X) >= 0 ?  TC : -TC --> copysign(TC,  X)

AIUI `bitcast` here means a bitcast to an integer type. For example, this:

--
union u { double d; long l; };

double f(double x)
{
return (union u){x}.l >= 0 ? 2.3 : -2.3;
}
--

is optimized into this:

--
; Function Attrs: nounwind readnone uwtable
define dso_local double @f(double %0) local_unnamed_addr #0 {
  %2 = tail call double @llvm.copysign.f64(double 2.30e+00, double %0)
  ret double %2
}
--

Did I miss something?

[Bug middle-end/93806] Wrong optimization: instability of floating-point results with -funsafe-math-optimizations leads to nonsense

2020-02-19 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93806

--- Comment #6 from Alexander Cherepanov  ---
I agree that every separate optimization here is quite natural. At the same
time, the end result is quite unnatural.

The following is a look at the situation from an outsider POV.

-funsafe-math-optimizations makes floating-point values unpredictable and
unstable. That's fine. And gcc is ready for this. For example, it doesn't
propagate conditional equivalence for them (signed zeros would complicate it
but -funsafe-math-optimizations enables -fno-signed-zeros).

OTOH gcc assumes that integers are stable. And conditional equivalence
propagation is fine for them.

Nonsense results start to appear when the boundary between FPs and integers
blurs. So an evident solution would be to stop replacing integers by FP
computations. E.g., in this particular case, don't substitute FP equality in
place of non-first instances of `a`, all instances of `a` should use the result
of the same FP computation.

This approach will also solve some problems with x87 (pr85957, pr93681) and
with -mpc64 (pr93682). It will not make them conformant but at least it will
fix some egregious issues.

[Bug c/85957] i686: Integers appear to be different, but compare as equal

2020-02-18 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85957

--- Comment #28 from Alexander Cherepanov  ---
The -funsafe-math-optimizations option has a similar problem (on all
processors, I assume) -- I've just filed pr93806 for it. I guess unstable FP
results are essential for this mode but integers computed from FPs should be
somehow guarded from instability.

I don't know if anybody cares about -funsafe-math-optimizations but if it's
fixed then x87 will get this improvement for free:-)

[Bug middle-end/93806] Wrong optimization: instability of floating-point results with -funsafe-math-optimizations leads to nonsense

2020-02-18 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93806

--- Comment #1 from Alexander Cherepanov  ---
And instability of integers then easily taints surrounding code:

--
#include 

__attribute__((noipa)) // imagine it in a separate TU
static double opaque(double d) { return d; }

int main()
{
int one = opaque(1);

int x = opaque(1);
int a = 1 + opaque(0x1p-60) == x;

printf("one = %d\n", one);

opaque(a);
if (one == a) {
opaque(0);
if (x == 1) {
opaque(0);
if (a == 0) {
opaque(0);
if (one == 0)
printf("one = %d\n", one);
}
}
}
}
--
$ gcc -std=c11 -pedantic -Wall -Wextra -funsafe-math-optimizations -O3 test.c
&& ./a.out
one = 1
one = 0
--
gcc x86-64 version: gcc (GCC) 10.0.1 20200218 (experimental)
--

[Bug middle-end/93806] New: Wrong optimization: instability of floating-point results with -funsafe-math-optimizations leads to nonsense

2020-02-18 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93806

Bug ID: 93806
   Summary: Wrong optimization: instability of floating-point
results with -funsafe-math-optimizations leads to
nonsense
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ch3root at openwall dot com
  Target Milestone: ---

With -funsafe-math-optimizations, floating-point results are effectively
unstable, this instability can taint everything around and lead to nonsense.
(Similar to pr93681 and pr93682.)

Instability is not limited to FP numbers, it extends to integers too:

--
#include 

__attribute__((noipa)) // imagine it in a separate TU
static double opaque(double d) { return d; }

int main()
{
int x = opaque(1);
int a = 1 + opaque(0x1p-60) != x;

printf("a = %d\n", a);
if (x == 1) {
opaque(0);
if (a)
printf("a is 1\n");
}
}
--
$ gcc -std=c11 -pedantic -Wall -Wextra -funsafe-math-optimizations -O3 test.c
&& ./a.out
a = 0
a is 1
--
gcc x86-64 version: gcc (GCC) 10.0.1 20200218 (experimental)
--

[Bug tree-optimization/93681] Wrong optimization: instability of x87 floating-point results leads to nonsense

2020-02-11 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93681

Alexander Cherepanov  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=323,
   ||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=85957,
   ||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=93682

--- Comment #3 from Alexander Cherepanov  ---
Bug 85957 is definitely similar. I've added it and a couple of others to "See
Also".

Some differences:
- testcases here use comparisons to jump from FPs to integers while bug 85957
uses casts;
- the exploited discrepancy here is between in-register and in-memory
(run-time) results of type `double` while in bug 85957 it's between
compile-time and run-time results;
- it's probably possible to fix bug 85957 by enhancing the dom2 pass while it
will probably not help here.
Whether it warrants a separate bug report it's hard for me to say.

[Bug c/85957] i686: Integers appear to be different, but compare as equal

2020-02-11 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85957

--- Comment #23 from Alexander Cherepanov  ---
(In reply to Alexander Monakov from comment #10)
> Also note that both the original and the reduced testcase can be tweaked to
> exhibit the surprising transformation even when -fexcess-precision=standard
> is enabled. A "lazy" way is via -mpc64

I think this is another problem. I filed bug 93682 for it.

[Bug c/85957] i686: Integers appear to be different, but compare as equal

2020-02-11 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85957

--- Comment #22 from Alexander Cherepanov  ---
(In reply to jos...@codesourcery.com from comment #11)
> Yes, I agree that any particular conversion to integer executed in the 
> abstract machine must produce some definite integer value for each 
> execution.
The idea that floating-point variables could be unstable but integer variables
have to be stable seems like an arbitrary boundary. But I guess this is deeply
ingrained in gcc: the optimizer just assumes that integers are stable (e.g.,
optimizes `if (x != y && y == z) use(x == z);` for integers to `if (x != y && y
== z) use(0);`) but it's ready for instability of FPs (e.g., doesn't do the
same optimization for FPs).

When the stability of integers is violated everything blows up. This bug report
show that instability of floating-point values extends to integers via casts.
Another way is via comparisons -- I've just filed bug 93681 for it. There is
also a testcase there that shows how such an instability can taint surrounding
code.

So, yeah, it seems integers have to be stable. OTOH, now that there is sse and
there is -fexcess-precision=standard floating-point values are mostly stable
too. Perhaps various optimizations done for integers could be enabled for FPs
too? Or the situation is more complicated?

[Bug tree-optimization/93682] Wrong optimization: on x87 -fexcess-precision=standard is incompatible with -mpc64

2020-02-11 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93682

--- Comment #1 from Alexander Cherepanov  ---
The instability can also taint surrounding code which leads to nonsense:

--
#include 

__attribute__((noipa)) // imagine it in a separate TU
static long double opaque(long double d) { return d; }

int main()
{
int one = opaque(1);
long double x = opaque(1);
int a = x + 0x1p-60 == 1;

opaque(a);
if (one == a) {
opaque(0);
if (x == 1) {
opaque(0);
if (a == 0) {
opaque(0);
if (one == 0)
printf("one = %d\n", one);
}
}
}
}
--
$ gcc -std=c11 -pedantic -Wall -Wextra -m32 -march=i686 -mpc64 -O3 test.c &&
./a.out
one = 0
--
gcc x86-64 version: gcc (GCC) 10.0.1 20200211 (experimental)
--

[Bug tree-optimization/93682] New: Wrong optimization: on x87 -fexcess-precision=standard is incompatible with -mpc64

2020-02-11 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93682

Bug ID: 93682
   Summary: Wrong optimization: on x87 -fexcess-precision=standard
is incompatible with -mpc64
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ch3root at openwall dot com
  Target Milestone: ---

Spotted by Alexander Monakov in bug 85957, comment 10...

Not sure if -mpc64 is supposed to be compatible with
-fexcess-precision=standard (and -std=c11 which enables it) but I don't see any
warnings in the description of these options.

-mpc64 changes hardware precision but the optimizer seems to be unaffected
which leads to discrepancies. For example, AIUI -fexcess-precision=standard is
supposed to give predictable results but the results of the following testcase
depend on the optimization level:

--
#include 

int main()
{
int x = 1;

printf("%a\n", 0x1p-60 + x - 1);
}
--
$ gcc -std=c11 -pedantic -Wall -Wextra -m32 -march=i686 -mpc64 test.c &&
./a.out
0x0p+0
$ gcc -std=c11 -pedantic -Wall -Wextra -m32 -march=i686 -mpc64 -O3 test.c &&
./a.out
0x1p-60
--
gcc x86-64 version: gcc (GCC) 10.0.1 20200211 (experimental)
--

[Bug c/85957] i686: Integers appear to be different, but compare as equal

2020-02-11 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85957

--- Comment #21 from Alexander Cherepanov  ---
The following variation works with the trunk:

--
#include 

__attribute__((noipa)) // imagine it in a separate TU
static int opaque(int i) { return i; }

int main()
{
static int a = 0;
int d = opaque(1);

if (opaque(0))
puts("ignore");
// need the next `if` to be at the start of a BB

if (d == 1)
a = 1;

int i = d - 0x1p-60;

if (i == 1)
printf("i = %d\n", i);

printf("i = %d\n", i);

opaque(a);
}
--
$ gcc -std=gnu11 -pedantic -Wall -Wextra -m32 -march=i686 -O3 test.c && ./a.out
i = 1
i = 0
--
gcc x86-64 version: gcc (GCC) 10.0.1 20200211 (experimental)
--

All the same but the computation of `i` is hoisted from the `if` in the
133t.pre pass so dom3 doesn't have a chance to fold it.

Another interesting aspect: there are no comparisons of floating-point numbers
in this example, all FP operations are limited to a basic arithmetic and a
conversion.

[Bug c/85957] i686: Integers appear to be different, but compare as equal

2020-02-11 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85957

Alexander Cherepanov  changed:

   What|Removed |Added

 CC||ch3root at openwall dot com

--- Comment #20 from Alexander Cherepanov  ---
Minimized testcase that should still be quite close to the original:

--
#include 

__attribute__((noipa)) // imagine it in a separate TU
static double opaque(double d) { return d; }

int main()
{
double d;
do {
d = opaque(1e6);
} while (opaque(0));

if (d == 1e6)
printf("yes\n");

int i = d * 1e-6;
printf("i = %d\n", i);

if (i == 1)
printf("equal to 1\n");
}
--
$ gcc -std=gnu11 -pedantic -Wall -Wextra -m32 -march=i686 -O3 test.c && ./a.out
yes
i = 0
equal to 1
--

According to https://godbolt.org/z/AmkmS5 , this happens for gcc versions
8.1--9.2 but not for trunk (I haven't tried earlier versions).

With gcc 8.3.0 from the stable Debian it works like this:
- (as described in comment 7) 120t.dom2 merges two `if`s, in particular
deducing that `i == 1` is true if `d == 1e6` is true but not substituting `i`
in `printf`;
- 142t.loopinit introduces `# d_4 = PHI ` between the loop and the
first `if`;
- 181t.dom3 would fold computation of `i` in the `d == 1e6` branch but the
introduced `PHI` seems to prevent this.

With gcc from the trunk a new pass 180t.fre4 removes that `PHI` and 182t.dom3
then does its work. (The numeration of passes changed slightly since gcc
8.3.0.)

[Bug tree-optimization/93681] Wrong optimization: instability of x87 floating-point results leads to nonsense

2020-02-11 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93681

--- Comment #1 from Alexander Cherepanov  ---
And instability of integers then easily taints surrounding code:

--
#include 

__attribute__((noipa)) // imagine it in a separate TU
static int opaque(int i) { return i; }

int main()
{
int z = opaque(0);
int a = opaque(1) + 0x1p-60 == 1;

printf("z = %d\n", z);
opaque(a);
if (z == a) {
opaque(0);
if (a)
printf("z = %d\n", z);
}
}
--
$ gcc -std=gnu11 -pedantic -Wall -Wextra -m32 -march=i686 -O3 test.c && ./a.out
z = 0
z = 1
--
gcc x86-64 version: gcc (GCC) 10.0.1 20200211 (experimental)
--

clang bug -- https://bugs.llvm.org/show_bug.cgi?id=44873

[Bug tree-optimization/93681] New: Wrong optimization: instability of x87 floating-point results leads to nonsense

2020-02-11 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93681

Bug ID: 93681
   Summary: Wrong optimization: instability of x87 floating-point
results leads to nonsense
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ch3root at openwall dot com
  Target Milestone: ---

x87 floating-point results are effectively unstable due to possible excess
precision. Without -fexcess-precision=standard, this instability can taint
everything around and lead to nonsense.

Instability is not limited to FP numbers, it extends to integers too:

--
#include 

__attribute__((noipa,optnone)) // imagine it in a separate TU
static int opaque(int i) { return i; }

int main()
{
int z = opaque(1) + 0x1p-60 == 1;

printf("z = %d\n", z);
if (z) 
puts("z is one");
}
--
$ gcc -std=gnu11 -pedantic -Wall -Wextra -Wno-attributes -m32 -march=i686 -O3
test.c && ./a.out
z = 0
z is one
--
gcc x86-64 version: gcc (GCC) 10.0.1 20200211 (experimental)
--

[Bug tree-optimization/93491] Wrong optimization: const-function moved over control flow leading to crashes

2020-01-29 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93491

--- Comment #2 from Alexander Cherepanov  ---
(In reply to Andrew Pinski from comment #1)
> Const functions by definition dont trap or throw.

They don't trap in all cases when they are really called. But this doesn't mean
they must have a sane behavior for other combinations of arguments. At least
that's what I got from the definition[1] of this attribute. In particular, it
reads:

"Declaring such functions with the const attribute allows GCC to avoid emitting
some calls in repeated invocations of the function with the same argument
values."

That is, it allow gcc to eliminate some calls, but not to introduce new ones.

But this is a gcc extension so it could be assigned any meaning that is seemed
beneficial to gcc. In such a case this pr should be considered as a
documentation clarification request.

[1]
https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-const-function-attribute

> So adding const to a function that traps makes the testcase undefined.

Let me describe the testcase like this: the function `g` is defined only for
non-zero arguments, conforms to the rules of the const attribute for non-zero
arguments, and is called only with non-zero arguments, but the optimizer
introduces a call to it with the zero argument.

Perhaps the minimized example that I posted looks weird. The situation could be
more subtle in real life. For instance, the bad call could be due to an
unitialized variable used as the argument. I mean this variable could be
uninitialized only during speculative execution but perfectly initialized
during ordinary execution.

Or it could be something like this:

--
#include 
#include 

__attribute__((noipa))
void check_value(int x)
{
if (x == 0) {
puts("Fatal error: got invalid value. Shutting down cleanly!");
exit(0);
}
}

__attribute__((const,noipa))
int calc(int x)
{
return 1 / x;
}

int main(int c, char **v)
{
(void)v;

int x = 0; // invalid value by default
// some complex initialization of x that doesn't always succeed
if (c > 5)
x = 123;
// etc.

for (int i = 0; i < 10; i++) {
check_value(x);
printf("result = %d\n", calc(x));
}
}
--
$ gcc -std=c11 -pedantic -Wall -Wextra test.c && ./a.out
Fatal error: got invalid value. Shutting down cleanly!
$ gcc -std=c11 -pedantic -Wall -Wextra -O3 test.c && ./a.out
Floating point exception
--
gcc x86-64 version: gcc (GCC) 10.0.1 20200129 (experimental)
--

Again, this doesn't mean that gcc have to support such use cases.

> Do you have a testcase were gcc does this optimize without the user adding
> const and still traps?

No. I'll file a separate bug if I stumble upon one, so please disregard this
possibility for now.

[Bug tree-optimization/93491] New: Wrong optimization: const-function moved over control flow leading to crashes

2020-01-29 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93491

Bug ID: 93491
   Summary: Wrong optimization: const-function moved over control
flow leading to crashes
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ch3root at openwall dot com
  Target Milestone: ---

In the following example, the call to the function `g` is guarded by the `f(0)`
call and is never evaluated but the optimizer moved it over the guard while
hoisting it from the loop:

--
#include 

__attribute__((noipa))
void f(int i)
{
exit(i);
}

__attribute__((const,noipa))
int g(int i)
{
return 1 / i;
}

int main()
{
while (1) {
f(0);

f(g(0));
}
}
--
$ gcc -std=c11 -pedantic -Wall -Wextra test.c && ./a.out
$ gcc -std=c11 -pedantic -Wall -Wextra -O3 test.c && ./a.out
Floating point exception
--
gcc x86-64 version: gcc (GCC) 10.0.1 20200129 (experimental)
--

[Bug tree-optimization/93301] Wrong optimization: instability of uninitialized variables leads to nonsense

2020-01-28 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93301

--- Comment #12 from Alexander Cherepanov  ---
(In reply to rguent...@suse.de from comment #9)
> This example simply shows that we're propagating the equivalence
> y == x - 1 in the way that the controlled x - 1 becomes y (that's
> possibly a QOI issue in this particular case where we _do_ see
> that y is not initialized - but as said in general we cannot know)
> and that we then optimize condition ? uninit : 5 as 5 (we're
> optimistically treating uninit equal to 5 here which is OK).

Right, nothing new here, just a simpler example. (Building something upon my
first example was not fun. This one proved to be much nicer.)

> Note the (void) doesn't do anything but it really looks like
> it is required to make the testcase conforming (otherwise you
> wouldn't use it?).

This is just a ritual dance dedicated to Itanium. C11, 6.3.2.1p2, has the
following text added for DR 338[1]:

"If the lvalue designates an object of automatic storage duration that could
have been declared with the register storage class (never had its address
taken), and that object is uninitialized (not declared with an initializer and
no assignment to it has been performed prior to use), the behavior is
undefined."

OTOH maybe it's worth to spent a minute on this DR given we are discussing
speculative execution. [2] has a nice explanation of the topic. IIUC you can
try to hoist a dereference of a pointer from a loop -- you will just get a NaT
(Not-a-Thing) value in a register instead of a crash if the pointer is bad. But
if your speculatively executed hoisted code have to write to memory a value
from a register that corresponds to an uninitialized variable and this register
contains a NaT, you get your crash back:-)

According to [3] the whole thing is to be EOLed the next year, so not sure how
relevant this is.

[1] http://open-std.org/jtc1/sc22/wg14/www/docs/dr_338.htm
[2] https://devblogs.microsoft.com/oldnewthing/?p=41003
[3] https://en.wikipedia.org/wiki/IA-64#End_of_life:_2021

[Bug tree-optimization/93301] Wrong optimization: instability of uninitialized variables leads to nonsense

2020-01-26 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93301

--- Comment #6 from Alexander Cherepanov  ---
(In reply to Alexander Cherepanov from comment #2)
> But my guess is that the C++ rules will not help. The problem is the
> internal inconsistency so everything will blow up independently of any
> external rules.

Well, the following example should illustrate what I mean. The uninitialized
variable `y` is only used in dead code:

--
#include 

__attribute__((noipa)) // imagine it in a separate TU
static int opaque(int i) { return i; }

int main()
{
short x = opaque(1);
short y;

opaque(x - 1);

while (opaque(1)) {
printf("x = %d;  x - 1 = %d\n", x, opaque(1) ? x - 1 : 5);

if (opaque(1))
break;

if (x - 1 == y)
opaque(y);
}
}
--
$ gcc -std=c11 test.c && ./a.out
x = 1;  x - 1 = 0
$ gcc -std=c11 -O3 test.c && ./a.out
x = 1;  x - 1 = 5
--
gcc x86-64 version: gcc (GCC) 10.0.1 20200126 (experimental)
--

There is some defence in tree-ssa-loop-unswitch.c against loop unswitching on
undefined values but, as you wrote in comment 1, it's probably not that easy.

[Bug tree-optimization/93301] Wrong optimization: instability of uninitialized variables leads to nonsense

2020-01-26 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93301

--- Comment #5 from Alexander Cherepanov  ---
(In reply to rguent...@suse.de from comment #3)
> But the first use of the undefined value in the comparison makes
> everything wobbly.

So `if (x == y)` is UB. Or `x == y` is already UB?

[Bug tree-optimization/93301] Wrong optimization: instability of uninitialized variables leads to nonsense

2020-01-26 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93301

--- Comment #4 from Alexander Cherepanov  ---
(In reply to Richard Biener from comment #1)
> guess DOM would also happily propagate equivalences

Yeah, this gives a simpler testcase:

--
#include 

__attribute__((noipa)) // imagine it in a separate TU
static int opaque(int i) { return i; }

int main()
{
unsigned char x = opaque(1);
unsigned char y;
(void)

if (x - 1 == y) {
printf("x = %d;  x - 1 = %d\n", x, opaque(1) ? x - 1 : 5);
opaque(y);
}
}
--
$ gcc -std=c11 test.c && ./a.out
x = 1;  x - 1 = 0
$ gcc -std=c11 -O3 test.c && ./a.out
x = 1;  x - 1 = 5
--
gcc x86-64 version: gcc (GCC) 10.0.1 20200126 (experimental)
--

[Bug tree-optimization/61502] == comparison on "one-past" pointer gives wrong result

2020-01-20 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61502

--- Comment #43 from Alexander Cherepanov  ---
The following example demonstrates that the instability taints the surrounding
code even if it's in dead code itself. In particular, this shows that even
making comparisons like ` + 1 == ` undefined will not help.

--
#include 
#include 

__attribute__((noipa)) // imagine it in a separate TU
static void *opaque(void *p) { return p; }

__attribute__((noipa)) // imagine it in a separate TU
static void g(int a)
{
printf("%d\n", a);
exit(0);
}

static void f(int c, void *p, void *q, void *r)
{
while (c) {
g(p == r);

if (p != q && q == r)
puts("unreachable");
}
}

int main(int c, char **v)
{
(void)v;

int x[5];
int y[2];

void *p = 
void *q =  + 1;

opaque(q); // escaped
void *r = opaque(p); // hide the provenance of p

f(c, p, q, r);
}
--
$ gcc -std=c11 -pedantic -Wall -Wextra test.c && ./a.out
1
$ gcc -std=c11 -pedantic -Wall -Wextra -O3 test.c && ./a.out
0
--
gcc x86-64 version: gcc (GCC) 10.0.1 20200120 (experimental)
--

[Bug tree-optimization/93301] Wrong optimization: instability of uninitialized variables leads to nonsense

2020-01-17 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93301

--- Comment #2 from Alexander Cherepanov  ---
The problem is much more serious. It's not that C has some guarantees about two
values of `y` while gcc doesn't provide them. It's that one part of gcc assumes
there are some guarantees about two values of `y` while another part of gcc
doesn't provide such guarantees.

There were many discussions about "wobbly" values (DR 260, DR 451 etc.) and I
don't expect gcc to give the same values for two `y`s at all. The second value
is taken in the `else` branch, which we don't care about (at least while this
branch doesn't invoke UB). Given that `b` is `1` we know that the `then` branch
is taken.

It's surely possible to make this testcase undefined. If I understand them
correctly, llvm folks claim that "real compilers" (gcc included I assume)
follow rules that are more like C++ than C (see the linked clang bug report).
The C++ rules are much more strict and make `x == y` undefined if `y` is
uninitialized.

But my guess is that the C++ rules will not help. The problem is the internal
inconsistency so everything will blow up independently of any external rules.

[Bug tree-optimization/93301] New: Wrong optimization: instability of uninitialized variables leads to nonsense

2020-01-17 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93301

Bug ID: 93301
   Summary: Wrong optimization: instability of uninitialized
variables leads to nonsense
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ch3root at openwall dot com
  Target Milestone: ---

Instability is inconsistency, which leads to logical contradictions, which
leads to total chaos. Similar to bug 61502, comment 42, but with uninitialized
variables:

--
#include 

__attribute__((noipa)) // imagine it in a separate TU
static void *opaque(void *p) { return p; }

int main()
{
int c = 1;
opaque();

unsigned char x = 0;
opaque();
unsigned char y; // no trap representation possible
(void) // disarm C11, 6.3.2.1p2

unsigned char z;
int b;
if (x == y) {
b = 1;
z = x;
} else {
b = 0;
z = y;
}

opaque();
if (b)
printf("b = %d  c = %d  x = %d  e = %d\n", b, c, x, c ? z : 5);
}
--
$ gcc -std=c11 -O3 test.c && ./a.out
b = 1  c = 1  x = 0  e = 5
--
gcc x86-64 version: gcc (GCC) 10.0.1 20200117 (experimental)
--

Given that the printf has fired, `b` is `1`, hence `z` is the same as `x` and
`e = 0` should be printed.

According to my reading of C11 this program doesn't invoke UB. (And I thought
that most proposals about "wobbly" values wouldn't change this but I'm not sure
anymore:-)

Even if this particular example is deemed undefined by gcc, I guess
inconsistencies could blow everything up even without any help from a
programmer.

clang bug -- https://bugs.llvm.org/show_bug.cgi?id=44512.

[Bug tree-optimization/61502] == comparison on "one-past" pointer gives wrong result

2020-01-10 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61502

--- Comment #42 from Alexander Cherepanov  ---
I've recently stumbled upon a straightforward description (from Hal Finkel, in
https://bugs.llvm.org/show_bug.cgi?id=34548#c77) for the thing that bothered me
in the second part of comment 17. Roughly speaking: instability is
inconsistency, which leads to logical contradictions, which leads to total
chaos. Instability taints everything around it and you cannot trust anything in
the end. A small modification to the example in comment 18 will hopefully
illustrate it:

--
#include 

__attribute__((noipa)) // imagine it in a separate TU
static void *opaque(void *p) { return p; }

static void f(void *p, void *q, void *r)
{
if (p != q && q == r)
printf("%d\n", p == r);
}

int main()
{
int x[5];
int y[2];

void *p = 
void *q =  + 1;

opaque(q); // escaped
void *r = opaque(p); // hide the provenance of p

f(p, q, r);
}
--
$ gcc -std=c11 -pedantic -Wall -Wextra -fno-partial-inlining -O3 test.c &&
./a.out
0
--
gcc x86-64 version: gcc (GCC) 10.0.0 20200110 (experimental)
--

Here `r` has the same value as `p` but the optimizer cannot see this. Comparing
them to `q` gives different results -- `p` is non-equal to `q` (at compile
time) and `r` is equal to `q` (at run time). Then, given these results, we ask
the optimizer to compare `p` and `r` and it happily concludes that they are
non-equal which is nonsense.

This example could be explained by conditional propagation of wrong provenance
but I see the optimization happening during the einline pass so it's probably
not it. (-fno-partial-inlining simplifies the analysis but doesn't affect the
result.) Even if this particular example triggers some other bug(s) in gcc the
logic in the previous paragraph is probably nice to have in the optimizer. But
it could not until the instability is fixed.

I guess this settles the question for me FWIW. Unless there is a magic way to
contain logical contradictions I think the right way is like this:
- the C standard could be changed to make comparison of the form ` ==  + 1`
unspecified or not -- not that important;
- all (non-UB) things in practice should have stable behavior;
- comparison of the form ` ==  + 1` in practice should give results
according to naive, literal reading of C11.

[Bug tree-optimization/61502] == comparison on "one-past" pointer gives wrong result

2020-01-10 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61502

--- Comment #41 from Alexander Cherepanov  ---
(In reply to Richard Biener from comment #38)
> (In reply to Alexander Cherepanov from comment #37)
> > On 30/12/2019 10.51, rguenther at suse dot de wrote:
> > >> Obviously, it could be used to fold `a + i == b` to `0` if `a` and `b`
> > >> are two different known arrays and `i` is unknown
> > > 
> > > That's indeed the main thing. Basically it allows points-to analysis work 
> > > at
> > > all in the presence of non-constant offsets.
> > 
> > But what is PTA used for? Examples that I've seen all deal with
> > dereferenceable pointers. And current gcc behaviour is just fine in that
> > case. The problem is with non-dereferenceable pointers. So is PTA important
> > for cases where dereferenceability is unknown (or known to be false) or it's
> > just too complicated to take dereferenceability into account?
> 
> points-to analysis doesn't care about whether a pointer is dereferenced or
> not when computing its points-to set.  You can very well add a dereference
> to your testcase and that shouldn't affect its outcome, no?

No, I mean dereferences of both pointers, these cannot be added to my testcase.
To exclude any ambiguity, I'm talking about my last testcase (comment 35) in
this bug report. (Or the original testcase from Peter Sewell in comment 0.) One
pointer is ``, fine, dereferenceable. The other one is ` + 1`, just past
the end of the object `y`, non-dereferenceable (C11, 6.5.6p8).

So the rough idea is to do it like this: if both pointers are known to be
dereferenceable at the point of check (e.g., we want to move `a[i] = 1;` over
`b[j] = 2;`) then the results of the PTA could be used, otherwise (e.g., we
want to fold `a + i == b + j` only knowing that `a` and `b` are different
arrays) pass the comparison to run time.

A nice thing about this approach is that it treats pointers and integers in the
same way. In particular, it will also solve bug 93010.
And it is applicable to dynamic memory which is handled somewhat differently
now.
It even almost works for `restrict`! (It should be possible to optimize `int
f(int *restrict p, int *q) { *p = 1; *q; return p == q; }` but not with `*p =
1;` replaced with just `*p;`.)

In some sense this approach delegates a part of the work to the programmer. If
they put a dereference somewhere they effectively assert that the pointer (even
if it's a casted from an integer) is good at this particular moment -- that the
result of malloc was not null, that the memory was not free'd, delete'd or
reused via placement new since, or that the local variable is still in scope,
or that the pointer is not past the end, or that the storage is not of zero
size. This of course depends on the programmer respecting the provenances but
that's not a news:-)

What is a dereference in this context is a somewhat tricky question.
Dereferencing a pointer to an empty struct should not count. But calling a
static method in non-empty class probably should (need to check the C++ std).
Then, while analyzing `p == q` it's not necessary to require dereferenceability
of `p` and `q` themselves, dereferenceability of `p + k` and `q + l` is enough
if `k` and `l` are both nonnegative or strictly negative.

Many more improvements are possible too. E.g., `a + i == b + j` could be folded
to `0` if the addresses are not exposed or only exposed in ways that don't
allow one to reason about the arrangement of the objects. IIUC llvm does some
of this, e.g., https://reviews.llvm.org/rL249490.

> And yes, GCC uses points-to analysis results to optimize pointer equality
> compares like p == q to false if the points-to sets do not intersect (for
> a set of cases, but that's current implementation detail).  That helps
> surprisingly often for abstraction coming from the C++ standard library
> container iterators.

Isn't it mainly dynamic memory? Then it's already handled a bit differently,
even `new int == new int` is not optimized right now (unlike in clang).
AIUI this bug report is relevant to non-dynamic memory only (but the fix could
improve the case of dynamic memory too).

> I do agree that we have bugs in GCC but AFAICS those come from conditional
> equivalences being propagated and from the very old RTL alias analysis issue
> involving base_alias_check.  Once we dealt with the latter I'm happily
> exploring fixes for the former - but the latter will happily nullify fixes
> of the former.

IMHO those two are quite different problems -- this bug is about wrong results
at compile-time and conditional equivalence propagation depends on the results
not being available at the optimization time. I could be wrong but it seems to
me that it's better to deal with them separately.

[Bug tree-optimization/61502] == comparison on "one-past" pointer gives wrong result

2020-01-10 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61502

--- Comment #40 from Alexander Cherepanov  ---
Example with a flexible array member:

--
#include 
#include 

__attribute__((noipa)) // imagine it in a separate TU
static void *opaque(void *p) { return p; }

int main()
{
int x[5];
struct { int i, j, a[]; } y;

printf("eq1: %d\n", x == y.a);

int *p = x;
int *q = y.a;
printf("eq2: %d\n", p == q);
printf("diff: %d\n", memcmp(, , sizeof(int *)));

opaque(q); // escaped
opaque(); // move the next comparison to run-time
printf("eq3: %d\n", p == q);
}
--
$ gcc -std=c11 -pedantic -Wall -Wextra test.c && ./a.out
eq1: 0
eq2: 1
diff: 0
eq3: 1
$ gcc -std=c11 -pedantic -Wall -Wextra -O3 test.c && ./a.out
eq1: 0
eq2: 0
diff: 0
eq3: 1
--
gcc x86-64 version: gcc (GCC) 10.0.0 20200110 (experimental)
--

This example should be standard-compliant.

eq1 is wrong even without optimization, eq2 folded by fre1.

[Bug tree-optimization/61502] == comparison on "one-past" pointer gives wrong result

2020-01-10 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61502

--- Comment #39 from Alexander Cherepanov  ---
For diversity, a couple of examples with zero sized objects. Even though they
don't have pointer arithmetic at all they could be classified as being about
past-the-end pointers:-) Please let me know if it's better to move them into a
separate bug (or bugs).

--
#include 

int main()
{
struct {} x, y, *p = , *q = 

printf("eq1: %d\n",  == );
printf("eq2: %d\n", p == q);
}
--
$ gcc -std=c11 -Wall -Wextra test.c && ./a.out
eq1: 0
eq2: 1
--
gcc x86-64 version: gcc (GCC) 10.0.0 20200110 (experimental)
--

Empty structs is a gcc extension (they are UB according to C11, 6.7.2.1p8) but
IMHO the demonstrated instability is not good.

Happens only without optimizations (optimizations add some padding between `x`
and `y`).

Similar clang bug -- https://bugs.llvm.org/show_bug.cgi?id=44508.

[Bug tree-optimization/93153] Wrong optimization while devirtualizing after placement new over local var

2020-01-10 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93153

Alexander Cherepanov  changed:

   What|Removed |Added

 Status|RESOLVED|UNCONFIRMED
 Resolution|DUPLICATE   |---

--- Comment #2 from Alexander Cherepanov  ---
I agree that this is dup of bug 45734 but I don't think the mentioned C++
defect report covers this testcase. I don't have right reopen bug 45734 so
reopening this one.

[Bug tree-optimization/45734] [DR 1116] Devirtualization results in wrong-code

2020-01-05 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=45734

--- Comment #9 from Alexander Cherepanov  ---
DR 1116 is said to be resolved by P0137R1[1]. By looking through it, I don't
see how it covers testcases from this pr where "right" pointer is used (like
the example in comment 0). And it even introduces std::launder for using
"wrong" pointers.

Even better, the C++ standard contains a whole paragraph ([basic.life]p9 -- see
[2]) describing finer details of killing non-dynamic variables. And the example
there features exactly placement new over a local variable:

--
class T { };
struct B {
   ~B();
};

void h() {
   B b;
   new () T;
}   // undefined behavior at block exit
--

The examples in this pr don't have classes with non-trivial destructors so they
don't hit such UB.

[1] http://open-std.org/JTC1/SC22/WG21/docs/papers/2016/p0137r1.html
[2] https://eel.is/c++draft/basic.life#9

[Bug tree-optimization/93153] New: Wrong optimization while devirtualizing after placement new over local var

2020-01-04 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93153

Bug ID: 93153
   Summary: Wrong optimization while devirtualizing after
placement new over local var
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ch3root at openwall dot com
  Target Milestone: ---

It seems gcc doesn't account for a possible type change of local variables due
to storage reuse while devirtualizing method calls (seems to happen in ccp1):

--
#include 
#include 

struct Y {
virtual void foo() { puts("Y"); }
};

struct X : Y {
virtual void foo() { puts("X"); }
};

static_assert(sizeof(X) == sizeof(Y));

int main()
{
Y y;
Y *p = new () X;
p->foo();
}
--
$ g++ -std=c++2a -pedantic -Wall -Wextra test.cc && ./a.out
X
$ g++ -std=c++2a -pedantic -Wall -Wextra -O3 test.cc && ./a.out
Y
--
gcc x86-64 version: g++ (GCC) 10.0.0 20200104 (experimental)
--

[Bug tree-optimization/61502] == comparison on "one-past" pointer gives wrong result

2019-12-30 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61502

--- Comment #37 from Alexander Cherepanov  ---
On 30/12/2019 10.51, rguenther at suse dot de wrote:
>> Obviously, it could be used to fold `a + i == b` to `0` if `a` and `b`
>> are two different known arrays and `i` is unknown
> 
> That's indeed the main thing. Basically it allows points-to analysis work at
> all in the presence of non-constant offsets.

But what is PTA used for? Examples that I've seen all deal with dereferenceable
pointers. And current gcc behaviour is just fine in that case. The problem is
with non-dereferenceable pointers. So is PTA important for cases where
dereferenceability is unknown (or known to be false) or it's just too
complicated to take dereferenceability into account?

[Bug rtl-optimization/49330] Integer arithmetic on addresses optimised with pointer arithmetic rules

2019-12-30 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49330

--- Comment #30 from Alexander Cherepanov  ---
Sure, I've filed pr93105. Thanks for the analysis!

[Bug rtl-optimization/93105] New: Wrong optimization for pointers: provenance of `p + (q1 - q2)` is treated as `q` when the provenance of `p` is unknown

2019-12-30 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93105

Bug ID: 93105
   Summary: Wrong optimization for pointers: provenance of `p +
(q1 - q2)` is treated as `q` when the provenance of
`p` is unknown
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ch3root at openwall dot com
  Target Milestone: ---

gcc seems to wrongly infer provenance of a pointer expression of the form `p +
(q1 - q2)` when the following conditions hold:
- the provenance of the pointer `p` couldn't be tracked;
- the provenance of `q1` or `q2` is known;
- `q1 - q2` couldn't be simplified to get rid of pointers.

--
#include 

__attribute__((noipa)) // imagine it in a separate TU
static int *opaque(int *p) { return p; }

int main()
{
static int x, y;

int *r = opaque() + (opaque() - );

x = 1;
*r = 2;
printf("x = %d\n", x);
}
--
$ gcc -std=c11 -pedantic -Wall -Wextra test.c && ./a.out
x = 2
$ gcc -std=c11 -pedantic -Wall -Wextra -O3 test.c && ./a.out
x = 1
--
gcc x86-64 version: gcc (GCC) 10.0.0 20191230 (experimental)
--

The problem is similar to pr49330. Analysis by Alexander Monakov via bug 49330,
comment 29:

"It's a separate issue, and it's also a regression, gcc-4.7 did not miscompile
this. The responsible pass seems to be RTL DSE."

[Bug tree-optimization/93010] Wrong optimization: provenance affects comparison of saved bits of addresses of dead auto variables

2019-12-30 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93010

--- Comment #2 from Alexander Cherepanov  ---
But gcc seems to be better than clang in arranging compound literals in memory
so here is a gcc-only testcase with them:

--
#include 

int main()
{
unsigned long u, v;

{
u = (unsigned long)&(int){0};
}

{
v = (unsigned long)&(int){0};
}

printf("diff = %lu\n", u - v);
printf("eq = %d\n", u == v);
}
--
$ gcc -std=c11 -pedantic -Wall -Wextra -O3 test.c && ./a.out
diff = 0
eq = 0
--
gcc x86-64 version: gcc (GCC) 10.0.0 20191230 (experimental)
--

[Bug rtl-optimization/49330] Integer arithmetic on addresses optimised with pointer arithmetic rules

2019-12-30 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49330

--- Comment #28 from Alexander Cherepanov  ---
I see the same even with pure pointers. I guess RTL doesn't care about such
differences but it means the problem could bite a relatively innocent code.

--
#include 

__attribute__((noipa)) // imagine it in a separate TU
static int *opaque(int *p) { return p; }

int main()
{
static int x, y;

int *r = opaque() + (opaque() - );

x = 1;
*r = 2;
printf("x = %d\n", x);
}
--
$ gcc -std=c11 -pedantic -Wall -Wextra test.c && ./a.out
x = 2
$ gcc -std=c11 -pedantic -Wall -Wextra -O3 test.c && ./a.out
x = 1
--
gcc x86-64 version: gcc (GCC) 10.0.0 20191230 (experimental)
--

[Bug tree-optimization/61502] == comparison on "one-past" pointer gives wrong result

2019-12-29 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61502

--- Comment #35 from Alexander Cherepanov  ---
What remains in this pr is the original problem.

1. The best way to demonstrate that there is indeed a bug here is probably to
compare representation of pointers directly:

--
#include 
#include 

__attribute__((noipa)) // imagine it in a separate TU
static void *opaque(void *p) { return p; }

int main()
{
int x[5];
int y[2];

void *p = 
void *q =  + 1;

printf("val1: %d\n", p == q);
printf("repr: %d\n", memcmp(, , sizeof(void *)) == 0);

opaque(); // move the next comparison to runtime
printf("val2: %d\n", p == q);

opaque(q);
}
--
$ gcc -std=c11 -pedantic -Wall -Wextra -O3 test.c && ./a.out
val1: 0
repr: 1
val2: 1
--
gcc x86-64 version: gcc (GCC) 10.0.0 20191229 (experimental)
--

C11, 6.2.6.1p4: "Two values (other than NaNs) with the same object
representation compare equal". Our pointers are not NaNs and have the same
representation so should compare equal.

DR 260 allows one to argue that representation of these pointers could change
right between the checks but IMHO this part of DR 260 is just wrong as it makes
copying objects byte-by-byte impossible. See
https://bugs.llvm.org/show_bug.cgi?id=44188 for a nice illustration.

While at it, the testcase also demonstrates that the comparison `p == q` is
unstable.

I'm not taking sides here, just stating that the standard and the compiler
disagree.

2. C++ at some point made results of the discussed comparison unspecified --
https://eel.is/c++draft/expr.eq#3.1 . According to the DR linked to in comment
27, it's done to make the definition usable at compile time. Perhaps
harmonization of the standards should move in this direction, not vice versa.

3. OTOH clang was fixed to be complying with C11.

4. What seems missing in the discussion is a clear description of benefits of
the current gcc's approach. Does it make some optimizations easier to
implement? Does it enable other optimizations?
Obviously, it could be used to fold `a + i == b` to `0` if `a` and `b` are two
different known arrays and `i` is unknown (or known to be exactly the length of
`a`). But this is probably not helpful for aliasing analysis as AA doesn't deal
with past-the-end pointers. And optimization of loops like in comment 19 is
probably not superhelpful too:-)

[Bug tree-optimization/93052] Wrong optimizations for pointers: `p == q ? p : q` -> `q`

2019-12-24 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93052

--- Comment #5 from Alexander Cherepanov  ---
1. It should be noted that the idea of problems arising from `p == q ? p : q`
is from Chung-Kil Hur via bug 65752, comment 15.

2. clang bug -- https://bugs.llvm.org/show_bug.cgi?id=44374.

[Bug tree-optimization/93051] Wrong optimizations for pointers: `if (p == q) use p` -> `if (p == q) use q`

2019-12-24 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93051

--- Comment #3 from Alexander Cherepanov  ---
For completeness, an example with a guessed pointer, based on the testcase from
bug 65752, comment 0, gcc-only (dom2):

--
#include 
#include 

__attribute__((noipa)) // imagine it in a separate TU
static uintptr_t opaque_to_int(void *p) { return (uintptr_t)p; }

int main()
{
int x;
int *p = 
uintptr_t ip = (uintptr_t)p;

uintptr_t iq = 0;
while (iq < ip)
iq++;

uintptr_t ir = opaque_to_int(p); // hide provenance of p

if (ir == iq) {
*p = 1;
*(int *)ir = 2;
printf("result: %d\n", *p);
}
}
--
$ gcc -std=c11 -pedantic -Wall -Wextra -O3 test.c && ./a.out
result: 1
--
gcc x86-64 version: gcc (GCC) 10.0.0 20191224 (experimental)
--

[Bug tree-optimization/93052] Wrong optimizations for pointers: `p == q ? p : q` -> `q`

2019-12-23 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93052

--- Comment #4 from Alexander Cherepanov  ---
Could you please provide a bit more specific reference? If you mean various
discussions about C provenance semantics then they are not about these cases.
All examples in pr93051 and in this pr fully respect provenance -- it's the
compiler who changes the provenance. In some sense dealing with these bugs is a
prerequisite for a meaningful discussion of C provenance semantics: it's hard
to reason about possible boundaries of provenance when there are problems with
cases where provenance is definitely right.

[Bug tree-optimization/65752] Too strong optimizations int -> pointer casts

2019-12-23 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65752

--- Comment #60 from Alexander Cherepanov  ---
It seems to me that problems with the optimization `p == q ? p : q` -> `q`
(comment 15, comment 38, comment 56 etc.) are not specific to past-the-end
pointers. So I filed a separated bug for it with various testcases -- see
pr93052.

The same for the optimization `if (p == q) use p` -> `if (p == q) use q`
(comment 49, comment 52) -- see pr93051.

[Bug tree-optimization/61502] == comparison on "one-past" pointer gives wrong result

2019-12-23 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61502

--- Comment #34 from Alexander Cherepanov  ---
It seems to me that problems with the optimization `if (p == q) use p` -> `if
(p == q) use q` (comment 4 etc.) are not specific to past-the-end pointers. So
I filed a separated bug for it with various testcases -- see pr93051.

The same for the optimization `p == q ? p : q` -> `q` (comment 30) -- see
pr93052.

[Bug tree-optimization/93052] Wrong optimizations for pointers: `p == q ? p : q` -> `q`

2019-12-23 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93052

--- Comment #2 from Alexander Cherepanov  ---
Example with a dead malloc (phiopt2):

--
#include 
#include 
#include 

__attribute__((noipa,optnone)) // imagine it in a separate TU
static void *opaque(void *p) { return p; }

static int been_there = 0;

static uintptr_t f(uintptr_t ip, uintptr_t iq)
{
if (ip == iq) {
been_there = 1;
return ip;
} else {
been_there = 0;
return iq;
}
}

int main()
{
int *q = malloc(sizeof(int));
opaque(q);
uintptr_t iq = (uintptr_t)(void *)q;
free(q);

int *p = malloc(sizeof(int));
opaque(p);
uintptr_t ip = (uintptr_t)(void *)p;

uintptr_t ir = f(ip, iq);
if (been_there) {
*p = 1;
*(int *)(void *)ir = 2;
printf("result: %d\n", *p);
}
}
--
$ gcc -std=c11 -pedantic -Wall -Wextra -Wno-attributes test.c && ./a.out
result: 2
$ gcc -std=c11 -pedantic -Wall -Wextra -Wno-attributes -O3 test.c && ./a.out
result: 1
--
gcc x86-64 version: gcc (GCC) 10.0.0 20191223 (experimental)
--

[Bug tree-optimization/93052] New: Wrong optimizations for pointers: `p == q ? p : q` -> `q`

2019-12-23 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93052

Bug ID: 93052
   Summary: Wrong optimizations for pointers: `p == q ? p : q` ->
`q`
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ch3root at openwall dot com
  Target Milestone: ---

Similar to pr93051.

The optimizer sometimes changes `p == q ? p : q` to `q`. This is wrong when the
actual provenance of `p` differs from that of `q`.
There are two forms -- with the actual conditional operator and with the `if`
statement.

The ideal example would be constructed with the help of restricted pointers but
it's run into a theoretical problem -- see the first testcase in pr92963.
My other examples require two conditionals to eliminate the possibility of UB.
Comparison of integers should give stable results, hopefully that would be
enough to demonstrate the problem.

Example with the conditional operator and with dead malloc (the wrong
optimization seems to be applied before tree-opt):

--
#include 
#include 
#include 

__attribute__((noipa,optnone)) // imagine it in a separate TU
static void *opaque(void *p) { return p; }

int main()
{
int *q = malloc(sizeof(int));
opaque(q);
uintptr_t iq = (uintptr_t)(void *)q;
free(q);

int *p = malloc(sizeof(int));
opaque(p);
uintptr_t ip = (uintptr_t)(void *)p;

uintptr_t ir = ip == iq ? ip : iq;
if (ip == iq) {
*p = 1;
*(int *)(void *)ir = 2;
printf("result: %d\n", *p);
}
}
--
$ gcc -std=c11 -pedantic -Wall -Wextra -Wno-attributes test.c && ./a.out
result: 2
$ gcc -std=c11 -pedantic -Wall -Wextra -Wno-attributes -O3 test.c && ./a.out
result: 1
--
gcc x86-64 version: gcc (GCC) 10.0.0 20191223 (experimental)
--

[Bug tree-optimization/93052] Wrong optimizations for pointers: `p == q ? p : q` -> `q`

2019-12-23 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93052

--- Comment #1 from Alexander Cherepanov  ---
Example with a past-the-end pointer (vrp1, similar to but 93051, comment 0 but
this time with PHI):

--
#include 

__attribute__((noipa,optnone)) // imagine it in a separate TU
static void *opaque(void *p) { return p; }

static int been_there = 0;

static int *f(int *p, int *q)
{
if (p == q) {
been_there = 1;
return p;
} else {
been_there = 0;
return q;
}
}

int main()
{
int x[5];
int y[1];

int *p = x;
int *q = y + 1;
opaque(q);

int *p1 = opaque(p); // prevents early optimization of x==y+1
int *r = f(p1, q);

if (been_there) {
*p = 1;
*r = 2;
printf("result: %d\n", *p);
}
}
--
$ gcc -std=c11 -pedantic -Wall -Wextra -Wno-attributes test.c && ./a.out
result: 2
$ gcc -std=c11 -pedantic -Wall -Wextra -Wno-attributes -O3 test.c && ./a.out
test.c: In function ‘main’:
test.c:33:9: warning: array subscript 1 is outside array bounds of ‘int[1]’
[-Warray-bounds]
   33 | *r = 2;
  | ^~
test.c:22:9: note: while referencing ‘y’
   22 | int y[1];
  | ^
result: 1
--
gcc x86-64 version: gcc (GCC) 10.0.0 20191223 (experimental)
--

[Bug tree-optimization/93051] Wrong optimizations for pointers: `if (p == q) use p` -> `if (p == q) use q`

2019-12-23 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93051

--- Comment #2 from Alexander Cherepanov  ---
Example with a dead malloc (not in tree-opt):

--
#include 
#include 
#include 

__attribute__((noipa,optnone)) // imagine it in a separate TU
static void *opaque(void *p) { return p; }

int main()
{
int *q = malloc(sizeof(int));
uintptr_t iq = (uintptr_t)(void *)q;
free(q);

int *p = malloc(sizeof(int));

uintptr_t ir = (uintptr_t)(void *)opaque(p); // hide provenance of p

if (ir == iq) {
*p = 1;
*(int *)ir = 2;
printf("result: %d\n", *p);
}
}
--
$ gcc -std=c11 -pedantic -Wall -Wextra -Wno-attributes test.c && ./a.out
result: 2
$ gcc -std=c11 -pedantic -Wall -Wextra -Wno-attributes -O3 test.c && ./a.out
result: 1
--
gcc x86-64 version: gcc (GCC) 10.0.0 20191223 (experimental)
--

[Bug tree-optimization/93051] Wrong optimizations for pointers: `if (p == q) use p` -> `if (p == q) use q`

2019-12-23 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93051

--- Comment #1 from Alexander Cherepanov  ---
Example with a restricted pointer (dom2):

--
#include 

__attribute__((noipa,optnone)) // imagine it in a separate TU
static void *opaque(void *p) { return p; }

__attribute__((noipa)) // imagine it in a separate TU
static void f(int *restrict p, int *restrict q)
{
int *r = opaque(p); // hide provenance of p
if (r == q) {
*p = 1;
*r = 2;
printf("result: %d\n", *p);
}

opaque(q);
}

int main()
{
int x;
f(, );
}
--
$ gcc -std=c11 -pedantic -Wall -Wextra -Wno-attributes test.c && ./a.out
test.c: In function ‘main’:
test.c:22:7: warning: passing argument 1 to ‘restrict’-qualified parameter
aliases with argument 2 [-Wrestrict]
   22 | f(, );
  |   ^~  ~~
result: 2
$ gcc -std=c11 -pedantic -Wall -Wextra -Wno-attributes -O3 test.c && ./a.out
test.c: In function ‘main’:
test.c:22:7: warning: passing argument 1 to ‘restrict’-qualified parameter
aliases with argument 2 [-Wrestrict]
   22 | f(, );
  |   ^~  ~~
result: 1
--
gcc x86-64 version: gcc (GCC) 10.0.0 20191223 (experimental)
--

Strictly speaking this example is not about provenance (both pointers have the
same provenance) but, for the optimizer, different restricted pointers probably
play similar roles.

Despite the warning, equal restricted pointers are fine per se -- see, e.g.,
Example 3 in C11, 6.7.3.1p10.

[Bug tree-optimization/93051] New: Wrong optimizations for pointers: `if (p == q) use p` -> `if (p == q) use q`

2019-12-23 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93051

Bug ID: 93051
   Summary: Wrong optimizations for pointers: `if (p == q) use p`
-> `if (p == q) use q`
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ch3root at openwall dot com
  Target Milestone: ---

The optimizer sometimes changes `if (p == q) use p` to `if (p == q) use q` if
it can track the provenance of `q` but not of `p`. This is wrong when the
actual provenance of `p` differs from that of `q`.

Examples demonstrate the problem in different cases:
- with integers and with live pointers (to show that the problem is not in
casts to integers);
- with past-the-end pointers and without them (to show that even changing the
standard to make their comparisons UB will not help);
- with two allocations and with only one (to show that it's not related to how
memory is allocated by the compiler/libc).
Plus, all examples behaves quite well:
- respect provenance of pointers including via casts to integers (so this bug
is not about (im)possibility to clear provenance by casts to integers or
something);
- use only one comparison (so the question of its stability is not touched).

There is some previous analysis of propagation of conditional equivalences in
other bugs, e.g., pr65752#c52, pr61502#c25.

Somewhat more general clang bug -- https://bugs.llvm.org/show_bug.cgi?id=44313.
Previous lengthy discussion is in https://bugs.llvm.org/show_bug.cgi?id=34548.

Example with a past-the-end pointer (the wrong optimization seems to be applied
in vrp1):

--
#include 

__attribute__((noipa,optnone)) // imagine it in a separate TU
static void *opaque(void *p) { return p; }

int main()
{
int x[5];
int y[1];

int *p = x;
int *q = y + 1;

int *r = opaque(p); // hide provenance of p
if (r == q) {
*p = 1;
*r = 2;
printf("result: %d\n", *p);
}

opaque(q);
}
--
$ gcc -std=c11 -pedantic -Wall -Wextra -Wno-attributes test.c && ./a.out
result: 2
$ gcc -std=c11 -pedantic -Wall -Wextra -Wno-attributes -O3 test.c && ./a.out
test.c: In function ‘main’:
test.c:17:9: warning: array subscript 1 is outside array bounds of ‘int[1]’
[-Warray-bounds]
   17 | *r = 2;
  | ^~
test.c:9:9: note: while referencing ‘y’
9 | int y[1];
  | ^
result: 1
--
gcc x86-64 version: gcc (GCC) 10.0.0 20191223 (experimental)
--

The warning nicely illustrates the problem:-)

Based on the example from Harald van Dijk in pr61502#c4.

[Bug tree-optimization/93010] Wrong optimization: provenance affects comparison of saved bits of addresses of dead auto variables

2019-12-19 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93010

--- Comment #1 from Alexander Cherepanov  ---
clang bug -- https://bugs.llvm.org/show_bug.cgi?id=44342

There is a second example there, with memcpy/memcmp, but it doesn't trigger the
bug in gcc so not pasting it here. (Generally gcc seems to be much less
consistent than clang in optimizing integers vs. memcpy/memcmp.)

[Bug tree-optimization/93010] New: Wrong optimization: provenance affects comparison of saved bits of addresses of dead auto variables

2019-12-19 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93010

Bug ID: 93010
   Summary: Wrong optimization: provenance affects comparison of
saved bits of addresses of dead auto variables
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ch3root at openwall dot com
  Target Milestone: ---

It's known that the value of a pointer to an object becomes indeterminate after
the object is dead (C11, 6.2.4p2). Whether its representation becomes
indeterminate is up for debate but let's bypass the issue by saving the
representation while the object is still alive. For example, we can cast it to
an integer. And we'll get an ordinary integer, with a stable value etc., not
affected by changes in the life of the original object. Right?

This seems to be broken for the equality operators when the operands are
constructed from addresses of automatic variables and at least one of these
variables is dead at the time of comparison.

--
#include 

int main()
{
unsigned long u, v;

{
int x[5];
u = (unsigned long)x;
}

{
int y[5];
v = (unsigned long)y;
}

printf("u = %#lx\n", u);
printf("v = %#lx\n", v);
printf("diff = %#lx\n", u - v);
printf("eq = %d\n", u == v);
}
--
$ gcc -std=c11 -pedantic -Wall -Wextra -O3 test.c && ./a.out
u = 0x7ffeb6326180
v = 0x7ffeb6326180
diff = 0
eq = 0
--
gcc x86-64 version: gcc (GCC) 10.0.0 20191219 (experimental)

If "diff == 0" then "eq" should be 1.

[Bug middle-end/92824] Wrong optimization: representation of long doubles not copied even with memcpy

2019-12-17 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92824

--- Comment #4 from Alexander Cherepanov  ---
(In reply to Richard Biener from comment #3)
> shows we're constant folding this to
> 
>   __builtin_printf ("%lf\n",
> 3.36210314311209350626267781732175260259807934484647124011e-4932);

Unfortunately, in this form, it's not clear whether normalization already
happened or not (unlike with clang which prints such numbers as hex).

OTOH I would not expect problems with only one variable after pr71522 was
fixed.

> now.  When we put #pragma GCC unroll 16 before your testcases loop we
> get the following (unrolling and constant folding happens after the
> "bad" transform)
> 
> main ()
> {
>[local count: 82570746]:
>   printf ("%02x ", 128);
>   printf ("%02x ", 128);
>   printf ("%02x ", 0);
>   printf ("%02x ", 0);
>   printf ("%02x ", 128);
>   printf ("%02x ", 0);
>   printf ("%02x ", 0);
>   printf ("%02x ", 0);
>   printf ("%02x ", 0);
>   printf ("%02x ", 0);
>   printf ("%02x ", 0);
>   printf ("%02x ", 0);
>   __builtin_putchar (10);
>   return 0;

After inserting `#pragma GCC unroll 16` in my testcase I get something else:

main ()
{
   [local count: 63136024]:
  printf ("%02x ", 0);
  printf ("%02x ", 0);
  printf ("%02x ", 0);
  printf ("%02x ", 0);
  printf ("%02x ", 0);
  printf ("%02x ", 0);
  printf ("%02x ", 0);
  printf ("%02x ", 1);
  printf ("%02x ", 128);
  printf ("%02x ", 0);
  printf ("%02x ", 0);
  printf ("%02x ", 0);
  printf ("%02x ", 0);
  printf ("%02x ", 0);
  printf ("%02x ", 0);
  printf ("%02x ", 0);
  __builtin_putchar (10);
  return 0;

}

Please note 16 values being printed instead of 12, "1" printed in exponent and
zeroes in padding (exactly as in the description of this pr). I'm testing on
x86-64.

[Bug tree-optimization/92963] Optimization with `restrict`: is `p == q ? p : q` "based" on `p`?

2019-12-16 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92963

--- Comment #2 from Alexander Cherepanov  ---
> p cannot be q as they cannot be based on each other based on my reading of
> 6.7.3.1p3.

Perhaps something like that was intended at some point but I don't see it in
the text. Until you start analyzing actual accesses you cannot draw any
conclusions about values of p and q. Example 3 in the formal definition of
restrict (C11, 6.7.3.1p10) specifically illustrates the case of two equal
restricted pointers.

[Bug tree-optimization/92963] New: Optimization with `restrict`: is `p == q ? p : q` "based" on `p`?

2019-12-16 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92963

Bug ID: 92963
   Summary: Optimization with `restrict`: is `p == q ? p : q`
"based" on `p`?
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ch3root at openwall dot com
  Target Milestone: ---

The question: is `p == q ? p : q` "based" on `p` (as in C11, 6.7.3.1p3)?

First, suppose that the answer is "yes". Then the following program should be
strictly conforming and should always print "2":

--
#include 

__attribute__((__noinline__)) // imagine it in a separate TU
static int f(int *restrict p, int *restrict q)
{
*p = 1;

int *r;
if (p == q)
r = p;
else
r = q;

// is r "based" on p?

*r = 2;

return *p;
}

int main()
{
int x;
printf("%d\n", f(, ));
}
--
$ gcc -std=c11 test.c && ./a.out
2
$ gcc -std=c11 -O3 test.c && ./a.out
1
--
gcc x86-64 version: gcc (GCC) 10.0.0 20191216 (experimental)
--

Ok, fair enough, `p == q ? p : q` is always equal to `q`, doesn't change when
`p` changes and, thus, is not "based" on `p`.
Then the following program (3 differences are marked) should be fine according
to the standard and should always print "2":

--
#include 

__attribute__((__noinline__)) // imagine it in a separate TU
static int f(int *restrict p, int *restrict q)
{
*q = 1; // 1) changed p -> q

int *r;
if (p == q)
r = p;
else
r = q;

// is r "based" on p?

if (p == q) // 2) added if
*r = 2;

return *q; // 3) changed p -> q
}

int main()
{
int x;
printf("%d\n", f(, ));
}
--
$ gcc -std=c11 test.c && ./a.out
2
$ gcc -std=c11 -O3 test.c && ./a.out
1
--

Either way, there is a problem...

[Bug c/92824] New: Wrong optimization: representation of long doubles not copied even with memcpy

2019-12-05 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92824

Bug ID: 92824
   Summary: Wrong optimization: representation of long doubles not
copied even with memcpy
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ch3root at openwall dot com
  Target Milestone: ---

Source code:

--
#include 
#include 

int main()
{
long double x, y;

unsigned char *px = (unsigned char *)
unsigned char *py = (unsigned char *)

// make x pseudo-denormal
x = 0;
px[7] = 0x80;

// set padding
px[10] = 0x80;
px[11] = 0x80;
px[12] = 0x80;
px[13] = 0x80;
px[14] = 0x80;
px[15] = 0x80;

// uncomment if you don't like pseudo-denormals
//memset(, 0x80, sizeof x);

y = x;
memcpy(, , sizeof y);

for (int i = sizeof y; i-- > 0;)
printf("%02x ", py[i]);
printf("\n");
}

--

Results:

--
$ gcc -std=c11 -pedantic -Wall -Wextra test.c && ./a.out
80 80 80 80 80 80 00 00 80 00 00 00 00 00 00 00 
$ gcc -std=c11 -pedantic -Wall -Wextra -O3 test.c && ./a.out
00 00 00 00 00 00 00 01 80 00 00 00 00 00 00 00 
--

gcc x86-64 version: gcc (GCC) 10.0.0 20191205 (experimental)


Another variation of pr92486 and pr71475. This time on trunk and without
structs. Feel free to close as dup.

There are two issues here:
- padding is not copied;
- if pseudo-denormals are not considered trap representions in gcc, their
non-padding bytes are not preserved too (the exponent is changed from 0 to 1).
(OTOH it seems sNaN in double on x86-32 is not converted to qNaN in such a
situation.)

[Bug tree-optimization/92486] Wrong optimization: padding in structs is not copied even with memcpy

2019-12-03 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92486

--- Comment #16 from Alexander Cherepanov  ---
BTW this bug combines nicely with pr71460. Possible effects:
- padding in a long double inside a struct is lost on x86-64;
- sNaN is converted to qNaN in a double inside a struct on x86-32.

Both are illustrated at https://godbolt.org/z/VKmGbY . E.g., the former, this:

--
#include 

struct s1 { long double d; };
void f1(struct s1 *restrict p, struct s1 *restrict q)
{
*p = *q;
memcpy(p, q, sizeof(struct s1));
}
--

is compiled into this:

--
f1:
fldt(%rsi)
fstpt   (%rdi)
ret
--

[Bug tree-optimization/92486] Wrong optimization: padding in structs is not copied even with memcpy

2019-11-14 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92486

--- Comment #9 from Alexander Cherepanov  ---
> Now as an exercise build a complete testcase for the DSE issue above.

Source code:

--
#include 
#include 

struct s {
char c;
int i;
};

__attribute__((noinline,noclone))
void f(struct s *p, struct s *q)
{
struct s w;

memset(, 0, sizeof(struct s));
w = *q;

memset(p, 0, sizeof(struct s));
*p = w;
}

int main()
{
struct s x;
memset(, 1, sizeof(struct s));

struct s y;
memset(, 2, sizeof(struct s));

f(, );

for (unsigned char *p = (unsigned char *) p < (unsigned char *) +
sizeof(struct s); p++)
printf("%d", *p);
printf("\n");
}
--

Results:

--
$ gcc -std=c11 -pedantic -Wall -Wextra test.c && ./a.out

$ gcc -std=c11 -pedantic -Wall -Wextra -O3 test.c && ./a.out
1222
--

gcc x86-64 version: gcc (GCC) 10.0.0 20191114 (experimental)


But from the C standard POV this case is much more clear: there is no problem
as stores into a struct make its padding unspecified (C11, 6.2.6.1p6). OTOH
this sample demonstrates the problem with trunc, so it could be more convenient
for testing.

[Bug tree-optimization/92486] New: Wrong optimization: padding in structs is not copied even with memcpy

2019-11-12 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92486

Bug ID: 92486
   Summary: Wrong optimization: padding in structs is not copied
even with memcpy
   Product: gcc
   Version: 9.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ch3root at openwall dot com
  Target Milestone: ---

Sometimes padding in structs is not copied even with memcpy. AIUI this is
considered a bug in gcc (similar to pr71452 and pr71522). The code:

--

#include 
#include 

struct s {
char c;
int i;
};

__attribute__((noinline,noclone))
void f(struct s *p, struct s *q)
{
struct s w = *q;
memcpy(, q, sizeof(struct s));

*p = w;
memcpy(p, , sizeof(struct s));
}

int main()
{
struct s x;
memset(, 1, sizeof(struct s));

struct s y;
memset(, 2, sizeof(struct s));

f(, );

for (unsigned char *p = (unsigned char *) p < (unsigned char *) +
sizeof(struct s); p++)
printf("%d", *p);
printf("\n");
}

--

According to https://godbolt.org/z/fSEGka, x86-64 gcc, versions from 6.3 to 9.2
(but not trunk), with options `-std=c11 -Wall -Wextra -pedantic -O3` miscompile
the function `f` and the program outputs `1222` instead of ``. I
guess that `memcpy`s are optimized away as they fully duplicate assignments,
then assignments are scalarized and the padding is lost.

[Bug tree-optimization/82224] Strict-aliasing not noticing valid aliasing of two unions with active members

2017-10-28 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82224

--- Comment #11 from Alexander Cherepanov  ---
When I was converting the original testcase to malloc I started with a 
simple assignment with a cast. Clang optimized it as well (and I posted 
it to the clang bug) bug gcc didn't. Essentially this example:

void f(long long *p)
{
   *(long *)p = *p;
}

tree optimization turns to

f (long long int * p)
{
   long long int _1;

[100.00%] [count: INV]:
   _1 = *p_3(D);
   MEM[(long int *)p_3(D)] = _1;
   return;

}

The same happens with "new (p) long (*p);". So the question: why is 
that? If this result is intended then perhaps memcpys and assignment of 
union members from this bug report could be converted by gcc to the same 
form? It would solve the problem.

[Bug tree-optimization/82224] Strict-aliasing not noticing valid aliasing of two unions with active members

2017-10-27 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82224

--- Comment #8 from Alexander Cherepanov  ---
On 2017-10-27 12:41, rguenth at gcc dot gnu.org wrote:
>> And with allocated memory (C; add placement new's for C++):
>>
>> --
>> #include 
>> #include 
>> #include 
>>
>> static long test(long *px, long long *py, void *pu)
>> {
>>*px = 0;
>>*py = 1;
>>
>>// change effective type from long long to long
>>long tmp;
>>memcpy(, pu, sizeof(tmp));
>>memcpy(pu, , sizeof(tmp));
> 
> I believe this one is invalid - memcpy transfers the dynamic
> type and *pu is currently 'long long'.  So it's either not
> changing the dynamic type because, well, the type transfers
> through 'tmp' or you are accessing 'tmp' with declared type
> long as 'long long'.
AIUI memcpy is always valid if there is enough space, no matter what 
types its source and destination have. It accesses both of them as 
arrays of chars -- C11, 7.24.1p1: "The header  declares one 
type and several functions, and defines one  macro useful for 
manipulating arrays of character type and other objects treated as 
arrays of character type."

OTOH memcpy transfers effective type from the source to the destination 
but only if the destination has no declared type -- C11, 6.5p6: "If a 
value is copied into an object having no declared type using memcpy or 
memmove, or is copied as an array of character type, then the effective 
type of the modified object for that access and for subsequent accesses 
that do not modify the value is the effective type of the object from 
which the value is copied, if it has one."
Placement new in C++ can change dynamic types of objects with declared 
type but I don't think there are such facilities in C.

So in this example the first memcpy copies the value of *pu to tmp but 
drops its effective type (long long) and the second memcpy copies the 
value and passes the effective type of 'tmp' (long) along.

[Bug tree-optimization/81028] GCC miscompiles placement new

2017-10-26 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81028

Alexander Cherepanov  changed:

   What|Removed |Added

 CC||ch3root at openwall dot com

--- Comment #6 from Alexander Cherepanov  ---
Seems to be an exact dup of bug 57359.

Clang bug -- https://bugs.llvm.org/show_bug.cgi?id=21725 .

[Bug tree-optimization/57359] wrong code for union access at -O3 on x86_64-linux

2017-10-24 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57359

Alexander Cherepanov  changed:

   What|Removed |Added

 CC||ch3root at openwall dot com

--- Comment #12 from Alexander Cherepanov  ---
Still reproducible if the number of iterations is changed to 3.

I've also converted the testcase to allocated memory:

--
#include 
#include 

__attribute__((__noinline__,__noclone__))
void test(int *pi, long *pl, int k, int *pa)
{
  for (int i = 0; i < 3; i++) {
pl[k] = // something that doesn't change but have to be calculated
*pa; // something that potentially can be changed by assignment to *pi
*pi = 0;
  }
}

int main(void)
{
  int *pi = malloc(10);
  int a = 1;

  test(pi, (void *)pi, 0, );

  printf("%d\n", *pi);
}
--

Results:

--
$ gcc -std=c11 -pedantic -Wall -Wextra test.c && ./a.out
0
$ gcc -std=c11 -pedantic -Wall -Wextra -O3 test.c && ./a.out
1
--

gcc version: gcc (GCC) 8.0.0 20171024 (experimental)

[Bug tree-optimization/82697] New: Wrong optimization with aliasing and "if"

2017-10-24 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82697

Bug ID: 82697
   Summary: Wrong optimization with aliasing and "if"
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ch3root at openwall dot com
  Target Milestone: ---

A testcase is simplified from
https://stackoverflow.com/questions/46592132/is-this-use-of-the-effective-type-rule-strictly-conforming
:

--
#include 
#include 

__attribute__((__noinline__))
void test(int *pi, long *pl, int f)
{
  *pl = 0;

  *pi = 1;

  if (f)
*pl = 2;
}

int main(void)
{
  void *p = malloc(10);

  test(p, p, 0);

  printf("%d\n", *(int *)p);
}
--

Results:

--
$ gcc -std=c11 -pedantic -Wall -Wextra test.c && ./a.out
1
$ gcc -std=c11 -pedantic -Wall -Wextra -O3 test.c && ./a.out
0
--

gcc version: gcc (GCC) 8.0.0 20171024 (experimental)

[Bug tree-optimization/82224] Strict-aliasing not noticing valid aliasing of two unions with active members

2017-10-23 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82224

Alexander Cherepanov  changed:

   What|Removed |Added

 CC||ch3root at openwall dot com

--- Comment #6 from Alexander Cherepanov  ---
Here are simplified testcases. With a union (C and C++):

--
#include 

union u {
  long x;
  long long y;
};

static long test(long *px, long long *py, union u *pu)
{
  pu->x = 0;// make .x active member (for C++)
  *px = 0;  // access .x via pointer

  pu->y = pu->x;// switch active member to .y (for C++)
  *py = 1;  // access .y via pointer

  pu->x = pu->y;// switch active member back to .x
  return *px;   // access .x via pointer
}

int main(void)
{
  union u u;

  printf("%ld\n", test(, , ));
}
--

Results:

--
$ gcc -std=c11 -pedantic -Wall -Wextra test.c && ./a.out
1
$ gcc -std=c11 -pedantic -Wall -Wextra -O3 test.c && ./a.out
0
--

And with allocated memory (C; add placement new's for C++):

--
#include 
#include 
#include 

static long test(long *px, long long *py, void *pu)
{
  *px = 0;
  *py = 1;

  // change effective type from long long to long
  long tmp;
  memcpy(, pu, sizeof(tmp));
  memcpy(pu, , sizeof(tmp));

  return *px;
}

int main(void)
{
  void *p = malloc(10);

  printf("%ld\n", test(p, p, p));
}
--

Results:

--
$ gcc -std=c11 -pedantic -Wall -Wextra test.c && ./a.out
1
$ gcc -std=c11 -pedantic -Wall -Wextra -O3 test.c && ./a.out
0
--

gcc version: gcc (GCC) 8.0.0 20171023 (experimental)

[Bug c/71766] New: Strange position of "error: request for member ‘...’ in something not a structure or union"

2016-07-05 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71766

Bug ID: 71766
   Summary: Strange position of "error: request for member ‘...’
in something not a structure or union"
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ch3root at openwall dot com
  Target Milestone: ---

Only the first letter of 'offsetof' is shown in another color for the following
testcase.

Source code:

--
#include 

int main()
{
  +offsetof(int, a);
}
--

Results:

--
$ gcc -std=c11 -pedantic -Wall -Wextra -O3 test.c && ./a.out
In file included from test.c:1:0:
test.c: In function ‘main’:
test.c:5:4: error: request for member ‘a’ in something not a structure or union
   +offsetof(int, a);
^
--

gcc version: gcc (GCC) 7.0.0 20160704 (experimental)


For comparison:

--
$ clang -std=c11 -Weverything -O3 test.c && ./a.out
test.c:5:4: error: offsetof requires struct, union, or class type, 'int'
invalid
  +offsetof(int, a);
   ^~~~
.../lib/clang/3.9.0/include/stddef.h:120:24: note: expanded from macro
'offsetof'
#define offsetof(t, d) __builtin_offsetof(t, d)
   ^  ~
1 error generated.
--

clang version: clang version 3.9.0 (trunk 274502)

[Bug c/71742] New: Wrong formulation of "error: flexible array member in otherwise empty struct"

2016-07-03 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71742

Bug ID: 71742
   Summary: Wrong formulation of "error: flexible array member in
otherwise empty struct"
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ch3root at openwall dot com
  Target Milestone: ---

Source code:

--
int main()
{
  struct s {
int :1;
int a[];
  };
}
--

Results:

--
$ gcc -std=c11 -pedantic -Wall -Wextra -O3 test.c && ./a.out
test.c: In function ‘main’:
test.c:5:9: error: flexible array member in otherwise empty struct
 int a[];
 ^
--

gcc version: gcc (GCC) 7.0.0 20160627 (experimental)

gcc is right in detecting an error in this testcase but the message is wrong
because the struct is not otherwise empty. The problem is absence of named
members.

The relevant rule:

C11, 6.7.2.1p18: "As a special case, the last element of a structure with more
than one named member may have an incomplete array type; this is called a
flexible array member."

[Bug tree-optimization/61502] == comparison on "one-past" pointer gives wrong result

2016-06-27 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61502

--- Comment #19 from Alexander Cherepanov  ---
(In reply to jos...@codesourcery.com from comment #3)
> Except within a larger object, I'm not aware of any reason the cases of 
> two objects following or not following each other in memory must be 
> mutually exclusive.

Apparently some folks use linker scripts to get a specific arrangement of
objects.

A fresh example is a problem in Linux -- https://lkml.org/lkml/2016/6/25/77 . A
simplified example from http://pastebin.com/4Qc6pUAA :

extern int __start[];
extern int __end[];

extern void bar(int *);

void foo()
{
for (int *x = __start; x != __end; ++x)
bar(x);
}

This is optimized into an infinite loop by gcc 7 at -O.

[Bug c/71613] Useful warnings silenced when macros from system headers are used

2016-06-27 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71613

--- Comment #5 from Alexander Cherepanov  ---
This bug also affects -pedantic-errors, i.e. it's not merely a cosmetic 
bug but also a conformance issue. I don't know how -std=c11 can ignore 
constraint violations but -pedantic-errors is supposed to fix it.

BTW gcc extension "non-int enum constants" is not documented?

[Bug sanitizer/71611] UBSan shows type '' for enums based on long

2016-06-22 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71611

--- Comment #2 from Alexander Cherepanov  ---
On 06/22/2016 06:08 PM, jakub at gcc dot gnu.org wrote:
> This has nothing to do with enums based on long, but about anonymous enums.
>  is what is used for all types that don't have a name.
> Would you prefer  instead or ?

If I change LONG_MIN to INT_MIN in my example then the message talks 
about 'int':

test.c:6:5: runtime error: negation of -2147483648 cannot be represented 
in type 'int'; cast to an unsigned type to negate this value to itself

So it definitely depends on an integer type that the enum is based on. 
But maybe this was not intentional?

[Bug c/71613] New: Useful warnings silenced when macros from system headers are used

2016-06-21 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71613

Bug ID: 71613
   Summary: Useful warnings silenced when macros from system
headers are used
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ch3root at openwall dot com
  Target Milestone: ---

Useful warnings seem to be silenced when macros from system headers are used.
For example, in the example below a warning would be useful for both enums but
only one warning is shown. The same can probably happen for other warnings.

Source code:

--
#include 

int main()
{
  enum { e1 = LLONG_MIN };
  enum { e2 = +LLONG_MIN };
}
--

Results:

--
$ gcc -std=c11 -pedantic -Wall -Wextra -O3 test.c && ./a.out
test.c: In function ‘main’:
test.c:6:15: warning: ISO C restricts enumerator values to range of ‘int’
[-Wpedantic]
   enum { e2 = +LLONG_MIN };
   ^
--

gcc version: gcc (GCC) 7.0.0 20160616 (experimental)

[Bug sanitizer/71611] New: UBSan shows type '' for enums based on long

2016-06-21 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71611

Bug ID: 71611
   Summary: UBSan shows type '' for enums based on long
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: sanitizer
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ch3root at openwall dot com
CC: dodji at gcc dot gnu.org, dvyukov at gcc dot gnu.org,
jakub at gcc dot gnu.org, kcc at gcc dot gnu.org
  Target Milestone: ---

Source code:

--
#include 

int main()
{
  enum { c = LONG_MIN } x = c;
  x = -x;
  return x;
}
--

Results:

--
$ gcc -std=c11 -pedantic -Wall -Wextra -O3 -fsanitize=undefined test.c &&
./a.out
test.c:6:5: runtime error: negation of -9223372036854775808 cannot be
represented in type ''; cast to an unsigned type to negate this value
to itself
--

gcc version: gcc (GCC) 7.0.0 20160616 (experimental)

For comparison:

--
$ clang -std=c11 -Weverything -O3 -fsanitize=undefined test.c && ./a.out
test.c:5:10: warning: ISO C restricts enumerator values to range of 'int'
(-9223372036854775808 is too small) [-Wpedantic]
  enum { c = LONG_MIN } x = c;
 ^   
test.c:7:10: warning: implicit conversion loses integer precision: 'enum
(anonymous enum at test.c:5:3)' to 'int' [-Wshorten-64-to-32]
  return x;
  ~~ ^
2 warnings generated.
test.c:6:7: runtime error: negation of -9223372036854775808 cannot be
represented in type 'long'; cast to an unsigned type to negate this value to
itself
--

clang version: clang version 3.9.0 (trunk 271312)

[Bug c/71610] New: Improve location for "warning: ISO C restricts enumerator values to range of ‘int’ [-Wpedantic]"?

2016-06-21 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71610

Bug ID: 71610
   Summary: Improve location for "warning: ISO C restricts
enumerator values to range of ‘int’ [-Wpedantic]"?
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ch3root at openwall dot com
  Target Milestone: ---

Only the first token of the offending value is underlined for this warning.
Underlining all the value (like in clang) seems to be better.

Source code:

--
int main()
{
  enum { c = -30 };
}
--

Results:

--
$ gcc -std=c11 -pedantic -Wall -Wextra -O3 test.c && ./a.out
test.c: In function ‘main’:
test.c:3:14: warning: ISO C restricts enumerator values to range of ‘int’
[-Wpedantic]
   enum { c = -30 };
  ^
--

gcc version: gcc (GCC) 7.0.0 20160616 (experimental)

For comparison:

--
$ clang -std=c11 -Weverything -O3 test.c && ./a.out
test.c:3:10: warning: ISO C restricts enumerator values to range of 'int'
(-30 is too small) [-Wpedantic]
  enum { c = -30 };
 ^   ~~~
1 warning generated.
--

clang version: clang version 3.9.0 (trunk 271312)

[Bug c/71598] New: Wrong optimization with aliasing enums

2016-06-20 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71598

Bug ID: 71598
   Summary: Wrong optimization with aliasing enums
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ch3root at openwall dot com
  Target Milestone: ---

Source code:

--
extern void abort (void);

enum e1 { c1 };
enum e2 { c2 };

__attribute__((noinline,noclone))
int f(enum e1 *p, enum e2 *q)
{
  *p = 1;
  *q = 2;
  return *p;
}

int main()
{
  unsigned x;

  if (f(, ) != 2)
abort();
}
--

Results:

--
$ gcc -std=c11 -pedantic -Wall -Wextra -O3 test.c && ./a.out
Aborted
--

gcc version: gcc (GCC) 7.0.0 20160616 (experimental)

In gcc such enums are compatible with unsigned int. Hence they could alias. But
gcc seem to (wrongly) assume that *p and *q cannot refer to the same object.

[Bug c/71597] New: Confusing warning for incompatible enums

2016-06-20 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71597

Bug ID: 71597
   Summary: Confusing warning for incompatible enums
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ch3root at openwall dot com
  Target Milestone: ---

Source code:

--
enum { a } x; // (1)
unsigned x; // (2)
enum { b } x; // (3)

int main()
{
}
--

Results:

--
$ gcc -std=c11 -pedantic -Wall -Wextra -O3 test.c && ./a.out
test.c:3:12: error: conflicting types for ‘x’
 enum { b } x; // (3)
^
test.c:2:10: note: previous declaration of ‘x’ was here
 unsigned x; // (2)
  ^
--

gcc version: gcc (GCC) 7.0.0 20160616 (experimental)

It would be better to display the conflicting declaration (1) instead of just
the previous one (2).

[Bug target/71460] Copying structs can trap (on x86-32) due to SNaN to QNaN

2016-06-15 Thread ch3root at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71460

--- Comment #27 from Alexander Cherepanov  ---
On 06/12/2016 01:09 AM, joseph at codesourcery dot com wrote:
>>> C11 does not
>>> consider sNaNs, and TS 18661 is explicitly stating otherwise for them.
>>
>> You are talking about C11 + TS 18661 which is a different version of C.
>
> The semantics of C11 + TS 18661 are a refinement of those of C11 (with
> sNaNs in C11 being trap representations by virtue of not acting like any
> kind of value considered by C11, so a refinement of C11 semantics can
> apply any semantics it likes to them).

Ok, I see. It seems that it would be fine with C11 to consider sNaN and 
qNaN alternative representations of the same value when traps are off 
but it would be incompatible with TS 18661 so it's more convenient to 
consider sNaN a trap representation. This makes the question of traps 
irrelevant and makes it compatible with TS 18661.

>> I'm not sure. It would be nice to have such a clear distinction between
>> values and representations but C is tricky. What is the value of a union
>> after memset? Suppose a value stored into an active member of a union
>> admits several representations, is taking inactive member of this union
>> (aka type-punning) is required to give the same result before and after
>> applying lvalue-to-rvalue conversion to the union?
>>
>> Heck, even the example that started this PR is not that easy. If a
>> member of a structure has a trap representation what is the value of
>> this structure? Is copying of this structure required to preserve the
>> exact representation of this member? Can this trap representation be
>> changed to a non-trap one?
>
> All these memory model issues would best be raised directly with WG14,

What is the best way to do it?

> possibly working together with the Cerberus people, not with individual
> implementations.

I'm afraid I don't quite understand their approach/aim and the choice of 
questions. OTOH one of the most useful to me results from the whole work 
was your reply to their questionnaire. Thank you for this!

>> So the idea is that gcc on x86-32 with feenableexcept(FE_INVALID) could
>> be made conforming with C11 + TS 18661? Is there anything there that
>
> As soon as you use feenableexcept you are completely outside the scope of
> any standards.

Isn't feenableexcept what IEC 60559 describes in the clause 8.1: 
"Language standards should define, and require implementations to 
provide, means for the user to associate alternate exception handling 
attributes with blocks (see 4.1)."? Why it's not in TS 18661?

Or you mean that a mode with traps for the invalid operation exception 
is not supported by gcc at all? Then this bug is mostly non-issue 
because IMHO the main problem is crashing and not the changes of 
representations of NaNs.

>> allows to trap on a return from a function (when the result of the
>> function is not used)? C11, 6.8.6.4p3, talks about a conversion "as if
>> by assignment" only for a case of differing types and I don't see TS
>> 18661 patching this clause of C11.
>
> The intent is clear, because C11 + Annex F semantics are meant to be a
> refinement of those of base C11 (Annex F being an optional extension to
> base C11), and from F.6 you can tell that such cases must be permitted to
> do a conversion to the same type, so it must be intended to allow such a
> conversion.  But I'll raise this for clarification in TS 18661 anyway.

Yes, this is probably a simple case. I haven't seen F.6 before but it 
surprising doesn't change 6.8.6.4p3 much.

I'm less sure about mere lvalue-to-rvalue conversion. Trapping at lvalue 
conversion makes impossible to apply isnan  to sNaNs even though 
"[...] these macros raise no floating-point exceptions, even if an 
argument is a signaling NaN."

>> I mean bug 57484 is more important case than lone loads from volatiles.
>
> 57484 is a closed C++ bug.  I think you mean 56831 as the substantive QoI
> bug for such cases.

No. 56831 is about passing sNaNs to functions as arguments. This 
conforms to TS 18661 and this could be fixed in gcc. The same for 
assignment.

57484 is about returning a value from a function. This is technically 
not conforming and this couldn't be fixed in gcc without breaking ABI. 
If a separate bug for C is required I can file one.

  1   2   >