[Bug middle-end/88971] Branch optimization inconsistency (missed optimization)

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88971

Andrew Pinski  changed:

   What|Removed |Added

  Component|libstdc++   |middle-end
   Severity|normal  |enhancement

[Bug tree-optimization/83190] missing strlen optimization of the empty string

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83190

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement

[Bug tree-optimization/82911] missing strlen optimization for strncpy with constant strings and constant bound

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82911

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2021-09-05
 Status|UNCONFIRMED |NEW
   Severity|normal  |enhancement

--- Comment #1 from Andrew Pinski  ---
Confirmed.

A related testcase is:
void f1 (char *d, char *e, bool b)
{
  d[2] = 0;

  if (__builtin_strlen (d) > 2)   // not eliminated but could be
__builtin_abort ();
}

where the strlen's range should be [0,2]. Maybe we can add a class to the
ranger for string and do the optimization that way instead.
so the null store to d[2] the range for the string becomes [0,2].

[Bug rtl-optimization/93525] Left shift and arithmetic shift could be futher simplified in simplify-rtx.c

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93525

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement

[Bug tree-optimization/93539] memmove over self with result of string function not eliminated

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93539

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2021-09-05
 Ever confirmed|0   |1
   Severity|normal  |enhancement
 Depends on||82991
 Status|UNCONFIRMED |NEW
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=82991

--- Comment #1 from Andrew Pinski  ---
Confirmed, PR 82991 is related and will most likely solve this too.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82991
[Bug 82991] memcpy and strcpy return value can be assumed to be equal to first
argument

[Bug tree-optimization/93556] lower mempcpy to memcpy when result is unused

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93556

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Last reconfirmed||2021-09-05
   Severity|normal  |enhancement

--- Comment #1 from Andrew Pinski  ---
Confirmed.

[Bug tree-optimization/93560] strstr(s, s) not folded to s

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93560

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2021-09-05
 Ever confirmed|0   |1

--- Comment #1 from Andrew Pinski  ---
Confirmed, LLVM does this.

[Bug target/93737] inline memmove for insertion into small arrays

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93737

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2021-09-05
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Severity|normal  |enhancement

--- Comment #5 from Andrew Pinski  ---
Confirmed.

[Bug target/93396] [RX] tail call optimization does not work with indirect call

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93396

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement

[Bug middle-end/91409] Missed optimization on `labels as values` expression

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91409

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement
  Component|target  |middle-end

[Bug rtl-optimization/52082] Memory loads not rematerialized

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52082

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement

--- Comment #3 from Andrew Pinski  ---
One thing I noticed that LLVM does to reduce the register pressure is:
(z ? v4 [k] : v3 [k])

Gets pulled out of the loop such that it is:
tmpaddr = z ? v4 : v3;

and then inside the loop it does:
(tempaddr)[k]

GCC still has (I changed the bb order just so it is easier to see what is going
on):
  if (z_39(D) != 0)
goto ; [50.00%]
  else
goto ; [50.00%]

   [local count: 5427362]:
  _21 = v3.3_18 + _157;
  iftmp.1_40 = *_21;
  goto ; [100.00%]

   [local count: 5427362]:
  _17 = v4.2_14 + _157;
  iftmp.1_41 = *_17;

   [local count: 10854724]:
  # m_8 = PHI 
  if (m_8 != 0B)
goto ; [94.50%]
  else
goto ; [5.50%]

we should able to do the similar it seems and need two less registers; one to
hold z and one to hold either v3 or v4.  This won't be enough for this testcase
but it will be something.

[Bug target/91103] AVX512 vector element extract uses more than 1 shuffle instruction; VALIGND can grab any element

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91103

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement

[Bug ipa/88231] aligned functions laid down inefficiently

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88231

Andrew Pinski  changed:

   What|Removed |Added

   Severity|minor   |enhancement

[Bug tree-optimization/89043] strcat (strcpy (d, a), b) not folded to stpcpy (strcpy (d, a), b)

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89043

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement

[Bug tree-optimization/86604] phiopt missed optimization of conditional add

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86604

Andrew Pinski  changed:

   What|Removed |Added

 CC||pinskia at gcc dot gnu.org
   Severity|normal  |enhancement

--- Comment #2 from Andrew Pinski  ---
Maybe something like:
(simplify
 (cond (ne bool@0 integer_zerop) (plus @1 integer_onep) @1)
 (plus (convert @0) @1))

Where bool is defined to be a var that is in the range of [0,1].
This seems like what LLVM does.

[Bug tree-optimization/86241] duplicate strlen-like snprintf calls not folded

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86241

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Severity|normal  |enhancement
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2021-09-05

--- Comment #3 from Andrew Pinski  ---
Confirmed.

[Bug tree-optimization/86339] DOM does not handle RHS COND_EXPRs well

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86339

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement

[Bug tree-optimization/85116] std::min_element does not optimize well with inlined predicate

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85116

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement
   Last reconfirmed|2018-03-29 00:00:00 |2021-9-4
  Component|libstdc++   |tree-optimization

[Bug middle-end/86085] I/O built-ins considered argument clobbers

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86085

--- Comment #3 from Andrew Pinski  ---
I thought builtin_fnspec and friends would have optimized this case but no.
In fact starting with GCC 10, f even regresses, starting with r10-2814.

[Bug middle-end/86085] I/O built-ins considered argument clobbers

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86085

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed|2018-06-13 00:00:00 |2021-9-4
   Severity|normal  |enhancement
  Component|tree-optimization   |middle-end

[Bug target/102205] New: vec + 1 could be done as vec - (-1)

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102205

Bug ID: 102205
   Summary: vec + 1 could be done as vec - (-1)
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: enhancement
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---
Target: x86_64

Take:
template  using V [[gnu::vector_size(16)]] = T;
auto a1(V<  int> b) { return 1 + b; }

 CUT 
Currently GCC produces:
a1(int __vector(4)):
paddd   .LC0(%rip), %xmm0
ret
.cfi_endproc
.LFE0:
.size   a1(int __vector(4)), .-a1(int __vector(4))
.section.rodata.cst16,"aM",@progbits,16
.align 16
.LC0:
.long   1
.long   1
.long   1
.long   1


But it might be best if GCC produces (like LLVM):
a1(int __vector(4)):
pcmpeqd %xmm1, %xmm1
psubd   %xmm1, %xmm0
retq

[Bug target/85324] missing constant propagation on SSE/AVX conversion intrinsics

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85324

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement
   Last reconfirmed|2018-04-11 00:00:00 |2021-9-4

[Bug c++/102204] New: OpenMP offload map type restriction

2021-09-04 Thread xw111luoye at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102204

Bug ID: 102204
   Summary: OpenMP offload map type restriction
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: xw111luoye at gmail dot com
  Target Milestone: ---

With branch devel/omp/gcc-11
I'm getting
/home/yeluo/opt/qmcpack/build_rtx3060_gcc_offload_real/src/config.h:42:29:
error: array section does not have mappable type in ‘map’ clause
   42 |   #define PRAGMA_OFFLOAD(x) _Pragma(x)
  | ^~~
/home/yeluo/opt/qmcpack/src/Particle/SoaDistanceTableAAOMPTarget.h:84:5: note:
in expansion of macro ‘PRAGMA_OFFLOAD’
   84 | PRAGMA_OFFLOAD("omp target enter data map(to : this[:1])")
  | ^~
In file included from
/home/yeluo/opt/qmcpack/src/Particle/createDistanceTableAAOMPTarget.cpp:19:
/home/yeluo/opt/qmcpack/src/Particle/SoaDistanceTableAAOMPTarget.h:31:8: note:
type ‘qmcplusplus::SoaDistanceTableAAOMPTarget’ with virtual
members is not mappable
   31 | struct SoaDistanceTableAAOMPTarget : public DTD_BConds,
public DistanceTableData
  |^~~

because SoaDistanceTableAAOMPTarget is a derived class and there is virtual
function overriding.
https://github.com/QMCPACK/qmcpack/blob/1a7af8e589726a91da94e5f6ad8b4e8d9e2acd4d/src/Particle/SoaDistanceTableAAOMPTarget.h#L31

In my case virtual functions are never called in offload region and I map
"this[:1]" for easy access a fixed data set. So I'm expecting just bit wise
copy to the device.

please remove this restriction.

[Bug c++/98869] Allowing mapping this in OpenMP target

2021-09-04 Thread xw111luoye at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98869

--- Comment #3 from Ye Luo  ---
This doesn't work with gcc 11.2 but works on devel/omp/gcc-11 branch.

[Bug middle-end/84756] Multiplication done twice just to get upper and lower parts of product

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84756

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2021-09-05
  Component|target  |middle-end
 Status|UNCONFIRMED |NEW
   Severity|normal  |enhancement
 Ever confirmed|0   |1

--- Comment #2 from Andrew Pinski  ---
Confirmed, we should be able to do part (all?) of this at the gimple level:
  _3 = a_6(D) w* b_7(D);
  _4 = _3 >> 64;
  _5 = (long unsigned int) _4;
  *upper_9(D) = _5;
  _11 = a_6(D) * b_7(D);
  return _11;

(long unsigned int)_3 is the same as _11.

[Bug ipa/84312] Variadic function without named argument not inlined

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84312

Andrew Pinski  changed:

   What|Removed |Added

  Known to fail||9.4.0
 Resolution|--- |FIXED
 Status|NEW |RESOLVED
 CC||marxin at gcc dot gnu.org
  Component|tree-optimization   |ipa
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=70929
   Severity|normal  |enhancement
  Known to work||10.1.0

--- Comment #2 from Andrew Pinski  ---
Fixed in GCC 10 by r10-.

[Bug tree-optimization/85406] Unnecessary blend when vectorizing short-cutted calculations

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85406

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement

--- Comment #7 from Andrew Pinski  ---
I Noticed clang/LLVM does not do this either nor ICC.

[Bug rtl-optimization/80301] Sub-optimal code with an array of structs offsetted inside a struct global on x86/x86_64 at -O2

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80301

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement

--- Comment #4 from Andrew Pinski  ---
We are able to do the 2->2 combine now (after r9-2064):
Trying 9 -> 10:
9: {r87:DI=r86:DI+0x2;clobber flags:CC;}
  REG_DEAD r86:DI
  REG_UNUSED flags:CC
   10: flags:CCZ=cmp([r87:DI*0x8+`m'],r83:SI)
Failed to match this instruction:
(parallel [
(set (reg:CCZ 17 flags)
(compare:CCZ (mem:SI (plus:DI (mult:DI (reg:DI 86 [ indexD.2442 ])
(const_int 8 [0x8]))
(const:DI (plus:DI (symbol_ref:DI ("m") [flags 0x2] 
)
(const_int 16 [0x10] [1
mD.2375.sD.2374[index_4(D)].aD.2372+0 S4 A64])
(reg:SI 83 [  ])))
(set (reg:DI 87)
(plus:DI (reg:DI 86 [ indexD.2442 ])
(const_int 2 [0x2])))
])
Failed to match this instruction:
(parallel [
(set (reg:CCZ 17 flags)
(compare:CCZ (mem:SI (plus:DI (mult:DI (reg:DI 86 [ indexD.2442 ])
(const_int 8 [0x8]))
(const:DI (plus:DI (symbol_ref:DI ("m") [flags 0x2] 
)
(const_int 16 [0x10] [1
mD.2375.sD.2374[index_4(D)].aD.2372+0 S4 A64])
(reg:SI 83 [  ])))
(set (reg:DI 87)
(plus:DI (reg:DI 86 [ indexD.2442 ])
(const_int 2 [0x2])))
])
Successfully matched this instruction:
(set (reg:DI 87)
(plus:DI (reg:DI 86 [ indexD.2442 ])
(const_int 2 [0x2])))
Successfully matched this instruction:
(set (reg:CCZ 17 flags)
(compare:CCZ (mem:SI (plus:DI (mult:DI (reg:DI 86 [ indexD.2442 ])
(const_int 8 [0x8]))
(const:DI (plus:DI (symbol_ref:DI ("m") [flags 0x2]  )
(const_int 16 [0x10] [1
mD.2375.sD.2374[index_4(D)].aD.2372+0 S4 A64])
(reg:SI 83 [  ])))
allowing combination of insns 9 and 10
original costs 4 + 13 = 17
replacement costs 4 + 13 = 17
modifying insn i2 9: r87:DI=r86:DI+0x2
deferring rescan insn with uid = 9.
modifying insn i310: flags:CCZ=cmp([r86:DI*0x8+const(`m'+0x10)],r83:SI)
  REG_DEAD r86:DI
deferring rescan insn with uid = 10.

But then we don't sink the add into the conditional and do the combine there.

The code we get now is:
func(unsigned int):
movl%edi, %edx
movq%rdx, %rax
leaq2(%rdx), %rcx
cmpl%edx, m+16(,%rdx,8)
je  .L1
movlm+4(,%rcx,8), %eax
.L1:
ret

[Bug target/102203] New: __builtin_memset and __builtin_memcpy could be expanded inline if range is known to be small

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102203

Bug ID: 102203
   Summary: __builtin_memset and __builtin_memcpy could be
expanded inline if range is known to be small
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: enhancement
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---
Target: aarch64*-*-*

Take:
typedef decltype(sizeof(0)) size_t;
void g(size_t a, char *d, char *e)
{
  if (a>16)__builtin_unreachable();
  __builtin_memcpy(d, e, a);
}

- CUT 
This could be inlined like it is on x86_64.

[Bug target/102202] Inefficent expansion of memset when range is [0,1]

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102202

--- Comment #2 from Andrew Pinski  ---
I wonder if we could do this expansion at the gimple level ...
Though introducing branches might not be happy for some.

[Bug target/102202] Inefficent expansion of memset when range is [0,1]

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102202

--- Comment #1 from Andrew Pinski  ---
Likewise for memcpy:
typedef decltype(sizeof(0)) size_t;
void g(size_t a, char *d, char *e)
{
  __builtin_memcpy(d, e, a&1);
}

[Bug target/102202] New: Inefficent expansion of memset when range is [0,1]

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102202

Bug ID: 102202
   Summary: Inefficent expansion of memset when range is [0,1]
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: enhancement
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---
Target: x86_64-*-*

Take:
void g(int a, char *d)
{
  if (a < 0 || a > 1) __builtin_unreachable();
  __builtin_memset(d, 0, a);
}

- CUT -
GCC compiles on x86_64 to:
g(int, char*):
.cfi_startproc
testl   %edi, %edi
je  .L1
xorl%eax, %eax
.L2:
movl%eax, %edx
addl$1, %eax
movb$0, (%rsi,%rdx)
cmpl%edi, %eax
jb  .L2
.L1:
ret

Which is better than clang/LLVM/ICC does but the loop is not needed as a will
either be 0 or 1 and we already jump around the loop.

Here is another example not using __builtin_unreachable:
void g(int a, char *d)
{
  __builtin_memset(d, 0, a&1);
}

[Bug tree-optimization/66646] small loop turned into memmove because of tree ldist

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66646

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement
   Last reconfirmed|2015-06-24 00:00:00 |2021-9-4

[Bug target/101059] v4sf reduction not optimal

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101059

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement

[Bug tree-optimization/93745] Redundant store not eliminated with intermediate instruction

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93745

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement

[Bug tree-optimization/95410] Failure to optimize compare next to and properly

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95410

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement

[Bug tree-optimization/84011] Optimize switch table with run-time relocation

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84011

Andrew Pinski  changed:

   What|Removed |Added

 CC||jengelh at inai dot de

--- Comment #14 from Andrew Pinski  ---
*** Bug 99383 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/99383] No tree-switch-conversion under PIC

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99383

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|NEW |RESOLVED

--- Comment #8 from Andrew Pinski  ---
Dup of bug 84011.

*** This bug has been marked as a duplicate of bug 84011 ***

[Bug tree-optimization/99383] No tree-switch-conversion under PIC

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99383

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=93326,
   ||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=36881

[Bug tree-optimization/93326] switch optimisation of multiple jumptables into a lookup

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93326

--- Comment #6 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #5)
> So for the -fPIC case, we don't want to increase the number of runtime
> relocations done.  The number of runtime locations will happen in the
> constable load table.  I think we don't want to change that.

And that is PR 99383.

[Bug tree-optimization/85316] [meta-bug] VRP range propagation missed cases

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85316
Bug 85316 depends on bug 98357, which changed state.

Bug 98357 Summary: Bounds check not eliminated
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98357

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

[Bug tree-optimization/98357] Bounds check not eliminated

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98357

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |FIXED
   Target Milestone|--- |12.0
 Status|NEW |RESOLVED

--- Comment #4 from Andrew Pinski  ---
Fixed on the trunk by some of the improvements to VRP (range).

[Bug tree-optimization/94846] Failure to optimize jnc+inc into adc

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94846

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement

--- Comment #5 from Andrew Pinski  ---
After r12-897 (which added a late sink pass), we get the following in
.optimized:
  if (_10 != 0)
goto ; [50.00%]
  else
goto ; [50.00%]

   [local count: 536870913]:
  _2 = _1 + 1;

   [local count: 1073741824]:
  # prephitmp_11 = PHI <_1(2), _2(3)>
  # _13 = PHI <_1(2), _2(3)>
  *p_5(D) = _13;
  return prephitmp_11;

Notice how prephitmp_11 and _13 are the same but no RTL optimizers handles
that.

[Bug target/98453] aarch64: Missed opportunity for STP for vec_duplicate

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98453

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2021-09-05
   Severity|normal  |enhancement
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #1 from Andrew Pinski  ---
Confirmed.

Plus these functions too:
typedef double v2df __attribute__((vector_size (16)));
typedef float v2sf __attribute__((vector_size (8)));

void
food (v2df *x, double a)
{
  v2df tmp = {a, a};
  *x = tmp;
}

void
foof (v2sf *x, float a)
{
  v2sf tmp = {a, a};
  *x = tmp;
}

[Bug middle-end/19987] [meta-bug] fold missing optimizations in general

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19987
Bug 19987 depends on bug 95527, which changed state.

Bug 95527 Summary: Failure to optimize __builtin_ffs == 0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95527

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

[Bug tree-optimization/95527] Failure to optimize __builtin_ffs == 0

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95527

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |FIXED
   Target Milestone|--- |11.0
 Status|NEW |RESOLVED

--- Comment #6 from Andrew Pinski  ---
Fixed.

[Bug tree-optimization/85316] [meta-bug] VRP range propagation missed cases

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85316
Bug 85316 depends on bug 85375, which changed state.

Bug 85375 Summary: possible missed optimisation / regression from 6.3 with 
while (__builtin_ffs(x) && x)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85375

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

[Bug tree-optimization/85375] possible missed optimisation / regression from 6.3 with while (__builtin_ffs(x) && x)

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85375

Andrew Pinski  changed:

   What|Removed |Added

  Known to fail||10.3.0
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=95527
   Target Milestone|--- |11.0
 Resolution|--- |FIXED
 Status|NEW |RESOLVED
  Known to work||11.1.0

--- Comment #3 from Andrew Pinski  ---
After r11-1080 (PR 95527), __builtin_ffs(x) && x becomes just x != 0 and
optimized.

So yes fixed for GCC 11.

[Bug rtl-optimization/94798] Failure to optimize subtraction and 0 literal properly

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94798

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Severity|normal  |enhancement
   Last reconfirmed||2021-09-04

[Bug rtl-optimization/97603] Failure to optimize out compare into reuse of subtraction result

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97603

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement

[Bug middle-end/19987] [meta-bug] fold missing optimizations in general

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19987
Bug 19987 depends on bug 95433, which changed state.

Bug 95433 Summary: Failure to completely optimize simple compare after 
operations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95433

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

[Bug tree-optimization/95433] Failure to completely optimize simple compare after operations

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95433

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |FIXED
   Target Milestone|--- |11.0
 Status|NEW |RESOLVED

--- Comment #8 from Andrew Pinski  ---
Fixed in GCC 11 by the commits.

[Bug tree-optimization/18487] Warnings for pure and const functions that are not actually pure or const

2021-09-04 Thread federico.kircheis at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=18487

--- Comment #30 from Federico Kircheis  ---
It seems to me we are not going to agree as we tend to repeat ourselves, lets
see if we go around and around in circles or if it is more like a spiral ;)



Your view is more about the compiler, how it is interpreting the attributes and
thus why it is unneeded, mine is more about the developers writing (but most
importantly) reading it.


> The only functions GCC can warn about are those that don’t need the
attributes in the first place. The way any warning would work is to detect
whether it is pure/const, and then see how the user marked it. So anything
it can properly detect as right or wrong didn’t need an attribute to begin
with - the compiler could already tell if it was pure/const


My knowledge about how GCC (or other compilers) works, is very limited, but If
the function is implemented in another
  * translation unit
  * library
  * pre-compiled library
  * pre-compiled library created by another compiler
does GCC know it can avoid calling it multiple times?


Whole-program-optimization might help in some of those cases (I admit I have no
idea; can the linker remove multiple function calls and replace them with a
variable?), but depending on the project size it might add up a lot in term of
compile-times.
So even for simple functions, where GCC can clearly determine its purity, it
can be useful adding the attribute.


And even assuming that whole-program-optimization helps in most of those cases
(which do not depend on the complexity or length of a function) how does
someone know if adding those attributes to a function that is pure makes sense
or not?

Adding pure to `inline int answer_of_life(){return 42;}` might not make any
difference (both for programmers and compiler, because of it's simplicity and
because inline), but where should the line be drawn?

Should I mark my functions (with something else as you are suggesting too it
might do more harm than good), add for all those dummy tests, and check in the
generated assembly if GCC recognizes them as pure and elides the second call?
There must be surely be a better way, but I currently know no other.


> Rather than tell the user they got it wrong, you might as well tell the
user to remove the attribute because it isn’t necessary and won’t be
necessary.

No, removing it as unnecessary would be wrong.
Then you cannot tell anymore the difference between functions that are pure by
accident and by design.
And you cannot prevent anymore a pure-function to getting nonpure, except by
reading the code.
It is useful for programmers (yes, even they look at the code), even for those
function where GCC does not need the attribute.

> Giving a bunch of really contrived examples where users may update things
wrong doesn’t seem like a good motivation to make a warning that can only
possibly have a really high false positive rate.

Just adding a "printf" statement for debugging, or increasing/decreasing a
global counter invalidates the pure attributes.
Thus by trying to understand/analyze a bug, another is added.


> It is a tool for experts.

And I see no harm in making it more developer-friendly.
Why would that be a bad idea? As you claimed previously.

Because it is difficult to implement?
I do not know if it is, but that would not make it a bad idea.

Because of false positives?
Developers can handle them, case-by-case by documenting and disabling (or
ignoring) the diagnostic, or globally by not turning the diagnostic on.
Just like any other diagnostic.

Because it adds nothing from a compiler perspective?
I'm still not convinced that it has no added value, especially when interacting
with "extern" code/libraries.

But it definitively has some value for developers.
It's part of the API of a function, just like declaring the member function of
a class const (or the parameter of a function).
Adding const might even avoid some optimization, and leads to code-duplication
when one needs overloads (like for operator[] in container-like classes), but
from a developer perspective it's great. It helps to catch errors.
Of course one could never use it, for the compiler it would be the same.
And it would not invalidate it's original use-case, thus it would still be
possible to use those attributes like today if someone wants to, they would not
even need to change a thing.

[Bug target/94789] Failure to take advantage of shift operand semantics to turn subtraction into negate

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94789

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement

[Bug middle-end/92080] Missed CSE of _mm512_set1_epi8(c) with _mm256_set1_epi8(c)

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92080

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed|2019-10-14 00:00:00 |2021-9-4
   Severity|normal  |enhancement

--- Comment #5 from Andrew Pinski  ---
This gives good code:
#include 

__m512i sinkz;
__m256i sinky;
void foo(char c) {
__m512i a = _mm512_set1_epi8(c);
sinkz = a;
sinky = *((__m256i*));
}

[Bug target/93346] gcc does not generate BZHI

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93346

Andrew Pinski  changed:

   What|Removed |Added

 CC||peter at cordes dot ca

--- Comment #8 from Andrew Pinski  ---
*** Bug 82298 has been marked as a duplicate of this bug. ***

[Bug target/82298] x86 BMI: no peephole for BZHI

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82298

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |10.0
 Resolution|--- |DUPLICATE
 Status|UNCONFIRMED |RESOLVED

--- Comment #1 from Andrew Pinski  ---
Fixed in GCC 10.  Dup of bug 93346.

*** This bug has been marked as a duplicate of bug 93346 ***

[Bug target/93346] gcc does not generate BZHI

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93346

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |10.0

[Bug tree-optimization/99082] manual bit-field creation followed by manual extraction does not always produce good code

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99082

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement

[Bug target/97286] GCC sometimes uses an extra xmm register for the destination of _mm_blend_ps

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97286

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement
   Keywords||ra

[Bug target/88473] AVX512: constant folding on mask does not remove unnecessary instructions

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88473

Andrew Pinski  changed:

   What|Removed |Added

 Blocks||93885

--- Comment #7 from Andrew Pinski  ---
The UNSPEC_MASKOP ones are still there.

PR 93885 is the same issue.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93885
[Bug 93885] Spurious instruction kshiftlw issued

[Bug target/95974] AArch64 arm_neon.h stores interfere with gimple optimisations

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95974

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Severity|normal  |enhancement
   Last reconfirmed||2021-09-04

--- Comment #1 from Andrew Pinski  ---
Confirmed, maybe adding some access attributes will help this.

[Bug tree-optimization/89811] uint32_t load is not recognized if shifts are done in a fixed-size loop

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89811

Andrew Pinski  changed:

   What|Removed |Added

 CC||gabravier at gmail dot com

--- Comment #3 from Andrew Pinski  ---
*** Bug 94834 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/94834] Failure to optimize loop bswap pattern

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94834

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|NEW |RESOLVED

--- Comment #3 from Andrew Pinski  ---
This is a dup of bug 89811.

*** This bug has been marked as a duplicate of bug 89811 ***

[Bug target/93885] Spurious instruction kshiftlw issued

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93885

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement
 Ever confirmed|0   |1
   Last reconfirmed||2021-09-04
 Status|UNCONFIRMED |NEW

--- Comment #1 from Andrew Pinski  ---
Confirmed, this is due to UNSPEC_MASKOP on the shift which most likely can be
removed these days.

[Bug middle-end/91899] Merge constant literals

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91899

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |WONTFIX
 Status|NEW |RESOLVED

--- Comment #6 from Andrew Pinski  ---
You need to use -fmerge-all-constants and the linker will merge them.

[Bug target/85539] x86_64: loads are not always narrowed

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85539

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |FIXED
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=92180
  Known to fail||10.3.0
 Status|NEW |RESOLVED
  Known to work||11.1.0

--- Comment #3 from Andrew Pinski  ---
Trying 6 -> 7:
6: r86:DI=[r87:DI]
  REG_DEAD r87:DI
7: r85:SI=r86:DI#0
  REG_DEAD r86:DI
Successfully matched this instruction:
(set (reg:SI 85 [ *p_3(D) ])
(mem:SI (reg:DI 87) [1 *p_3(D)+0 S4 A64]))
allowing combination of insns 6 and 7
original costs 4 + 4 = 8
replacement cost 4
deferring deletion of insn with uid = 6.
modifying insn i3 7: r85:SI=[r87:DI]
  REG_DEAD r87:DI
deferring rescan insn with uid = 7.
starting the processing of deferred insns
rescanning insn with uid = 7.
ending the processing of deferred insns

This is because cse no longer props the subreg into the last move:
(insn 7 6 8 2 (set (reg:SI 85)
(subreg:SI (reg:DI 86) 0)) "/app/example.cpp":7:13 67 {*movsi_internal}
 (nil))
(insn 8 7 12 2 (set (reg:SI 83 [  ])
(reg:SI 85)) "/app/example.cpp":7:13 67 {*movsi_internal}
 (nil))
(insn 12 8 13 2 (set (reg/i:SI 0 ax)
(reg:SI 83 [  ])) "/app/example.cpp":8:1 67 {*movsi_internal}
 (nil))

And this was due to the patch which fixes PR 92180 and it was an expected out
come too.

[Bug middle-end/90424] memcpy into vector builtin not optimized

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90424

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed|2019-05-13 00:00:00 |2021-9-4
   Severity|normal  |enhancement
  Component|target  |middle-end

--- Comment #8 from Andrew Pinski  ---
Happens on aarch64 also.

[Bug tree-optimization/18487] Warnings for pure and const functions that are not actually pure or const

2021-09-04 Thread dberlin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=18487

--- Comment #29 from Daniel Berlin  ---
Let me try to explain a different way:
The only functions GCC can warn about are those that don’t need the
attributes in the first place. The way any warning would work is to detect
whether it is pure/const, and then see how the user marked it. So anything
it can properly detect as right or wrong didn’t need an attribute to begin
with - the compiler could already tell if it was pure/const

Rather than tell the user they got it wrong, you might as well tell the
user to remove the attribute because it isn’t necessary and won’t be
necessary.

This is precisely why attributes are meant for when you are sure you know
more than the compiler can tell, and *no other time *. It is a tool for
experts.
Giving a bunch of really contrived examples where users may update things
wrong doesn’t seem like a good motivation to make a warning that can only
possibly have a really high false positive rate.
The same logic applies to a lot of expert-use-only attributes.  It is
assumed you know what you are doing, because the compiler can’t tell you
you are wrong accurately




On Sat, Sep 4, 2021 at 4:40 PM federico.kircheis at gmail dot com <
gcc-bugzi...@gcc.gnu.org> wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=18487
>
> --- Comment #28 from Federico Kircheis  com> ---
> >Edit: sorry, my last comment about what GCC thinks is wrong.
>
> Unless it is going to inline the function call, in that case the
> attributes are
> as-if ignored (at least the case I've tested with GCC 11.2).
>
> --
> You are receiving this mail because:
> You are on the CC list for the bug.

[Bug tree-optimization/89811] uint32_t load is not recognized if shifts are done in a fixed-size loop

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89811

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement
   Last reconfirmed|2019-03-25 00:00:00 |2021-9-4

[Bug tree-optimization/93040] gcc doesn't optimize unaligned accesses to a 16-bit value on the x86 as well as it does a 32-bit value (or clang)

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93040

Andrew Pinski  changed:

   What|Removed |Added

 CC||nok.raven at gmail dot com

--- Comment #6 from Andrew Pinski  ---
*** Bug 89809 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/89809] movzwl is not utilized when uint16_t is loaded with bit-shifts (while memcpy does)

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89809

Andrew Pinski  changed:

   What|Removed |Added

  Known to fail||9.4.0
 Resolution|--- |DUPLICATE
 Status|NEW |RESOLVED
  Known to work||10.1.0
   Target Milestone|--- |10.0

--- Comment #4 from Andrew Pinski  ---
Fixed for GCC 10.
Dup of bug 93040.

*** This bug has been marked as a duplicate of bug 93040 ***

[Bug tree-optimization/93040] gcc doesn't optimize unaligned accesses to a 16-bit value on the x86 as well as it does a 32-bit value (or clang)

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93040

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |10.0

[Bug tree-optimization/18487] Warnings for pure and const functions that are not actually pure or const

2021-09-04 Thread federico.kircheis at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=18487

--- Comment #28 from Federico Kircheis  ---
>Edit: sorry, my last comment about what GCC thinks is wrong.

Unless it is going to inline the function call, in that case the attributes are
as-if ignored (at least the case I've tested with GCC 11.2).

[Bug tree-optimization/18487] Warnings for pure and const functions that are not actually pure or const

2021-09-04 Thread federico.kircheis at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=18487

--- Comment #27 from Federico Kircheis  ---
Edit: sorry, my last comment about what GCC thinks is wrong.

GCC seems to follow the gnu::pure/gnu::const directive to the letter, it does
not ignore it when it sees the implementation of the function, thus my comment
about information are already available can be ignored.

[Bug tree-optimization/18487] Warnings for pure and const functions that are not actually pure or const

2021-09-04 Thread federico.kircheis at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=18487

--- Comment #26 from Federico Kircheis  ---
As multiple people commented this Ticket, I do not know to who the least
message is sent, but I would like to give again my opinion on it, as I would
really like to use those attributes in non-toy projects.

> This seems like a bad idea

I think there are valid use-cases for those warnings.


> and is impossible in general

Let me quote myself:

> ... a warning that even only works for trivial case is much better than 
> nothing, because at least I know I can safely use the attribute for some 
> functions as a contract to the caller, and have it checked.

There are now two possible outcomes if a compiler emits a warning.

1)
I look at the definition, and *gasp*, the compiler is actually right.
The function was pure before, but the last changes made it impure.
Either I did not realize it, or I forgot to change the function declaration.
Thank you GCC for making me aware of the issue, I'll fix it.

2) 
I look at the definition an think that GCC is wrong.
I know better, and the function is pure.
I can either try to simplify the function in such a way that GCC does not
complain anymore (which might be a good idea), or I can use a pragma to ignore
this one warning (and comment why it's ignored), or remove the attribute
altogether, as GCC might call the function multiple times if it thinks it's
impure (see example at the end).
In the first approach, I can still benefit from warnings if the function
changes again.
In the second case I cant but at least, I can still grep in the entire codebase
and check periodically which warnings have been disabled locally, just like I
do for other warnings.
In the third case yes, I would probably report a bug with a minimal example.
This (hopefully), would improve GCC analysis capabilities.


> The whole point of the attributes is to tell the compiler things are 
> pure/const in cases it can't already prove.

That does not mean that it is not useful to let it do the check, *especially if
it can prove that the attribute is used incorrectly*, but even if it can't
prove anything.
And also see the example at the end why this is not completely true.

> It can already prove a lot, and doesn't need help in most of the simple 
> examples being given (in other bugs). 


But programmers (at least for the most use-cases I've seen) needs that type of
support.
I would like to know if a function has side effects.
It's great if the compiler can see it automatically, but when reading and
writing code, especially code not written by me or maintained by multiple
authors, we might want to restrict the functionality of some functions.

For side-effect free functions, the attributes const and pure are great, but
using them is more harmful, because if used wrongly it introduces UB, thus

1) they do not really document if a function is pure, as there is no tooling
checking if the statement is true
2) they introduce bugs that no-one can explain (see at the end).

Thus a comment "this function is pure", is by contrast much better, as it does
not introduce UB, but we all know that those kind of commends do not age well.
Thus at the end, they get ignored because not trustworthy, and one need always
to look at the implementation.

> You are basically going to warn in the cases the compiler can't prove it [...]

And for many use-cases it is fine.




Also the second example I gave:


// bar.hpp
[[gnu::const]] int get_value();

// bar.cpp
int get_value(){static int i = 0; return ++i;}


// foo.cpp
int foo(){
int i = get_value();
int j = get_value();
return i+j;
}


The compiler will still optimize the call to get_value, (unless it is able to
see the definition of get_value and see that there are side effects).

Thus, if the function is marked pure, the compiler

* will not call it a second time if it does not see the implementation of
`get_value`
* will call it a second time if it sees the implementation of `get_value` and
notices it is not pure.

This is one of those bugs that no-one can explain, as simply moving code
(making a function, for example, inline, or move it to another file), or
changing optimization level, changes the behavior of the program.


Thus, given main.cpp


[[gnu::const]] int foo();

// foo.cpp
int main(){
int i = foo();
int j = foo();
return i+j;
}



how many times is GCC going to call foo?

If GCC thinks that the function is pure, then only once.
If it thinks it is not pure, twice.

I have no idea what GCC thinks, because there are no diagnostics for it!
And look, it does not even matter if foo is pure or not, it matters if GCC
thinks if it is pure or not. 

I can similarly tell GCC to inline functions, but if GCC doesn't at least it
will tell me he didn't.(warning: 'always_inline' function might not be
inlinable [-Wattributes])



We can of course say "those attributes are only for those people that really
know better", but as the compiler is 

[Bug target/56309] conditional moves instead of compare and branch result in almost 2x slower code

2021-09-04 Thread peter at cordes dot ca via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56309

--- Comment #37 from Peter Cordes  ---
Correction, PR82666 is that the cmov on the critical path happens even at -O2
(with GCC7 and later).  Not just with -O3 -fno-tree-vectorize.

Anyway, that's related, but probably separate from choosing to do if-conversion
or not after inlining.

[Bug target/56309] conditional moves instead of compare and branch result in almost 2x slower code

2021-09-04 Thread peter at cordes dot ca via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56309

Peter Cordes  changed:

   What|Removed |Added

 CC||peter at cordes dot ca

--- Comment #36 from Peter Cordes  ---
Related:  a similar case of cmov being a worse choice, for a threshold
condition with an array input that happens to already be sorted:

https://stackoverflow.com/questions/28875325/gcc-optimization-flag-o3-makes-code-slower-than-o2

GCC with -fprofile-generate / -fprofile-use does correctly decide to use
branches.

GCC7 and later (including current trunk) with -O3 -fno-tree-vectorize
de-optimizes by putting the CMOV on the critical path, instead of as part of
creating a zero/non-zero input for the ADD. PR82666.  If you do allow full -O3,
then vectorization is effective, though.

[Bug c/29970] mixing ({...}) with VLA leads to massive breakage

2021-09-04 Thread muecker at gwdg dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=29970

--- Comment #13 from Martin Uecker  ---
The remaining problem with constant index 0 for the patch mentioned above,
appears to be related to fold_binary_loc which transforms (a + (x, 0)) to (x,
a) which breaks if 'x' depends on something in 'a'.

[Bug c++/102201] Accepts invalid C++98 with nested class and sizeof of outer's non-static field

2021-09-04 Thread harald at gigawatt dot nl via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102201

Harald van Dijk  changed:

   What|Removed |Added

 CC||harald at gigawatt dot nl

--- Comment #1 from Harald van Dijk  ---
This doesn't need inner classes, a simpler reproducer is:

struct S { int i; };
int j = sizeof S::i;

gcc accepts this in all modes ever since the C++11 rule for non-static members
in unevaluated contexts was implemented (4.4). clang says in C++98 mode:

test.cc:2:19: error: invalid use of non-static data member 'i'
int j = sizeof S::i;
   ~~~^
1 error generated.

[Bug c++/101355] incorrect `this' in destructor calls when compiling coroutines with ubsan

2021-09-04 Thread daklishch at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101355

Dan Klishch  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from Dan Klishch  ---
GCC stopped instrumenting destructors in this particular case, so I guess the
bug is fixed.

https://godbolt.org/z/KGa6aGf5x

[Bug tree-optimization/102196] -Wmaybe-uninitialized: Maybe generate helpful hints?

2021-09-04 Thread jbglaw--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102196

--- Comment #6 from Jan-Benedict Glaw  ---
Calling the compiler again with just adding -fanalyzer doesn't add more
information to the output. Do I need to turn on extra warnings to enable static
analysis for access to possibly uninitialized variables?

[Bug tree-optimization/102200] [12 Regression] ice in put_ref, at pointer-query.cc:1351

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102200

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2021-09-04

--- Comment #1 from Andrew Pinski  ---
(In reply to David Binderman from comment #0)
> The bug first seems to occur sometime between git hash 7a6f40d0452ec76e
> and 9695e1c23be5b5c5. Only 21 commits.

Most likely r12-3300-ece28da924dd

Confirmed.

[Bug c++/102199] is_default_constructible incorrect for an inner type with NSDMI

2021-09-04 Thread eyalroz1 at gmx dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102199

--- Comment #3 from Eyal Rozenberg  ---
Andrew: What you're saying would be plausible if g++ would find the structure
to be incomplete. It does not. The completeness check passes; and it is  why
adding the explicit default ctor makes the asserting pass - despite your
rationale applying to that case just as well.

[Bug c++/102201] New: Accepts invalid C++98 with nested class and sizeof of outer's non-static field

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102201

Bug ID: 102201
   Summary: Accepts invalid C++98 with nested class and sizeof of
outer's non-static field
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Keywords: accepts-invalid
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---

Take:
struct outer {
struct inner {
 inner() :x(sizeof(y)) { }
unsigned int x;
};
int y;
};

- CUT 
The above code is valid C++11 but invalid C++98 because the field y is
non-static.

[Bug c++/102199] is_default_constructible incorrect for an inner type with NSDMI

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102199

--- Comment #2 from Andrew Pinski  ---
This is because the following is still valid C++11:
struct outer {
struct inner {
// inner() { }
unsigned int x = y;
};
  static constexpr int y =10;
};

That is inner is not completed until outer is completed.

[Bug c++/102199] is_default_constructible incorrect for an inner type with NSDMI

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102199

Andrew Pinski  changed:

   What|Removed |Added

  Component|libstdc++   |c++

--- Comment #1 from Andrew Pinski  ---
THis comes down to when the struct is complete.

[Bug tree-optimization/102200] [12 Regression] ice in put_ref, at pointer-query.cc:1351

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102200

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||ice-on-valid-code
   Target Milestone|--- |12.0
  Component|c   |tree-optimization
Summary|ice in put_ref, at  |[12 Regression] ice in
   |pointer-query.cc:1351   |put_ref, at
   ||pointer-query.cc:1351

[Bug tree-optimization/18487] Warnings for pure and const functions that are not actually pure or const

2021-09-04 Thread dberlin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=18487

--- Comment #25 from Daniel Berlin  ---
This seems like a bad idea, and is impossible in general.

The whole point of the attributes is to tell the compiler things are pure/const
in cases it can't already prove.

It can already prove a lot, and doesn't need help in most of the simple
examples being given (in other bugs). 

You are basically going to warn in the cases the compiler can't prove it (IE
sees something it thinks makes the function not pure/const), and those are
*exactly* the cases the attribute exists for - where the compiler doesn't know,
but you do.

[Bug c/102200] New: ice in put_ref, at pointer-query.cc:1351

2021-09-04 Thread dcb314 at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102200

Bug ID: 102200
   Summary: ice in put_ref, at pointer-query.cc:1351
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: dcb314 at hotmail dot com
  Target Milestone: ---

For this C source code:

long try_extension_len;
void try_extension_str() {
  char *curr = try_extension_str;
  char end = sizeof try_extension_str;
  while (try_extension_len) {
if (curr < end)
  *curr = ';';
if (curr > )
  curr = 
  }
}

compiled with recent gcc trunk and compiler flag -O2, does this:

during GIMPLE pass: strlen
bug754.c: In function ‘try_extension_str’:
bug754.c:2:6: internal compiler error: in put_ref, at pointer-query.cc:1351
2 | void try_extension_str() {
  |  ^
0xc7696c pointer_query::put_ref(tree_node*, access_ref const&, int)
../../trunk.git/gcc/pointer-query.cc:1351

The bug first seems to occur sometime between git hash 7a6f40d0452ec76e
and 9695e1c23be5b5c5. Only 21 commits.

[Bug middle-end/32911] Function __attribute__ ((idempotent))

2021-09-04 Thread trass3r at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=32911

Trass3r  changed:

   What|Removed |Added

 CC||trass3r at gmail dot com

--- Comment #7 from Trass3r  ---
OpenGL's bind functions are another example.
They don't return anything so can't be marked pure/const but any subsequent
calls with the same arguments are redundant.

[Bug tree-optimization/93540] Attributes pure and const not working with aggregate return types, even trivial ones

2021-09-04 Thread trass3r at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93540

Trass3r  changed:

   What|Removed |Added

 CC||trass3r at gmail dot com

--- Comment #2 from Trass3r  ---
I also ran into this when trying to optimize trivial but expensive getter
functions, e.g. returning shared_ptr.

struct Foo
{
int operator+(const Foo& f);
int a;
};

[[gnu::const]]
Foo foo(); // int instead of Foo works

auto testfunction()
{
return foo() + foo(); // results in 2 calls
}

[Bug libstdc++/102199] New: is_default_constructible incorrect for an inner type with NSDMI

2021-09-04 Thread eyalroz1 at gmx dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102199

Bug ID: 102199
   Summary: is_default_constructible incorrect for an inner type
with NSDMI
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: eyalroz1 at gmx dot com
  Target Milestone: ---

Stackoverflow discussion: https://stackoverflow.com/q/69050558/1593077
Related LLVM bug: https://bugs.llvm.org/show_bug.cgi?id=38374
GodBolt: https://godbolt.org/z/snPf7Ks4W

Consider the following program:

#include 

struct outer {
struct inner {
// inner() { }
unsigned int x = 0;
};
//static_assert(std::is_default_constructible::value,
//  "not default ctorable - inside");
};

static_assert(std::is_default_constructible::value,
"not default ctorable - outside");

It compiles. But if we uncomment the first static_assert - it evaluates to
false. Mind you: Not because struct inner is incomplete; it is simply deemed to
not be default-constructible. But - it _is_ default constructible. And if we
add a method to struct outer which default-constructs an inner, it will work.

Also note that if we uncomment the explicit default ctor the definition of
struct inner, both asserts pass.


clang++ seems to exhibit this too (also with -stdlib=libc++). I'm not sure
whether this is an actual bug in the library, or whether the standard mandates
this in some freakish way, but - it's just wrong.

[Bug target/101933] Unloaded dll with global std::mutex causes exe to crash on exit

2021-09-04 Thread mailnew4ster at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101933

Paul Jackson  changed:

   What|Removed |Added

 Resolution|--- |INVALID
 Status|UNCONFIRMED |RESOLVED

--- Comment #4 from Paul Jackson  ---
I debugged it a bit more, and I found out that:
1. It's happening when exceptions are involved.
2. It's actually a bug of TDM-GCC.

For details, please see my second comment in the GitHub issue:
https://github.com/jmeubank/tdm-gcc/issues/38#issuecomment-912876481

[Bug target/88476] Optimize expressions which uses vector, mask and general purpose registers

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88476

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Severity|normal  |enhancement
   Last reconfirmed||2021-09-04
 Ever confirmed|0   |1
  Component|middle-end  |target

--- Comment #1 from Andrew Pinski  ---
This is a target issue:
Trying 22 -> 33:
   22: {r108:QI=r100:QI|r111:QI;unspec[0] 159;}
  REG_DEAD r100:QI
   33: {r116:QI=r108:QI:QI;unspec[0] 159;}
  REG_DEAD r112:QI
  REG_DEAD r108:QI
Can't combine i2 into i3

The reasoning behind the unspec I suspect has gone way since r11-2796 and
r11-2795 .

[Bug target/89984] Extra register move

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89984

--- Comment #2 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #1)
> The problem is here:
> (define_insn_and_split "@xorsign3_1"
>   [(set (match_operand:MODEF 0 "register_operand" "=Yv")
> (unspec:MODEF
>   [(match_operand:MODEF 1 "register_operand" "Yv")
>(match_operand:MODEF 2 "register_operand" "0")
>(match_operand: 3 "nonimmediate_operand" "Yvm")]
>   UNSPEC_XORSIGN))]
>   "SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH"
>   "#"
>   "&& reload_completed"
>   [(const_int 0)]
>   "ix86_split_xorsign (operands); DONE;")
> 
> 
> for AVX operand 2 does not need to be the same as operand 0.
> Shouldn't be a hard change for someone starting out to improve this.

Right,  The way copysign is defined is like this:
(define_insn "@copysign3_var"
  [(set (match_operand:SSEMODEF 0 "register_operand" "=Yv,Yv,Yv,Yv,Yv")
(unspec:SSEMODEF
  [(match_operand:SSEMODEF 2 "register_operand" "Yv,0,0,Yv,Yv")
   (match_operand:SSEMODEF 3 "register_operand" "1,1,Yv,1,Yv")
   (match_operand: 4
 "nonimmediate_operand" "X,Yvm,Yvm,0,0")
   (match_operand: 5
 "nonimmediate_operand" "0,Yvm,1,Yvm,1")]
  UNSPEC_COPYSIGN))
   (clobber (match_scratch: 1 "=Yv,Yv,Yv,Yv,Yv"))]
  "(SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH)
   || (TARGET_SSE && (mode == TFmode))"
  "#")

I suspect xorsign should be improved similarlly.

[Bug target/52034] __builtin_copysign optimization suboptimal

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52034

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |11.0
 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #6 from Andrew Pinski  ---
This was a scrach_register issue which was fixed in r11-4577.

[Bug target/52034] __builtin_copysign optimization suboptimal

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52034

--- Comment #5 from Andrew Pinski  ---
Th

[Bug target/52034] __builtin_copysign optimization suboptimal

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52034

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=89984
   Severity|normal  |enhancement

--- Comment #4 from Andrew Pinski  ---
related to PR 89984.

[Bug target/85819] conversion from __v[48]su to __v[48]sf should use FMA

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85819

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement
   Last reconfirmed|2018-05-18 00:00:00 |2021-9-3

--- Comment #2 from Andrew Pinski  ---
ix86_expand_convert_uns_sisf_sse and ix86_expand_vector_convert_uns_vsivsf
should check if FMA exists and expand directly to them instead of doing MULT
PLUS seperately.

  1   2   >