[Bug tree-optimization/40170] redundant zero extensions

2021-07-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40170

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |11.0
 Resolution|--- |FIXED
  Component|target  |tree-optimization
 Status|UNCONFIRMED |RESOLVED

--- Comment #2 from Andrew Pinski  ---
Fixed for GCC 11, in EVRP.  I can't figure out which patch caused it but what
happens is the following:
We figure out the range of _3 to be [0, 255]
 _3 = (int) bit_16;

While processing:
  _4 = _2 >> _3;

We figure out the range of _4 is still [0, 255] as it is a right shift so we
cannot change any upper bits.

And then we match and simplify the following:
  _24 = _4 & 255;

to just:
 _24 = _4;

[Bug tree-optimization/101621] gcc cannot optimize int8_t vector assign with subscription to shuffle

2021-07-25 Thread yumeyao at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101621

--- Comment #3 from YumeYao  ---
(In reply to Andrew Pinski from comment #2)
> The cast issue is because in GCC 9, it was not producing PERM at the gimple
> level which was fixed correctly in GCC 11.
> 
> clang_shuffle_with_zero can easy be added.

Thanks for your insights.

Do you have any comment on the optimization flag part (gcc <=8 only needs -O1
to optimize the 'cast' case, but gcc 11 requires -O3)?
Is it due to some default optimization options change in -O1 between gcc 8 and
11, or it's something deeper?

[Bug target/48986] Missed optimization in atomic decrement on x86/x64

2021-07-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48986

Andrew Pinski  changed:

   What|Removed |Added

 CC||bcrl at kvack dot org

--- Comment #7 from Andrew Pinski  ---
*** Bug 25230 has been marked as a duplicate of this bug. ***

[Bug target/25230] __sync_add_and_fetch does not use condition flags from subl

2021-07-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25230

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |DUPLICATE
   Target Milestone|--- |4.7.0

--- Comment #3 from Andrew Pinski  ---
Dup of bug 48986 which was fixed for GCC 4.7.0.

*** This bug has been marked as a duplicate of bug 48986 ***

[Bug tree-optimization/32226] Missed optimization caused by copy loop header (yes a weird case)

2021-07-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=32226

--- Comment #3 from Andrew Pinski  ---
To do this optimization (the reduced testcase works right now), you have to
simulate each statement until the end with "width_5 == 0" (the opposite range
of the initial condition) to see if get the other phi operand.

  if (width_5(D) != 0)
goto ; [89.00%]
  else
goto ; [11.00%]

   [local count: 105119325]:
  _1 = (long unsigned int) dir_8(D);
  _3 = width_5(D) + 4294967295;
  _14 = (sizetype) _3;
  _6 = _14 + 1;
  _17 = _1 * _6;
  _18 = _17 * 2;
  errorptr_4 = errorptr_7(D) + _18;

   [local count: 118111601]:
  # errorptr_16 = PHI 


I don't know if this optimization is that important, even clang does not do it.
It should most likely be only done if the branch is highly predicted taken down
the route of the longer path.

[Bug tree-optimization/30099] missed value numbering optimization (conditional-based assertions)

2021-07-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30099

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |8.0
 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Andrew Pinski  ---
Fixed in GCC 8 by r8-1633 .

[Bug tree-optimization/39761] data-flow analysis does not discover constant real/imaginary parts

2021-07-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39761

--- Comment #14 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #13)
> Fixed in GCC 8, most likely by r8-5346 .  That is DOM is now able to do the
> jump threading even at -Os.

I should say DOM is doing the jump threading now which is why I think r8-5346
fixed this.

[Bug tree-optimization/39761] data-flow analysis does not discover constant real/imaginary parts

2021-07-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39761

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |8.0
 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #13 from Andrew Pinski  ---
Fixed in GCC 8, most likely by r8-5346 .  That is DOM is now able to do the
jump threading even at -Os.

[Bug tree-optimization/37810] Bad store sinking job

2021-07-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=37810

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed|2009-04-03 12:34:44 |2021-7-25

--- Comment #6 from Andrew Pinski  ---
For the reduced testcase in comment #2 I get now:
4.8.0+:
.L4:
addl$1, %eax
movl%eax, (%rbx)
cmpl4(%rbx), %eax
je  .L8
.L3:
testl   %eax, %eax
jne .L4

4.7.4 and before:
.L3:
testl   %eax, %eax
je  .L8
addl$1, %eax
cmpl4(%rbx), %eax
movl%eax, (%rbx)
jne .L3

Or on the trunk at the gimple level:
   [local count: 1014686025]:
  _1 = prephitmp_10 + 1;
  iter_6(D)->n = _1;
  _2 = iter_6(D)->m;
  if (_1 == _2)
goto ; [5.50%]
  else
goto ; [94.50%]

   [local count: 55807731]:
  g ();

   [local count: 114863530]:
  pretmp_11 = iter_6(D)->n;

   [local count: 1073741824]:
  # prephitmp_10 = PHI 
  if (prephitmp_10 != 0)
goto ; [94.50%]
  else
goto ; [5.50%]

Aka the store still happens inside the loop unconditionally.

[Bug rtl-optimization/35309] Late struct expansion leads to missing PRE

2021-07-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=35309

--- Comment #3 from Andrew Pinski  ---
THe original testcase in comment #0 is now fixed but the following is not:
struct A {
  int f[16];
} ag, ag2,ag3;


struct A foo(int n)
{
   if (n)
   {
 ag2 = ag;
   }

   return ag;
}

[Bug target/23813] redundant register assignments not eliminated

2021-07-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=23813

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED
   Target Milestone|--- |5.0

--- Comment #6 from Andrew Pinski  ---
So this has been fixed in GCC 5.0 and above as it is able to detect bswap and
do the correct thing there.

That is it is able to convert:
  REV64_STEP(n,  8, 0x00FF00FF00FF00FFULL); /* bytes */
  REV64_STEP(n, 16, 0xULL); /* halfwords */
  REV64_STEP(n, 32, 0xULL); /* full words */

Into:
n = __builtin_bswap64 (n)

Re: [PATCH][RFC] tree-optimization/100499 - multiple_of_p bad behavior wrt niter analysis

2021-07-25 Thread Bin.Cheng via Gcc-patches
On Thu, Jul 22, 2021 at 6:41 PM Richard Biener  wrote:
>
> This avoids using multiple_of_p in niter analysis when its behavior
Hmm, but this patch actually introduces one more call to
multiple_of_p, also it doesn't touch the below use:
if (niter->control.no_overflow && multiple_of_p (type, c, s))
  {
niter->niter = fold_build2 (FLOOR_DIV_EXPR, niter_type, c, s);
return true;
  }

> to assume the tested expression does not invoke integer overflow
> produces wrong niter analysis results.  For the cases multiple_of_p
> handles power-of-two values of bottom look safe which should also be
> the majority of cases we care about in addition to the constant case
> now handled explicitely.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu.
>
> I'm unsure how important a "perfect" solution is (rewriting
> multiple_of_p), and wonder whether this solution is preferable
I am still for this approach now, it only needs to be conservative,
rather than perfect, especially if there are not many breakages with a
conservative version multiple_of_p?

> for now (and especially for branches).  I've not yet tried
> to sanitize multiple_of_p plus use range info to prove
> no-overflow where TYPE_OVERFLOW_UNDEFINED doesn't tell us
> immediately.
>
> 2021-07-22  Richard Biener  
>
> PR tree-optimization/100499
> * tree-ssa-loop-niter.c (number_of_iterations_ne): Restrict
> multiple_of_p queries to power-of-two bottoms, handle
> the all constant case inline.
>
> * gcc.dg/torture/pr100499-1.c: New testcase.
> * gcc.dg/torture/pr100499-2.c: Likewise.
> ---
>  gcc/testsuite/gcc.dg/torture/pr100499-1.c | 28 +++
>  gcc/testsuite/gcc.dg/torture/pr100499-2.c | 16 +
>  gcc/tree-ssa-loop-niter.c |  8 ++-
>  3 files changed, 51 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.dg/torture/pr100499-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/torture/pr100499-2.c
>
> diff --git a/gcc/testsuite/gcc.dg/torture/pr100499-1.c 
> b/gcc/testsuite/gcc.dg/torture/pr100499-1.c
> new file mode 100644
> index 000..97ab6051554
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/torture/pr100499-1.c
> @@ -0,0 +1,28 @@
> +/* { dg-do run } */
> +/* { dg-require-effective-target int32plus } */
> +
> +typedef __UINT16_TYPE__ uint16_t;
> +static uint16_t g_2823 = 0xEC75L;
> +static uint16_t g_116 = 0xBC07L;
> +
> +static uint16_t
> +safe_mul_func_uint16_t_u_u(uint16_t ui1, uint16_t ui2)
> +{
> +  return ((unsigned int)ui1) * ((unsigned int)ui2);
> +}
> +
> +int main (int argc, char* argv[])
> +{
> +  uint16_t l_2815 = 65535UL;
> +  uint16_t *l_2821 = _116;
> +  uint16_t *l_2822 = _2823;
> +
> +lbl_2826:
> +  l_2815 &= 0x9DEF1EAEL;
> +  if (+(safe_mul_func_uint16_t_u_u(((*l_2821) = l_2815), (--(*l_2822)
> +goto lbl_2826;
> +
> +  if (g_2823 != 32768)
> +__builtin_abort ();
> +  return 0;
> +}
> diff --git a/gcc/testsuite/gcc.dg/torture/pr100499-2.c 
> b/gcc/testsuite/gcc.dg/torture/pr100499-2.c
> new file mode 100644
> index 000..999f931806a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/torture/pr100499-2.c
> @@ -0,0 +1,16 @@
> +/* { dg-do run } */
> +
> +unsigned char ag = 55;
> +unsigned i;
> +int main()
> +{
> +  unsigned char c;
> +  unsigned char a = ag;
> +d:
> +  c = a-- * 52;
> +  if (c)
> +goto d;
> +  if (a != 255)
> +__builtin_abort ();
> +  return 0;
> +}
> diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
> index 1b5605c26b8..c6b953c5316 100644
> --- a/gcc/tree-ssa-loop-niter.c
> +++ b/gcc/tree-ssa-loop-niter.c
> @@ -1050,7 +1050,13 @@ number_of_iterations_ne (class loop *loop, tree type, 
> affine_iv *iv,
>   Note, for NE_EXPR, base equals to FINAL is a special case, in
>   which the loop exits immediately, and the iv does not overflow.  */
>if (!niter->control.no_overflow
> -  && (integer_onep (s) || multiple_of_p (type, c, s)))
> +  && (integer_onep (s)
> + || (poly_int_tree_p (c)
> + && multiple_p (wi::to_poly_widest (c), wi::to_poly_widest (s)))
> + /* ???  multiple_of_p assumes the expression 'c' does not overflow
> +but that cannot be guaranteed, so we restrict 's' to power of
> +two values where that should not be an issue.  See PR100499.  */
> + || (integer_pow2p (s) && multiple_of_p (type, c, s
>  {
>tree t, cond, new_c, relaxed_cond = boolean_false_node;
I (to be blamed) added this part of code to special handle cases like
pr34114, now I feel it's in the wrong direction.  Ideally this part of
code is unnecessary and conditions will be (it is now) incorporated
into niter->assumptions which should be simplified to 1/0
correspondingly.  The only problem is that assumptions are not
appropriately simplified.
Is it possible to incorporate a more powerful solver (like Z3) in GCC
for such cases, e.g., assumption simplification, multiple_of_p, etc..
Oh, we don't do SCEV analysis 

[Bug tree-optimization/23855] loop header should also be pulled out of the inner loop too

2021-07-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=23855

Andrew Pinski  changed:

   What|Removed |Added

 CC||xinliangli at gmail dot com

--- Comment #33 from Andrew Pinski  ---
*** Bug 35344 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/35344] Loop unswitching to produce perfect loop nest

2021-07-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=35344

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
   Target Milestone|--- |6.0
 Status|UNCONFIRMED |RESOLVED

--- Comment #2 from Andrew Pinski  ---
Fixed:

  if (m_23(D) > 0)
goto ; [89.00%]
  else
goto ; [11.00%]

   [local count: 12992276]:
  p.0_1 = p;
  q.1_10 = q;
  if (n_24(D) > 0)
goto ; [89.00%]
  else
goto ; [11.00%]

So yes it is a dup.

*** This bug has been marked as a duplicate of bug 23855 ***

Re: PING^1 [PATCH v2] x86: Check AVX512 without mask instructions

2021-07-25 Thread Hongtao Liu via Gcc-patches
On Wed, Jul 14, 2021 at 8:27 PM H.J. Lu  wrote:
>
> On Fri, Jun 25, 2021 at 5:39 AM H.J. Lu  wrote:
> >
> > On Fri, Jun 25, 2021 at 12:50 AM Uros Bizjak  wrote:
> > >
> > > On Fri, Jun 25, 2021 at 4:51 AM Hongtao Liu  wrote:
> > > >
> > > > On Fri, Jun 25, 2021 at 12:13 AM Uros Bizjak via Gcc-patches
> > > >  wrote:
> > > > >
> > > > > On Thu, Jun 24, 2021 at 2:12 PM H.J. Lu  wrote:
> > > > > >
> > > > > > CPUID functions are used to detect CPU features.  If vector ISAs
> > > > > > are enabled, compiler is free to use them in these functions.  Add
> > > > > > __attribute__ ((target("general-regs-only"))) to CPUID functions
> > > > > > to avoid vector instructions.
> > > > >
> > > > > These functions are intended to be inlined, so how does target
> > > > > attribute affect inlining?
> > > > I guess w/ -O0. they may not be inlined, that's why H.J adds those
> > > > attributes to those functions.
> > >
> > > The problem is not with these functions, but with surrounding checks
> > > for cpuid features. These checks are implemented with logic
> > > instructions, and nothing prevents RA from allocating mask registers,
> > > and consequently mask insn is emitted. Regarding mentioned functions,
> > > cpuid insn pattern has four GPR single-reg constraints, so mask
> > > registers can't be allocated here.
> > >
> > > > pr96814.dump:
> > > > 0804aa40 :
> > > >  804aa40: 8d 4c 24 04  lea0x4(%esp),%ecx
> > > > ...
> > > >  804aa63: 6a 07push   $0x7
> > > >  804aa65: e8 e0 e7 ff ffcall   804924a <__get_cpuid_count>
> > > >
> > > > Also we need to add a target attribute to avx512f_os_support (), and
> > > > that would be enough to fix the AVX512 part.
> > > >
> > > > Moreover, all check functions in below files may also need to deal with:
> > > > adx-check.h
> > > > aes-avx-check.h
> > > > aes-check.h
> > > > amx-check.h
> > > > attr-nocf-check-1a.c
> > > > attr-nocf-check-3a.c
> > > > avx2-check.h
> > > > avx2-vpop-check.h
> > > > avx512bw-check.h
> > > > avx512-check.h
> > > > avx512dq-check.h
> > > > avx512er-check.h
> > > > avx512f-check.h
> > > > avx512vl-check.h
> > > > avx-check.h
> > > > bmi2-check.h
> > > > bmi-check.h
> > > > cf_check-1.c
> > > > cf_check-2.c
> > > > cf_check-3.c
> > > > cf_check-4.c
> > > > cf_check-5.c
> > > > f16c-check.h
> > > > fma4-check.h
> > > > fma-check.h
> > > > isa-check.h
> > > > lzcnt-check.h
> > > > m128-check.h
> > > > m256-check.h
> > > > m512-check.h
> > > > mmx-3dnow-check.h
> > > > mmx-check.h
> > > > pclmul-avx-check.h
> > > > pclmul-check.h
> > > > pr39315-check.c
> > > > rtm-check.h
> > > > sha-check.h
> > > > spellcheck-options-1.c
> > > > spellcheck-options-2.c
> > > > spellcheck-options-3.c
> > > > spellcheck-options-4.c
> > > > spellcheck-options-5.c
> > > > sse2-check.h
> > > > sse3-check.h
> > > > sse4_1-check.h
> > > > sse4_2-check.h
> > > > sse4a-check.h
> > > > sse-check.h
> > > > ssse3-check.h
> > > > stack-check-11.c
> > > > stack-check-12.c
> > > > stack-check-17.c
> > > > stack-check-18.c
> > > > stack-check-19.c
> > > > xop-check.h
> > >
> > > True, but this would just paper over the real problem. Now, it is
> > > expected that the user decorates the function that checks CPUID
> > > features with the target attribute. I'm not sure if this is OK.
vmovw is enabled by AVX512FP16, and compile cpuid check function w/
avx512fp16 may result in SIGILL on non-avx512fp16 target(though, we
didn't get a testcase yet).
Would that be a sufficient reason to disable avx512 for cpuid check?
> > >
> > > Uros.
> >
> > CPUID functions are used to detect CPU features.  If mask instructions
> > are enabled, compiler is free to use them in these functions.  Disable
> > AVX512F in AVX512 check with target pragma to avoid mask instructions.
> >
> > OK for master?
> >
>
> PING:
>
> https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573717.html
>
>
> --
> H.J.



-- 
BR,
Hongtao


[Bug target/28919] IV selection is messed up

2021-07-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=28919

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed|2006-09-17 22:48:12 |2021-7-25

--- Comment #10 from Andrew Pinski  ---
Still happens.
__builtin_prefetch causes the issue.

[Bug target/18562] SSE constant vector initialization produces dead constant values on stack

2021-07-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=18562

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |4.9.0
 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #14 from Andrew Pinski  ---
Fixed fully in 4.9 and above.

Re: [PATCH 44/62] AVX512FP16: Add scalar/vector bitwise operations, including

2021-07-25 Thread Hongtao Liu via Gcc-patches
On Fri, Jul 23, 2021 at 1:13 PM Hongtao Liu  wrote:
>
> On Thu, Jul 1, 2021 at 2:18 PM liuhongt  wrote:
> >
> > From: "H.J. Lu" 
> >
> > 1. FP16 vector xor/ior/and/andnot/abs/neg
> > 2. FP16 scalar abs/neg/copysign/xorsign
> >
> > gcc/ChangeLog:
> >
> > * config/i386/i386-expand.c (ix86_expand_fp_absneg_operator):
> > Handle HFmode.
> > (ix86_expand_copysign): Ditto.
> > (ix86_expand_xorsign): Ditto.
> > * config/i386/i386.c (ix86_build_const_vector): Handle HF vector
> > modes.
> > (ix86_build_signbit_mask): Ditto.
> > (ix86_can_change_mode_class): Ditto.
> > * config/i386/i386.md (SSEMODEF): Add HF mode.
> > (ssevecmodef): Ditto.
> > (2): Use MODEFH.
> > (*2_1): Ditto.
> > (define_split): Ditto.
> > (xorsign3): Ditto.
> > (@xorsign3_1): Ditto.
> As mentioned by uros, l think these also better have separate patterns for hf.
I realized there're parameters names in define_insn and
define_insn_and_split, and they will be called by xorsign/copysign
functions in i386-expand.c, for simplicity i'd like to keep the
macroization of HF patterns in this patch.

> > * config/i386/sse.md (VFB): New mode iterator.
> > (VFB_128_256): Ditto.
> > (VFB_512): Ditto.
> > (sseintvecmode2): Support HF vector mode.
> > (2): Use new mode iterator.
> > (*2): Ditto.
> > (copysign3): Ditto.
> > (xorsign3): Ditto.
> > (3): Ditto.
> > (3): Ditto.
> > (_andnot3): Adjust for HF vector mode.
> > (_andnot3): Ditto.
> > (*3): Ditto.
> > (*3): Ditto.
> > ---
> >  gcc/config/i386/i386-expand.c |  12 +++-
> >  gcc/config/i386/i386.c|  12 +++-
> >  gcc/config/i386/i386.md   |  40 ++-
> >  gcc/config/i386/sse.md| 128 --
> >  4 files changed, 118 insertions(+), 74 deletions(-)
> >
> > diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
> > index 9233c6cd1e8..006f4bec8db 100644
> > --- a/gcc/config/i386/i386-expand.c
> > +++ b/gcc/config/i386/i386-expand.c
> > @@ -1781,6 +1781,8 @@ ix86_expand_fp_absneg_operator (enum rtx_code code, 
> > machine_mode mode,
> > vmode = V4SFmode;
> >else if (mode == DFmode)
> > vmode = V2DFmode;
> > +  else if (mode == HFmode)
> > +   vmode = V8HFmode;
> >  }
> >
> >dst = operands[0];
> > @@ -1918,7 +1920,9 @@ ix86_expand_copysign (rtx operands[])
> >
> >mode = GET_MODE (dest);
> >
> > -  if (mode == SFmode)
> > +  if (mode == HFmode)
> > +vmode = V8HFmode;
> > +  else if (mode == SFmode)
> >  vmode = V4SFmode;
> >else if (mode == DFmode)
> >  vmode = V2DFmode;
> > @@ -1934,7 +1938,7 @@ ix86_expand_copysign (rtx operands[])
> >if (real_isneg (CONST_DOUBLE_REAL_VALUE (op0)))
> > op0 = simplify_unary_operation (ABS, mode, op0, mode);
> >
> > -  if (mode == SFmode || mode == DFmode)
> > +  if (mode == HFmode || mode == SFmode || mode == DFmode)
> > {
> >   if (op0 == CONST0_RTX (mode))
> > op0 = CONST0_RTX (vmode);
> > @@ -2073,7 +2077,9 @@ ix86_expand_xorsign (rtx operands[])
> >
> >mode = GET_MODE (dest);
> >
> > -  if (mode == SFmode)
> > +  if (mode == HFmode)
> > +vmode = V8HFmode;
> > +  else if (mode == SFmode)
> >  vmode = V4SFmode;
> >else if (mode == DFmode)
> >  vmode = V2DFmode;
> > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> > index dc0d440061b..17e1b5ea874 100644
> > --- a/gcc/config/i386/i386.c
> > +++ b/gcc/config/i386/i386.c
> > @@ -15374,6 +15374,9 @@ ix86_build_const_vector (machine_mode mode, bool 
> > vect, rtx value)
> >  case E_V2DImode:
> >gcc_assert (vect);
> >/* FALLTHRU */
> > +case E_V8HFmode:
> > +case E_V16HFmode:
> > +case E_V32HFmode:
> >  case E_V16SFmode:
> >  case E_V8SFmode:
> >  case E_V4SFmode:
> > @@ -15412,6 +15415,13 @@ ix86_build_signbit_mask (machine_mode mode, bool 
> > vect, bool invert)
> >
> >switch (mode)
> >  {
> > +case E_V8HFmode:
> > +case E_V16HFmode:
> > +case E_V32HFmode:
> > +  vec_mode = mode;
> > +  imode = HImode;
> > +  break;
> > +
> >  case E_V16SImode:
> >  case E_V16SFmode:
> >  case E_V8SImode:
> > @@ -19198,7 +19208,7 @@ ix86_can_change_mode_class (machine_mode from, 
> > machine_mode to,
> >  disallow a change to these modes, reload will assume it's ok to
> >  drop the subreg from (subreg:SI (reg:HI 100) 0).  This affects
> >  the vec_dupv4hi pattern.  */
> > -  if (GET_MODE_SIZE (from) < 4)
> > +  if (GET_MODE_SIZE (from) < 4 && from != E_HFmode)
> > return false;
> >  }
> >
> > diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> > index 014aba187e1..a85c23d74f1 100644
> > --- a/gcc/config/i386/i386.md
> > +++ b/gcc/config/i386/i386.md
> > @@ -1233,9 

[Bug tree-optimization/21712] missed optimization due with const function and pulling out of loops

2021-07-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=21712

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |FIXED
   Target Milestone|--- |4.3.0
 Status|NEW |RESOLVED

--- Comment #26 from Andrew Pinski  ---
Fixed for GCC 4.3.0 and above.  Most likely by r0-86459 .

[Bug tree-optimization/101621] gcc cannot optimize int8_t vector assign with subscription to shuffle

2021-07-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101621

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement

--- Comment #2 from Andrew Pinski  ---
The cast issue is because in GCC 9, it was not producing PERM at the gimple
level which was fixed correctly in GCC 11.

clang_shuffle_with_zero can easy be added.

[Bug target/19922] xor is enclosed in loop, and exectuted on each iteration of for statement

2021-07-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19922

--- Comment #7 from Andrew Pinski  ---
So the question becomes do we care about this look if 
-fno-tree-loop-distribute-patterns  is added?  Anyways we are able to detect
the loop is a memset for a while now and then expand that to have no xor inside
the loop.

Re: 0001-Don-t-skip-prologue-instructions-as-it-could-affect-.patch

2021-07-25 Thread Bin.Cheng via Gcc-patches
On Sat, Jul 24, 2021 at 12:30 AM Jeff Law via Gcc-patches
 wrote:
>
>
>
> On 7/14/2021 3:14 AM, bin.cheng via Gcc-patches wrote:
> > Hi,
> > I ran into a wrong code bug in code with deep template instantiation when 
> > working on sdx::simd.
> > The root cause as described in commit summary is we skip prologue insns in 
> > init_alias_analysis.
> > This simple patch fixes the issue, however, it's hard to reduce a case 
> > because of heavy use of
> > templates.
> > Bootstrap and test on x86_64, is it OK?
> It's a clear correctness improvement, but what's unclear to me is why
> we'd want to skip them in the epilogue either.
I can only guess, there is nothing to initialize epilogue for because
no code follows.

Thanks,
bin
>
> Jeff


[Bug tree-optimization/101621] gcc cannot optimize int8_t vector assign with subscription to shuffle

2021-07-25 Thread yumeyao at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101621

--- Comment #1 from YumeYao  ---
https://gcc.godbolt.org/z/a47Enb9oK

16-bytes (AVX) version added.

[Bug target/18233] extraneous inc/dec pair

2021-07-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=18233

Andrew Pinski  changed:

   What|Removed |Added

 Depends on||94956
 Resolution|--- |FIXED
   Target Milestone|--- |11.2
 Status|NEW |RESOLVED

--- Comment #4 from Andrew Pinski  ---
So this is fixed in a few different ways but fully with r11-194.

For x86 with cmov (!=i386), this was fixed in GCC 4.5.0 where the ffs is
expanded at expand time to use ctz and cmov.

without cmov, this was only fixed in GCC 11 with r11-194 which changes ffs to
ctz if ctz has a known 0 alrgument which x86 has.

So closing as fixed for GCC 11; There is already a testcase for this too;
gcc.target/i386/pr94956.c .


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94956
[Bug 94956] Unable to remove impossible ffs() test for zero

[Bug tree-optimization/101621] New: gcc cannot optimize int8_t vector assign with subscription to shuffle

2021-07-25 Thread yumeyao at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101621

Bug ID: 101621
   Summary: gcc cannot optimize int8_t vector assign with
subscription to shuffle
   Product: gcc
   Version: 11.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: yumeyao at gmail dot com
  Target Milestone: ---

https://gcc.godbolt.org/z/91cqenf99

typedef char v16b __attribute__((vector_size(16)));

To summary it up, regarding optimizing v = { v[n] ...} into shuffle, targeting
Intel x86(x86_64):
These is a lack of optimization when there is a zero
There is some regression starting from gcc9.
so this might be 2 issues. But I think a proper fix could resolve both.


* gcc can optimize int8_t vector assign with subscription of the same vector to
shuffle, like this:
v16b gcc_can_shuffle(v16b b) {
return (v16b) {b[0], b[0], b[0], b[0], b[4], b[4], b[4], b[4], b[8], b[8],
b[8], b[8], b[12], b[12], b[12], b[12]};
}

* However, if there is a zero, gcc can't handle this. Actually this is
supported on Intel x86, with a negative subscription indicating the 'zero
value'.
Clang can do the optimization starting with clang 5.

* Furthermore, there is a regression:
gcc < 8 can always optimize it, but starting with gcc9, if there is a cast,
then the optimization fails:
typedef long v2si64 __attribute__((vector_size(16)));
v16b gcc_cannot_shuffle_with_cast(v2si64 x) {
v16b b = (v16b)x;
v16b b0 = {b[0], b[0], b[0], b[0], b[4], b[4], b[4], b[4], b[8], b[8],
b[8], b[8], b[12], b[12], b[12], b[12]};
return b0;
}
gcc 11 can optimize it on -O3, but not on -O1 or -O2.

[Bug target/101614] [s390] vec_signed requires z15, docs say z13

2021-07-25 Thread evan--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101614

Evan Nemerson  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #1 from Evan Nemerson  ---
Never mind; the ARCH in the documentation refers to the same value as __ARCH__,
not -march=zN

[Bug rtl-optimization/101617] a ? -1 : 1 -> (-(type)a) | 1

2021-07-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101617

--- Comment #7 from Andrew Pinski  ---
A few more canonicalization issues that need to be thought of:

"a >>u (bitsize-1)" and "a >s (bitsize-1)" and "-(a  Thinking about this some more, there is a canonicalization issue. We need to
> decide if we want to canonicalization to just a ? -1 : 1; or expand it out.
> a ? 1 : 0 makes sense to do (cast) a;  So does "a ? 0 : 1".
> 
> Does the current a ? -1 : 0 make sense or just add that to ifcvt.

PR101339 is related to that canonicalization really.

There are others.

Even things like:
(a == 0) + 2
Should that be:
a == 0 ? 3 : 2
On the gimple level
and then do the correct thing on the RTL level?

[Bug rtl-optimization/101617] a ? -1 : 1 -> (-(type)a) | 1

2021-07-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101617

--- Comment #6 from Andrew Pinski  ---
Thinking about this some more, there is a canonicalization issue. We need to
decide if we want to canonicalization to just a ? -1 : 1; or expand it out.
a ? 1 : 0 makes sense to do (cast) a;  So does "a ? 0 : 1".

Does the current a ? -1 : 0 make sense or just add that to ifcvt.

I am going to take a few days to think of this and such.

There are other issues that deal with this.  Even having a cmov existing makes
it harder to decide.  Even though for an example -(a == 0) can be optimized
nicely on x86, it might not be nicely on other targets.

[Bug d/101490] ICE at convert_expr(tree_node*, Type*, Type*)

2021-07-25 Thread ibuclaw at gdcproject dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101490

--- Comment #1 from Iain Buclaw  ---
Reduced test
---
struct test
{
int[0] foo;
}

void main()
{
test* t;
auto a = cast(typeof(t.foo)[0])t.foo;
write(a);
}

void write(S)(S args)
{
foreach (arg; args)
{
}
}

gcc-12-20210725 is now available

2021-07-25 Thread GCC Administrator via Gcc
Snapshot gcc-12-20210725 is now available on
  https://gcc.gnu.org/pub/gcc/snapshots/12-20210725/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 12 git branch
with the following options: git://gcc.gnu.org/git/gcc.git branch master 
revision b454c40956947938c9e274d75cef8a43171f3efa

You'll find:

 gcc-12-20210725.tar.xz   Complete GCC

  SHA256=9f15451e777cdb6da0c21bff5f0bb61593ad4cb48e994f0eb13462c8ca33d2ee
  SHA1=b6f99513d4930afe36c93f1c4294084d93e7417c

Diffs from 12-20210718 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-12
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


[Bug rtl-optimization/101617] a ? -1 : 1 -> (-(type)a) | 1

2021-07-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101617

Andrew Pinski  changed:

   What|Removed |Added

  Attachment #51203|0   |1
is obsolete||

--- Comment #5 from Andrew Pinski  ---
Comment on attachment 51203
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51203
ifcvt patch

This patch is wrong if STORE_FLAG_VALUE == -1.

[Bug rtl-optimization/101617] a ? -1 : 1 -> (-(type)a) | 1

2021-07-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101617

--- Comment #4 from Andrew Pinski  ---
Created attachment 51203
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51203=edit
ifcvt patch

Patch which go into testing.

[Bug rtl-optimization/101617] a ? -1 : 1 -> (-(type)a) | 1

2021-07-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101617

--- Comment #3 from Andrew Pinski  ---
I have the ifcvt.c patch which adds this.

[Bug d/101441] __FUNCTION__ doesn't work in core.stdc.stdio functions without cast

2021-07-25 Thread ibuclaw at gdcproject dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101441

--- Comment #1 from Iain Buclaw  ---
Upstream dmd fixed bug much later than 2.076.

https://github.com/dlang/dmd/pull/9920

[Bug rtl-optimization/67382] RTL combiner is too eager to combine (plus (reg 92) (reg 92)) to (ashift (reg 92) (const_int 1))

2021-07-25 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67382

--- Comment #5 from Segher Boessenkool  ---
It turns out that noop other_insn is fine, and is accepted etc., but the
resulting i3 in this case is not.

[Bug rtl-optimization/101617] a ? -1 : 1 -> (-(type)a) | 1

2021-07-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101617

Andrew Pinski  changed:

   What|Removed |Added

  Component|tree-optimization   |rtl-optimization

--- Comment #2 from Andrew Pinski  ---
I decided that this should really go on the RTL level 

[Bug c++/101620] New: gcc incorrectly makes concept checking in incomplete-class context

2021-07-25 Thread fchelnokov at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101620

Bug ID: 101620
   Summary: gcc incorrectly makes concept checking in
incomplete-class context
   Product: gcc
   Version: 11.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: fchelnokov at gmail dot com
  Target Milestone: ---

Compilation of this program
```
struct A {};

template
concept DerivedOnceFromA = requires(T t) { { static_cast(t) }; };

template
struct B {};

struct C : A
{
B foo();
};
```
must fail, since B is checked in incomplete struct C context:
https://gcc.godbolt.org/z/ajh8MsY4n

[Bug c++/52099] Incorrectly applying conversion when catching pointer-to-members

2021-07-25 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52099

--- Comment #2 from Jonathan Wakely  ---
>From the dup:


 Eric Fiselier 2016-01-20 03:50:56 UTC

Created attachment 37399 [details]
reproducer

I don't see where [except.handle] allows such a conversion.

Comment 1 Jonathan Wakely 2017-01-13 20:36:35 UTC

We're missing a check for cv-qualifiers in
__pointer_to_member_type_info::__pointer_catch that needs to be done before we
compare the pointees. Both pointees have type void() so we need to compare the
cv-quals before that info is lost.

Comment 2 Jonathan Wakely 2017-01-13 20:49:13 UTC

Hmm, we don't seem to have the cv-quals in __flags. That's a problem.

Comment 3 Jonathan Wakely 2017-01-13 21:08:10 UTC

When compiled with clang the pointees are different, so the match fails when
comparing them.

Using Clang:

(gdb) step
__cxxabiv1::__pbase_type_info::__pointer_catch (this=0x401cc0 , thrown_type=0x401d10 ,
thr_obj=0x7fffd220, outer=0)
at
/usr/lib/gcc/x86_64-redhat-linux/6.3.1/../../../../include/c++/6.3.1/cxxabi.h:309
(gdb) step
std::type_info::__do_catch (this=0x401c90 ,
thr_type=0x401cf8 ) at
../../../../libstdc++-v3/libsupc++/tinfo.cc:71
(gdb) p *this
$3 = {_vptr.type_info = 0x6030b0 , __name = 0x401c89  "KFvvE"}
(gdb) p *thr_type
$4 = {_vptr.type_info = 0x6030b0 , __name = 0x401cf0  "FvvE"}
(gdb) 


But using GCC the two pointee types are the same:

(gdb) p *this
$1 = {_vptr.type_info = 0x6030e8 , __name = 0x401c50  "FvvE"}
(gdb) p *thr_type
$2 = {_vptr.type_info = 0x6030e8 , __name = 0x401c50  "FvvE"}

So it looks like the problem is in the front-end where the typeinfo object for
a pointer to cv-qualified member function has the wrong pointee type.

Comment 4 Jonathan Wakely 2017-01-13 23:05:34 UTC

My front-end debugging skills are pitiful, but I've found something suspicious.
ptm_initializer uses TYPE_PTRMEM_POINTED_TO_TYPE to get that pointee type. For
this case that expands to TYPE_PTRMEMFUNC_FN_TYPE which is a call to
cp_build_qualified_type with the qualifiers from cp_type_quals.

But cp_type_quals tries pretty hard to ensure we never get cv-quals for a
function type. For the purposes of RTTI, where we really do care about the
difference between void() and void()const, do we want the memfn quals instead?

Comment 5 Jonathan Wakely 2017-01-13 23:20:33 UTC

For the attached reproducer this condition is never true in
cp_build_qualified_type_real

  /* But preserve any function-cv-quals on a FUNCTION_TYPE.  */
  if (TREE_CODE (type) == FUNCTION_TYPE)
type_quals |= type_memfn_quals (type);

As far as I can tell this is what's supposed to put the cv-quals back onto the
function type, so we'd have a pointee of type void() const not void().

[Bug rtl-optimization/67382] RTL combiner is too eager to combine (plus (reg 92) (reg 92)) to (ashift (reg 92) (const_int 1))

2021-07-25 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67382

--- Comment #4 from Segher Boessenkool  ---
(In reply to Andrew Pinski from comment #3)
> Note combine is able to figure out the jump is unconditional but there is no
> "pattern" to match it:
> Trying 10 -> 17:
>10: r85:QI=0x1
>17: {flags:CCC=cmp(r85:QI-0x1,r85:QI);clobber scratch;}
>   REG_DEAD r85:QI
>   REG_EQUAL cmp(0,0x1)
> Failed to match this instruction:
> (parallel [
> (set (pc)
> (pc))
> (clobber (scratch:QI))
> ])
> Failed to match this instruction:
> (set (pc)
> (pc))

This is an other_insn, namely a cc_use_insn.  We currently use that for
changing the cc mode used.
update_cfg_for_uncondjump
There is code in combine for handling (set (pc) (pc)) in other_insn, in
fact (see where update_cfg_for_uncondjump is called).

There also is code (in recog_for_combine_1) that should handle noop sets
like this.  It does not print anything if that happens though.

Investigating.

[PATCH v2, Fortran] TS 29113 testsuite

2021-07-25 Thread Sandra Loosemore
Here is an updated version of my TS29113 testsuite.  The last version I 
posted became kind of bit-rotten after Tobias's commit "Fortran: Fix 
bind(C) character length checks" for PR92842, which changed the wording 
of the error message that I'd been catching with dg-bogus in many 
places.  I've also merged some bug fixes to the test cases (most of 
which I'd already posted in conjunction with other patches to fix the 
associated library function bugs), and updated all the 
ISO_Fortran_binding.h #includes on the theory that I will iron out the 
remaining include-path problem with my patch series for PR101305 and get 
that committed before some consensus is reached on what to do about this 
patch.


With this version I'm now getting 263 XFAILs per multilib on x86.  With 
the bug fix patches I have already posted that are still awaiting 
review/committal, 42 of those go away.  And this approved but 
not-yet-committed patch from Jose


https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572725.html

fixes 6 more.

Reducing the exact number of XFAILs probably doesn't matter to the 
meta-discussion about whether it is OK to commit a pile of tests with so 
many things XFAILed, but passing around updated patches like this is 
about the only way to keep a working version of the testsuite handy to 
anyone besides me who might want to help fix some of these bugs and to 
make sure we aren't introducing regressions.  :-(


-Sandra


ts29113-jul25.patch.gz
Description: application/gzip


[Bug tree-optimization/101617] a ? -1 : 1 -> (-(type)a) | 1

2021-07-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101617

--- Comment #1 from Andrew Pinski  ---
So it turns out you can make this generic and don't need to handle 1 specially
diff --git a/gcc/match.pd b/gcc/match.pd
index beb8d27535e..2af987278af 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -3805,14 +3805,23 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (simplify
  (cond @0 INTEGER_CST@1 INTEGER_CST@2)
  (switch
+   /* a ? CST : -1 -> -(!a) | CST. */
+  (if (INTEGRAL_TYPE_P (type) && integer_all_onesp (@2))
+   (with {
+  tree booltrue = constant_boolean_node (true, boolean_type_node);
+}
+(bit_ior (negate (convert (bit_xor (convert:boolean_type_node @0) {
booltrue; } ))) @2)))
+   /* a ? -1 : CST -> -(a) | CST. */
+  (if (INTEGRAL_TYPE_P (type) && integer_all_onesp (@1))
+   (with {
+  tree booltrue = constant_boolean_node (true, boolean_type_node);
+}
+(bit_ior (negate (convert (convert:boolean_type_node @0))) @2)))
   (if (integer_zerop (@2))
(switch
 /* a ? 1 : 0 -> a if 0 and 1 are integral types. */
 (if (integer_onep (@1))
  (convert (convert:boolean_type_node @0)))
-/* a ? -1 : 0 -> -a. */
-(if (INTEGRAL_TYPE_P (type) && integer_all_onesp (@1))
- (negate (convert (convert:boolean_type_node @0
 /* a ? powerof2cst : 0 -> a << (log2(powerof2cst)) */
 (if (INTEGRAL_TYPE_P (type) && integer_pow2p (@1))
  (with {
@@ -3827,9 +3836,6 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  /* a ? 0 : 1 -> !a. */
  (if (integer_onep (@2))
   (convert (bit_xor (convert:boolean_type_node @0) { booltrue; } )))
- /* a ? -1 : 0 -> -(!a). */
- (if (INTEGRAL_TYPE_P (type) && integer_all_onesp (@2))
-  (negate (convert (bit_xor (convert:boolean_type_node @0) { booltrue; }

  /* a ? powerof2cst : 0 -> (!a) << (log2(powerof2cst)) */
  (if (INTEGRAL_TYPE_P (type) &&  integer_pow2p (@2))
   (with {

Function attribute to indicate a likely (or unlikely) return value

2021-07-25 Thread Dominique Pellé via Gcc
Hi

I read https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html
but was left wondering: is there a way to annotate a function
to indicate that a return value is likely (or unlikely)?

For example, let's say we have this function:

  // Return OK (=0) in case of success (frequent case)
  // or an error code != 0 in case of failure (rare case).
  int do_something();

If it's unlikely to fail, I wish I could declare the function like
this (pseudo-code!):

  int do_something() __likely_return(OK);

So wherever it's used, the optimizer can optimize branch
prediction and the instruction cache.  In other words, lines
like this:

  if (do_something() == OK)

...  would implicitly be similar to:

  // LIKELY defined as __builtin_expect((x), 1).
  if (LIKELY(do_something() == OK))

The advantage of being able to annotate the declaration,
is that we only need to annotate once in the header, and
all uses of the function can benefit from the optimization
without polluting/modifying all code where the function
is called.

Another example: a function that would be unlikely to
return NULL could be declared as:

  void *foo() __unlikely_returns(NULL);

This last example would be a bit similar to the
__attribute__((malloc)) since I read about it in the doc:

> In addition, the GCC predicts that a function with
> the attribute returns non-null in most cases.

Of course __attribute__((malloc)) gives other guarantees
(return value cannot alias any other pointer) so it's not
equivalent.

Would attribute __likely_return() and  __unlikely_return()
make sense?

Is there already a way to achieve this which I missed in
the doc?

Regards
Dominique


[Bug fortran/92482] BIND(C) with array-descriptor mishandled for type character

2021-07-25 Thread sandra at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92482

sandra at gcc dot gnu.org changed:

   What|Removed |Added

 CC||sandra at gcc dot gnu.org

--- Comment #4 from sandra at gcc dot gnu.org ---
Tobias's recent commit (which he forgot to tag with this issue) changed the
"must be length 1" messages to something more descriptive, but the
functionality itself still isn't working.

commit b3d4011ba10275fbd5d6ec5a16d5aaebbdfb5d3c
Author: Tobias Burnus 
Date:   Wed Jul 21 09:36:48 2021 +0200

Fortran: Fix bind(C) character length checks

gcc/fortran/ChangeLog:

* decl.c (gfc_verify_c_interop_param): Update for F2008 + F2018
changes; reject unsupported bits with 'Error: Sorry,'.
* trans-expr.c (gfc_conv_procedure_call): Fix condition to
For using CFI descriptor with characters.

gcc/testsuite/ChangeLog:

* gfortran.dg/iso_c_binding_char_1.f90: Update dg-error.
* gfortran.dg/pr32599.f03: Use -std=-f2003 + update comment.
* gfortran.dg/bind_c_char_10.f90: New test.
* gfortran.dg/bind_c_char_6.f90: New test.
* gfortran.dg/bind_c_char_7.f90: New test.
* gfortran.dg/bind_c_char_8.f90: New test.
* gfortran.dg/bind_c_char_9.f90: New test.

Re: [patch][version5]add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-07-25 Thread Qing Zhao via Gcc-patches



> On Jul 25, 2021, at 10:59 AM, Qing Zhao via Gcc-patches 
>  wrote:
> 
> Hi,
> 
> This is the 5th version of the patch for the new security feature for GCC.
> 
> I have tested it with bootstrap on both x86 and aarch64, regression testing 
> on both x86 and aarch64.
> Also compile and run CPU2017, without any issue.

NOTE here, for CPU2017 -ftrivial-auto-var-init=pattern, I opened bug 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101586
And then the compilation and running of CPU2017 is done with my 5th patch + the 
patch provided from Jakub for PR101586.

Qing

> Please take a look and let me know your comments and suggestions.
> 
> thanks.
> 
> Qing
> 
> **Compare with the 4th version, the following are the major changes:
> 
> 1. delete the code for handling "grp_to_be_debug_replaced" since they are not 
> needed per Martin Jambor's suggestion.
> 2. for Pattern init, call __builtin_clear_padding after the call to 
> .DEFERRED_INIT to initialize the paddings to zeroes;
> 3. for partially or fully initialized auto variables, call   
> __builtin_clear_padding before the real initialization to initialize
>the paddings to zeroes.
> 4. Update the documentation with padding initialization to zeroes.
> 5. in order to reuse __builtin_clear_padding for auto init purpose, add one 
> more dummy argument to indiciate whether it's for auto init or not,
>   if for auto init, do not emit error messages to avoid confusing users.
> 6. Add new testing cases to verify padding initializations.
> 7. rename some of the old testing cases to make the file name reflecting the 
> testing purpose per Kees Cook's suggestions.
> 
> **Please see version 4 at:
> https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574642.html
> 
> **ChangeLog is:
> gcc/ChangeLog:
> 
> 2021-07-23  qing zhao  
> 
>* builtins.c (expand_builtin_memset): Make external visible.
>* builtins.h (expand_builtin_memset): Declare extern.
>* common.opt (ftrivial-auto-var-init=): New option.
>* doc/extend.texi: Document the uninitialized attribute.
>* doc/invoke.texi: Document -ftrivial-auto-var-init.
>* flag-types.h (enum auto_init_type): New enumerated type
>auto_init_type.
>* gimple-fold.c (clear_padding_type): Add one new parameter.
>(clear_padding_union): Likewise.
>(clear_padding_emit_loop): Likewise.
>(clear_type_padding_in_mask): Likewise.
>(gimple_fold_builtin_clear_padding): Handle this new parameter.
>* gimplify.c (gimple_add_init_for_auto_var): New function.
>(maybe_with_size_expr): Forword declaration.
>(build_deferred_init): New function.
>(gimple_add_padding_init_for_auto_var): New function.
>(gimplify_decl_expr): Add initialization to automatic variables per
>users' requests.
>(gimplify_call_expr): Add one new parameter for call to
>__builtin_clear_padding.
>(gimplify_modify_expr_rhs): Add padding initialization before
>gimplify_init_constructor.
>* internal-fn.c (INIT_PATTERN_VALUE): New macro.
>(expand_DEFERRED_INIT): New function.
>* internal-fn.def (DEFERRED_INIT): New internal function.
>* tree-cfg.c (verify_gimple_call): Verify calls to .DEFERRED_INIT.
>* tree-sra.c (generate_subtree_deferred_init): New function.
>(sra_modify_deferred_init): Likewise.
>(sra_modify_function_body): Handle calls to DEFERRED_INIT specially.
>* tree-ssa-structalias.c (find_func_aliases_for_call): Likewise.
>* tree-ssa-uninit.c (warn_uninit): Handle calls to DEFERRED_INIT
>specially.
>(check_defs): Likewise.
>(warn_uninitialized_vars): Likewise.
>* tree-ssa.c (ssa_undefined_value_p): Likewise.
> 
> gcc/c-family/ChangeLog:
> 
> 2021-07-23  qing zhao  
> 
>* c-attribs.c (handle_uninitialized_attribute): New function.
>(c_common_attribute_table): Add "uninitialized" attribute.
> 
> gcc/testsuite/ChangeLog:
> 
> 
> 2021-07-23  qing zhao  
> 
>* c-c++-common/auto-init-1.c: New test.
>* c-c++-common/auto-init-10.c: New test.
>* c-c++-common/auto-init-11.c: New test.
>* c-c++-common/auto-init-12.c: New test.
>* c-c++-common/auto-init-13.c: New test.
>* c-c++-common/auto-init-14.c: New test.
>* c-c++-common/auto-init-15.c: New test.
>* c-c++-common/auto-init-16.c: New test.
>* c-c++-common/auto-init-2.c: New test.
>* c-c++-common/auto-init-3.c: New test.
>* c-c++-common/auto-init-4.c: New test.
>* c-c++-common/auto-init-5.c: New test.
>* c-c++-common/auto-init-6.c: New test.
>* c-c++-common/auto-init-7.c: New test.
>* c-c++-common/auto-init-8.c: New test.
>* c-c++-common/auto-init-9.c: New test.
>* c-c++-common/auto-init-esra.c: New test.
>* c-c++-common/auto-init-padding-1.c: New test.
>* 

[PATCH] incorrect arguments designated in -Wnonnull for arrays

2021-07-25 Thread Uecker, Martin

Two arguments are switched for -Wnonnull when
warning about array parameters with bounds > 0
and which are NULL.

This patch corrects the mistake.

Martin


2021-07-25  Martin Uecker  

gcc/
 * calls.c (maybe_warn_rdwr_sizes): Correct argument
 numbers in warning that were switched.

gcc/testsuite/
 * gcc.dg/Wnonnull-4.c: Correct argument numbers in warnings.



diff --git a/gcc/calls.c b/gcc/calls.c
index d2413a280cf..c54c57206c7 100644
--- a/gcc/calls.c
+++ b/gcc/calls.c
@@ -2128,8 +2128,8 @@ maybe_warn_rdwr_sizes (rdwr_map *rwm, tree fndecl, tree 
fntype, tree exp)
  "array %s is null but "
  "the corresponding bound argument "
  "%i value is %s",
- sizidx + 1, argtypestr.c_str (),
- ptridx + 1, sizstr))
+ ptridx + 1, argtypestr.c_str (),
+ sizidx + 1, sizstr))
arg_warned = OPT_Wnonnull;
}
  else if (warning_at (loc, OPT_Wnonnull,
diff --git a/gcc/testsuite/gcc.dg/Wnonnull-4.c 
b/gcc/testsuite/gcc.dg/Wnonnull-4.c
index 180a40d4606..2c1c45a9856 100644
--- a/gcc/testsuite/gcc.dg/Wnonnull-4.c
+++ b/gcc/testsuite/gcc.dg/Wnonnull-4.c
@@ -27,9 +27,9 @@ void test_fca_n (int r_m1)
   T (  0);
 
   // Verify positive bounds.
-  T (  1);  // { dg-warning "argument 1 of variable length array 
'char\\\[n]' is null but
the corresponding bound argument 2 value is 1" }
-  T (  9);  // { dg-warning "argument 1 of variable length array 
'char\\\[n]' is null but
the corresponding bound argument 2 value is 9" }
-  T (max);  // { dg-warning "argument 1 of variable length array 
'char\\\[n]' is null but
the corresponding bound argument 2 value is \\d+" }
+  T (  1);  // { dg-warning "argument 2 of variable length array 
'char\\\[n]' is null but
the corresponding bound argument 1 value is 1" }
+  T (  9);  // { dg-warning "argument 2 of variable length array 
'char\\\[n]' is null but
the corresponding bound argument 1 value is 9" }
+  T (max);  // { dg-warning "argument 2 of variable length array 
'char\\\[n]' is null but
the corresponding bound argument 1 value is \\d+" }
 }
 
 
@@ -55,9 +55,9 @@ void test_fsa_x_n (int r_m1)
   T (  0);
 
   // Verify positive bounds.
-  T (  1);  // { dg-warning "argument 1 of variable length array 
'short int\\\[]\\\[n]' is
null but the corresponding bound argument 2 value is 1" }
-  T (  9);  // { dg-warning "argument 1 of variable length array 
'short int\\\[]\\\[n]' is
null but the corresponding bound argument 2 value is 9" }
-  T (max);  // { dg-warning "argument 1 of variable length array 
'short int\\\[]\\\[n]' is
null but the corresponding bound argument 2 value is \\d+" }
+  T (  1);  // { dg-warning "argument 2 of variable length array 
'short int\\\[]\\\[n]' is
null but the corresponding bound argument 1 value is 1" }
+  T (  9);  // { dg-warning "argument 2 of variable length array 
'short int\\\[]\\\[n]' is
null but the corresponding bound argument 1 value is 9" }
+  T (max);  // { dg-warning "argument 2 of variable length array 
'short int\\\[]\\\[n]' is
null but the corresponding bound argument 1 value is \\d+" }
 }
 
 
@@ -83,9 +83,9 @@ void test_fia_1_n (int r_m1)
   T (  0);
 
   // Verify positive bounds.
-  T (  1);  // { dg-warning "argument 1 of variable length array 
'int\\\[1]\\\[n]' is null
but the corresponding bound argument 2 value is 1" }
-  T (  9);  // { dg-warning "argument 1 of variable length array 
'int\\\[1]\\\[n]' is null
but the corresponding bound argument 2 value is 9" }
-  T (max);  // { dg-warning "argument 1 of variable length array 
'int\\\[1]\\\[n]' is null
but the corresponding bound argument 2 value is \\d+" }
+  T (  1);  // { dg-warning "argument 2 of variable length array 
'int\\\[1]\\\[n]' is null
but the corresponding bound argument 1 value is 1" }
+  T (  9);  // { dg-warning "argument 2 of variable length array 
'int\\\[1]\\\[n]' is null
but the corresponding bound argument 1 value is 9" }
+  T (max);  // { dg-warning "argument 2 of variable length array 
'int\\\[1]\\\[n]' is null
but the corresponding bound argument 1 value is \\d+" }
 }
 
 
@@ -111,9 +111,9 @@ void test_fla_3_n (int r_m1)
   T (  0);
 
   // Verify positive bounds.
-  T (  1);  // { dg-warning "argument 1 of variable length array 'long 
int\\\[3]\\\[n]' is
null but the corresponding bound argument 2 value is 1" }
-  T (  9);  // { dg-warning "argument 1 of variable length array 'long 
int\\\[3]\\\[n]' is
null but the corresponding bound argument 2 value is 9" }
-  T (max);  // { dg-warning "argument 1 of variable length array 'long 
int\\\[3]\\\[n]' is
null but the corresponding bound argument 2 value is \\d+" }
+  T (  

[Bug d/101619] New: d: Change in DotTemplateExp type semantics leading to regression

2021-07-25 Thread ibuclaw at gdcproject dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101619

Bug ID: 101619
   Summary: d: Change in DotTemplateExp type semantics leading to
regression
   Product: gcc
   Version: 10.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: d
  Assignee: ibuclaw at gdcproject dot org
  Reporter: ibuclaw at gdcproject dot org
  Target Milestone: ---

A regression found in upstream was included in the fix for PR100999.
---
import std.range.primitives : isInputRange;
struct Slice
{
bool empty() const;
int front() const;
void popFront()()
{
}
}
static assert(isInputRange!(  Slice) == true);
static assert(isInputRange!(const Slice) == false);  // fails since PR100999

Re: [PATCH] Analyzer: Refactor callstring to work with pairs of supernodes.

2021-07-25 Thread Prathamesh Kulkarni via Gcc-patches
On Sun, 25 Jul 2021 at 16:03, Ankur Saini via Gcc-patches
 wrote:
>
> Here is the new patch after fixing all the issues pointed out in the previous 
> version.
Just a nitpick:

+/* call_string::element_t's inequality operator.  */
+bool
+call_string::element_t::operator!= (const call_string::element_t ) const
+{
+  if (m_caller != other.m_caller || m_callee != other.m_callee)
+return true;
+  return false;
+}

Since you define operator== above, perhaps just implement operator != as:
return !(*this == other) ?

Thanks,
Prathamesh
>
>
>
> —
>
> Question :
>
> 1. The mail id I am using here to send the patch ( 
> arsenic.second...@gmail.com ) and the mail id in the patch ( 
> arse...@sourceware.org ) are different from one and other, will this affect 
> the process in any ways ?
>
>
> Thanks
> - Ankur


[PATCH] Support RangeFrom ([x..]) and RangeFromTo ([x..y]) in the parser

2021-07-25 Thread Mark Wielaard
Parsing the .. (DOT_DOT) operator to get a range had two
issues. Trying to compile:

  let block = [1,2,3,4,5];
  let _rf = [1..];
  let _rt = [..3];
  let _rft = [2..4];

range.rs:4:23: error: found unexpected token ‘]’ in null denotation
4 |   let _rf = [1..];
  |   ^
range.rs:4:24: error: expecting ‘]’ but ‘;’ found
4 |   let _rf = [1..];
  |^

Since .. can represent either a range from or a range from-to it can
be followed by an expression or not. We do have a hack in our
pratt-parser so that it is allowed to return a nullptr. But even in
that case it will have swallowed the next token. Add another hack to
the pratt-parser so that if the next token is one that cannot start an
expression and the caller allows a nullptr return then don't skip the
token and return immediately.

After this patch we can parse the above range expressions, but we
still don't handle them fully:

range.rs:4:20: fatal error: Failed to lower expr: [1..]
4 |   let _rf = [1..];
  |^

Ranges are actually syntactic sugar for std::ops::Range[From|To].
---
 gcc/rust/parse/rust-parse-impl.h | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/gcc/rust/parse/rust-parse-impl.h b/gcc/rust/parse/rust-parse-impl.h
index be261715c6c..7b128fff157 100644
--- a/gcc/rust/parse/rust-parse-impl.h
+++ b/gcc/rust/parse/rust-parse-impl.h
@@ -12348,6 +12348,18 @@ Parser::parse_expr (int 
right_binding_power,
ParseRestrictions restrictions)
 {
   const_TokenPtr current_token = lexer.peek_token ();
+  // Special hack because we are allowed to return nullptr, in that case we
+  // don't want to skip the token, since we don't actually parse it. But if
+  // null isn't allowed it indicates an error, and we want to skip past that.
+  // So return early if it is one of the tokens that ends an expression
+  // (or at least cannot start a new expression).
+  if (restrictions.expr_can_be_null)
+{
+  TokenId id = current_token->get_id ();
+  if (id == SEMICOLON || id == RIGHT_PAREN || id == RIGHT_CURLY
+ || id == RIGHT_SQUARE)
+   return nullptr;
+}
   lexer.skip_token ();
 
   // parse null denotation (unary part of expression)
@@ -14028,6 +14040,9 @@ 
Parser::parse_led_range_exclusive_expr (
 {
   // FIXME: this probably parses expressions accidently or whatever
   // try parsing RHS (as tok has already been consumed in parse_expression)
+  // Can be nullptr, in which case it is a RangeFromExpr, otherwise a
+  // RangeFromToExpr.
+  restrictions.expr_can_be_null = true;
   std::unique_ptr right
 = parse_expr (LBP_DOT_DOT, AST::AttrVec (), restrictions);
 
-- 
2.32.0

-- 
Gcc-rust mailing list
Gcc-rust@gcc.gnu.org
https://gcc.gnu.org/mailman/listinfo/gcc-rust


[Bug bootstrap/100552] [11/12 Regression] configure: 32208: Syntax error: Bad substitution

2021-07-25 Thread ibuclaw at gdcproject dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100552

Iain Buclaw  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Iain Buclaw  ---
Given the two commits, I'm going to assume this is fixed.

[PATCH] Support byte and byte string literals

2021-07-25 Thread Mark Wielaard
A byte literal is an u8 created as a ascii char or hex escape
e.g. b'X'.  A byte string literal is a string created from ascii or
hex chars. bytes are represented as u8 and byte strings as str (with
just ascii < 256 chars), but it should really be &'static [u8; n].
---
 gcc/rust/backend/rust-compile-expr.h  |  9 -
 gcc/rust/parse/rust-parse-impl.h  |  8 
 gcc/rust/rust-backend.h   |  3 +++
 gcc/rust/rust-gcc.cc  |  9 +
 gcc/rust/typecheck/rust-hir-type-check-expr.h | 19 +++
 .../rust/compile/torture/byte_char_str.rs |  8 
 6 files changed, 55 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/rust/compile/torture/byte_char_str.rs

diff --git a/gcc/rust/backend/rust-compile-expr.h 
b/gcc/rust/backend/rust-compile-expr.h
index dff4712e18e..fa6a53991ac 100644
--- a/gcc/rust/backend/rust-compile-expr.h
+++ b/gcc/rust/backend/rust-compile-expr.h
@@ -278,7 +278,14 @@ public:
}
return;
 
-   case HIR::Literal::STRING: {
+   case HIR::Literal::BYTE: {
+ char c = literal_value->as_string ().c_str ()[0];
+ translated = ctx->get_backend ()->char_constant_expression (c);
+   }
+   return;
+
+  case HIR::Literal::STRING:
+   case HIR::Literal::BYTE_STRING: {
  auto base = ctx->get_backend ()->string_constant_expression (
literal_value->as_string ());
  translated
diff --git a/gcc/rust/parse/rust-parse-impl.h b/gcc/rust/parse/rust-parse-impl.h
index be261715c6c..73600d22d60 100644
--- a/gcc/rust/parse/rust-parse-impl.h
+++ b/gcc/rust/parse/rust-parse-impl.h
@@ -12545,10 +12545,18 @@ Parser::null_denotation 
(const_TokenPtr tok,
   return std::unique_ptr (
new AST::LiteralExpr (tok->get_str (), AST::Literal::STRING,
  tok->get_type_hint (), {}, tok->get_locus ()));
+case BYTE_STRING_LITERAL:
+  return std::unique_ptr (
+   new AST::LiteralExpr (tok->get_str (), AST::Literal::BYTE_STRING,
+ tok->get_type_hint (), {}, tok->get_locus ()));
 case CHAR_LITERAL:
   return std::unique_ptr (
new AST::LiteralExpr (tok->get_str (), AST::Literal::CHAR,
  tok->get_type_hint (), {}, tok->get_locus ()));
+case BYTE_CHAR_LITERAL:
+  return std::unique_ptr (
+   new AST::LiteralExpr (tok->get_str (), AST::Literal::BYTE,
+ tok->get_type_hint (), {}, tok->get_locus ()));
 case TRUE_LITERAL:
   return std::unique_ptr (
new AST::LiteralExpr ("true", AST::Literal::BOOL, tok->get_type_hint (),
diff --git a/gcc/rust/rust-backend.h b/gcc/rust/rust-backend.h
index 35271b60f43..1dd4aba12ca 100644
--- a/gcc/rust/rust-backend.h
+++ b/gcc/rust/rust-backend.h
@@ -331,6 +331,9 @@ public:
   // Return an expression for the string value VAL.
   virtual Bexpression *string_constant_expression (const std::string ) = 0;
 
+  // Get a char literal
+  virtual Bexpression *char_constant_expression (char c) = 0;
+
   // Get a char literal
   virtual Bexpression *wchar_constant_expression (wchar_t c) = 0;
 
diff --git a/gcc/rust/rust-gcc.cc b/gcc/rust/rust-gcc.cc
index 74a8b5221f1..23a91ad9bcb 100644
--- a/gcc/rust/rust-gcc.cc
+++ b/gcc/rust/rust-gcc.cc
@@ -333,6 +333,8 @@ public:
 
   Bexpression *wchar_constant_expression (wchar_t c);
 
+  Bexpression *char_constant_expression (char c);
+
   Bexpression *boolean_constant_expression (bool val);
 
   Bexpression *real_part_expression (Bexpression *bcomplex, Location);
@@ -1557,6 +1559,13 @@ Gcc_backend::wchar_constant_expression (wchar_t c)
   return this->make_expression (ret);
 }
 
+Bexpression *
+Gcc_backend::char_constant_expression (char c)
+{
+  tree ret = build_int_cst (this->char_type ()->get_tree (), c);
+  return this->make_expression (ret);
+}
+
 // Make a constant boolean expression.
 
 Bexpression *
diff --git a/gcc/rust/typecheck/rust-hir-type-check-expr.h 
b/gcc/rust/typecheck/rust-hir-type-check-expr.h
index 166535acba0..6e5b2312f50 100644
--- a/gcc/rust/typecheck/rust-hir-type-check-expr.h
+++ b/gcc/rust/typecheck/rust-hir-type-check-expr.h
@@ -542,6 +542,12 @@ public:
}
break;
 
+   case HIR::Literal::LitType::BYTE: {
+ auto ok = context->lookup_builtin ("u8", );
+ rust_assert (ok);
+   }
+   break;
+
case HIR::Literal::LitType::STRING: {
  TyTy::BaseType *base = nullptr;
  auto ok = context->lookup_builtin ("str", );
@@ -553,6 +559,19 @@ public:
}
break;
 
+   case HIR::Literal::LitType::BYTE_STRING: {
+ /* We just treat this as a string, but it really is an arraytype of
+u8. It isn't in UTF-8, but really just a byte array.  */
+ TyTy::BaseType *base = nullptr;
+ auto ok = context->lookup_builtin ("str", );
+ rust_assert (ok);
+
+ infered
+   = new 

Re: An asm constraint issue (ARM FPU)

2021-07-25 Thread Marc Glisse

On Sun, 25 Jul 2021, Zoltán Kócsi wrote:


I try to write a one-liner inline function to create a double form
a 64-bit integer, not converting it to a double but the integer
containing the bit pattern for the double (type spoofing).

The compiler is arm-eabi-gcc 8.2.0.
The target is a Cortex-A9, with NEON.

According to the info page the assembler constraint "w" denotes an FPU
double register, d0 - d31.

The code is the following:

double spoof( uint64_t x )
{
double r;

  asm volatile
  (
" vmov.64 %[d],%Q[i],%R[i] \n"


Isn't it supposed to be %P[d] for a double?
(the documentation is very lacking...)


: [d] "=w" (r)
: [i] "q" (x)
  );

  return r;
}

The command line:

arm-eabi-gcc -O0 -c -mcpu=cortex-a9 -mfloat-abi=hard -mfpu=neon-vfpv4 \
test.c

It compiles and the generated object code is this:

 :
  0:   e52db004push{fp}; (str fp, [sp, #-4]!)
  4:   e28db000add fp, sp, #0
  8:   e24dd014sub sp, sp, #20
  c:   e14b01f4strdr0, [fp, #-20]  ; 0xffec
 10:   e14b21d4ldrdr2, [fp, #-20]  ; 0xffec
 14:   ec432b30vmovd16, r2, r3
 18:   ed4b0b03vstrd16, [fp, #-12]
 1c:   e14b20dcldrdr2, [fp, #-12]
 20:   ec432b30vmovd16, r2, r3
 24:   eeb00b60vmov.f64d0, d16
 28:   e28bd000add sp, fp, #0
 2c:   e49db004pop {fp}; (ldr fp, [sp], #4)
 30:   e12fff1ebx  lr

which is not really efficient, but works.

However, if I specify -O1, -O2 or -Os then the compilation fails
because assembler complains. This is the assembly the compiler
generated, (comments and irrelevant stuff removed):

spoof:
  vmov.64 s0,r0,r1
  bx lr

where the problem is that 's0' is a single-precision float register and
it should be 'd0' instead.

Either I'm seriously missing something, in which case I would be most
obliged if someone sent me to the right direction; or it is a compiler
or documentation bug.

Thanks,

Zoltan


--
Marc Glisse


TBAA bug?

2021-07-25 Thread Uecker, Martin


Hi Richard,

here is another case where it seems that TBAA goes
wrong. Since this is not in a loop, it seems this
is something else than what we discussed. Is
this a known issue?

Best,
Martin


#include 
#include 

union u {
  long x;
  long long y;
};

__attribute__((noinline,noclone))
long test(long *px, long long *py, union u *pu)
{
  *px = 0;
  *py = 1;

  long xy = pu->y;
  pu->x = xy;

  return *px;
}

int main(void)
{
  union u u;
  printf("%ld\n", test(, , ));
}

https://godbolt.org/z/a9drezEza



[Bug sanitizer/101111] xgcc cross-compiler for x86_64-apple-darwin in GCC 11.1 doesn't generate weak symbols, resulting in undefined reference to ___lsan_default_suppressions

2021-07-25 Thread mose at gnu dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=10

--- Comment #11 from Mosè Giordano  ---
> This is OK for master and back-ports from the Darwin perspective

Thanks for the review and confirmation!

> (I guess Martin plans to deal with this since he has assigned the PR, but if 
> he does not have time, I can apply this for you if you don't have write 
> access).

Yes, I don't have write access, so someone else will need to apply the patch
:-)

[Bug sanitizer/101111] xgcc cross-compiler for x86_64-apple-darwin in GCC 11.1 doesn't generate weak symbols, resulting in undefined reference to ___lsan_default_suppressions

2021-07-25 Thread iains at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=10

--- Comment #10 from Iain Sandoe  ---
(In reply to Mosè Giordano from comment #6)
> Created attachment 51038 [details]
> Patch to fix the reported issue
> 
> Please find attached a patch to fix the reported issue.  I replaced the
> bashism += with simple string interpolation, to make it complaint with
> strict POSIX shells.

This is OK for master and back-ports from the Darwin perspective (I guess
Martin plans to deal with this since he has assigned the PR, but if he does not
have time, I can apply this for you if you don't have write access).

Re: [PATCH] Analyzer: Refactor callstring to work with pairs of supernodes.

2021-07-25 Thread Ankur Saini via Gcc-patches
Here is the new patch after fixing all the issues pointed out in the previous 
version.



call_string.patch
Description: Binary data


—

Question :

1. The mail id I am using here to send the patch ( arsenic.second...@gmail.com 
) and the mail id in the patch ( arse...@sourceware.org ) are different from 
one and other, will this affect the process in any ways ?


Thanks 
- Ankur

[Bug gcov-profile/101618] New: [GCOV] Wrong coverage caused by call site in a "for" statement

2021-07-25 Thread njuwy at smail dot nju.edu.cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101618

Bug ID: 101618
   Summary: [GCOV] Wrong coverage caused by call site in a "for"
statement
   Product: gcc
   Version: 10.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: gcov-profile
  Assignee: unassigned at gcc dot gnu.org
  Reporter: njuwy at smail dot nju.edu.cn
CC: marxin at gcc dot gnu.org
  Target Milestone: ---

$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-pc-linux-gnu/10.2.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../configure -enable-checking=release -enable-languages=c,c++
-disable-multilib
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 10.2.0 (GCC) 

$ cat test.c
#include 
#include 
#include 
#include 
struct obstack {};
struct bitmap_head_def;
typedef struct bitmap_head_def *bitmap;
typedef const struct bitmap_head_def *const_bitmap;
typedef unsigned long BITMAP_WORD;
typedef struct bitmap_obstack {
  struct bitmap_element_def *elements;
  struct bitmap_head_def *heads;
  struct obstack obstack;
} bitmap_obstack;
typedef struct bitmap_element_def {
  struct bitmap_element_def *next;
  struct bitmap_element_def *prev;
  unsigned int indx;
  BITMAP_WORD bits[((128 + (8 * 8 * 1u) - 1) / (8 * 8 * 1u))];
} bitmap_element;

struct bitmap_descriptor;

typedef struct bitmap_head_def {
  bitmap_element *first;
  bitmap_element *current;
  unsigned int indx;
  bitmap_obstack *obstack;
} bitmap_head;

bitmap_element bitmap_zero_bits;

typedef struct {
  bitmap_element *elt1;
  bitmap_element *elt2;
  unsigned word_no;
  BITMAP_WORD bits;
} bitmap_iterator;

static void __attribute__((noinline))
bmp_iter_set_init(bitmap_iterator *bi, const_bitmap map, unsigned start_bit,
  unsigned *bit_no) {
  bi->elt1 = map->first;
  bi->elt2 = ((void *)0);

  while (1) {
if (!bi->elt1) {
  bi->elt1 = _zero_bits;
  break;
}

if (bi->elt1->indx >=
start_bit / (((128 + (8 * 8 * 1u) - 1) / (8 * 8 * 1u)) * (8 * 8 * 1u)))
  break;
bi->elt1 = bi->elt1->next;
  }

  if (bi->elt1->indx !=
  start_bit / (((128 + (8 * 8 * 1u) - 1) / (8 * 8 * 1u)) * (8 * 8 * 1u)))
start_bit = bi->elt1->indx *
(((128 + (8 * 8 * 1u) - 1) / (8 * 8 * 1u)) * (8 * 8 * 1u));

  bi->word_no =
  start_bit / (8 * 8 * 1u) % ((128 + (8 * 8 * 1u) - 1) / (8 * 8 * 1u));
  bi->bits = bi->elt1->bits[bi->word_no];
  bi->bits >>= start_bit % (8 * 8 * 1u);

  start_bit += !bi->bits;

  *bit_no = start_bit;
}

static void __attribute__((noinline))
bmp_iter_next(bitmap_iterator *bi, unsigned *bit_no) {
  bi->bits >>= 1;
  *bit_no += 1;
}

static unsigned char __attribute__((noinline))
bmp_iter_set_tail(bitmap_iterator *bi, unsigned *bit_no) {
  while (!(bi->bits & 1)) {
bi->bits >>= 1;
*bit_no += 1;
  }
  return 1;
}

static __inline__ unsigned char bmp_iter_set(bitmap_iterator *bi,
 unsigned *bit_no) {
  unsigned bno = *bit_no;
  BITMAP_WORD bits = bi->bits;
  bitmap_element *elt1;

  if (bits) {
while (!(bits & 1)) {
  bits >>= 1;
  bno += 1;
}
*bit_no = bno;
return 1;
  }

  *bit_no = ((bno + 64 - 1) / 64 * 64);
  bi->word_no++;

  elt1 = bi->elt1;
  while (1) {
while (bi->word_no != 2) {
  bi->bits = elt1->bits[bi->word_no];
  if (bi->bits) {
bi->elt1 = elt1;
return bmp_iter_set_tail(bi, bit_no);
  }
  *bit_no += 64;
  bi->word_no++;
}
elt1 = elt1->next;
if (!elt1) {
  bi->elt1 = elt1;
  return 0;
}
*bit_no = elt1->indx * (2 * 64);
bi->word_no = 0;
  }
}

extern void abort(void);

static void __attribute__((noinline)) catchme(int i) {
  if (i != 0 && i != 64)
abort();
}
static void __attribute__((noinline)) foobar(bitmap_head *chain) {
  bitmap_iterator rsi;
  unsigned int regno;
  for (bmp_iter_set_init(&(rsi), (chain), (0), &(regno));
   bmp_iter_set(&(rsi), &(regno)); bmp_iter_next(&(rsi), &(regno)))
catchme(regno);
}

int main() {
  bitmap_element elem = {(void *)0, (void *)0, 0, {1, 1}};
  bitmap_head live_throughout = {, , 0, (void *)0};
  foobar(_throughout);
  return 0;
}

$ gcc -O0 --coverage test.c;./a.out;gcov test;cat test.c.gcov
File 'test.c'
Lines executed:80.88% of 68
Creating 'test.c.gcov'

-:0:Source:test.c
-:0:Graph:test.gcno
-:0:Data:test.gcda
-:0:Runs:1
-:1:#include 
-:2:#include 
-:3:#include 
-:4:#include 
-:5:struct obstack {};
-:6:struct bitmap_head_def;
-:7:typedef struct bitmap_head_def *bitmap;
-:8:typedef const struct bitmap_head_def *const_bitmap;
-:9:typedef unsigned long BITMAP_WORD;
-:   10:typedef struct bitmap_obstack {
-:   11:  struct 

[Bug objc/101616] Objective-C frontend should not emit vtable/fixup messages (at least, not by default)

2021-07-25 Thread iains at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101616

Iain Sandoe  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #2 from Iain Sandoe  ---
(In reply to Matt Jacobson from comment #0)
> In 10.2.0, the Objective-C frontend (in NeXT v2 ABI mode) emits "fixup"
> messages for all message sends.

Please check 10.3, 11.(1,2rc) and master - I believe this is already fixed (and
back ported to 10.3).

I have not (yet) applied it to 9.x (so that would not appear until 9.5, if
done).

The changes are selective on the target OS version (since fixup messages _are_
emitted by the 'system' [i.e. last usable Xcode] compilers for earlier OS
versions).

So that 
gcc foo.m 
on a recent OS version should omit the fixup versions 
but with -mmacosx-version-min=10.5 the fixups versions should be emitted
(actually, with a few small changes as the OS version changes).

[Bug objc/101616] Objective-C frontend should not emit vtable/fixup messages (at least, not by default)

2021-07-25 Thread egallager at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101616

Eric Gallager  changed:

   What|Removed |Added

 CC||egallager at gcc dot gnu.org,
   ||iains at gcc dot gnu.org,
   ||mikestump at comcast dot net

--- Comment #1 from Eric Gallager  ---
cc-ing ObjectiveC maintainers