date:20220409

[Bug libstdc++/93687] Add mcf thread model to GCC on windows for supporting C++11 std::thread?

2022-04-09 Thread xtemp09 at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93687

Evgeniy  changed:

   What|Removed |Added

 CC||xtemp09 at gmail dot com

--- Comment #3 from Evgeniy  ---
Another lite approach:

https://github.com/meganz/mingw-std-threads

[Bug pch/91440] Precompiled headers don't work with ASLR on mingw

2022-04-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91440

Andrew Pinski  changed:

   What|Removed |Added

 CC||malashkin.andrey at gmail dot 
com

--- Comment #9 from Andrew Pinski  ---
*** Bug 105199 has been marked as a duplicate of this bug. ***

[Bug c++/105199] can't compile glslang on windows

2022-04-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105199

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|UNCONFIRMED |RESOLVED

--- Comment #2 from Andrew Pinski  ---
Fixed on the trunk for GCC 12 as PCH are now relocatable.

*** This bug has been marked as a duplicate of bug 91440 ***

[Bug preprocessor/59782] libcpp does not avoid bug #48326 when compiled by older GCC

2022-04-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59782

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |WONTFIX

--- Comment #8 from Andrew Pinski  ---
This is a won't fix as GCC 12+ requires GCC 4.8.0+ to build which has the fix.

[Bug preprocessor/105207] Translation phase 2: splicing physical source lines to form logical source lines may not work

2022-04-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105207

--- Comment #3 from Andrew Pinski  ---
Note this only matters if you preprocessing the file yourself; that is
-save-temps works correctly and errors out that there is a stray '#' in
program.

[Bug analyzer/103892] -Wanalyzer-double-free false positive when compiling libpipeline

2022-04-09 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103892

David Malcolm  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #4 from David Malcolm  ---
Should be fixed by the above patch on trunk for GCC 12.  Backporting the fix to
GCC 11 is probably not feasible.

Marking as resolved.

Thanks again for filing this bug.

[Bug c/105207] Translation phase 2: splicing physical source lines to form logical source lines may not work

2022-04-09 Thread pavel.morozkin at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105207

Pavel M  changed:

   What|Removed |Added

Summary|C preprocessor: splicing|Translation phase 2:
   |physical source lines to|splicing physical source
   |form logical source lines   |lines to form logical
   |may not work|source lines may not work

--- Comment #2 from Pavel M  ---
Actually the issue is not related to the preprocessor. It is relayed to the
translation phase 2. Please

[Bug analyzer/103892] -Wanalyzer-double-free false positive when compiling libpipeline

2022-04-09 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103892

--- Comment #3 from CVS Commits  ---
The master branch has been updated by David Malcolm :

https://gcc.gnu.org/g:3d41408c5d28105e7a3ea2eb2529431a70b96369

commit r12-8071-g3d41408c5d28105e7a3ea2eb2529431a70b96369
Author: David Malcolm 
Date:   Sat Apr 9 18:12:57 2022 -0400

analyzer: fix folding of regions involving unknown ptrs [PR103892]

PR analyzer/103892 reports a false positive from -Wanalyzer-double-free.

The root cause is the analyzer failing to properly handle "unknown"
symbolic regions, and thus confusing two different expressions.

Specifically, the analyzer eventually hits the complexity limit for
symbolic values, and starts using an "unknown" svalue for a pointer.
The analyzer uses
  symbolic_region(unknown_svalue([of ptr type]))
i.e.
  (*UNKNOWN_PTR)
in a few places to mean "we have an lvalue, but we're not going to
attempt to track what it is anymore".

"Unknown" should probably be renamed to "unknowable"; in theory, any
operation on such an unknown svalue should be also an unknown svalue.

The issue is that in various places where we create child regions, we
were failing to check for the parent region being (*UNKNOWN_PTR), and so
were erroneously creating regions based on (*UNKNOWN_PTR), such as
*(UNKNOWN_PTR + OFFSET).  The state-machine handling was erroneously
allowing e.g. INITIAL_VALUE (*(UNKNOWN_PTR + OFFSET)) to have state,
and thus we could record that such a value had had "free" called on it,
and thus eventually false report a double-free when a different
expression incorrectly "simplified" to the same expression.

This patch fixes things by checking when creating the various kinds of
child region for (*UNKNOWN_PTR) as the parent region, and simply
returning another (*UNKNOWN_PTR) for such child regions (using the
appropriate type).

Doing so fixes the false positive, and also fixes a state explosion on
this testcase, as the states at the program points more rapidly reach
a fixed point where everything is unknown.  I checked for other cases
that no longer needed -Wno-analyzer-too-complex; the only other one
seems to be gcc.dg/analyzer/pr96841.c, but that seems to already have
become redundant at some point before this patch.

gcc/analyzer/ChangeLog:
PR analyzer/103892
* region-model-manager.cc
(region_model_manager::get_unknown_symbolic_region): New,
extracted from...
(region_model_manager::get_field_region): ...here.
(region_model_manager::get_element_region): Use it here.
(region_model_manager::get_offset_region): Likewise.
(region_model_manager::get_sized_region): Likewise.
(region_model_manager::get_cast_region): Likewise.
(region_model_manager::get_bit_range): Likewise.
* region-model.h
(region_model_manager::get_unknown_symbolic_region): New decl.
* region.cc (symbolic_region::symbolic_region): Handle sval_ptr
having NULL type.
(symbolic_region::dump_to_pp): Handle having NULL type.

gcc/testsuite/ChangeLog:
PR analyzer/103892
* gcc.dg/analyzer/pr103892.c: New test.
* gcc.dg/analyzer/pr96841.c: Drop redundant
-Wno-analyzer-too-complex.

Signed-off-by: David Malcolm

[Bug c/105207] C preprocessor: splicing physical source lines to form logical source lines may not work

2022-04-09 Thread pavel.morozkin at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105207

--- Comment #1 from Pavel M  ---
The same behavior with:
xxx \
error

Expected:
xxx error

Actual:
xxx
 error

[Bug c/105207] New: C preprocessor: splicing physical source lines to form logical source lines may not work

2022-04-09 Thread pavel.morozkin at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105207

Bug ID: 105207
   Summary: C preprocessor: splicing physical source lines to form
logical source lines may not work
   Product: gcc
   Version: 11.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pavel.morozkin at gmail dot com
  Target Milestone: ---

Sample code:
xxx \
#error

Invocation:
$ gcc t6.c -E

Actual output:
xxx
 #error

Expected output:
xxx #error

[Bug tree-optimization/103680] Jump threading and switch corrupts profile

2022-04-09 Thread hubicka at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103680

--- Comment #5 from Jan Hubicka  ---
The cfgcleanup logic is consistent assuming that your profile was consistent on
the input (i.e. read from profile feedback). If you 
 1) read profile
 2) do optimization and prove that given if conditional is true
then you should also have 100% probability on the "true" edge so doing nothing
in cfgcleanup is correct.

Now of course what can happen is that you guess profile or
 1) read profile
 2) duplicate code
 3) prove if conditonal always true in one of the copy.
In this case fixing up profile locally is not possible (since it is also wrong
in the other copy), so we opt doing nothing which keeps errors sort of
contained and we need to live that profile is somethimes inconsistent.

So cfgcleanup behaviour is by design.

However if you do threading there is way to update the profile and logic for
that iis n update_bb_profile_for_threading.  If guessed profile was consistent
with the thread, it will update profile well and it will drop message to a dump
file otherwise.

Now the problem is that each time profiling code is updated the interface to
this function is lost.  I tried to get it fixed but got lost in the new code.

/* An edge originally destinating BB of COUNT has been proved to
   leave the block by TAKEN_EDGE.  Update profile of BB such that edge E can be
   redirected to destination of TAKEN_EDGE.

   This function may leave the profile inconsistent in the case TAKEN_EDGE
   frequency or count is believed to be lower than COUNT
   respectively.  */
void
update_bb_profile_for_threading (basic_block bb, 
 profile_count count, edge taken_edge)

So the interface is quite simple.  I have to re-read the new updating code
since I no longer recall where I got lost, but perhaps if you are familiar with
it, you can write in the update?

[Bug ipa/103819] [10/11/12 Regression] ICE in redirect_callee, at cgraph.c:1389 with attribute((flatten)) and -O2 since r11-7940-ge7fd3b783238d034

2022-04-09 Thread hubicka at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103819

Jan Hubicka  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED

--- Comment #4 from Jan Hubicka  ---
mine.

[Bug ipa/103378] [12 Regression] ICE: verify_cgraph_node failed (error: semantic interposition mismatch) since r12-5412-g458d2c689963d846

2022-04-09 Thread hubicka at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103378

Jan Hubicka  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #7 from Jan Hubicka  ---
Fixed by r:4943b75e9f06f0b64ed541430bb7fbccf55fc552
Sorry for wrong PR marker :( I should have cut

[Bug ipa/103818] [12 Regression] ICE: in insert, at ipa-modref-tree.c:591 since r12-3202-gf5ff3a8ed4ca9173

2022-04-09 Thread hubicka at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103818

--- Comment #4 from Jan Hubicka  ---
We have access list:

  Base 0: alias set 2
Ref 0: alias set 1
  access: Parm 0 param offset:0 offset:-4611686018427387936 size:32
max_size:32
  access: Parm 0 param offset:0 offset:352 size:32 max_size:32
  access: Parm 0 param offset:0 offset:64 size:32 max_size:32
  access: Parm 0 param offset:0 offset:0 size:32 max_size:32
  access: Parm 0 param offset:0 offset:32800 size:32 max_size:32
  access: Parm 0 param offset:0 offset:160 size:32 max_size:32
  access: Parm 0 param offset:0 offset:4629700416936869888 size:32
max_size:32
  access: Parm 0 param offset:0 offset:-96 size:32 max_size:32
  access: Parm 0 param offset:0 offset:1376 size:32 max_size:32
  access: Parm 0 param offset:0 offset:224 size:32 max_size:32
  access: Parm 0 param offset:0 offset:-288 size:32 max_size:32
  access: Parm 0 param offset:0 offset:448 size:32 max_size:32
  access: Parm 0 param offset:0 offset:288 size:32 max_size:32
  access: Parm 0 param offset:0 offset:1568 size:32 max_size:32
  access: Parm 0 param offset:0 offset:640 size:32 max_size:32
  access: Parm 0 param offset:0 offset:2624 size:32 max_size:32

and we want to merge
 Parm 0 param offset:0 offset:-4611686018427387936 size:32 max_size:32
and
 Parm 0 param offset:0 offset:4629700416936869888 size:32 max_size:32
into one entry since we think they have small difference.  

So an overflow issue:
  new_max_size = max_size2 + offset2 - offset1; 
  if (known_le (new_max_size, max_size1))   
new_max_size = max_size1;   
So we need 128bit math here.
I need to look into proper way to get this right (and corresponding overflow
that makes the lgoic to choose these two entries as closest to each other.

[Bug middle-end/105206] mis-optimization with -ffast-math and __builtin_powf

2022-04-09 Thread kargl at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105206

kargl at gcc dot gnu.org changed:

   What|Removed |Added

   Severity|normal  |minor

--- Comment #1 from kargl at gcc dot gnu.org ---
Not sure if anyone cares.  I don't use -ffast-math, but this might considered a
mis-optimization with that option.

#include 

float
foof(float x)
{
   return (powf(10.f,x));
}

double
food(double x)
{
   return (pow(10.,x));
}


-fdump-tree-original shows

;; Function foof (null)
;; enabled by -tree-original

{
  return powf (1.0e+1, x);
}


;; Function food (null)
;; enabled by -tree-original

{
  return pow (1.0e+1, x);
}

Compiling to assembly shows

foof:
.LFB3:
.cfi_startproc
movaps  %xmm0, %xmm1
movss   .LC0(%rip), %xmm0
jmp powf
.cfi_endproc
food:
.LFB4:
.cfi_startproc
mulsd   .LC1(%rip), %xmm0
jmp exp
.cfi_endproc

So, the middle-end is converting pow(10.x) to exp(x*log(10.0)) where log(10.0)
is reduced, but the same transformation of powf(10.f,x) still yields a call to
powf.

[Bug middle-end/105206] New: mis-optimization with -ffast-math and __builtin_powf

2022-04-09 Thread kargl at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105206

Bug ID: 105206
   Summary: mis-optimization with -ffast-math and __builtin_powf
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kargl at gcc dot gnu.org
  Target Milestone: ---

[Bug tree-optimization/103376] [12 Regression] wrong code at -Os and above on x86_64-linux-gnu since r12-5453-ga944b5dec3adb28e

2022-04-09 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103376

--- Comment #12 from CVS Commits  ---
The master branch has been updated by Jan Hubicka :

https://gcc.gnu.org/g:4943b75e9f06f0b64ed541430bb7fbccf55fc552

commit r12-8070-g4943b75e9f06f0b64ed541430bb7fbccf55fc552
Author: Jan Hubicka 
Date:   Sat Apr 9 21:22:58 2022 +0200

Update semantic_interposition flag at analysis time

This patch solves problem with FE first finalizing function and then adding
-fno-semantic-interposition flag (by parsing optimization attribute).

gcc/ChangeLog:

2022-04-09  Jan Hubicka  

PR ipa/103376
* cgraphunit.cc (cgraph_node::analyze): update
semantic_interposition
flag.

gcc/testsuite/ChangeLog:

2022-04-09  Jan Hubicka  

PR ipa/103376
* gcc.c-torture/compile/pr103376.c: New test.

[Bug fortran/105205] New: Incorrect assignment of derived type with allocatable, deferred-length character component

2022-04-09 Thread townsend at astro dot wisc.edu via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105205

Bug ID: 105205
   Summary: Incorrect assignment of derived type with allocatable,
deferred-length character component
   Product: gcc
   Version: 11.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: townsend at astro dot wisc.edu
  Target Milestone: ---

I've run into problems with assignment of derived types containing an
allocatable array of deferred-length strings. Example program:

---
program alloc_char_type
   implicit none
   type mytype
  character(:), allocatable :: c(:)
   end type mytype
   type(mytype) :: a
   type(mytype) :: b
   integer :: i
   a%c = ['foo','bar','biz','buz']
   b = a
   do i = 1, size(b%c)
  print *,b%c(i)
   end do
end
---

Running with gfortran 10.2.0 or 11.2.0, I get the output:

>>
 foo



<<

If I hard-code the length of the c component (to, say, 3), I get the expected
output:

>>
 foo
 bar
 biz
 buz
<<

It seems as if only the first element of c is being copied correctly.

cheers,

Rich

[Bug ipa/105160] [12 regression] ipa modref marks functions with asm volatile as const or pure

2022-04-09 Thread hubicka at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105160

Jan Hubicka  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #10 from Jan Hubicka  ---
Fixed by r:aabb9a261ef060cf24fd626713f1d7d9df81aa57

[Bug libquadmath/105101] incorrect rounding for sqrtq

2022-04-09 Thread sgk at troutmask dot apl.washington.edu via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105101

--- Comment #8 from Steve Kargl  ---
On Sat, Apr 09, 2022 at 10:23:39AM +, jakub at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105101
> 
> --- Comment #6 from Jakub Jelinek  ---
> (In reply to Thomas Koenig from comment #5)
> > There is another, much worse, problem, reported and analyzed by "Michael S"
> > on comp.arch. The code has
> > 
> > #ifdef HAVE_SQRTL
> >   {
> > long double xl = (long double) x;
> > if (xl <= LDBL_MAX && xl >= LDBL_MIN)
> >   {
> > /* Use long double result as starting point.  */
> > y = (__float128) sqrtl (xl);
> > 
> > /* One Newton iteration.  */
> > y -= 0.5q * (y - x / y);
> > return y;
> >   }
> >   }
> > #endif
> > 
> > which assumes that long double has a higher precision than
> > normal double.  On x86_64, this depends o the settings of the
> > FPU flags, so a number like 0x1.06bc82f7b9d71dfcbddf2358a0eap-1024 
> > is corrected with 32 ULP of error because there is only a single
> > round of Newton iterations if the FPU flags are set to normal precision.
> 
> That is only a problem on OSes that do that, I think mainly BSDs, no?
> On Linux it should be fine (well, still not 0.5ulp precise, but not as bad as
> when sqrtl is just double precision precise).
> 

i686-*-freebsd sets the FPU to have 53 bits of precision for
long double.  It has the usual exponent range of an Intel
80-bit extended double.

[Bug libquadmath/105101] incorrect rounding for sqrtq

2022-04-09 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105101

--- Comment #7 from Thomas Koenig  ---
(In reply to Jakub Jelinek from comment #6)
> (In reply to Thomas Koenig from comment #5)
> > There is another, much worse, problem, reported and analyzed by "Michael S"
> > on comp.arch. The code has
> > 
> > #ifdef HAVE_SQRTL
> >   {
> > long double xl = (long double) x;
> > if (xl <= LDBL_MAX && xl >= LDBL_MIN)
> >   {
> > /* Use long double result as starting point.  */
> > y = (__float128) sqrtl (xl);
> > 
> > /* One Newton iteration.  */
> > y -= 0.5q * (y - x / y);
> > return y;
> >   }
> >   }
> > #endif
> > 
> > which assumes that long double has a higher precision than
> > normal double.  On x86_64, this depends o the settings of the
> > FPU flags, so a number like 0x1.06bc82f7b9d71dfcbddf2358a0eap-1024 
> > is corrected with 32 ULP of error because there is only a single
> > round of Newton iterations if the FPU flags are set to normal precision.
> 
> That is only a problem on OSes that do that, I think mainly BSDs, no?

Correct.

> On Linux it should be fine (well, still not 0.5ulp precise, but not as bad
> as when sqrtl is just double precision precise).

In this case, it was discovered on some version of WSL.

[Bug libquadmath/105101] incorrect rounding for sqrtq

2022-04-09 Thread jakub at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105101

--- Comment #6 from Jakub Jelinek  ---
(In reply to Thomas Koenig from comment #5)
> There is another, much worse, problem, reported and analyzed by "Michael S"
> on comp.arch. The code has
> 
> #ifdef HAVE_SQRTL
>   {
> long double xl = (long double) x;
> if (xl <= LDBL_MAX && xl >= LDBL_MIN)
>   {
> /* Use long double result as starting point.  */
> y = (__float128) sqrtl (xl);
> 
> /* One Newton iteration.  */
> y -= 0.5q * (y - x / y);
> return y;
>   }
>   }
> #endif
> 
> which assumes that long double has a higher precision than
> normal double.  On x86_64, this depends o the settings of the
> FPU flags, so a number like 0x1.06bc82f7b9d71dfcbddf2358a0eap-1024 
> is corrected with 32 ULP of error because there is only a single
> round of Newton iterations if the FPU flags are set to normal precision.

That is only a problem on OSes that do that, I think mainly BSDs, no?
On Linux it should be fine (well, still not 0.5ulp precise, but not as bad as
when sqrtl is just double precision precise).

[Bug libquadmath/105101] incorrect rounding for sqrtq

2022-04-09 Thread tkoenig at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105101

--- Comment #5 from Thomas Koenig  ---
There is another, much worse, problem, reported and analyzed by "Michael S"
on comp.arch. The code has

#ifdef HAVE_SQRTL
  {
long double xl = (long double) x;
if (xl <= LDBL_MAX && xl >= LDBL_MIN)
  {
/* Use long double result as starting point.  */
y = (__float128) sqrtl (xl);

/* One Newton iteration.  */
y -= 0.5q * (y - x / y);
return y;
  }
  }
#endif

which assumes that long double has a higher precision than
normal double.  On x86_64, this depends o the settings of the
FPU flags, so a number like 0x1.06bc82f7b9d71dfcbddf2358a0eap-1024 
is corrected with 32 ULP of error because there is only a single
round of Newton iterations if the FPU flags are set to normal precision.

I believe we can at least fix that before the gcc 12 release, by
simply removing the code I quoted.

[Bug target/82261] x86: missing peephole for SHLD / SHRD

2022-04-09 Thread peter at cordes dot ca via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82261

--- Comment #4 from Peter Cordes  ---
GCC will emit SHLD / SHRD as part of shifting an integer that's two registers
wide.
Hironori Bono proposed the following functions as a workaround for this missed
optimization (https://stackoverflow.com/a/71805063/224132)

#include 

#ifdef __SIZEOF_INT128__
uint64_t shldq_x64(uint64_t low, uint64_t high, uint64_t count) {
  return (uint64_t)(unsigned __int128)high << 64) | (unsigned __int128)low)
<< (count & 63)) >> 64);
}

uint64_t shrdq_x64(uint64_t low, uint64_t high, uint64_t count) {
  return (uint64_t)unsigned __int128)high << 64) | (unsigned __int128)low)
>> (count & 63));
}
#endif

uint32_t shld_x86(uint32_t low, uint32_t high, uint32_t count) {
  return (uint32_t)(uint64_t)high << 32) | (uint64_t)low) << (count & 31))
>> 32);
}

uint32_t shrd_x86(uint32_t low, uint32_t high, uint32_t count) {
  return (uint32_t)uint64_t)high << 32) | (uint64_t)low) >> (count & 31));
}

---

The uint64_t functions (using __int128) compile cleanly in 64-bit mode
(https://godbolt.org/z/1j94Gcb4o) using 64-bit operand-size shld/shrd

but the uint32_t functions compile to a total mess in 32-bit mode (GCC11.2 -O3
-m32 -mregparm=3) before eventually using shld, including a totally insane 
or  dh, 0

GCC trunk with -O3 -mregparm=3 compiles them cleanly, but without regparm it's
also slightly different mess.

Ironically, the uint32_t functions compile to quite a few instructions in
64-bit mode, actually doing the operations as written with shifts and ORs, and
having to manually mask the shift count to &31 because it uses a 64-bit
operand-size shift which masks with &63.  32-bit operand-size SHLD would be a
win here, at least for -mtune=intel or a specific Intel uarch.

I haven't looked at whether they still compile ok after inlining into
surrounding code, or whether operations would tend to combine with other things
in preference to becoming an SHLD.

[Bug c++/105130] gcc does not warn about unused return value of last expression of statement expr

2022-04-09 Thread egallager at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105130

Eric Gallager  changed:

   What|Removed |Added

 CC||egallager at gcc dot gnu.org

--- Comment #2 from Eric Gallager  ---
Please send your patch to the gcc-patches mailing list for review.

[Bug c/105180] K style definition does not evaluate array size

2022-04-09 Thread egallager at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105180

Eric Gallager  changed:

   What|Removed |Added

 CC||egallager at gcc dot gnu.org

--- Comment #1 from Eric Gallager  ---
I mean... that's some pretty suspect code right there...

[Bug libstdc++/93687] Add mcf thread model to GCC on windows for supporting C++11 std::thread?

[Bug pch/91440] Precompiled headers don't work with ASLR on mingw

[Bug c++/105199] can't compile glslang on windows

[Bug preprocessor/59782] libcpp does not avoid bug #48326 when compiled by older GCC

[Bug preprocessor/105207] Translation phase 2: splicing physical source lines to form logical source lines may not work

[Bug analyzer/103892] -Wanalyzer-double-free false positive when compiling libpipeline

[Bug c/105207] Translation phase 2: splicing physical source lines to form logical source lines may not work

[Bug analyzer/103892] -Wanalyzer-double-free false positive when compiling libpipeline

[Bug c/105207] C preprocessor: splicing physical source lines to form logical source lines may not work

[Bug c/105207] New: C preprocessor: splicing physical source lines to form logical source lines may not work

[Bug tree-optimization/103680] Jump threading and switch corrupts profile

[Bug ipa/103819] [10/11/12 Regression] ICE in redirect_callee, at cgraph.c:1389 with attribute((flatten)) and -O2 since r11-7940-ge7fd3b783238d034

[Bug ipa/103378] [12 Regression] ICE: verify_cgraph_node failed (error: semantic interposition mismatch) since r12-5412-g458d2c689963d846

[Bug ipa/103818] [12 Regression] ICE: in insert, at ipa-modref-tree.c:591 since r12-3202-gf5ff3a8ed4ca9173

[Bug middle-end/105206] mis-optimization with -ffast-math and __builtin_powf

[Bug middle-end/105206] New: mis-optimization with -ffast-math and __builtin_powf

[Bug tree-optimization/103376] [12 Regression] wrong code at -Os and above on x86_64-linux-gnu since r12-5453-ga944b5dec3adb28e

[Bug fortran/105205] New: Incorrect assignment of derived type with allocatable, deferred-length character component

[Bug ipa/105160] [12 regression] ipa modref marks functions with asm volatile as const or pure

[Bug libquadmath/105101] incorrect rounding for sqrtq

[Bug libquadmath/105101] incorrect rounding for sqrtq

[Bug libquadmath/105101] incorrect rounding for sqrtq

[Bug libquadmath/105101] incorrect rounding for sqrtq

[Bug target/82261] x86: missing peephole for SHLD / SHRD

[Bug c++/105130] gcc does not warn about unused return value of last expression of statement expr

[Bug c/105180] K style definition does not evaluate array size

26 matches

Site Navigation

Mail list logo

Footer information