r that can
alias anything
build_qualified_type to add const (probably useless)
build_aligned_type to specify that it is unaligned
--
Marc Glisse
T
fibonacci_heap.o -MMD -MP -MF ./.deps/fibonacci_heap.TPo
../../gcc/gcc/fibonacci_heap.cc
Note the "-g -O2". Why?
In addition to CFLAGS and BOOT_CFLAGS, you are missing CFLAGS_FOR_TARGET
(plus the same 3 for CXX). I don't know if that's still sufficient, but
that's what I used to set a few years ago.
--
Marc Glisse
id g ()
{
size_type __dnew;
struct string a;
[...]
[local count: 1073741824]:
_26 = a._M_string_length;
if (_26 == 4611686018427387903)
which should not require any interprocedural logic.
--
Marc Glisse
); }
was optimized to just one call to calloc (someone broke that in gcc-10).
Using LTO on libsupc++ is related.
I don't know if we want to define "sane" operators new/delete, or just
have a flag that promises that we won't try to replace the default ones.
--
Marc Glisse
On Tue, 21 Nov 2023, Jonathan Wakely wrote:
CC Marc Glisse who added the relocation support. He might recall why
we use memmove when all uses are for newly-allocated storage, which
cannot overlap the existing storage.
Going back a bit:
https://gcc.gnu.org/pipermail/gcc-patches/2019-April
criterion would be a|(b) when the possible 1 bits of b
are included in the certainly 1 bits of a|c.
--
Marc Glisse
is is? And
is there any solution?
If 'a' is a global variable, how do you know 'printf' doesn't modify its
value? (you could know it for printf, but it really depends on the
function that is called)
--
Marc Glisse
ons give
the same result on negative 0 and NaN.
--
Marc Glisse
call SIMD intrinsics/builtins
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80517 and others.
Makes sense to work around them for now.
--
Marc Glisse
__is_void
becomes unused and can be removed?
--
Marc Glisse
traits where you can save
several instantiations. Now that you have done a couple simple cases to
see how it works, I think you should concentrate on the more complicated
cases.
--
Marc Glisse
e true gcc with a different TMPDIR / -dumpdir each time.
--
Marc Glisse
in Intel syntax.
The doc for -masm=dialect says:
Darwin does not support ‘intel’.
Assuming that's still true, and even with Mac Intel going away, it doesn't
help.
--
Marc Glisse
y shuffles?
--
Marc Glisse
since earlier.
What would be causing the difference? Is this intended? Link
<https://godbolt.org/z/eWxnYsK1z> for details. Thank you!
See LOGICAL_OP_NON_SHORT_CIRCUIT in fold-const.cc (and various discussions
on the topic in mailing lists and bugzilla).
--
Marc Glisse
hich has 0,
and one for sparc, which has several.
--
Marc Glisse
undefined."
To me, this means that for instance INT_MIN<<1 is well defined and
evaluates to 0. But with this patch we turn (INT_MIN<<1)+(INT_MIN<<1) into
(INT_MIN+INT_MIN)<<1, which is UB.
If we decide not to support this extension anymore, I think we need to
change the documentation first.
--
Marc Glisse
-ftrapping-math.
--
Marc Glisse
to
-prevent the warning), even in conjunction with macros. This also
+consider questionable. This also
enables some language-specific warnings described in @ref{C++ Dialect
--
Marc Glisse
On Thu, 31 Mar 2022, Jonathan Wakely wrote:
On Thu, 31 Mar 2022 at 17:03, Marc Glisse via Libstdc++
wrote:
On Thu, 31 Mar 2022, Matthias Kretz via Gcc-patches wrote:
I like it. But I'd like it even more if we could have
#elif defined _UBSAN
__ubsan_invoke_ub("reached std::unreac
ght subset of) ubsan is enabled sounds like a
good idea.
--
Marc Glisse
tor lowering time.
+/* Push VEC_PERM earlier if that may help FMA perception (PR101895). */
+(for plusminus (plus minus)
+ (simplify
+(plusminus (vec_perm (mult@0 @1 vec_same_elem_p@2) @0 @3) @4)
+(plusminus (mult (vec_perm @1 @1 @3) @2) @4)))
Don't you want :s on mult and vec_perm?
--
Marc Glisse
IFN_FMIN bit_and bit_ior bit_xor)
+ (simplify (reduc (op @0 VECTOR_CST@1))
+(op (reduc:type @0) (reduc:type @1
I wonder if we need to test flag_associative_math for the 'plus' case,
or if the presence of IFN_REDUC_PLUS is enough to justify the possible
loss of precision.
--
Marc Glisse
ion option,
...);
No, curl_easy_setopt is a macro. If you look at the preprocessed code, you
get many statements doing the same wrong operation, and one warning for
each of them.
(wrong list, should be gcc-help, or an issue on bugzilla)
--
Marc Glisse
.
--
Marc Glisse
t? gcd_1.c has only 103
lines in release 6.2.1.
A stack trace (UBSAN_OPTIONS=print_stacktrace=1) would make it easier to
guess where this is coming from.
--
Marc Glisse
about this anyway.
In the other case, it could affect correct
code before the trap.
-fnon-call-exceptions helps with the first testcase but not with the
second one. I don't know if that's by accident, but the flag seems
possibly relevant.
--
Marc Glisse
r the broken thread, I was automatically unsubscribed because
mailman doesn't like greylisting)
--
Marc Glisse
e part of a separate
compiler flag or pragma, like C's FENV_ROUND can affect the rounding mode
in static initializers (of course C++ templates make pragmas less
convenient).
--
Marc Glisse
://gcc.gnu.org/bugs/ says that you should first try compiling your
code with -fsanitize=undefined, which tells you at runtime that your code
is broken.
Apart from that, bug reports should go to https://gcc.gnu.org/bugzilla/
and questions to gcc-h...@gcc.gnu.org.
--
Marc Glisse
this new pattern. I've attached a
patch with my latest changes.
From: Richard Biener
Sent: Wednesday, July 28, 2021 2:59 AM
To: Victor Tong
Cc: Marc Glisse ; gcc-patches@gcc.gnu.org
Subject: Re: [EXTERNAL] Re: [PATCH] tree-optimization: Optimize division followed by multiply [PR95176]
ee_nonzero_bits (@4)) == 0)
+ (mult @1
+{ wide_int_to_tree (type, wi::to_wide (@3) + wi::to_wide (@5)); })))
Could you explain how the convert helps exactly?
--
Marc Glisse
oblem is that 's0' is a single-precision float register and
it should be 'd0' instead.
Either I'm seriously missing something, in which case I would be most
obliged if someone sent me to the right direction; or it is a compiler
or documentation bug.
Thanks,
Zoltan
--
Marc Glisse
of each @i is not
reliable.
--
Marc Glisse
a pattern
in the list of types that work)
--
Marc Glisse
On Fri, 4 Jun 2021, Hongtao Liu via Gcc-patches wrote:
On Tue, Jun 1, 2021 at 6:17 PM Marc Glisse wrote:
On Tue, 1 Jun 2021, Hongtao Liu via Gcc-patches wrote:
Hi:
This patch is about to simplify (view_convert:type ~a) < 0 to
(view_convert:type a) >= 0 when type is signed integer. S
to try and generalize it a bit, say with
(cmp (nop_convert1? (bit_not @0)) CONSTANT_CLASS_P)
(scmp (view_convert:XXX @0) (bit_not @1))
(I still believe that it is a bad idea that SSA_NAMEs are strongly typed,
encoding the type in operations would be more convenient, but I think the
time for that choice has long gone)
--
Marc Glisse
On Wed, 26 May 2021, Prathamesh Kulkarni via Gcc-patches wrote:
The attached patch removes calls to builtins in vmul_n* (a, b) with __a * __b.
I am not familiar with neon, but are __a and __b unsigned here? Otherwise,
is vmul_n already undefined in case of overflow?
--
Marc Glisse
E_P (TREE_TYPE (@0))
+&& !TYPE_UNSIGNED (TREE_TYPE (@0))
+&& TYPE_PRECISION (TREE_TYPE (@0)) == TYPE_PRECISION (type))
Is there a risk that x is signed char (precision 8) and y is a vector with
8 elements?
--
Marc Glisse
On Fri, 14 May 2021, Jakub Jelinek via Gcc-patches wrote:
On Thu, May 06, 2021 at 09:42:41PM +0200, Marc Glisse wrote:
We can probably do it in 2 steps, first something like
(for cmp (eq ne)
(simplify
(cmp (bit_and:c @0 @1) @0)
(cmp (@0 (bit_not! @1)) { build_zero_cst (TREE_TYPE (@0
On Tue, 11 May 2021, Jakub Jelinek via Gcc-patches wrote:
On Thu, May 06, 2021 at 09:42:41PM +0200, Marc Glisse wrote:
We can probably do it in 2 steps, first something like
(for cmp (eq ne)
(simplify
(cmp (bit_and:c @0 @1) @0)
(cmp (@0 (bit_not! @1)) { build_zero_cst (TREE_TYPE (@0
PLE represents types will add an inconvenient cast. And I think VRP
already manages to use the bit test to derive a range.
--
Marc Glisse
ase should be
unreachable and so (_M_value&1)==_M_value is then equivalent to _M_value>=0,
but is not a single use but two uses. I'll need to pattern match that case
specially.
Somewhere in RTL (_M_value&1)==_M_value is turned into (_M_value&-2)==0,
that could be worth doing already in GIMPLE.
--
Marc Glisse
for this particular transformation (don't change the patch
because of my comment), we just seem to be getting more and more uses of
single_use in match.pd, maybe at some point we need to revisit the meaning
of :s or introduce a stronger :S.
--
Marc Glisse
] = 1311768467463790320;
_4 = c;
Isn't that a clear violation of strict aliasing?
--
Marc Glisse
directory for
linking should imply using the same directory for includes.
--
Marc Glisse
think clang follows gcc and uses the type of the first operand.
--
Marc Glisse
On Sat, 12 Dec 2020, Jakub Jelinek via Gcc-patches wrote:
On Sat, Dec 12, 2020 at 01:25:39PM +0100, Marc Glisse wrote:
On Sat, 12 Dec 2020, Jakub Jelinek via Gcc-patches wrote:
This patch adds the ~(X - Y) -> ~X + Y simplification requested
in the PR (plus also ~(X + C) -> ~X
gating
(and then extending it to non-constants)?
I wonder if this makes
/* ~(~X - Y) -> X + Y and ~(~X + Y) -> X - Y. */
useless.
--
Marc Glisse
e most common being modular arithmetic: if you know that uint32_t a, b,
c, d are smaller than m (and m!=0), you can compute a*b+c+d in uint64_t,
then use div to compute that modulo m.
--
Marc Glisse
-funsafe-math-optimizations is harder to tell.
--
Marc Glisse
if either overflow
is undefined or if VRP can prove that no overflow is happening.
Of course that's all ideas for later, refactoring belongs in the second or
third patch using a feature, not the first one :-)
--
Marc Glisse
IN(X, Y) -> false
So, the result will be true for GE_EXPR and LE_EXPR and false otherwise.
Is that true even if X is NaN?
It may be hard to hit a situation where this matters though, if we honor
NaN, we don't build MAX_EXPR (which is unspecified).
--
Marc Glisse
he exact meaning of -ftrapping-math, but
don't let that stop you.
--
Marc Glisse
o, can you please point me to an example?
* Otherwise, I'd be interested in advice about providing new infrastructure to
support
this. I'm a relative noob with respect to the configury code, and I'm sure my
initial instincts will be wrong. :)
Does the i386 mm_malloc.h file match your scenario
mmediately and permanently
delete the original and any copies of this email and any attachments thereto.
Could you please get rid of this when posting on public mailing lists?
--
Marc Glisse
On Wed, 2 Sep 2020, Jason Merrill via Gcc-patches wrote:
On 9/1/20 6:13 AM, Marc Glisse wrote:
On Tue, 1 Sep 2020, Jakub Jelinek via Gcc-patches wrote:
As discussed in the PR, fold-const.c punts on floating point constant
evaluation if the result is inexact and -frounding-math is turned
rmation even for 5*X-4*X -> X which does not
increase the number of multiplications. Which is where '!' (or :v here)
comes in.
Or we could decide that the extra multiplication is not that bad if it
saves an addition, simplifies the expression, possibly gains more insn
parallelism, etc, in which case we could just drop the existing hard
single_use check...
--
Marc Glisse
ic.c using V_C_Es looks much safer to me.
If they weren't so rare, we could consider lowering them earlier so they
benefit from more optimizations, but that doesn't seem worth the trouble.
--
Marc Glisse
x1.0p+100 + 0x1.0p-100 == 0x1.0p+100, "");
+}
+
+const double = a;
+const double = b;
Jakub
--
Marc Glisse
l_to (@0, wi::minus_one (TYPE_PRECISION (type))
(mult:v{ !single_use (@3) && !single_use (@4 } (plusminus @1 @2) @0
Indeed, something more flexible than '!' would be nice, but I am not so
sure about this version. If we are going to allow inserting code after
resimplification and before validation, maybe we should go even further
and let people insert arbitrary code there...
--
Marc Glisse
On Sat, 22 Aug 2020, Jonathan Wakely via Gcc-patches wrote:
On Sat, 22 Aug 2020 at 13:13, Jonathan Wakely wrote:
On Sat, 22 Aug 2020 at 10:52, Marc Glisse wrote:
is there a particular reason to handle only __int128 this way, and not all
the non-standard integer types? It looks like it would
int) x_4(D);
_9 = _8 >> 16;
_10 = (int) _9;
_2 = __builtin_bswap16 (_10);
_3 = _1 | _2;
_5 = (int) _3;
return _5;
}
Handling this in the same transformation with a pair of convert12? and
some tests should be doable, but it gets complicated enough that it is
fine to postpone that.
--
Marc Glisse
termediate extension, and 16 bit targets are "rare"? And
BUILT_IN_BSWAP128 because on most platforms intmax_t is only 64 bits and
we don't have a 128-bit version of parity/popcount? (we have an IFN, but
it seldom appears by magic)
--
Marc Glisse
Hello,
is there a particular reason to handle only __int128 this way, and not all
the non-standard integer types? It looks like it would be a bit simpler to
avoid a special case.
--
Marc Glisse
useless bit_and when casting to a smaller
type. We probably get a different pattern on 16-bit targets, but a pattern
they do not match won't hurt them.
--
Marc Glisse
Odd numbers are invertible in Z / 2^n Z, so X * C1 == C2 can be rewritten
as X == C2 * inv(C1) when overflow wraps.
mod_inv should probably be updated to better match the other wide_int
functions, but that's a separate issue.
Bootstrap+regtest on x86_64-pc-linux-gnu.
2020-08-10 Marc Glisse
On Fri, 7 Aug 2020, Jakub Jelinek wrote:
On Fri, Aug 07, 2020 at 10:57:54PM +0200, Marc Glisse wrote:
On Fri, 7 Aug 2020, Joern Wolfgang Rennecke wrote:
On 07/08/20 19:21, Marc Glisse wrote:
If we are going to handle the wrapping case, we shouldn't limit to
the non-wrapping meaning
On Fri, 7 Aug 2020, Joern Wolfgang Rennecke wrote:
On 07/08/20 19:21, Marc Glisse wrote:
If we are going to handle the wrapping case, we shouldn't limit to the
non-wrapping meaning of multiplicity. 3*X==5 should become X==1431655767
(for a 32 bit type), etc.
Do we have an extended gcd
on't have to limit to the case where 15 is a multiple of 3. 3*X>7
can be replaced with X>2.
Those are two nice suggestions. Do you intend to write a patch? Otherwise
I'll try to do it eventually (no promise).
--
Marc Glisse
exceptions we need some protection after, so
it may be easier to keep the memory (fenv) read as part of .FENV_PLUS.
Also, caring only about rounding doesn't match any standard #pragma, so
such an option may see very little use in practice...
Sorry for the incoherent brain-dump above ;)
It is great to have someone to discuss this with!
--
Marc Glisse
On Fri, 7 Aug 2020, Richard Biener wrote:
On Fri, Aug 7, 2020 at 10:33 AM Marc Glisse wrote:
On Fri, 7 Aug 2020, Richard Biener wrote:
On Thu, Aug 6, 2020 at 8:07 PM Marc Glisse wrote:
On Thu, 6 Aug 2020, Christophe Lyon wrote:
Was I on the right track configuring with
--target=arm
agma said at that
point in the source code), while flag_finite_math_only is at best per
function.
--
Marc Glisse
On Fri, 7 Aug 2020, Richard Biener wrote:
On Thu, Aug 6, 2020 at 8:07 PM Marc Glisse wrote:
On Thu, 6 Aug 2020, Christophe Lyon wrote:
Was I on the right track configuring with
--target=arm-none-linux-gnueabihf --with-cpu=cortex-a9
--with-fpu=neon-fp16
then compiling without any special
it to be temporary.
Since aarch64 seems to handle the same code just fine, maybe someone who
knows arm could copy the relevant code over?
Does my message make sense, do people have comments?
--
Marc Glisse
On Thu, 6 Aug 2020, Christophe Lyon wrote:
On Thu, 6 Aug 2020 at 11:06, Marc Glisse wrote:
On Thu, 6 Aug 2020, Christophe Lyon wrote:
2020-08-05 Marc Glisse
PR tree-optimization/95906
PR target/70314
* match.pd ((c ? a : b) op d, (c ? a : b) op (c ? d : e
On Thu, 6 Aug 2020, Richard Biener wrote:
On Thu, Aug 6, 2020 at 10:17 AM Christophe Lyon
wrote:
Hi,
On Wed, 5 Aug 2020 at 16:24, Richard Biener via Gcc-patches
wrote:
On Wed, Aug 5, 2020 at 3:33 PM Marc Glisse wrote:
New version that passed bootstrap+regtest during the night.
When
On Thu, 6 Aug 2020, Christophe Lyon wrote:
2020-08-05 Marc Glisse
PR tree-optimization/95906
PR target/70314
* match.pd ((c ? a : b) op d, (c ? a : b) op (c ? d : e),
(v ? w : 0) ? a : b, c1 ? c2 ? a : b : b): New transformations.
(op (c ? a : b
he obvious (making sure it bootstraps, running the testsuite,
adding a few tests), what missing pieces do you consider a strict
requirement for this to have a chance to reach master one day as an
experimental option?
--
Marc Glissecommit 4adb494e88323bf41ee2c0871caa2323fa2aca06
Author: Marc Gl
or this patch.
2020-08-05 Marc Glisse
PR tree-optimization/95906
PR target/70314
* match.pd ((c ? a : b) op d, (c ? a : b) op (c ? d : e),
(v ? w : 0) ? a : b, c1 ? c2 ? a : b : b): New transformations.
(op (c ? a : b)): Update to match the new tr
h directly exits the
function, instead of just giving up on this particular transformation and
trying the next one. I'll reorder my transformations to work around this,
but it looks like a pre-existing limitation.
--
Marc Glisse
On Mon, 3 Aug 2020, Richard Biener wrote:
On Sat, Aug 1, 2020 at 9:29 AM Marc Glisse wrote:
Hello,
this transformation is quite straightforward, without overflow, 3*X==15 is
the same as X==5 and 3*X==5 cannot happen. Adding a single_use restriction
for the first case didn't seem necessary
+regtest on x86_64-pc-linux-gnu.
2020-08-03 Marc Glisse
PR tree-optimization/95433
* match.pd (X * C1 == C2): New transformation.
* gcc.c-torture/execute/pr23135.c: Add -fwrapv to avoid
undefined behavior.
* gcc.dg/tree-ssa/pr95433.c: New file.
--
Marc
, we
can always extend it later...
--
Marc Glisse
the operands,
which should not be sufficient to enable the transformation.
--
Marc Glisse
On Fri, 31 Jul 2020, Marc Glisse wrote:
On Fri, 31 Jul 2020, Richard Sandiford wrote:
Marc Glisse writes:
On Fri, 31 Jul 2020, Richard Sandiford wrote:
Marc Glisse writes:
+/* (c ? a : b) op (c ? d : e) --> c ? (a op d) : (b op e) */
+ (simplify
+ (op (vec_cond:s @0 @1
On Fri, 31 Jul 2020, Richard Sandiford wrote:
Marc Glisse writes:
On Fri, 31 Jul 2020, Richard Sandiford wrote:
Marc Glisse writes:
+/* (c ? a : b) op (c ? d : e) --> c ? (a op d) : (b op e) */
+ (simplify
+ (op (vec_cond:s @0 @1 @2) (vec_cond:s @0 @3 @4))
+ (with
+ {
+ t
On Fri, 31 Jul 2020, Richard Biener wrote:
On Fri, Jul 31, 2020 at 1:39 PM Richard Biener
wrote:
On Fri, Jul 31, 2020 at 1:35 PM Richard Biener
wrote:
On Thu, Jul 30, 2020 at 9:49 AM Marc Glisse wrote:
When vector comparisons were forced to use vec_cond_expr, we lost a number
uter
AVX512 SSE-style vec-cond and you then would get a mismatch.
Ah, I thought the SSE-style vec_cond was impossible in AVX512 mode, at
least I couldn't generate one in a few tests, but I didn't try very hard.
So indeed better add a type compatibility check.
Ok, it can't hurt.
--
Marc Glisse
On Fri, 31 Jul 2020, Richard Sandiford wrote:
Marc Glisse writes:
+/* (c ? a : b) op (c ? d : e) --> c ? (a op d) : (b op e) */
+ (simplify
+ (op (vec_cond:s @0 @1 @2) (vec_cond:s @0 @3 @4))
+ (with
+ {
+ tree rhs1, rhs2 = NULL;
+ rhs1 = fold_binary (op, type, @1,
:2 and not :1 (is it a hack so true is 1 and not
-1?), but that doesn't matter for this patch.
Regtest+bootstrap on x86_64-pc-linux-gnu
2020-07-30 Marc Glisse
PR tree-optimization/95906
PR target/70314
* match.pd ((c ? a : b) op d, (c ? a : b) op (c ? d : e),
(v ? w : 0
On Thu, 23 Jul 2020, Marc Glisse wrote:
On Wed, 22 Jul 2020, Roger Sayle wrote:
Many thanks for the peer review and feedback. I completely agree that
POPCOUNT
and PARITY iterators simplifies things and handle the IFN_ variants.
Is there a reason why the iterators cannot be used
BUILT_IN_POPCOUNTL BUILT_IN_POPCOUNTLL
+ BUILT_IN_POPCOUNTIMAX)
+ parity (BUILT_IN_PARITY BUILT_IN_PARITYL BUILT_IN_PARITYLL
+BUILT_IN_PARITYIMAX)
+ (simplify
+(bit_and (popcount @0) integer_onep)
+(parity @0)))
--
Marc Glisse
nding the right thing to do?
Will it eventually simplify to ((int64_t)(i128>>64)|i64)>=0
--
Marc Glisse
n possible, not so bad.
--
Marc Glisse
in "#pragma fenv_access off" regions, which
seems to imply that it would be the front-end's responsibility (although
it would need help from the back-end to know the default value to fold
to).
--
Marc Glisse
ss-through on the
arguments and the result.
--
Marc Glisse
to true, because for a boolean
b is the same as b ? true : false
__builtin_constant_p(b ? true : false) would be the same as b ?
__builtin_constant_p(true) : __builtin_constant_p(false), i.e. true.
It is too bad we don't have any optimization pass using ranges between IPA
and thread1, that would have gotten rid of the comparisons, and hence the
temptation to thread. Adding always_inline on atomic_add (or flatten on
the caller) does help: EVRP removes the comparisons.
Do you see a way forward without changing what thread1 does or declaring
the testcase as unsupported?
--
Marc Glisse
s this point supposed to be? If I understood you right,
106t.thread1 is already too late - why is it so?
Small remark: shouldn't __atomic_add_const be marked with the
always_inline attribute, since it isn't usable when it isn't inlined?
--
Marc Glisse
ange can be seen as a
simplification and should be pushed to master. It regtests fine.
2020-06-20 Marc Glisse
* include/bits/stl_algo.h (__includes): Simplify the code.
(as with the patch for std::optional, I still haven't worked on my ssh key
issue and cannot currently push)
--
Marc
1 - 100 of 1798 matches
Mail list logo