certainly don't want to re-fold everything all the time.
VRP is kind of a special case, every variable for which it finds a
new/improved range could be considered changed, since it may trigger
some extra transformation in match.pd (same for CCP and the nonzero
mask).
--
Marc Glisse
plicate
parts of what match.pd does, but certainly doable -- I've even prototyped
it.
Thoughts?
I believe we are supposed to call match.pd from VRP eventually (though
that may not have happened yet), so the test could probably also be done
there, if that looks more convenient and we decide not to remove
single_use completely for this transformation.
--
Marc Glisse
456ULL), something else?
Yes, unless we teach libcpp about larger than 64bit literals.
Teaching libcpp about 128bit literals sounds like a much nicer idea
indeed...
--
Marc Glisse
R tree-optimization/71563
* match.pd: Simplify X << Y into X if Y is known to be 0 or
out of range value - has low bits known to be zero.
Hello,
would it make sense to extend it to rotates later?
Note that you can write (shift @0 SSA_NAME@1) in the pattern instead of a
separate tes
On Tue, 13 Dec 2016, Richard Biener wrote:
On Sat, Dec 10, 2016 at 7:59 PM, Marc Glisse <marc.gli...@inria.fr> wrote:
On Sat, 10 Dec 2016, Allan Sandfeld Jensen wrote:
On Saturday 10 December 2016, Marc Glisse wrote:
On Sat, 10 Dec 2016, Marc Glisse wrote:
On Sat, 10 Dec 2016,
On Sat, 10 Dec 2016, Allan Sandfeld Jensen wrote:
On Saturday 10 December 2016, Marc Glisse wrote:
On Sat, 10 Dec 2016, Marc Glisse wrote:
On Sat, 10 Dec 2016, Allan Sandfeld Jensen wrote:
Replaces the definitions of the shift intrinsics with GCC extension
syntax to
allow GCC to reason about
On Sat, 10 Dec 2016, Marc Glisse wrote:
On Sat, 10 Dec 2016, Allan Sandfeld Jensen wrote:
Replaces the definitions of the shift intrinsics with GCC extension syntax
to
allow GCC to reason about what the instructions does. Tests are added to
ensure the intrinsics still produce the right
revent from discussing it)
--
Marc Glisse
s for now.
--
Marc Glisse
d), so it already has the
appropriate 1 or 0 in the right place.
--
Marc Glisse
ild_int_cst (integer_type_node, shift); })
+ (convert (rshift @2 { build_int_cst (integer_type_node, -shift); })
What happens if @1 is the sign bit, in a signed type? Do we get an
arithmetic shift right?
--
Marc Glisse
much, in
places where the lack of may_alias might be an issue... Or maybe I am
afraid for no reason and even here the may_alias is unnecessary. Looking
at dumps also makes me wonder if we could simplify
view_convert_expr(bit_field_expr) to just bit_field_expr when it is the
only use.
--
Marc Glisse
= TREE_CODE (@0), cmp_code = TREE_CODE (@0);
Note that you can use cmp directly, it is already equal to LT_EXPR or one
of the others.
--
Marc Glisse
at the testsuite), but you'll
need a real reviewer to approve the patch...
--
Marc Glisse
doesn't work in the later case.
Attached patch reworks this functionality to detect -j correctly in all cases.
Hello,
I didn't read the patch, but do you think this also fixes PR 53155 ?
--
Marc Glisse
On Fri, 4 Nov 2016, Andrew Pinski wrote:
On Fri, Nov 4, 2016 at 7:08 AM, Marc Glisse <marc.gli...@inria.fr> wrote:
Ping https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02220.html
I think this is obvious.
Ok, I've committed it.
--
Marc Glisse
the patch. Note also
that memcpy already has both an attribute that says that it returns its
first argument, and an attribute that says that said first argument is
nonnull.
(I've heard some noise in C++-land about making memcpy(0,0,0) valid, but
that may have just been noise)
--
Marc Glisse
On Wed, 16 Nov 2016, Michael Matz wrote:
Hi,
On Wed, 16 Nov 2016, Marc Glisse wrote:
The first sentence about ORing the sign bit sounds strange (except for a
sign-magnitude representation). With 2's complement, INT_MIN is -2^31, the
divisors are the 2^k and -(2^k). -2 * 2^30 yields INT_MIN
On Wed, 16 Nov 2016, Richard Biener wrote:
On Wed, 16 Nov 2016, Marc Glisse wrote:
On Wed, 16 Nov 2016, Richard Biener wrote:
On Wed, 16 Nov 2016, Marc Glisse wrote:
On Wed, 16 Nov 2016, Richard Biener wrote:
I am testing the following to avoid undefined behavior when negating
On Wed, 16 Nov 2016, Richard Biener wrote:
On Wed, 16 Nov 2016, Marc Glisse wrote:
On Wed, 16 Nov 2016, Richard Biener wrote:
I am testing the following to avoid undefined behavior when negating
a multiplication (basically extending a previous fix to properly handle
negative power of two
lus } */
+/* { dg-do run } */
+
+int main ()
+{
+ int a = 2;
+ int b = 1;
+
+ int t = -1 * ( -0x4000 * a / ( -0x2000 + b ) ) / -1;
+
+ if (t != 4) __builtin_abort();
+
+ return 0;
+}
--
Marc Glisse
example)
Is a builtin really needed here? What would happen if you used
return __A & __B;
?
--
Marc Glisse
am not convinced we need the
overflow stuff at all here.
+(for cmp (eq ne gt ge lt le)
(for cmp (simple_comparison)
+ (cmp (convert@0 @1) INTEGER_CST@2)
+ (if (TREE_CODE (@1) == SSA_NAME
(cmp (convert@0 SSA_NAME@1) INTEGER_CST@2)
+ (cmp { @1; } (convert @2))
(cmp @1 (convert @2))
--
Marc Glisse
We have a
number of other optimizations that take advantage of the fact that bool is
in [0, 1], even without looking at VRP, say for instance x != 0 -> x. Are
we supposed to remove all of them?
--
Marc Glisse
On Wed, 9 Nov 2016, Marc Glisse wrote:
On Wed, 9 Nov 2016, Segher Boessenkool wrote:
On Wed, Nov 09, 2016 at 10:54:53PM +0100, Marc Glisse wrote:
match.pd transforms (A)|(B&~C) to ((A^B))^B, which is fewer
operations if C is not const (and it is not on simple tests at least,
this trans
On Wed, 9 Nov 2016, Segher Boessenkool wrote:
On Wed, Nov 09, 2016 at 10:54:53PM +0100, Marc Glisse wrote:
match.pd transforms (A)|(B&~C) to ((A^B))^B, which is fewer
operations if C is not const (and it is not on simple tests at least,
this transform is done very early already).
Var
because C only
becomes a constant at that stage?
--
Marc Glisse
your point.
(and yes, the first half would give a very general (simplify (cmp
SSA_NAME@0 INTEGER_CST@1) ...), that doesn't seem so bad)
--
Marc Glisse
ning stuff is such a pain...)
--
Marc Glisse
that instruction
and clear the following code, which likely gives better code than
replacing 0/0 with 1.
Bootstrap+regtest on powerpc64le-unknown-linux-gnu.
2016-11-07 Marc Glisse <marc.gli...@inria.fr>
gcc/
* match.pd (0 / X, X / X, X % X): New simplifications.
gcc/tes
Ping https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02220.html
On Thu, 27 Oct 2016, Marc Glisse wrote:
Hello,
some optimization patch I was working on simplified __TMC_END__ -
__TMC_LIST__ == 0 to false, which is not wanted (I assume that's why it
wasn't written __TMC_END__ == __TMC_LIST__
On Fri, 4 Nov 2016, Richard Biener wrote:
On Fri, Nov 4, 2016 at 2:15 PM, Richard Biener
<richard.guent...@gmail.com> wrote:
On Fri, Nov 4, 2016 at 1:34 PM, Marc Glisse <marc.gli...@inria.fr> wrote:
Hello,
this kind of simplification is already handled by fold_comparison,
org/ml/gcc-patches/2016-10/msg02220.html , I'll rerun the
testsuite when that patch is in.
2016-11-07 Marc Glisse <marc.gli...@inria.fr>
gcc/
* fold-const.c (fold_comparison): Ignore EXACT_DIV_EXPR.
* match.pd (A /[ex] B CMP C): New simplifications.
gcc/testsuite/
(double&)c==0; less so...
(and I am assuming that signaling NaNs don't make the whole transformation
impossible, which might be wrong)
--
Marc Glisse
.
(related to PR 68714)
--
Marc Glisse
be cool to do the same for floats (most
likely at the RTL level).
--
Marc Glisse
On Mon, 31 Oct 2016, Richard Biener wrote:
On Fri, 28 Oct 2016, Marc Glisse wrote:
On Wed, 28 Sep 2016, Richard Biener wrote:
The following patch implements patterns to catch x / abs (x)
and x / -x, taking advantage of undefinedness at x == 0 as
opposed to the PR having testcases
specific vector types (that people probably didn't have in mind when
developing the C extension)?
--
Marc Glisse
visions by zero. This is the reason why
we don't simplify x / x to 1 or 0 / x to 0. */
Did we give up on preserving divisions by 0? Can we now do the 2
simplifications listed by the comment?
--
Marc Glisse
Hello,
some optimization patch I was working on simplified __TMC_END__ -
__TMC_LIST__ == 0 to false, which is not wanted (I assume that's why it
wasn't written __TMC_END__ == __TMC_LIST__ in the first place).
Bootstrap+regtest on powerpc64le-unknown-linux-gnu.
2016-10-27 Marc Glisse
On Wed, 26 Oct 2016, Bin.Cheng wrote:
On Wed, Oct 26, 2016 at 3:10 PM, Bin.Cheng <amker.ch...@gmail.com> wrote:
On Wed, Oct 26, 2016 at 3:04 PM, Marc Glisse <marc.gli...@inria.fr> wrote:
On Wed, 26 Oct 2016, Bin.Cheng wrote:
Thanks for reviewing, updated patch attache
d long)(-2),3ul)
seems to satisfy your conditions, but is not the same as
max(-2,3)
--
Marc Glisse
erations, which
would allow constant folding as well as other optimizations.
--
Marc Glisse
explained in an earlier
message, but we can do that later as the need arises)
--
Marc Glisse
On Thu, 13 Oct 2016, Prathamesh Kulkarni wrote:
On 12 October 2016 at 14:43, Richard Biener <rguent...@suse.de> wrote:
On Wed, 12 Oct 2016, Marc Glisse wrote:
On Wed, 12 Oct 2016, Prathamesh Kulkarni wrote:
I was having a look at PR71636 and added the following pattern to match
absurd
way to go about it ;-)
--
Marc Glisse
d avoid the
transformation, just that we should fix the RA issue (by the way, if you
have time to file a separate PR for the RA issue, that would be great,
otherwise I'll try to do it at some point...).
However it seems andnot isn't a standard pattern name, so am not sure
how to check if target
patch). It is better to write tests
for the gimple version of transformations, i.e. don't write everything as
a single expression, use intermediate variables.
--
Marc Glisse
se (int)(b-a) could be a truncation in which case
multiplying with 4 might not result in the same value as
b-a truncated(?). The comment before the unpatched patterns
said "sign-changing conversions" but nothign actually verified this.
Might be that truncations are indeed ok now that I think
On Fri, 7 Oct 2016, Richard Biener wrote:
On Thu, 6 Oct 2016, Marc Glisse wrote:
On Wed, 5 Oct 2016, Richard Biener wrote:
The following will fix PR77826, the issue that in match.pd matching
up two things uses operand_equal_p which is too lax about the type
of the toplevel entity (at least
nd a >= b in the second).
I guess that the slightly more general:
X >= A && X < B where we know that A <= B
--> (unsigned)X - (unsigned)A < (unsigned)B - (unsigned)A
generates too many operations and does not gain anything? The case where A
and B are constants seems to be handled by ifcombine, currently.
--
Marc Glisse
does?
* match.pd ((X /[ex] A) * A -> X): Properly handle converted
and constant A.
This regressed
int f(int*a,int*b){return 4*(int)(b-a);}
--
Marc Glisse
On Wed, 5 Oct 2016, Jason Merrill wrote:
On Wed, Oct 5, 2016 at 5:29 AM, Marek Polacek <pola...@redhat.com> wrote:
On Wed, Oct 05, 2016 at 08:58:08AM +0200, Marc Glisse wrote:
On Tue, 4 Oct 2016, Jason Merrill wrote:
C++17 adds the ability to omit the template arguments for a class
te
__noexcept_auto__ spelling) and just use that
in most places where we want a conditional noexcept.
--
Marc Glisse
. Is there a macro to test for this feature? I couldn't find it in the
latest sg10 list.
--
Marc Glisse
On Tue, 4 Oct 2016, Richard Biener wrote:
Possibly. Though then for FP we also want - abs (a) -> copysign (a, -1).
I thought this might fix PR 62055, but at least on x86_64, we generate
much worse code for copysign(,-1) than for -abs :-(
--
Marc Glisse
tor_size(16)));
vec f(vec x){
vec y=(x<0)?-x:x;
return x/y;
}
(I wasn't sure if you had added a feature to turn cond into vec_cond
automatically in some cases)
--
Marc Glisse
the C semantics. On the
other hand, the paper mentions speed and convenience as the main selling
points. Maybe LWG could clarify the intent?
--
Marc Glisse
for
errno? Otherwise, I have a hard time believing that 3 multiplications and
2 additions can be slower than 4 multiplications, 2 additions, plus a
bunch of tests and divisions.
--
Marc Glisse
On Mon, 19 Sep 2016, Patrick Palka wrote:
On Wed, Sep 14, 2016 at 1:58 AM, Marc Glisse <marc.gli...@inria.fr> wrote:
On Fri, 19 Aug 2016, Patrick Palka wrote:
On Fri, Aug 19, 2016 at 7:30 PM, Patrick Palka <patr...@parcs.ath.cx>
wrote:
integer_nonzerop() currently uncondition
(align -
1), for instance.
I guess people interested in performance will do for aligned new the same
as for the old new: provide an inline version that skips all the overhead
to forward directly to malloc/aligned_alloc (and avoid questionable calls
in their code).
--
Marc Glisse
entry would also help.
--
Marc Glisse
eger_each_nonzerop vs integer_not_all_zerop?
--
Marc Glisse
time, I agree that gimplifying x+1 and 1+x differently makes
little sense, you could file a PR about that.
--
Marc Glisse
time.
--
Marc Glisse
that just aborts I guess?
--
Marc Glisse
to
gcc/config/i386/gmm_malloc.h)? A windows version with _aligned_malloc /
_aligned_free would also be possible.
--
Marc Glisse
On Thu, 1 Sep 2016, Jonathan Wakely wrote:
+ const __uc_type __comp_range = __swap_range * (__swap_range + 1);
If __swap_range is 3, then __comp_range is 10 and
???
--
Marc Glisse
)
+++ gcc/ChangeLog (working copy)
@@ -1,10 +1,15 @@
+2016-08-31 Marc Glisse <marc.gli...@inria.fr>
+
+ PR tree-optimization/73714
+ * match.pd (a * (1 << b)): Revert change from 2016-05-23.
+
2016-08-31 David Malcolm <dmalc...@redhat.com>
* selftest.c: Move "names
On Mon, 29 Aug 2016, Uros Bizjak wrote:
On Mon, Aug 29, 2016 at 5:22 PM, Marc Glisse <marc.gli...@inria.fr> wrote:
On Mon, 29 Aug 2016, Kirill Yukhin wrote:
On 29.08.2016 14:58, Marc Glisse wrote:
this patch gets rid of a few more builtins (well, I actually kept them,
since Ada use
On Mon, 29 Aug 2016, Kirill Yukhin wrote:
On 29.08.2016 14:58, Marc Glisse wrote:
this patch gets rid of a few more builtins (well, I actually kept them,
since Ada users may still need them). I had to tweak the flags for
pr59539-2.c, otherwise the compiler thinks it is more efficient to split
, (%rax){%k1}
The changes in the signature of functions don't seem to matter, gcc
apparently ignores the aligned attribute for that purpose. The last change
(_mm_load_ps) is for consistency.
Bootstrap+regtest on x86_64-pc-linux-gnu, with only the above regression.
2016-08-29 Marc Glisse
ke powerpc didn't add the macro
when they later gained __float128 support, but that would be easy to fix.
--
Marc Glisse
On Fri, 26 Aug 2016, Richard Biener wrote:
On Thu, Aug 25, 2016 at 9:40 PM, Marc Glisse <marc.gli...@inria.fr> wrote:
Hello,
I was considering changing the implementation of _mm_loadu_pd in x86's
emmintrin.h to avoid a builtin. Here are 3 versions:
typedef double __m128d __attri
lignment-check stuff is not supported by
gcc?
--
Marc Glisse
On Tue, 19 Jul 2016, Bernd Schmidt wrote:
On 07/19/2016 12:09 PM, Richard Biener wrote:
I saw walks over stmts of a BB. IMHO that's a no-go.
Only to find the first or last nondebug one. Is that unacceptable?
Does gsi_start_nondebug_after_labels_bb not fit?
--
Marc Glisse
On Wed, 13 Jul 2016, Marek Polacek wrote:
On Wed, Jul 13, 2016 at 08:39:28PM +0200, Marc Glisse wrote:
On Wed, 13 Jul 2016, Marek Polacek wrote:
Does "__attribute__((fallthrough));" have any advantages over
"__builtin_fallthrough()"?
Not a strong argument, but compi
tribute (and
ignore the comment, which makes both other versions less relevant).
Unrelated question: are there cases where __builtin_fallthrough() has any
impact on code generation?
--
Marc Glisse
on x86_64 (possibly
more depending on the exact definition of largest integer).
--
Marc Glisse
targets you care
about?
I don't have access to Sparc, you want Rainer here (added in Cc:).
(thanks for completing the patch!)
--
Marc Glisse
lds (sadly, the standards make it hard to avoid unsigned types...).
--
Marc Glisse
h. (OTOH, if you decided that the same message
was good enough for both...)
--
Marc Glisse
point is that we are not allowed to transform g into
f below:
char*f(char*p){return p+4;}
char*g(char*p){return (char*)((intptr_t)p+4);}
That makes sense and seems much easier to guarantee than I feared, nice.
(on the other hand, only RTL is able to simplify (long)p+4-(long)(p+4))
--
Marc Glisse
sions.
I think this is an exciting optimization, but I was always too scared to
try anything like this.
--
Marc Glisse
that's good :-) Otherwise, we might want to err on the side of caution.
--
Marc Glisse
ENOPATCH ;-)
(I highly recommend https://gcc.gnu.org/wiki/CompileFarm )
--
Marc Glisse
On Tue, 14 Jun 2016, Kyrill Tkachov wrote:
On 14/06/16 08:04, Marc Glisse wrote:
On Mon, 13 Jun 2016, Kyrill Tkachov wrote:
The new function vect_synth_mult_by_constant that does all the hard work
is very similar in structure to expand_mult_const from expmed.c but it
operates on gimple SSA
overflow from
well-defined code. While this is fine in RTL, I would expect to need a
cast to unsigned_type_for in gimple for some of the algorithms.
(not completely sure, just something to look at)
--
Marc Glisse
nd or vector lowering to turn them to additions (though the estimated
cost might be off). Any idea on the best way to handle SPARC?
--
Marc Glisse
://gcc.gnu.org/ml/gcc-patches/2016-06/msg00881.html
This one seems much better.
--
Marc Glisse
On Mon, 13 Jun 2016, Richard Biener wrote:
On Sun, Jun 12, 2016 at 11:19 AM, Marc Glisse <marc.gli...@inria.fr> wrote:
Hello,
canonicalizing x+x to x*2 made us regress some vectorization tests on sparc.
As suggested by Richard, this lets the vectorizer handle x*2 as x+x if that
helps.
her in later
passes. ยป
Rainer bootstrapped and regtested the patch on sparc. As a bonus, it now
vectorizes one more loop in gcc.dg/vect/vect-iv-9.c, I'll let someone else
tweak the test (which will temporarily appear as a FAIL).
2016-06-13 Marc Glisse <marc.gli...@inria.fr>
PR
Hello,
this move is pretty straightforward. The transformation would probably
work just fine for any type (floats, vectors), but I didn't want the
headache of checking the behavior for NaN.
Bootstrap+regtest on powerpc64le-unknown-linux-gnu.
2016-06-13 Marc Glisse <marc.gli...@inria
actually be, instead of
!integer_zerop (@1):
flag_non_call_exceptions ? tree_expr_nonzero_p (@1) :
(GIMPLE || !integer_zerop (@1))
(or something similar at a different level in the call chain of course)
(not that this helps in any way for the PR...)
--
Marc Glisse
message about it, and folding might
not be idempotent.
But as long as we don't always perform the simplification, we need some
other way to break the cycle for PR70992.
--
Marc Glisse
hing more generic).
The goal of the experiment is described in PR59159 (for which "+X" is
unlikely to be the right answer, in particular because it is meaningless
for constants). I don't know in what context people use the "X"
constraint, or even better "=X"...
--
Marc Glisse
the -s command line parameter yields
nothing. Have I done something wrong?
"nothing" is not very helpful... Surely it gave some error message.
--
Marc Glisse
n't ask because I was assuming the latter, but I am not 100%
certain)
--
Marc Glisse
is may require an extra check like tree_expr_nonzero_p,
although we are quite inconsistent about this (we don't simplify x/x to 1,
but we do simplify 0%x to 0 if x is not (yet) known to be the constant 0).
We'll see what the reviewers think...
Any plan on optimizing the 'B && ovf' form?
--
Marc Glisse
not,
currently), they will submit a patch to gcc-patc...@gcc.gnu.org, which
will be reviewed. Note that a patch needs to include testcases (see the
files in gcc/testsuite/g++.dg for examples). If you are interested, you
could give it a try...
--
Marc Glisse
501 - 600 of 1798 matches
Mail list logo