Hi,
The testcase gfortran.dg/default_format_denormal_2.f90 has been
reporting XPASS since 4.8 on the powerpc*-unknown-linux-gnu platforms.
This patch removes the XFAIL for powerpc*-*-linux-* from the test. I
believe this pattern doesn't match any other platforms, but please let
me know if I
Hi there,
Ping. I'm seeking approval for this fix on trunk and 4_6-branch.
Thanks!
Bill
On Tue, 2011-09-13 at 17:55 -0500, William J. Schmidt wrote:
Greetings,
The code to build scops (static control parts) for graphite first
rewrites loops into canonical loop-closed SSA form. PR50183
On Thu, 2011-09-29 at 10:03 +0100, Tobias Grosser wrote:
On 09/29/2011 09:58 AM, Richard Guenther wrote:
On Thu, Sep 29, 2011 at 12:10 AM, William J. Schmidt
wschm...@linux.vnet.ibm.com wrote:
Hi there,
Ping. I'm seeking approval for this fix on trunk and 4_6-branch.
Thanks!
Ok
This patch addresses the poor code generation in PR46556 for the
following code:
struct x
{
int a[16];
int b[16];
int c[16];
};
extern void foo (int, int, int);
void
f (struct x *p, unsigned int n)
{
foo (p-a[n], p-c[n], p-b[n]);
}
Prior to the fix for PR32698, gcc calculated the
On Wed, 2011-10-05 at 18:29 +0200, Steven Bosscher wrote:
On Wed, Oct 5, 2011 at 6:13 PM, William J. Schmidt
wschm...@linux.vnet.ibm.com wrote:
* tree-ssa-loop-ivopts.c (copy_ref_info): Remove static token.
Rather than this, why not move the function to common code somewhere?
Ciao
On Wed, 2011-10-05 at 18:21 +0200, Paolo Bonzini wrote:
On 10/05/2011 06:13 PM, William J. Schmidt wrote:
One other general question about the pattern-match transformation: Is
this an appropriate transformation for all targets, or should it be
somehow gated on available addressing modes
On Wed, 2011-10-05 at 21:01 +0200, Paolo Bonzini wrote:
On 10/05/2011 07:22 PM, William J. Schmidt wrote:
I don't know off the top of my head -- I'll have to gather that
information. The issue is that the profitability is really
context-sensitive, so just the isolated costs of insns aren't
On Thu, 2011-10-06 at 09:47 +0200, Paolo Bonzini wrote:
And IIUC the other address is based on pseudo 125 as well, but the
combination is (plus (plus (reg 126) (reg 128)) (const_int X)) and
cannot be represented on ppc. I think _this_ is the problem, so I'm
afraid your patch could cause
On Thu, 2011-10-06 at 12:13 +0200, Richard Guenther wrote:
People have already commented on pieces, so I'm looking only
at the tree-ssa-reassoc.c pieces (did you consider piggy-backing
on IVOPTs instead? The idea is to expose additional CSE
opportunities, right? So it's sort-of a
On Thu, 2011-10-06 at 16:16 +0200, Richard Guenther wrote:
snip
Doh, I thought you were matching gimple stmts that do the address
computation. But now I see you are matching the tree returned from
get_inner_reference. So no need to check anything for that case.
But that keeps me
On Thu, 2011-10-06 at 11:35 -0600, Jeff Law wrote:
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On 10/06/11 04:13, Richard Guenther wrote:
People have already commented on pieces, so I'm looking only at the
tree-ssa-reassoc.c pieces (did you consider piggy-backing on IVOPTs
On Fri, 2011-10-07 at 11:17 +0200, Paolo Bonzini wrote:
On 10/07/2011 10:00 AM, Richard Guenther wrote:
It's a reasonable plan - you'd have to introduce a late reassoc
pass though. Can you separate out the RTL fwprop changes? So
we can iterate over the tree parts separately.
That's
Greetings,
Here are the revised changes for the tree portions of the patch. I've
attempted to resolve all comments to date on those portions. Per
Steven's comment, I moved copy_ref_info into tree-ssa-address.c; let me
know if there's a better place, or whether you'd prefer to leave it
where it
Hi Richard,
Thanks for the comments -- a few responses below.
On Tue, 2011-10-11 at 13:40 +0200, Richard Guenther wrote:
On Sat, 8 Oct 2011, William J. Schmidt wrote:
snip
+ c4 = uhwi_to_double_int (bitpos / BITS_PER_UNIT);
You don't verify that bitpos % BITS_PER_UNIT is zero
On Tue, 2011-10-11 at 09:12 -0500, William J. Schmidt wrote:
The pattern matching is still very ad-hoc and doesn't consider
statements that feed the base address. There is conceptually
no difference between p-a[n] and *(p + n * 4).
That's true. Since we abandoned the general address
Greetings,
Here is a new revision of the tree portions of this patch. I moved the
pattern recognizer to expand, and added additional logic to look for the
same pattern in gimple form. I added two more tests to verify the new
logic.
I didn't run into any problems with the RTL CSE phases. I
On Fri, 2011-10-21 at 11:26 +0200, Richard Guenther wrote:
On Tue, Oct 18, 2011 at 4:14 PM, William J. Schmidt
wschm...@linux.vnet.ibm.com wrote:
snip
+
+ /* We don't use get_def_for_expr for S1 because TER doesn't forward
+ S1 in some situations where this transform is useful
OK, I've removed the pointer-arithmetic case from expand, to be handled
later by straight-line strength reduction. Here's the patch to deal
with just the specific pattern of PR46556 (which will also eventually be
handled by strength reduction, but not as quickly).
(FYI, I've been thinking
On Wed, 2012-03-28 at 15:57 +0200, Richard Guenther wrote:
On Tue, Mar 6, 2012 at 9:49 PM, William J. Schmidt
wschm...@linux.vnet.ibm.com wrote:
Hi,
This is a re-post of the patch I posted for comments in January to
address http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18589. The patch
On Wed, 2012-04-04 at 13:35 +0200, Richard Guenther wrote:
On Tue, Apr 3, 2012 at 10:25 PM, William J. Schmidt
wschm...@linux.vnet.ibm.com wrote:
On Wed, 2012-03-28 at 15:57 +0200, Richard Guenther wrote:
On Tue, Mar 6, 2012 at 9:49 PM, William J. Schmidt
wschm...@linux.vnet.ibm.com
On Wed, 2012-04-04 at 15:08 +0200, Richard Guenther wrote:
On Wed, Apr 4, 2012 at 2:35 PM, William J. Schmidt
wschm...@linux.vnet.ibm.com wrote:
On Wed, 2012-04-04 at 13:35 +0200, Richard Guenther wrote:
On Tue, Apr 3, 2012 at 10:25 PM, William J. Schmidt
wschm...@linux.vnet.ibm.com wrote
On Wed, 2012-04-04 at 13:35 +0200, Richard Guenther wrote:
On Tue, Apr 3, 2012 at 10:25 PM, William J. Schmidt
wschm...@linux.vnet.ibm.com wrote:
Hi Richard,
I've revised my patch along these lines; see the new version below.
While testing it I realized I could do a better job
There seems to be tacit agreement that the vector tests should use
-fno-common on all targets to avoid the recent spate of failures (see
discussion in 52571 and 52603). This patch (proposed by Dominique
D'Humieures) does just that. I agreed to shepherd the patch through.
I've verified that it
On Thu, 2012-04-05 at 11:30 +0200, Richard Guenther wrote:
On Thu, Apr 5, 2012 at 6:22 AM, Mike Stump mikest...@comcast.net wrote:
On Apr 4, 2012, at 7:56 PM, William J. Schmidt wrote:
There seems to be tacit agreement that the vector tests should use
-fno-common on all targets to avoid
On Thu, 2012-04-05 at 11:23 +0200, Richard Guenther wrote:
On Wed, Apr 4, 2012 at 9:15 PM, William J. Schmidt
wschm...@linux.vnet.ibm.com wrote:
Unfortunately this seems to be necessary if I name the two passes
reassoc1 and reassoc2. If I try to name both of them reassoc I
get failures
On Thu, 2012-04-12 at 09:50 -0700, H.J. Lu wrote:
On Thu, Apr 5, 2012 at 6:49 AM, William J. Schmidt
wschm...@linux.vnet.ibm.com wrote:
On Thu, 2012-04-05 at 11:23 +0200, Richard Guenther wrote:
On Wed, Apr 4, 2012 at 9:15 PM, William J. Schmidt
wschm...@linux.vnet.ibm.com wrote
On Thu, 2012-04-12 at 09:50 -0700, H.J. Lu wrote:
On Thu, Apr 5, 2012 at 6:49 AM, William J. Schmidt
wschm...@linux.vnet.ibm.com wrote:
On Thu, 2012-04-05 at 11:23 +0200, Richard Guenther wrote:
On Wed, Apr 4, 2012 at 9:15 PM, William J. Schmidt
wschm...@linux.vnet.ibm.com wrote
This patch corrects two errors in reassociating expressions with
repeated factors. First, undistribution needs to recognize repeated
factors. For now, repeated factors will be ineligible for this
optimization. In the future, this can be improved. Second, when a
__builtin_powi call is
On Mon, 2012-04-16 at 11:01 +0200, Richard Guenther wrote:
On Sat, Apr 14, 2012 at 7:05 PM, William J. Schmidt
wschm...@linux.vnet.ibm.com wrote:
This patch corrects two errors in reassociating expressions with
repeated factors. First, undistribution needs to recognize repeated
factors
The emergency reassociation patch for PR52976 disabled un-distribution
in the presence of repeated factors to avoid ICEs in zero_one_operation.
This patch fixes such cases properly by teaching zero_one_operation
about __builtin_pow* calls.
Bootstrapped with no new regressions on powerpc64-linux.
The emergency patch for PR52976 manipulated the operand rank system to
force inserted __builtin_powi calls to occur before uses of the call
results. However, this is generally the wrong approach, as it forces
other computations to move unnecessarily, and extends the lifetimes of
other operands.
This enhances constant folding for division by complex and vector
constants. When -freciprocal-math is present, such divisions are
converted into multiplies by the constant reciprocal. When an exact
reciprocal is available, this is done for vector constants when
optimizing. I did not implement
On Fri, 2012-04-20 at 10:04 +0200, Richard Guenther wrote:
On Thu, 19 Apr 2012, William J. Schmidt wrote:
This enhances constant folding for division by complex and vector
constants. When -freciprocal-math is present, such divisions are
converted into multiplies by the constant
On Fri, 2012-04-20 at 11:32 -0700, H.J. Lu wrote:
On Thu, Apr 19, 2012 at 6:58 PM, William J. Schmidt
wschm...@linux.vnet.ibm.com wrote:
This enhances constant folding for division by complex and vector
constants. When -freciprocal-math is present, such divisions are
converted
On Mon, 2012-04-23 at 11:09 +0200, Richard Guenther wrote:
On Fri, 20 Apr 2012, William J. Schmidt wrote:
On Fri, 2012-04-20 at 11:32 -0700, H.J. Lu wrote:
On Thu, Apr 19, 2012 at 6:58 PM, William J. Schmidt
wschm...@linux.vnet.ibm.com wrote:
This enhances constant folding
This fixes an error wherein a nontrivial expression oassed to an Altivec
built-in results in an ICE, following Joseph Myers's suggested approach
in the bugzilla.
Bootstrapped and tested with no new regressions on
powerpc64-unknown-linux-gnu. Ok for trunk?
Thanks,
Bill
gcc:
2012-04-24 Bill
Thought I'd ping http://gcc.gnu.org/ml/gcc-patches/2012-03/msg01225.html
since it's been about six weeks. Any initial feedback would be very
much appreciated!
Thanks,
Bill
On Mon, 2012-04-30 at 20:22 -0700, Andrew Pinski wrote:
Hi,
This patch improves the expansion of COND_EXPR into RTL, directly
using conditional moves.
I had to fix a bug in the x86 backend where emit_conditional_move
could cause a crash as we had a comparison mode of DImode which is not
This patch was posted for comment back in February during stage 4. It
addresses a performance issue noted in the EEMBC routelookup benchmark
on a common idiom:
if (...)
x = y-left;
else
x = y-right;
If the two loads can be hoisted out of the if/else, the if/else can be
replaced by a
On Thu, 2012-05-03 at 09:40 -0600, Jeff Law wrote:
On 05/03/2012 08:33 AM, William J. Schmidt wrote:
This patch was posted for comment back in February during stage 4. It
addresses a performance issue noted in the EEMBC routelookup benchmark
on a common idiom:
if (...)
x = y
On Thu, 2012-05-03 at 11:44 -0600, Jeff Law wrote:
On 05/03/2012 10:47 AM, William J. Schmidt wrote:
Yes and no. What's important is that you don't want to introduce page
faults (or less urgently, cache misses) by speculating the load. So the
patch is currently extremely constrained
This fixes another statement-placement issue when reassociating
expressions with repeated factors. Multiplies feeding into
__builtin_powi calls were not getting placed properly ahead of them in
some cases.
Bootstrapped and tested on powerpc64-unknown-linux-gnu with no new
regressions. I've also
Backporting this patch to 4.7 fixes a problem building Fedora 17.
Bootstrapped and regression tested on powerpc64-unknown-linux-gnu. Is
the backport OK?
Thanks,
Bill
2012-05-10 Bill Schmidt wschm...@vnet.linux.ibm.com
Backport from trunk:
2012-03-12 Richard Guenther
On Thu, 2012-05-10 at 18:49 +0200, Jakub Jelinek wrote:
On Thu, May 10, 2012 at 11:44:27AM -0500, William J. Schmidt wrote:
Backporting this patch to 4.7 fixes a problem building Fedora 17.
Bootstrapped and regression tested on powerpc64-unknown-linux-gnu. Is
the backport OK?
For 4.7
Ping.
Thanks,
Bill
On Tue, 2012-05-08 at 22:04 -0500, William J. Schmidt wrote:
This fixes another statement-placement issue when reassociating
expressions with repeated factors. Multiplies feeding into
__builtin_powi calls were not getting placed properly ahead of them in
some cases
On Wed, 2012-05-16 at 11:45 +0200, Richard Guenther wrote:
On Tue, 15 May 2012, William J. Schmidt wrote:
Ping.
I don't like it too much - but pondering a bit over it I can't find
a nicer solution.
So, ok.
Thanks,
Richard.
Agreed. I'm not fond of it either, and I feel it's a bit
On Wed, 2012-05-16 at 14:05 +0200, Richard Guenther wrote:
On Wed, 16 May 2012, William J. Schmidt wrote:
On Wed, 2012-05-16 at 11:45 +0200, Richard Guenther wrote:
On Tue, 15 May 2012, William J. Schmidt wrote:
Ping.
I don't like it too much - but pondering a bit over it I
Ping.
Thanks,
Bill
On Thu, 2012-05-03 at 09:33 -0500, William J. Schmidt wrote:
This patch was posted for comment back in February during stage 4. It
addresses a performance issue noted in the EEMBC routelookup benchmark
on a common idiom:
if (...)
x = y-left;
else
x = y
On Tue, 2012-05-15 at 14:17 +0200, Richard Guenther wrote:
This is the first patch to make the generated code for the testcase
in PR53355 better. It teaches VRP about LSHIFT_EXPRs (albeit only
of a very simple form).
Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
This
This patch gives up on using the reassociation rank algorithm to
correctly place __builtin_powi calls and their feeding multiplies. In
the end this proved to introduce more complexity than it saved, due in
part to the poor fit of introducing DAG expressions into the
reassociated operand tree.
This repairs the bootstrap issue due to unsafe signed overflow
assumptions. Bootstrapped and tested on powerpc64-unknown-linux-gnu
with no new regressions. Ok for trunk?
Thanks,
Bill
2012-05-18 Bill Schmidt wschm...@linux.vnet.ibm.com
* config/rs6000/rs6000.c (print_operand):
On Mon, 2012-05-21 at 14:17 +0200, Richard Guenther wrote:
On Thu, May 3, 2012 at 4:33 PM, William J. Schmidt
wschm...@linux.vnet.ibm.com wrote:
This patch was posted for comment back in February during stage 4. It
addresses a performance issue noted in the EEMBC routelookup benchmark
Here's a revision of the hoist-adjacent-loads patch. Besides hopefully
addressing all your comments, I added a gate of at least -O2 for this
transformation. Let me know if you prefer a different minimum opt
level.
I'm still running SPEC tests to make sure there are no regressions when
opening
On Wed, 2012-05-23 at 13:25 +0200, Richard Guenther wrote:
On Tue, 22 May 2012, William J. Schmidt wrote:
Here's a revision of the hoist-adjacent-loads patch. Besides hopefully
addressing all your comments, I added a gate of at least -O2 for this
transformation. Let me know if you
Ping...
On Thu, 2012-06-28 at 16:45 -0500, William J. Schmidt wrote:
Here's a relatively small piece of strength reduction that solves that
pesky addressing bug that got me looking at this in the first place...
The main part of the code is the stuff that was reviewed last year, but
which
On Tue, 2012-07-24 at 10:57 +0200, Richard Guenther wrote:
On Mon, 23 Jul 2012, William J. Schmidt wrote:
This patch completes the conversion of the vectorizer cost model to use
target hooks for recording vectorization information and calculating
costs. Previous work handled the costs
Per Richard Henderson's suggestion
(http://gcc.gnu.org/ml/gcc-patches/2012-06/msg01370.html), this patch
changes the IVOPTS and straight-line strength reduction passes to make
use of data computed by init_expmed. This required adding a new
convert_cost array in expmed to store the costs of
On Wed, 2012-07-25 at 09:59 -0700, Richard Henderson wrote:
On 07/25/2012 09:13 AM, William J. Schmidt wrote:
Per Richard Henderson's suggestion
(http://gcc.gnu.org/ml/gcc-patches/2012-06/msg01370.html), this patch
changes the IVOPTS and straight-line strength reduction passes to make
use
On Wed, 2012-07-25 at 13:39 -0600, Sandra Loosemore wrote:
On 07/17/2012 05:22 AM, Richard Guenther wrote:
On Wed, Jul 4, 2012 at 6:35 PM, Sandra Loosemore
san...@codesourcery.com wrote:
Ping? Original post with patch is here:
http://gcc.gnu.org/ml/gcc-patches/2012-06/msg00319.html
On Fri, 2012-07-27 at 15:40 +0200, Richard Guenther wrote:
On Thu, Jul 26, 2012 at 11:57 AM, Steven Bosscher stevenb@gmail.com
wrote:
On Thu, Jul 26, 2012 at 11:23 AM, Richard Guenther
richard.guent...@gmail.com wrote:
Ok! Thanks for adding this exhaustive documentation.
There's
This fixes the de-canonicalization of commutative GIMPLE operations in
the vectorizer that occurs when processing reductions. A loop_vec_info
is flagged for cleanup when a de-canonicalization has occurred in that
loop, and the cleanup is done when the loop_vec_info is destroyed.
Bootstrapped on
Now that the vectorizer cost model is set up to facilitate per-target
heuristics, I'm revisiting the density heuristic I submitted
previously. This allows the vec_permute and vec_promote_demote costs to
be set to their natural values, but inhibits vectorization in cases like
sphinx3 where
This cleans up terminology in strength reduction. What used to be a
base SSA name is now sometimes other tree expressions, so the term base
name is replaced by base expression throughout.
Bootstrapped and tested with no new regressions on
powerpc64-unknown-linux-gnu; committed as obvious.
Change this test case to use the optimized dump so that the unreliable
vect-details dump can't cause different behavior on different targets.
Verified on powerpc64-unknown-linux-gnu, committed as obvious.
Thanks,
Bill
2012-08-03 Bill Schmidt wschm...@linux.ibm.com
*
On Wed, 2012-08-08 at 15:35 -0700, Janis Johnson wrote:
On 08/08/2012 03:27 PM, Andrew Pinski wrote:
On Wed, Aug 8, 2012 at 3:25 PM, H.J. Lu hjl.to...@gmail.com wrote:
On Wed, Aug 1, 2012 at 10:36 AM, William J. Schmidt
wschm...@linux.vnet.ibm.com wrote:
Greetings,
Thanks
On Wed, 2012-08-08 at 19:22 -0700, Janis Johnson wrote:
On 08/08/2012 06:41 PM, William J. Schmidt wrote:
On Wed, 2012-08-08 at 15:35 -0700, Janis Johnson wrote:
On 08/08/2012 03:27 PM, Andrew Pinski wrote:
On Wed, Aug 8, 2012 at 3:25 PM, H.J. Lu hjl.to...@gmail.com wrote:
On Wed, Aug 1
Fix a thinko in strength reduction. I was checking the type of the
wrong operand to determine whether address arithmetic should be used in
replacing expressions. This produced a spurious POINTER_PLUS_EXPR when
an address was converted to an unsigned long and back again.
Bootstrapped and tested
As suggested by Janis regarding testsuite/gcc.dg/tree-ssa/slsr-30.c,
this patch adds a new effective target for machines having long and int
of differing sizes.
Tested on powerpc64-unknown-linux-gnu, where the test passes for -m64
and is skipped for -m32. Ok for trunk?
Thanks,
Bill
doc:
Replace the once vacuously true, and now vacuously false, test for
existence of a conditional move instruction for a given mode, with one
that actually checks what it's supposed to. Add a test case so we don't
miss such things in future.
The test is powerpc-specific. It would be good to have an
Thanks, Andrew!
Bill
On Tue, 2012-08-14 at 14:17 -0700, Andrew Pinski wrote:
On Tue, Aug 14, 2012 at 2:15 PM, Andrew Pinski pins...@gmail.com wrote:
On Tue, Aug 14, 2012 at 2:11 PM, William J. Schmidt
wschm...@linux.vnet.ibm.com wrote:
Replace the once vacuously true, and now vacuously
Currently we can insert an initializer that performs a multiply in too
small of a type for correctness. For now, detect the problem and avoid
the optimization when this would happen. Eventually I will fix this up
to cause the multiply to be performed in a sufficiently wide type.
Bootstrapped
On Thu, 2012-08-23 at 00:53 +0200, Steven Bosscher wrote:
Hello Bill,
This patch plugs a leak in rs6000.c:rs6000_density_test(). You have to
free the array that get_loop_body returns. Noticed while going over
all uses of get_loop_body (it's a common mistake to leak the return
array).
On Thu, 2012-08-23 at 00:53 +0200, Steven Bosscher wrote:
Hello Bill,
This patch plugs a leak in rs6000.c:rs6000_density_test(). You have to
free the array that get_loop_body returns. Noticed while going over
all uses of get_loop_body (it's a common mistake to leak the return
array).
Richard found some N^2 behavior in SLSR that has to be suppressed.
Searching for the best possible basis is overkill when there are
hundreds of thousands of possibilities. This patch constrains the
search to good enough in such cases.
Bootstrapped and tested on powerpc64-unknown-linux-gnu with
On Mon, 2012-09-10 at 16:45 +0200, Richard Guenther wrote:
On Mon, 10 Sep 2012, William J. Schmidt wrote:
Richard found some N^2 behavior in SLSR that has to be suppressed.
Searching for the best possible basis is overkill when there are
hundreds of thousands of possibilities. This patch
On Mon, 2012-09-10 at 16:56 +0200, Richard Guenther wrote:
On Mon, 10 Sep 2012, Jakub Jelinek wrote:
On Mon, Sep 10, 2012 at 04:45:24PM +0200, Richard Guenther wrote:
On Mon, 10 Sep 2012, William J. Schmidt wrote:
Richard found some N^2 behavior in SLSR that has to be suppressed
Here's the revised patch with a param. Bootstrapped and tested in the
same manner. Ok for trunk?
Thanks,
Bill
2012-08-10 Bill Schmidt wschm...@linux.vnet.ibm.com
* doc/invoke.texi (max-slsr-cand-scan): New description.
* gimple-ssa-strength-reduction.c
On Wed, 2011-07-06 at 15:16 +0200, Richard Guenther wrote:
On Tue, Jul 5, 2011 at 3:59 PM, William J. Schmidt
wschm...@linux.vnet.ibm.com wrote:
(Sorry for the late response; yesterday was a holiday here.)
On Mon, 2011-07-04 at 16:21 +0200, Richard Guenther wrote:
On Thu, Jun 30, 2011
On Mon, 2011-07-04 at 17:30 +0200, Michael Matz wrote:
Hi,
On Mon, 4 Jul 2011, Richard Guenther wrote:
I still do not like the implementation of yet another CSE machinery
given that we already have two.
From reading it it really seems to be a normal block-local CSE, without
anything
Ilya, thanks for posting this! This patch is useful also on powerpc64.
Applying it solved a performance degradation with bwaves due to loss of
reassociation somewhere between 4.5 and 4.6 (still tracking it down).
When we apply -ftree-reassoc-width=2 to bwaves, the more optimal code
generation
On Tue, 2011-07-12 at 11:50 -0500, William J. Schmidt wrote:
Ilya, thanks for posting this! This patch is useful also on powerpc64.
Applying it solved a performance degradation with bwaves due to loss of
reassociation somewhere between 4.5 and 4.6 (still tracking it down).
When we apply
On Tue, 2011-07-12 at 11:50 -0500, William J. Schmidt wrote:
Ilya, thanks for posting this! This patch is useful also on powerpc64.
Applying it solved a performance degradation with bwaves due to loss of
reassociation somewhere between 4.5 and 4.6 (still tracking it down).
When we apply
I've been distracted by other things, but got back to this today...
On Wed, 2011-07-06 at 16:58 +0200, Richard Guenther wrote:
Ah, so we still have the ARRAY_REFs here. Yeah, well - then the
issue boils down to get_inner_reference canonicalizing the offset
according to what fold-const.c
This patch fixes part of PR tree-optimization/49749. The operand scans
in tree-ssa-reassoc.c:get_rank() can be prematurely halted by two
erroneous conditions, which this patch removes. Patch pre-approved by
IRC communication with Richard Guenther, 7/21/11.
The wider issue of biasing
This is a draft patch that biases the reassociation machinery so that
each iteration of an accumulator pattern in a loop is independent of the
other iterations. This addresses a problem identified as an accidental
side effect of the bug observed in PR tree-optimization/49749. This
patch reverses
I found a handful of degradations with this patch from an earlier test
version, which demonstrate the incorrectness of this comment:
On Wed, 2011-07-27 at 10:11 -0500, William J. Schmidt wrote:
+ However, the rank of a value that depends on the result of a loop-
+ carried phi should still
Here is the final version of the reassociation patch. There are two
differences from the version I published on 7/27. I removed the
function call from within the MAX macro per Michael's comment, and I
changed the propagation of the rank of loop-carried phis to be zero.
This involved a small
Hi Richard,
Here's a revision of the hoist-adjacent-loads patch. I'm sorry for the
delay since the last revision, but my performance testing has been
blocked waiting for a fix to PR53487. I ended up applying a test
version of the patch to 4.7 and ran performance numbers with that
instead, with
On Mon, 2012-06-04 at 08:45 -0500, William J. Schmidt wrote:
Hi Richard,
Here's a revision of the hoist-adjacent-loads patch. I'm sorry for the
delay since the last revision, but my performance testing has been
blocked waiting for a fix to PR53487. I ended up applying a test
version
The fix for PR53331 caused a degradation to 187.facerec on
powerpc64-unknown-linux-gnu. The following simple patch reverses the
degradation without otherwise affecting SPEC cpu2000 or cpu2006.
Bootstrapped and regtested on that platform with no new regressions. Ok
for trunk?
Thanks,
Bill
On Mon, 2012-06-11 at 13:28 +0200, Richard Guenther wrote:
On Mon, Jun 4, 2012 at 3:45 PM, William J. Schmidt
wschm...@linux.vnet.ibm.com wrote:
Hi Richard,
Here's a revision of the hoist-adjacent-loads patch. I'm sorry for the
delay since the last revision, but my performance testing
On Mon, 2012-06-11 at 11:15 +0200, Richard Guenther wrote:
On Sun, Jun 10, 2012 at 5:58 PM, William J. Schmidt
wschm...@linux.vnet.ibm.com wrote:
The fix for PR53331 caused a degradation to 187.facerec on
powerpc64-unknown-linux-gnu. The following simple patch reverses the
degradation
On Mon, 2012-06-11 at 13:40 +0200, Richard Guenther wrote:
On Fri, 8 Jun 2012, William J. Schmidt wrote:
This patch adds a heuristic to the vectorizer when estimating the
minimum profitable number of iterations. The heuristic is
target-dependent, and is currently disabled for all targets
On Mon, 2012-06-11 at 16:10 +0200, Richard Guenther wrote:
On Mon, 11 Jun 2012, William J. Schmidt wrote:
On Mon, 2012-06-11 at 11:15 +0200, Richard Guenther wrote:
On Sun, Jun 10, 2012 at 5:58 PM, William J. Schmidt
wschm...@linux.vnet.ibm.com wrote:
The fix for PR53331
On Mon, 2012-06-11 at 16:58 +0200, Richard Guenther wrote:
On Mon, 11 Jun 2012, Richard Guenther wrote:
On Mon, 11 Jun 2012, William J. Schmidt wrote:
On Mon, 2012-06-11 at 13:40 +0200, Richard Guenther wrote:
On Fri, 8 Jun 2012, William J. Schmidt wrote:
This patch adds
On Mon, 2012-06-11 at 11:09 -0400, David Edelsohn wrote:
On Mon, Jun 11, 2012 at 10:55 AM, Richard Guenther rguent...@suse.de wrote:
Well, they are at least magic numbers and heuristics that apply
generally and not only to the single issue in sphinx. And in
fact how it works for sphinx
On Mon, 2012-06-11 at 14:59 +0200, Richard Guenther wrote:
On Mon, 11 Jun 2012, William J. Schmidt wrote:
On Mon, 2012-06-11 at 13:28 +0200, Richard Guenther wrote:
On Mon, Jun 4, 2012 at 3:45 PM, William J. Schmidt
wschm...@linux.vnet.ibm.com wrote:
Hi Richard,
Here's
On Mon, 2012-06-11 at 12:11 -0500, William J. Schmidt wrote:
I found this parameter that seems to correspond to well-predicted
conditional jumps:
/* When branch is predicted to be taken with probability lower than this
threshold (in percent), then it is considered well predictable
OK, once more with feeling... :)
This patch differs from the previous one in two respects: It disables
the optimization when either the then or else edge is well-predicted;
and it now uses the existing l1-cache-line-size parameter instead of a
new one (with updated commentary).
Bootstraps and
On Tue, 2012-06-12 at 12:59 +0200, Richard Guenther wrote:
Btw, with PR53533 I now have a case where multiplications of v4si are
really expensive on x86 without SSE 4.1. But we only have vect_stmt_cost
and no further subdivision ...
Thus we'd need a tree_code argument to the cost hook.
1 - 100 of 197 matches
Mail list logo