Re: [PATCH] Handle vectorization of invariant loads (PR46787)

2011-07-03 Thread Ira Rosen
Richard Guenther rguent...@suse.de wrote on 30/06/2011 06:24:50 PM: FYI, I'm testing the following which cures a fallout seen when building SPEC2k6 with the committed patch. It's suboptimal for j != 0 though - is there a way to get to the vectorized stmt of the j == 0 iteration? Yes, I

[patch] Fix PR tree-optimization/49610

2011-07-03 Thread Ira Rosen
Hi, This patch adds a missing check that a basic blocks exists before using it. Bootstrapped and tested on powerpc64-suse-linux. Committed. Ira ChangeLog: PR tree-optimization/49610 * tree-vect-loop.c (vect_is_slp_reduction): Check that DEF_STMT has a basic block.

[patch, vectorizer] Handle pattern statements with multiple uses

2011-06-29 Thread Ira Rosen
Hi, This is a follow-up patch for http://gcc.gnu.org/ml/gcc-patches/2011-06/msg01205.html. The previous patch added a support for widen-mult in intermediate type, e.g.,: char a_t; short a_it; int a_T, prod_T, prod_T'; S1 a_t = ; S3 a_T = (int) a_t; '--

[patch] Another enhancement of widen-mult in the vectorizer

2011-06-16 Thread Ira Rosen
Hi, For unsigned char in[N]; int out[N]; for (i = 0; i N; i++) out[i] = in[i] * 300; in[i] is first promoted to int and then multiplied by 300. This over-promotion prevents the vectorizer from using the widen-mult pattern here. This patch checks if a constant fits an intermediate type

Re: [patch] Don't insert pattern statements into the code (was Fix PR tree-optimization/49318)

2011-06-16 Thread Ira Rosen
On 14 June 2011 15:01, Richard Guenther richard.guent...@gmail.com wrote: On Tue, Jun 14, 2011 at 1:38 PM, Ira Rosen ira.ro...@linaro.org wrote: On 14 June 2011 14:27, Richard Guenther richard.guent...@gmail.com wrote:   /* Mark the stmts that are involved in the pattern

Re: [patch] Don't insert pattern statements into the code (was Fix PR tree-optimization/49318)

2011-06-16 Thread Ira Rosen
(See attached file: pattern2.txt)Index: ChangeLog === --- ChangeLog (revision 175073) +++ ChangeLog (working copy) @@ -1,3 +1,31 @@ +2011-06-15 Ira Rosen ira.ro...@linaro.org + + * tree-vect-loop-manip.c

Re: [patch] Another enhancement of widen-mult in the vectorizer

2011-06-16 Thread Ira Rosen
+         TREE_CODE (half_type1) == INTEGER_TYPE) +       { +         if (int_fits_type_p (oprnd0, half_type1)) I believe you need to check that oprnd0 is a INTEGER_CST before calling int_fits_type_p. +            { +             /* OPRND0 is a constant of HALF_TYPE1.  */ The whole

Re: [patch, testsuite] Fix vectorizer testsuite failures on ARM

2011-06-15 Thread Ira Rosen
Steve Ellcey s...@cup.hp.com wrote on 15/06/2011 08:15:27 PM: testsuite/ChangeLog: * gcc.dg/vect/vect-16.c: Rename to ... * gcc.dg/vect/no-fast-math-vect16.c: ... this. * gcc.dg/vect/vect-peel-3.c: Adjust misalignment values for double-word vectors.

Re: [patch] Don't insert pattern statements into the code (was Fix PR tree-optimization/49318)

2011-06-14 Thread Ira Rosen
On 14 June 2011 13:02, Richard Guenther richard.guent...@gmail.com wrote: On Mon, Jun 13, 2011 at 2:43 PM, Ira Rosen ira.ro...@linaro.org wrote: On 10 June 2011 12:14, Richard Guenther richard.guent...@gmail.com wrote: In the end I think we should not generate the pattern stmt during pattern

Re: [patch] Don't insert pattern statements into the code (was Fix PR tree-optimization/49318)

2011-06-14 Thread Ira Rosen
On 14 June 2011 14:27, Richard Guenther richard.guent...@gmail.com wrote:   /* Mark the stmts that are involved in the pattern. */ -  gsi_insert_before (si, pattern_stmt, GSI_SAME_STMT);   set_vinfo_for_stmt (pattern_stmt,                      new_stmt_vec_info (pattern_stmt, loop_vinfo,

[patch] Fix PR tree-optimization/49352

2011-06-13 Thread Ira Rosen
-linux. Committed. Ira ChangeLog: 2011-06-13 Jakub Jelinek ja...@redhat.com Ira Rosen ira.ro...@linaro.org PR tree-optimization/49352 * tree-vect-loop.c (vect_is_slp_reduction): Don't count debug uses at all, make sure loop_use_stmt after the loop is a def stmt

Re: [patch, testsuite] Fix vectorizer testsuite failures on ARM

2011-06-13 Thread Ira Rosen
On 9 June 2011 13:00, Ira Rosen ira.ro...@linaro.org wrote: Hi, This patch fixes several vectorizer testsuite failures on ARM: - vect-16.c checks that the vectorization fails without -ffast-math, but -ffast-math is a default flag for vector tests on ARM. I renamed the test to no-fast-math

[patch] Don't insert pattern statements into the code (was Fix PR tree-optimization/49318)

2011-06-13 Thread Ira Rosen
On 10 June 2011 12:14, Richard Guenther richard.guent...@gmail.com wrote: In the end I think we should not generate the pattern stmt during pattern matching but only mark the relevant statements with a pattern kind.  Say, for each pattern we have a main statement that has related stmts

[patch] Improve peeling heuristic in the vectorizer

2011-06-12 Thread Ira Rosen
Hi, gcc.dg/vect/vect-72.c is not expected to use loop peeling for alignment, but it does on ARM with double-word vectors. The loop contains two data-refs of type char: one is aligned and the other is misaligned by 1. When the cost model is disabled the peeling heuristic chooses to peel a number

[patch] Fix PR tree-optimization/49318

2011-06-10 Thread Ira Rosen
copy) @@ -1,3 +1,9 @@ +2011-06-10 Ira Rosen ira.ro...@linaro.org + + PR tree-optimization/49318 + * tree-vect-loop.c (vect_determine_vectorization_factor): Remove + irrelevant pattern statements. + 2011-06-10 Hans-Peter Nilsson h...@axis.com * system.h

Re: [patch] Fix PR tree-optimization/49318

2011-06-10 Thread Ira Rosen
On 10 June 2011 12:14, Richard Guenther richard.guent...@gmail.com wrote: On Fri, Jun 10, 2011 at 9:19 AM, Ira Rosen ira.ro...@linaro.org wrote: Hi, The test in PR 49318 fails because the vectorizer recognizes address computation sequence as a widening-multiplication pattern, while

[patch, testsuite] Fix vectorizer testsuite failures on ARM

2011-06-09 Thread Ira Rosen
Hi, This patch fixes several vectorizer testsuite failures on ARM: - vect-16.c checks that the vectorization fails without -ffast-math, but -ffast-math is a default flag for vector tests on ARM. I renamed the test to no-fast-math-vect-16.c to avoid the use of the flag for it. - vect-peel-3.c and

Re: [patch] Improve detection of widening multiplication in the vectorizer

2011-06-02 Thread Ira Rosen
On 1 June 2011 15:14, Richard Guenther richard.guent...@gmail.com wrote: On Wed, Jun 1, 2011 at 1:37 PM, Ira Rosen ira.ro...@linaro.org wrote: On 1 June 2011 12:42, Richard Guenther richard.guent...@gmail.com wrote: Did you think about moving pass_optimize_widening_mul before loop

Re: [patch] Improve detection of widening multiplication in the vectorizer

2011-06-02 Thread Ira Rosen
On 2 June 2011 12:59, Richard Guenther richard.guent...@gmail.com wrote: On Thu, Jun 2, 2011 at 10:46 AM, Ira Rosen ira.ro...@linaro.org wrote: On 1 June 2011 15:14, Richard Guenther richard.guent...@gmail.com wrote: On Wed, Jun 1, 2011 at 1:37 PM, Ira Rosen ira.ro...@linaro.org wrote: On 1

Re: [patch] Fix PR tree-optimization/49038

2011-06-02 Thread Ira Rosen
On 26 May 2011 10:52, Ira Rosen ira.ro...@linaro.org wrote: Hi, The vectorizer supports strided loads with gaps, e.g., when only a[4i] and a[4i+2] are accessed, it generates a vector load a[4i:4i+3], i.e., creating an access to a[4i+3], which doesn't exist in the scalar code. This access

[patch] Improve detection of widening multiplication in the vectorizer

2011-06-01 Thread Ira Rosen
Hi, The vectorizer expects widening multiplication pattern to be: type a_t, b_t; TYPE a_T, b_T, prod_T; a_T = (TYPE) a_t; b_T = (TYPE) b_t; prod_T = a_T * b_T; where type 'TYPE' is double the size of type 'type'. This works fine when the types are signed. For the

Re: [patch] Improve detection of widening multiplication in the vectorizer

2011-06-01 Thread Ira Rosen
On 1 June 2011 12:42, Richard Guenther richard.guent...@gmail.com wrote: Did you think about moving pass_optimize_widening_mul before loop optimizations?  Does that pass catch the cases you are teaching the pattern recognizer?  I think we should try to expose these more complicated

[patch, testsuite] Fix PR 49239

2011-05-31 Thread Ira Rosen
Hi, gcc.dg/vect/vect-strided-u8-i8-gap4-unknown.c randomly fails on Linux/ia32. I think it's because I forgot to initialize the output array. Tested on x86_64-suse-linux. Committed as obvious. Ira testsuite/ChangeLog: PR testsuite/49239 *

[patch] Fix PR tree-optimization/49093

2011-05-31 Thread Ira Rosen
Hi, This patch fails vectorization for volatile data references. Bootstrapped on powerpc64-suse-linux and tested on powerpc64-suse-linux and x86_64-suse-linux. Applied to trunk. OK for 4.6 after testing? Thanks, Ira ChangeLog: PR tree-optimization/49093 * tree-vect-data-refs.c

[patch] Fix PR49199 - ICE with SLP reduction

2011-05-30 Thread Ira Rosen
-tree-scev-cprop. Index: ChangeLog === --- ChangeLog (revision 174424) +++ ChangeLog (working copy) @@ -1,3 +1,10 @@ +2011-05-30 Ira Rosen ira.ro...@linaro.org + + PR tree-optimization/49199 + * tree-vect-loop.c

[patch] Fix PR testsuite/49222

2011-05-29 Thread Ira Rosen
Hi, This patch uses MAP_ANON if MAP_ANONYMOUS is not defined fixing this test's failure on x86_64-apple-darwin10. Tested on x86_64-suse-linux and on x86_64-apple-darwin10 (by Dominique). OK to apply? Thanks, Ira testsuite/ChangeLog: PR testsuite/49222 * gcc.dg/vect/pr49038.c: Use

[patch] Fix PR tree-optimization/49038

2011-05-26 Thread Ira Rosen
Hi, The vectorizer supports strided loads with gaps, e.g., when only a[4i] and a[4i+2] are accessed, it generates a vector load a[4i:4i+3], i.e., creating an access to a[4i+3], which doesn't exist in the scalar code. This access maybe invalid as described in the PR. This patch creates an

[patch] Fix PR 49087 (was Re: Fix crash in vect_is_slp_reduction)

2011-05-22 Thread Ira Rosen
No, we shouldn't arrive with a NULL use_stmt here. I think a proper fix will be to fail if there are no uses. I'll prepare a patch on Sunday. Here is the patch. It bails out if LHS has no uses. Bootstrapped and tested on powerpc64-suse-linux. Committed. Ira ChangeLog: PR

Re: Fix crash in vect_is_slp_reduction

2011-05-20 Thread Ira Rosen
gcc-patches-ow...@gcc.gnu.org wrote on 20/05/2011 05:17:47 PM: On Fri, May 20, 2011 at 4:06 PM, Ryan Mansfield rmansfi...@qnx.com wrote: There is a crash in vect_is_slp_reduction where use_stmt doesn't get initialized in the FOR_EACH_IMM_USE_FAST loop. 1718          

[patch] [1/2] Support reduction in loop SLP

2011-05-18 Thread Ira Rosen
Hi, This is the first part of reduction support in loop-aware SLP. The purpose of the patch is to handle unrolled reductions such as: #a1 = phi a0, a5 ... a2 = a1 + x ... a3 = a2 + y ... a5 = a4 + z Such sequence of statements is gathered into a reduction chain and serves as a root for an SLP

[patch] [2/2] Support reduction in loop SLP

2011-05-18 Thread Ira Rosen
This part adds the actual code for reduction support. Bootstrapped and tested on powerpc64-suse-linux. I am planning to apply it later today. Ira ChangeLog: PR tree-optimization/41881 * tree-vectorizer.h (struct _loop_vec_info): Add new field reduction_chains along with a macro

Re: Ping: Make 128 bits the default vector size for NEON

2011-05-08 Thread Ira Rosen
On 6 May 2011 13:29, Richard Earnshaw rearn...@arm.com wrote: On Thu, 2011-04-21 at 09:02 +0300, Ira Rosen wrote: http://gcc.gnu.org/ml/gcc-patches/2011-03/msg02172.html The last version: ChangeLog:      * doc/invoke.texi (preferred-vector-size): Document.      * params.h

Re: Ping: Make 128 bits the default vector size for NEON

2011-05-08 Thread Ira Rosen
On 8 May 2011 15:02, Gerald Pfeifer ger...@pfeifer.com wrote: On Sun, 8 May 2011, Ira Rosen wrote: How about ARM specific flag similar to -mprefer-avx128 (not tested)? If this goes in, please also update gcc-4.7/changes.html. Do you mean that the new flag should be documented? This patch http

Re: Ping: Make 128 bits the default vector size for NEON

2011-05-08 Thread Ira Rosen
Gerald Pfeifer ger...@pfeifer.com wrote on 09/05/2011 01:53:35 AM: On Sun, 8 May 2011, Ira Rosen wrote: If this goes in, please also update gcc-4.7/changes.html. Do you mean that the new flag should be documented? Yes, as we're adding new flags, it's (nearly?) always a good idea

Re: [patch, ARM] Fix PR target/48252

2011-05-01 Thread Ira Rosen
Ramana Radhakrishnan ramana.radhakrish...@linaro.org wrote on 07/04/2011 03:16:44 PM: On 07/04/11 08:42, Ira Rosen wrote: Hi, This patch makes both outputs of neon_vzip/vuzp/vtrn_internal explicitly dependent on both inputs, preventing incorrect optimization: for (a,b)- vzip (c

Re: [RFT][patch] Fix PR testsuite/48498

2011-05-01 Thread Ira Rosen
gcc-patches-ow...@gcc.gnu.org wrote on 20/04/2011 02:24:55 PM: Hi, In gcc.dg/vect/slp-3.c and gcc.dg/vect/no-vfa-pr29145.c vectorization is expected to fail on targets vect_no_align. But no realignment is necessary here except for having the array bases aligned. This patch removes xfail

[patch, vectorizer] Fix PR tree-optimization/48765

2011-04-28 Thread Ira Rosen
Hi, Sometimes loop vectorization factor changes during the analysis, while statement analysis depends on it. This patch moves the update of the vectorization before statements, avoiding current difference between the analysis and the transformations phases that caused the problem described in

Re: Move STMT_VINFO_TYPE assignment in vectorizable_reduction

2011-04-28 Thread Ira Rosen
gcc-patches-ow...@gcc.gnu.org wrote on 28/04/2011 05:30:35 PM: When I started looking at PR 48765, I noticed that vectorizable_reduction set STMT_VINFO_TYPE before checking the reduction cost. This probably doesn't matter in practice, and certainly has nothing to do with fixing the PR

Re: [patch, vectorizer] Fix PR tree-optimization/48765

2011-04-28 Thread Ira Rosen
+1,25 @@ +2011-04-28 Ira Rosen ira.ro...@linaro.org + + PR tree-optimization/48765 + * tree-vectorizer.h (vect_make_slp_decision): Return bool. + * tree-vect-loop.c (vect_analyze_loop_operations): Add new argument + to indicate if loop aware SLP is being used. Scan

Ping: Make 128 bits the default vector size for NEON

2011-04-21 Thread Ira Rosen
http://gcc.gnu.org/ml/gcc-patches/2011-03/msg02172.html The last version: ChangeLog: * doc/invoke.texi (preferred-vector-size): Document. * params.h (PREFERRED_VECTOR_SIZE): Define. * config/arm/arm.c (arm_preferred_simd_mode): Use param PREFERRED_VECTOR_SIZE instead of

[RFT][patch] Fix PR testsuite/48498

2011-04-20 Thread Ira Rosen
Hi, In gcc.dg/vect/slp-3.c and gcc.dg/vect/no-vfa-pr29145.c vectorization is expected to fail on targets vect_no_align. But no realignment is necessary here except for having the array bases aligned. This patch removes xfail for vect_no_align (and increases a loop bound in slp-3.c to prevent

Re: [PATCH] Fix SLP vectorization of shifts (PR tree-optimization/48616)

2011-04-18 Thread Ira Rosen
Jakub Jelinek ja...@redhat.com wrote on 17/04/2011 05:26:14 PM: On Sun, Apr 17, 2011 at 11:30:31AM +0300, Ira Rosen wrote: We already have this check in vect_build_slp_tree(). It didn't work for the testcase because it doesn't take into account the type of definition. I agree that it's

Re: [patch, ARM] Make 128 bits the default vector size for NEON

2011-04-07 Thread Ira Rosen
On 6 April 2011 16:07, Hans-Peter Nilsson hans-peter.nils...@axis.com wrote: Date: Thu, 31 Mar 2011 13:39:05 +0200 From: Ira Rosen ira.ro...@linaro.org This patch changes NEON's default vector size from 64 to 128 bits. I'm wondering, are there NEON-specific measurements to support

[patch, ARM] Fix PR target/48252

2011-04-07 Thread Ira Rosen
after testing? Thanks, Ira ChangeLog: 2011-04-07 Ulrich Weigand ulrich.weig...@linaro.org Ira Rosen ira.ro...@linaro.org PR target/48252 * config/arm/arm.c (neon_emit_pair_result_insn): Swap arguments to match neon_vzip/vuzp/vtrn_internal. * config/arm

Re: [patch, ARM] Make 128 bits the default vector size for NEON

2011-04-06 Thread Ira Rosen
On 5 April 2011 15:30, Ramana Radhakrishnan ramana.radhakrish...@linaro.org wrote: On 31/03/11 12:39, Ira Rosen wrote: Hi, This patch changes NEON's default vector size from 64 to 128 bits. The patch doesn't touch mvectorize-with-neon-quad, but removes the uses

[wwwdocs][patch] Document NEON default vector size change

2011-04-05 Thread Ira Rosen
Hi, As pointed out here http://gcc.gnu.org/ml/gcc-patches/2011-03/msg02183.html the change of the default vector size has to be documented in changes.html: * htdocs/gcc-4.7/changes.html (targets): Document ARM NEON default vector size change. Index: htdocs/gcc-4.7/changes.html

[patch, ARM] Make 128 bits the default vector size for NEON

2011-03-31 Thread Ira Rosen
Hi, This patch changes NEON's default vector size from 64 to 128 bits. The patch doesn't touch mvectorize-with-neon-quad, but removes the uses of TARGET_NEON_VECTORIZE_QUAD. Following Julian's suggestion I added a param preferred-vector-size for testing and debugging purposes. I tested a

Re: [patch, ARM] Make 128 bits the default vector size for NEON

2011-03-31 Thread Ira Rosen
On 31 March 2011 14:28, Joseph S. Myers jos...@codesourcery.com wrote: On Thu, 31 Mar 2011, Ira Rosen wrote: +Illegal values are ignored.  The default is 128. See the GNU Coding Standards http://www.gnu.org/prep/standards/html_node/GNU-Manuals.html:   Please do not use the term illegal

Re: [patch, ARM] Make 128 bits the default vector size for NEON

2011-03-31 Thread Ira Rosen
On 31 March 2011 15:11, Nathan Froyd froy...@codesourcery.com wrote: On Thu, Mar 31, 2011 at 01:39:05PM +0200, Ira Rosen wrote: This patch changes NEON's default vector size from 64 to 128 bits. No comments about the patch itself, but this change should be noted in changes.html. I'll do

[RFC][patch] If-conversion of COMPONENT_REFs

2011-03-30 Thread Ira Rosen
Hi, With this patch a data-ref is marked as unconditionally read or written also if its adjacent field is read or written unconditionally in the loop. My concern is that this is not safe enough, even though the fields have to be non-pointers and non-aggregates, and this optimization is applied

Re: [RFC][patch] If-conversion of COMPONENT_REFs

2011-03-30 Thread Ira Rosen
On 30 March 2011 12:59, Richard Guenther richard.guent...@gmail.com wrote: On Wed, Mar 30, 2011 at 11:13 AM, Ira Rosen ira.ro...@linaro.org wrote: Hi, With this patch a data-ref is marked as unconditionally read or written also if its adjacent field is read or written unconditionally

Re: [RFC][patch] If-conversion of COMPONENT_REFs

2011-03-30 Thread Ira Rosen
On 30 March 2011 14:41, Richard Guenther richard.guent...@gmail.com wrote: On Wed, Mar 30, 2011 at 2:22 PM, Ira Rosen ira.ro...@linaro.org wrote: On 30 March 2011 12:59, Richard Guenther richard.guent...@gmail.com wrote: On Wed, Mar 30, 2011 at 11:13 AM, Ira Rosen ira.ro...@linaro.org wrote

[patch] Fix PR tree-optmization/48290

2011-03-29 Thread Ira Rosen
Hi, This patch fixes the vectorizer part of PR tree-optmization/48290 by checking that if we have a phi in outer loop in the basic block after the inner loop, then this phi is really inner loop's exit phi, i.e., its operand is defined in the inner loop. Bootstrapped with vectorization enabled

[patch, testsuite] Another fix for gcc.dg/vect/vect-cselim-1.c

2011-03-26 Thread Ira Rosen
Hi, vect-cselim-1.c contains strided memory accesses and is not vectorizable on targets that do not support such accesses. Tested on powerpc64-suse-linux. Committed as obvious. Ira testsuite/ChangeLog: * gcc.dg/vect/vect-cselim-1.c: Fail on targets that don't support strided

[patch, ARM] Enable auto-detection of vector size for NEON

2011-03-24 Thread Ira Rosen
Hi, This patch implements TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES for ARM NEON. Regtested on arm-linux-gnueabi. OK for trunk? Thanks, Ira ChangeLog: * config/arm/arm.c (arm_autovectorize_vector_sizes): New function. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES):

Re: [patch, ARM] Enable auto-detection of vector size for NEON

2011-03-24 Thread Ira Rosen
On 24 March 2011 13:03, Joseph S. Myers jos...@codesourcery.com wrote: On Thu, 24 Mar 2011, Ira Rosen wrote: Hi, This patch implements TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES for ARM NEON. Given the multiple vector sizes support, is there a reason not to enable -mvectorize-with-neon

[patch] Enhance conditional store sinking

2011-03-16 Thread Ira Rosen
Hi, This patch adds a support of conditional store sinking for cases with multiple data references in then and else basic blocks. The correctness of the transformation is checked by verifying that there are no read-after-write and write-after-write dependencies. Bootstrapped and tested on

Re: [patch] Enhance conditional store sinking

2011-03-16 Thread Ira Rosen
On 16 March 2011 12:29, Richard Guenther richard.guent...@gmail.com wrote: On Wed, Mar 16, 2011 at 7:49 AM, Ira Rosen ira.ro...@linaro.org wrote: Hi, This patch adds a support of conditional store sinking for cases with multiple data references in then and else basic blocks. The correctness

<    1   2