On Mon, Dec 9, 2013 at 12:27 PM, Jakub Jelinek <ja...@redhat.com> wrote: > Hi! > > On Fri, Dec 06, 2013 at 01:49:50PM +0100, Richard Biener wrote: >> > basic_block bb = ifc_bbs[i]; >> > gimple_seq stmts; >> > >> > - if (!is_predicated (bb)) >> > + if (!is_predicated (bb) >> > + || dominated_by_p (CDI_DOMINATORS, loop->latch, bb)) >> >> isn't that redundant now? > > After IRC discussion, moved this dominated_by_p call and another one > from predicate_bbs to add_to_predicate_list, so that we don't change > an always true predicate to something else on a bb that dominates > loop->latch and therefore will be executed unconditionally inside of > the loop body. > >> > + if (TREE_CODE (addr) == SSA_NAME && !SSA_NAME_PTR_INFO (addr)) >> > + copy_ref_info (build2 (MEM_REF, TREE_TYPE (ref), addr, ptr), >> > + ref); >> >> Eh - can you split out a copy_ref_info_to_addr so you can avoid >> creating the MEM_REF? > > Haven't changed this one, because copy_ref_info uses the new_ref in > several places (in addition to the initial few ones). > >> > static bool >> > +version_loop_for_if_conversion (struct loop *loop, bool *do_outer) >> > +{ >> >> What's the do_outer parameter? > > Outer loops aren't versioned anymore. > >> Please add a comment before this. Seems you match what outer loop >> vectorization handles? Thus, best factor out a predicate in >> tree-loop-vect.c that you can use in both places? > > So this went away completely. > >> This needs a comment with explaining what code you create. > > Ditto (and several other spots). > >> Btw I hate that we do update_ssa multiple times per pass per >> function. That makes us possibly worse than O(N^2) as update_ssa computes >> the IDF of the whole function. >> >> This is something your patch introduces (it's only rewriting the >> virtuals, not the incremental SSA update by BB copying). > > This too. > >> See above. And factor this out into a function. Also move this >> to the cleanup loop below. > > Not moved to the cleanup loop, because there is no easy way to find > then if a loop has been vectorized or not. But it is in a separate helper > etc. > > The rest should be in this new version of the patch, bootstrapped/regtested > on x86_64-linux and i686-linux, ok for trunk? > > 2013-12-09 Jakub Jelinek <ja...@redhat.com> > > * tree-vectorizer.h (struct _loop_vec_info): Add scalar_loop field. > (LOOP_VINFO_SCALAR_LOOP): Define. > (slpeel_tree_duplicate_loop_to_edge_cfg): Add scalar_loop argument. > * config/i386/sse.md (maskload<mode>, maskstore<mode>): New expanders. > * tree-data-ref.c (get_references_in_stmt): Handle MASK_LOAD and > MASK_STORE. > * internal-fn.def (LOOP_VECTORIZED, MASK_LOAD, MASK_STORE): New > internal fns. > * tree-if-conv.c: Include expr.h, optabs.h, tree-ssa-loop-ivopts.h and > tree-ssa-address.h. > (release_bb_predicate): New function. > (free_bb_predicate): Use it. > (reset_bb_predicate): Likewise. Don't unallocate bb->aux > just to immediately allocate it again. > (add_to_predicate_list): Add loop argument. If basic blocks that > dominate loop->latch don't insert any predicate. > (add_to_dst_predicate_list): Adjust caller. > (if_convertible_phi_p): Add any_mask_load_store argument, if true, > handle it like flag_tree_loop_if_convert_stores. > (insert_gimplified_predicates): Likewise. > (ifcvt_can_use_mask_load_store): New function. > (if_convertible_gimple_assign_stmt_p): Add any_mask_load_store > argument, check if some conditional loads or stores can't be > converted into MASK_LOAD or MASK_STORE. > (if_convertible_stmt_p): Add any_mask_load_store argument, > pass it down to if_convertible_gimple_assign_stmt_p. > (predicate_bbs): Don't return bool, only check if the last stmt > of a basic block is GIMPLE_COND and handle that. Adjust > add_to_predicate_list caller. > (if_convertible_loop_p_1): Only call predicate_bbs if > flag_tree_loop_if_convert_stores and free_bb_predicate in that case > afterwards, check gimple_code of stmts here. Replace is_predicated > check with dominance check. Add any_mask_load_store argument, > pass it down to if_convertible_stmt_p and if_convertible_phi_p, > call if_convertible_phi_p only after all if_convertible_stmt_p > calls. > (if_convertible_loop_p): Add any_mask_load_store argument, > pass it down to if_convertible_loop_p_1. > (predicate_mem_writes): Emit MASK_LOAD and/or MASK_STORE calls. > (combine_blocks): Add any_mask_load_store argument, pass > it down to insert_gimplified_predicates and call predicate_mem_writes > if it is set. Call predicate_bbs. > (version_loop_for_if_conversion): New function. > (tree_if_conversion): Adjust if_convertible_loop_p and combine_blocks > calls. Return todo flags instead of bool, call > version_loop_for_if_conversion if if-conversion should be just > for the vectorized loops and nothing else. > (main_tree_if_conversion): Adjust caller. Don't call > tree_if_conversion for dont_vectorize loops if if-conversion > isn't explicitly enabled. > * tree-vect-data-refs.c (vect_check_gather): Handle > MASK_LOAD/MASK_STORE. > (vect_analyze_data_refs, vect_supportable_dr_alignment): Likewise. > * gimple.h (gimple_expr_type): Handle MASK_STORE. > * internal-fn.c (expand_LOOP_VECTORIZED, expand_MASK_LOAD, > expand_MASK_STORE): New functions. > * tree-vectorizer.c: Include tree-cfg.h and gimple-fold.h. > (vect_loop_vectorized_call, fold_loop_vectorized_call): New functions. > (vectorize_loops): Don't try to vectorize loops with > loop->dont_vectorize set. Set LOOP_VINFO_SCALAR_LOOP for if-converted > loops, fold LOOP_VECTORIZED internal call depending on if loop > has been vectorized or not. > * tree-vect-loop-manip.c (slpeel_duplicate_current_defs_from_edges): > New function. > (slpeel_tree_duplicate_loop_to_edge_cfg): Add scalar_loop argument. > If non-NULL, copy basic blocks from scalar_loop instead of loop, but > still to loop's entry or exit edge. > (slpeel_tree_peel_loop_to_edge): Add scalar_loop argument, pass it > down to slpeel_tree_duplicate_loop_to_edge_cfg. > (vect_do_peeling_for_loop_bound, vect_do_peeling_for_loop_alignment): > Adjust callers. > (vect_loop_versioning): If LOOP_VINFO_SCALAR_LOOP, perform loop > versioning from that loop instead of LOOP_VINFO_LOOP, move it to the > right place in the CFG afterwards. > * tree-vect-loop.c (vect_determine_vectorization_factor): Handle > MASK_STORE. > * cfgloop.h (struct loop): Add dont_vectorize field. > * tree-loop-distribution.c (copy_loop_before): Adjust > slpeel_tree_duplicate_loop_to_edge_cfg caller. > * optabs.def (maskload_optab, maskstore_optab): New optabs. > * passes.def: Add a note that pass_vectorize must immediately follow > pass_if_conversion. > * tree-predcom.c (split_data_refs_to_components): Give up if > DR_STMT is a call. > * tree-vect-stmts.c (vect_mark_relevant): Don't crash if lhs > is NULL. > (exist_non_indexing_operands_for_use_p): Handle MASK_LOAD > and MASK_STORE. > (vectorizable_mask_load_store): New function. > (vectorizable_call): Call it for MASK_LOAD or MASK_STORE. > (vect_transform_stmt): Handle MASK_STORE. > * tree-ssa-phiopt.c (cond_if_else_store_replacement): Ignore > DR_STMT where lhs is NULL. > * optabs.h (can_vec_perm_p): Fix up comment typo. > (can_vec_mask_load_store_p): New prototype. > * optabs.c (can_vec_mask_load_store_p): New function. >
This caused: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59523. H.J.