Hi All, Here is a patch which gives us significant speed-up on HASWELL for test containing masked stores. The main goal of that patch is attempt to avoid HW hazard for maskmove instructions through inserting additional check on zero mask and putting all masked store statements into separate block on false edge.All MASK_STORE statements having the same mask put into one block. Any comments will be appreciate.
ChangeLog: 2015-05-06 Yuri Rumyantsev <ysrum...@gmail.com> * cfgloop.h (has_mask_store): Add new field to struct loop. * config/i386/i386.c: Include files stringpool.h and tree-ssanames.h. (ix86_vectorize_zero_vector): New function. (TARGET_VECTORIZE_ZERO_VECTOR): New target macro * doc/tm.texi.in: Add @hook TARGET_VECTORIZE_ZERO_VECTOR. * doc/tm.texi: Updated. * target.def (zero_vector): New DEFHOOK. * tree-if-conv.c (predicate_mem_writes): Set has_mask_store for loop. * tree-vect-stmts.c : Include tree-into-ssa.h. (optimize_mask_stores): New function. * tree-vectorizer.c (vectorize_loops): Zero has_mask_store field for non-vectorized loops and invoke optimize_mask_stores function. gcc/testsuite/ChangeLog: * gcc.target/i386/avx2-vect-mask-store-move1.c: New test.
patch.1
Description: Binary data