https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79745
Bug ID: 79745
Summary: vec_init<> expander misses V2TImode with AVX and
V2OImode and V2TImode with AVX512
Product: gcc
Version: 7.0.1
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: rguenth at gcc dot gnu.org
Blocks: 65832
Target Milestone: ---
Target: x86_64-*-*, i?86-*-*
The AVX case pessimizes CPUv6 x264 when vectorized with 256bit vectors. The
vectorizer tries to load the two 128bit halves and build a 256bit vector
(the halves are separated by a gap) via
/* Avoid emitting a constructor of vector elements by performing
the loads using an integer type of the same size,
constructing a vector of those and then re-interpreting it
as the original vector type. This works around the fact
that the vec_init optab was only designed for scalar
element modes and thus expansion goes through memory.
This avoids a huge runtime penalty due to the general
inability to perform store forwarding from smaller stores
to a larger load. */
unsigned lsize
= group_size * TYPE_PRECISION (TREE_TYPE (vectype));
enum machine_mode elmode = mode_for_size (lsize, MODE_INT, 0);
enum machine_mode vmode = mode_for_vector (elmode,
nunits / group_size);
/* If we can't construct such a vector fall back to
element loads of the original vector type. */
if (VECTOR_MODE_P (vmode)
&& optab_handler (vec_init_optab, vmode) != CODE_FOR_nothing)
{
nloads = nunits / group_size;
lnel = group_size;
ltype = build_nonstandard_integer_type (lsize, 1);
lvectype = build_vector_type (ltype, nloads);
}
See also PR65832 which is broader.
Referenced Bugs:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65832
[Bug 65832] Inefficient vector construction