https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105816

Robin Dapp <rdapp at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rdapp at gcc dot gnu.org

--- Comment #4 from Robin Dapp <rdapp at gcc dot gnu.org> ---
> where we build the vector from scalars (and fail to reject this via costing):
> 
>   _1 = BIT_FIELD_REF <src1_9(D), 32, 0>;
>   _2 = BIT_FIELD_REF <src1_9(D), 32, 32>;
>   _3 = BIT_FIELD_REF <src1_9(D), 32, 64>;
>   _4 = BIT_FIELD_REF <src1_9(D), 32, 96>;
>   _5 = BIT_FIELD_REF <src2_16(D), 32, 0>;
>   _6 = BIT_FIELD_REF <src2_16(D), 32, 32>;
>   _7 = BIT_FIELD_REF <src2_16(D), 32, 64>;
>   _8 = BIT_FIELD_REF <src2_16(D), 32, 96>;
>   _21 = {_1, _2, _3, _4, _5, _6, _7, _8};
>   vectp.4_22 = &BIT_FIELD_REF <*dst_11(D), 32, 0>;
> 
> t.c:6:13: note: Cost model analysis for part in loop 0:
>   Vector cost: 48
>   Scalar cost: 96
> t.c:6:13: note: Basic block will be vectorized using SLP
> 
> Thus re-confirmed.

If I'm not confused I have a patch for this that enhances the constructor from
bit-field-ref optimization to two sources.  We only handle one right now.

This also happens in x264 and after "propagation" it just decays to a simple
permute (or none even).  Right now we only see it in forwprop when it's too
late.  Intended to send it during stage 1.

Reply via email to