On 16/12/15 15:01, Richard Biener wrote:

The following patch adds a heuristic to prefer store/load-lanes
over SLP when vectorizing.  Compared to the variant attached to
the PR I made the STMT_VINFO_STRIDED_P behavior explicit (matching
what you've tested).

Not sure I follow this. Compared to the variant attached to the PR - we will now attempt to use load-lanes, if (say) all of the loads are strided, even if we know we don't support load-lanes (for any of them). That sounds the wrong way around and I think rather different to what you proposed earlier? (At the least, the debug message "can use load/store lanes" is potentially misleading, that's not necessarily the case!)

There are arguments that we want to do less SLP, generally, on ARM/AArch64 but I think Wilco's permute cost patch https://gcc.gnu.org/ml/gcc-patches/2015-12/msg01469.html is a better way of achieving that?

Just my gut feeling at this point - I haven't evaluated this version of the patch on any benchmarks etc...

Thanks, Alan

Reply via email to