Hi all, The aarch64_sqdml<SBINQOPS:as>l patterns are of the form: [(set (match_operand:<VWIDE> 0 "register_operand" "=w") (SBINQOPS:<VWIDE> (match_operand:<VWIDE> 1 "register_operand" "0") (ss_ashift:<VWIDE> (mult:<VWIDE> (sign_extend:<VWIDE> (match_operand:VSD_HSI 2 "register_operand" "w")) (sign_extend:<VWIDE> (match_operand:VSD_HSI 3 "register_operand" "w"))) (const_int 1))))]
where SBINQOPS is ss_plus and ss_minus. The problem is that for the ss_plus case the RTL is not canonical: the (match_oprand 1) should be the second arm of the PLUS. I've seen this manifest in combine missing some legitimate simplifications because it generates the canonical ss_plus form and fails to match the pattern. This patch splits the patterns into the ss_plus and ss_minus forms with the canonical form for each. I've seen this improve my testcase (which I can't include as it's too large and not easy to test reliably). Bootstrapped and tested on aarch64-none-linux-gnu. Pushing to trunk. Thanks, Kyrill gcc/ChangeLog: * config/aarch64/aarch64-simd.md (aarch64_sqdml<SBINQOPS:as>l<mode>): Split into... (aarch64_sqdmlal<mode>): ... This... (aarch64_sqdmlsl<mode>): ... And this. (aarch64_sqdml<SBINQOPS:as>l_lane<mode>): Split into... (aarch64_sqdmlal_lane<mode>): ... This... (aarch64_sqdmlsl_lane<mode>): ... And this. (aarch64_sqdml<SBINQOPS:as>l_laneq<mode>): Split into... (aarch64_sqdmlsl_laneq<mode>): ... This... (aarch64_sqdmlal_laneq<mode>): ... And this. (aarch64_sqdml<SBINQOPS:as>l_n<mode>): Split into... (aarch64_sqdmlsl_n<mode>): ... This... (aarch64_sqdmlal_n<mode>): ... And this. (aarch64_sqdml<SBINQOPS:as>l2<mode>_internal): Split into... (aarch64_sqdmlal2<mode>_internal): ... This... (aarch64_sqdmlsl2<mode>_internal): ... And this.
sqdmlal.patch
Description: sqdmlal.patch