https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99746
Tamar Christina <tnfchris at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |ASSIGNED
--- Comment #7 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
reduced to
SUBROUTINE CLAREF(A)
LOGICAL BLOCK
COMPLEX T1 , V2
COMPLEX A(LDA, *) , SUM
LOGICAL LSAME
IF (LSAME) THEN
IF (BLOCK) THEN
DO 130 J = ITMP1, ITMP2
SUM = T1 * A(J, ICOL1) * A0 +
$ V2 * A(J, 2)
A(J, ICOL1) = -SUM
A(J, 2) = SUM
130 CONTINUE
END IF
END IF
END
which produces the following SLP tree,
node 0x4e150c0 (max_nunits=2, refcnt=1)
op template: REALPART_EXPR <(*a_29(D))[_12]> = sum$real_60;
stmt 0 REALPART_EXPR <(*a_29(D))[_12]> = sum$real_60;
stmt 1 IMAGPART_EXPR <(*a_29(D))[_12]> = sum$imag_61;
children 0x4e15720
node 0x4e15720 (max_nunits=2, refcnt=1)
op template: slp_patt_69 = .COMPLEX_FMA (sum$real_60, sum$real_60,
sum$real_60);
stmt 0 sum$real_60 = _48 + _58;
stmt 1 sum$imag_61 = _49 + _59;
children 0x4e15500 0x4e15e08 0x4e15ad8
node 0x4e15500 (max_nunits=2, refcnt=1)
op template: _48 = a0_31(D) * _46;
stmt 0 _48 = a0_31(D) * _46;
stmt 1 _49 = a0_31(D) * _47;
children 0x4e15588 0x4e151d0
node (external) 0x4e15588 (max_nunits=1, refcnt=1)
{ a0_31(D), a0_31(D) }
node 0x4e151d0 (max_nunits=2, refcnt=1)
op template: slp_patt_71 = .COMPLEX_MUL (_46, _46);
stmt 0 _46 = _42 - _43;
stmt 1 _47 = _44 + _45;
children 0x4e15038 0x4e15d80
node (external) 0x4e15038 (max_nunits=1, refcnt=1)
{ t1$real_38(D), t1$imag_41(D) }
node 0x4e15d80 (max_nunits=2, refcnt=2)
op template: _17 = REALPART_EXPR <(*a_29(D))[_5]>;
stmt 0 _17 = REALPART_EXPR <(*a_29(D))[_5]>;
stmt 1 _16 = IMAGPART_EXPR <(*a_29(D))[_5]>;
load permutation { 0 1 }
node 0x4e15e08 (max_nunits=2, refcnt=2)
op template: _50 = REALPART_EXPR <(*a_29(D))[_12]>;
stmt 0 _50 = REALPART_EXPR <(*a_29(D))[_12]>;
stmt 1 _51 = IMAGPART_EXPR <(*a_29(D))[_12]>;
load permutation { 0 1 }
node (external) 0x4e15ad8 (max_nunits=1, refcnt=1)
{ v2$real_52(D), v2$imag_53(D) }
which is correct, but vect_detect_hybrid_slp determines
marking hybrid: slp_patt_71 = .COMPLEX_MUL (_46, _46);
Which is a problem since the patterns are only valid in SLP.
I don't quite see why the sub-tree is hybrid though.. it determines
marking hybrid: _50 = REALPART_EXPR <(*a_29(D))[_12]>;
marking hybrid: _51 = IMAGPART_EXPR <(*a_29(D))[_12]>;
marking hybrid: _48 = a0_31(D) * _46;
marking hybrid: slp_patt_71 = .COMPLEX_MUL (_46, _46);
marking hybrid: sum$imag_61 = _49 + _59;
marking hybrid: _49 = a0_31(D) * _47;
marking hybrid: _59 = _56 + _57;
marking hybrid: _56 = _50 * v2$imag_53(D);
marking hybrid: _57 = _51 * v2$real_52(D);
marking hybrid: _47 = _44 + _45;
marking hybrid: _44 = _17 * t1$imag_41(D);
marking hybrid: _45 = _16 * t1$real_38(D);
marking hybrid: _16 = IMAGPART_EXPR <(*a_29(D))[_5]>;
marking hybrid: _17 = REALPART_EXPR <(*a_29(D))[_5]>;
So either the vect_detect_hybrid_slp is correct but then SLP should be aborted
or it's not right and this should have been pure.
the problem starts because it marks _50 as hybrid, but don't see why it thinks
that...