On 9/17/23 01:42, Pan Li via Gcc-patches wrote:
From: Pan Li <pan2...@intel.com>
Given below example for VLS mode
void
test (vl_t *u)
{
vl_t t;
long long *p = (long long *)&t;
p[0] = p[1] = 2;
*u = t;
}
The vec_set will simplify the insn to vmv.s.x when index is 0, without
merged operand. That will result in some problems in DCE, aka:
1: 137[DI] = a0
2: 138[V2DI] = 134[V2DI] // deleted by DCE
3: 139[DI] = #2 // deleted by DCE
4: 140[DI] = #2 // deleted by DCE
5: 141[V2DI] = vec_dup:V2DI (139[DI]) // deleted by DCE
6: 138[V2DI] = vslideup_imm (138[V2DI], 141[V2DI], 1) // deleted by DCE
7: 135[V2DI] = 138[V2DI] // deleted by DCE
8: 142[V2DI] = 135[V2DI] // deleted by DCE
9: 143[DI] = #2
10: 142[V2DI] = vec_dup:V2DI (143[DI])
11: (137[DI]) = 142[V2DI]
The higher 64 bits of 142[V2DI] is unknown here and it generated
incorrect code when store back to memory. This patch would like to
fix this issue by adding a new SCALAR_MOVE_MERGED_OP for vec_set.
I must be missing something. Doesn't insn 10 broadcast the immediate
0x2 to both elements of r142?!? What am I missing?
JEff