On 08/17/2017 11:04 AM, Alex Bennée wrote:
> + int32_t *rd = (int32_t *) d;
> + int16_t *rn = (int16_t *) n;
> + int16_t rm = (int16_t) m;
> + int i;
> +
> + #pragma GCC ivdep
> + for (i = 0; i < opr_elt; ++i) {
> + rd[i] = rn[i + doff_elt] * rm;
> + }
You need to run this loop backward to avoid clobbering data when rd == rn.
I thought you'd put m into ADVSIMD_DATA.
>
> + if (is_q) {
> + simd_info = deposit32(simd_info,
> + ADVSIMD_DOFF_ELT_SHIFT,
> ADVSIMD_DOFF_ELT_BITS, 4);
> + }
It'd probably be useful to have a macro to clean this up:
#define PUT_SIMD_DATA(t, d) \
deposit32(0, ADVSIMD_ ## t ## _SHIFT, ADVSIMD_ ## t ## _BITS, (d))
simd_info |= PUT_SIMD_DATA(DOFF_ELT, 4)
that said, folding DOFF into the pointer that gets passed in the first place
seems a better solution to me.
r~