On 2/26/19 3:39 AM, David Hildenbrand wrote:
> +    for (dst_idx = 0; dst_idx < NUM_VEC_ELEMENTS(es); dst_idx++) {
> +        src_idx = dst_idx / 2;
> +        if (!high) {
> +            src_idx += NUM_VEC_ELEMENTS(es) / 2;
> +        }
> +        if (dst_idx % 2 == 0) {
> +            read_vec_element_i64(tmp, v2, src_idx, es);
> +        } else {
> +            read_vec_element_i64(tmp, v3, src_idx, es);
> +        }
> +        write_vec_element_i64(tmp, dst_v, dst_idx, es);
> +    }

TODO: Note that you do not need a vector temporary here, so long as you load
both source elements before writing, and you iterate in the proper direction.

For VMRL, iterate forward as you do now.  The element access order for MO_32:

 read  v2: 2   3
 read  v3:   2   3
 write v1: 0 1 2 3

For VMRH, iterate backward:

 read  v2: 1   0
 read  v3:   1   0
 write v1: 3 2 1 0


r~

Reply via email to