Re: [PATCH] OPTABS/IFN: Add mask_len_strided_load/mask_len_strided_store OPTABS/IFN

Robin Dapp Tue, 31 Oct 2023 01:45:45 -0700

Hi Juzhe,

> +@cindex @code{mask_len_strided_load@var{m}@var{n}} instruction pattern
> +@item @samp{mask_len_strided_load@var{m}@var{n}}
> +Load several separate memory locations into a vector of mode m.
> +Operand 1 is a scalar base address and operand 2 is mode @var{n}
> +specifying each uniform stride between consecutive element.
How about:


"into a destination vector of mode @var{m} (operand 0). Operand 1
is a scalar base address.  Operand 2 is a scalar stride of mode @var{n}"
such that element @var{i} of the destination is loaded from
(operand 1) + @var{i} * (operand 2).  The instruction can be seen
as a special case of @code{mask_len_gather_load@var{m}@var{n}} with
an offset vector that is a @code{vec_series} with (operand 1) as base
and (operand 2) as step.

> +operand 3 is mask operand, operand 4 is length operand and operand 5 is
> +bias operand.  

Maybe: Similar to mask_len_load, operand 3 contains the mask, operand 4
the length and operand 5 the bias.  The instruction loads...

> +@cindex @code{mask_len_strided_store@var{m}@var{n}} instruction pattern
> +@item @samp{mask_len_strided_store@var{m}@var{n}}
> +Store a vector of mode @var{m} into several distinct memory locations.
> +Operand 0 is a scalar base address, operand 2 is the vector to be stored,
> +and operand 1 is mode @var{n} specifying each uniform stride between 
> consecutive element.
> +operand 3 is mask operand, operand 4 is length operand and operand 5 is
> +bias operand.  Similar to mask_len_store, the instruction stores at most
> +(operand 4 + operand 5) elements to memory.  Bit @var{i} of the mask is set
> +if element @var{i} of the result should be storeed.
> +Mask elements @var{i} with @var{i} > (operand 4 + operand 5) are ignored.

Same here.

Regards
 Robin

Re: [PATCH] OPTABS/IFN: Add mask_len_strided_load/mask_len_strided_store OPTABS/IFN

Reply via email to