I'm not aware of a simple way to accomplish either approaches on POWER8, I
recommend to use allocated stack buffer to assist handling leftovers rather
than making it complicated or we can use POWER9 specific instruction
'lxvll' which can used to load vector with length passed to general
register as parameter, it also work on both endian modes without any
post-loading operations, another benefit from switching to POWER ISA 3.0 is
that we can use 'lxvb16x/stxvb16x' to load/store input and output data
instead of 'lxvd2x/stxvd2x' instructions, this eliminate the need for
post-loading/pre-storing permuting operations on little-endian mode.

regards,
Mamone

On Sun, Nov 22, 2020 at 11:26 PM Niels Möller <ni...@lysator.liu.se> wrote:

> Maamoun TK <maamoun...@googlemail.com> writes:
>
> > It generates a mask compatible with the length of leftovers, for example
> if
> > the length is 1 then the mask generated is
> > 0xFF000000000000000000000000000000 then the mask is ANDed with the vector
> > register of leftovers to clear the extra unneeded bytes. It's not exactly
> > like the first approach but it avoids using stack and handles the
> leftovers
> > inside the assembly implementation, sorry for mixing up.
>
> I see. I'm a bit worried that it may read to far. E.g, assume that
> leftover size to read is 5 bytes, and those 5 bytes start at address
> 1ffffff8. Then the final
>
>    lxvd2x VSR(C0),0,DATA
>
> will read 16 bytes from memory, including a few bytes starting at
> address 20000000, which may result in a segfault. Getting this right
> would need approach 2, "Round the address down to make it aligned, read
> an aligned word and, only if needed, the next word. And shift and mask
> to get the needed bytes."
>
> I would expect that the simplest is to go with approach two: Have a loop
> to read a byte at the time, and shift into a register.
>
> > I made a merge request in git.lysator.liu.se, it ended up easier for me
> to
> > push patches to the repository in this way, I hope you don't mind dealing
> > with the future patches the same way.
>
> Thanks, that's fine. But you may need to ping me, since I don't look at
> the gitlab web interface that often.
>
> Regards,
> /Niels
>
> --
> Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
> Internet email is subject to wholesale government surveillance.
>
_______________________________________________
nettle-bugs mailing list
nettle-bugs@lists.lysator.liu.se
http://lists.lysator.liu.se/mailman/listinfo/nettle-bugs

Reply via email to