On Fri, Sep 25, 2020 at 7:43 AM Maamoun TK <[email protected]> wrote:
> ...
> > I'm not sure where it fits under powerpc64. The code doesn't need any
> > cryptographic extensions, but it depends on vector instructions as well
> > as VSX registers (for the unaligned load and store instructions). So I'd
> > need advice both on the directory hierarchy and compile time
> > configuration, and appropriate runtime tests for fat builds.
>
> The VSX instructions are introduced in Power ISA v.2.06 so since you have
> used VSX instructions lxvw4x/stxvw4x the minimum processor you are
> targeting is POWER7
> We can add new config option like "--enable-power-vsx" that enable this
> optimization.
I believe the 64-bit adds (addudm) and subtracts (subudm) require
POWER8. POWER7 provides vector unsigned long long (and friends) and
the 64-bit loads, but you need POWER8 to do something useful with
them.
Or, the 64-bit adds can be performed manually using vector unsigned
int with code to manage carry or borrow. It allows you to drop back to
POWER4. ChaCha20 is still profitable.
typedef vector unsigned int uint32x4_p;
inline uint32x4_p VecAdd64(const uint32x4_p vec1, const uint32x4_p vec2)
{
// The carry mask selects carrys for elements 1 and 3 and sets
// remaining elements to 0. The results is then shifted so the
// carried values are added to elements 0 and 2.
#if defined(NETTLE_BIG_ENDIAN)
const uint32x4_p zero = {0, 0, 0, 0};
const uint32x4_p mask = {0, 1, 0, 1};
#else
const uint32x4_p zero = {0, 0, 0, 0};
const uint32x4_p mask = {1, 0, 1, 0};
#endif
uint32x4_p cy = vec_addc(vec1, vec2);
uint32x4_p res = vec_add(vec1, vec2);
cy = vec_and(mask, cy);
cy = vec_sld (cy, zero, 4);
return vec_add(res, cy);
#endif
}
Here's the core of a subtract:
uint32x4_p bw = vec_subc(vec1, vec2);
uint32x4_p res = vec_sub(vec1, vec2);
bw = vec_andc(mask, bw);
bw = vec_sld (bw, zero, 4);
return vec_sub(res, bw);
Jeff
_______________________________________________
nettle-bugs mailing list
[email protected]
http://lists.lysator.liu.se/mailman/listinfo/nettle-bugs