> -----Original Message-----
> From: Eelco Chaudron <[email protected]>
> Sent: Thursday, July 14, 2022 2:24 PM
> To: Van Haaren, Harry <[email protected]>
> Cc: [email protected]; [email protected]; Amber, Kumar
> <[email protected]>; Pai G, Sunil <[email protected]>; Finn, Emma
> <[email protected]>; Stokes, Ian <[email protected]>
> Subject: Re: [PATCH v10 09/10] odp-execute: Add ISA implementation of 
> set_masked
> ETH

<snip patch>

> > +    /* Read the content of the key(src) and mask in the respective 
> > registers.
> > +     * We only load the src and dest addresses, which is only 96-bits and 
> > not
> > +     * 128-bits. */
> > +    __m128i v_src = _mm_maskz_loadu_epi32(0x7,(void *) key);
> > +    __m128i v_mask = _mm_maskz_loadu_epi32(0x7, (void *) mask);
> 
> One question here I asked throughout the various revisions but got not 
> answered:
> 
> "The second load, loads 128 bits of data, but there are only 12 bytes to 
> load. What
> happens if the memory at the remaining 6 bytes are not mapped in memory (i.e. 
> a
> page does not exist/can't be loaded)? Will we crash!?

AVX512 has some very nice features for handling scenarios where "not full" SIMD 
is 
required. This feature is known as "k-masks", and in short allows "turning off" 
part of
the SIMD instruction from having an effect.

In this case, the "maskz" part of the intrinsic means that the k-mask becomes 
active.
An extra parameter is added to any k-mask instruction (_mm_maskz_*), which 
indicates
what lanes to enable/disable. Note that the *size* of each lane is determined 
by the
end of the intrinsic, so _epi32() indicates 32-bit lanes. A worked example 
below:

_mm_maskz_loadu_epi32(0x7, (void *) mask);

kmask is 0x7, or "111" in binary, so lowest 3 lanes (visualize them on the 
right) are active.
As the instruction targets 32-bit ints, each lane size is 4 bytes, so 3 * 4 = 
12 bytes "active".
As a result, only 12 bytes are loaded from memory here. Even if the next byte 
was on a new
page, and not mapped into our virtual address range, there would be no crash 
here due to
the k-mask handling the load.

<snip more patch>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to