On Mon, Jun 22, 2020 at 9:12 PM <[email protected]> wrote:

> Are SSE instructions allowed in the AMD64 kernel?  Is #ifdef __SSE__
> a sufficient guard?
>
> I have a rasops32 putchar with SSE that is 2x faster.
>

As Bryan and Patrick noted: it's possible, but there are restrictions and
costs.

The main restriction is that the code must not permit a context-switch
between the fpu_kernel_enter() and fpu_kernel_exit() calls.  No taking an
rwlock or calling any of the sleep functions, for example.

If you're using more than the minimal level of SSE which is already
required by the kernel (for lfence, etc) then you should also check whether
the necessary extension bits are present in curcpu()->ci_feature_* and fall
back to the current code if not present.

The cost is that if the thread doing this isn't a system thread, then the
first fpu_kernel_enter() call after the userspace->kernel transition has to
save and reset the FPU registers (XSAVEOPT + XRSTOR on newish CPUs).  Every
fpu_kernel_exit(), regardless of thread type, resets them again (XRSTOR).

If the restriction isn't a problem and the cost of those is worth the gain,
then sure, go for it.  We already do it for AES stuff in the kernel, for
example.  c.f. /usr/src/sys/arch/amd64/amd64/aesni.c


Philip Guenther

Reply via email to