Comment #5 from Rich Felker
Of course, fancy memcpy in general is only a win beyond a certain size. For DMA
I did not mean I want to use DMA for any size beyond gcc's proposed
function-call threshold. Rather, the vdso-provided function would choose what
to do appropriately for the hardware. But on J2 (nommu, no special kernel mode)
I suspect DMA could be a win at sizes as low as 256 bytes, with
spin-to-completion and a lock shared between user (vdso) and kernel rather than
using a syscall (not sure this is justified, though). Using a syscall with
sleep-during-dma would have a significantly larger threshold before it's

Regarding how I measured kernel performance increase, I was just looking at
boot timing with printk timestamps enabled. The main time consumer is unpacking

