I think CPU Cortex-M4 is a lot slower than arduino

Really? That isn't my experience. Given that the M4 is running at
120MHz and the Arduino is 8MHz and both are running the same code?

No I got it wrong and mixed it with raspberry, I think.

The pi's huge, glaring slow i/o problem is that internal usb-2 hub. AIUI,
only the wifi and gpio bypass that huge bottleneck.

