It dawned on me yesterday that the microcontrollers used by DMA-capable devices lack ECC protection, such that the addresses on which they perform DMA are operations vulnerable to bit flips.
One prominent example of DMA-capable hardware that lacks ECC is the Intel 82574L, which is used in super motherboards from Supermicro, Apple's Mac Pro workstations and other high-end systems. http://www.servethehome.com/intel-ethernet-controller-buffer-ecc-comparison/ I am not currently able to detect this phenomena, but there is some evidence to suggest that it happens. In particular, a recent talk by Robert Stucke at DefCon demonstrated evidence of bit flips on Google's servers, which presumably have ECC memory protection. http://www.youtube.com/watch?v=ZPbyDSvGasw&t=10m20s Checksum offload to NICs that lacked ECC protection can permit bit flips to alter DNS queries in the manner that Robert described. Knowing that commonly used NICs have no form of ECC protection, I see no reason why these bit flips cannot affect the addresses used in DMA operations. I also see no reason why this is restricted to NICs. The consequence is that ZFS ARC buffers and other data structures are vulnerable to memory corruption caused by bit flips on various DMA-capable devices in the addresses that they use for DMA operations. Operating system kernels should be able to protect against DMA operations to incorrect addresses using the IO-MMU (called VT-d extensions by Intel) on recent hardware. However, I have yet to hear of any that take advantage of it. Has anyone else given any thought to this or know of any thing done about it?
signature.asc
Description: OpenPGP digital signature
_______________________________________________ developer mailing list [email protected] http://lists.open-zfs.org/mailman/listinfo/developer
