I've updated my patch. The previous one was completely wrong. However, I'm not going to commit it now.
For the record: in fact, I cannot measure the performance because of two issues - our PMCs do not support SMP, which is problematic since false-sharing occurs with at least two CPUs. I can't get the interesting measurement MSRs on my AMD. - time-based benchmarking does not work either for me; when performing 5x10^8 calls to pmap_page_zero on two CPUs, several driver errors get printed on the screen midway and the system freezes. It seems to be related to my hardware, because I've already seen this kind of things in the past on that particular machine, regardless of the OS running on it. Maxime
