We're running a dual-970 blade, based on a modified 2.6.10. We have an application that does lots of random data fetches over a fairly large data set (a few GB) contained entirely in RAM, and the performance guys think that we may be spending time in unnecessary hardware prefetches and would like me to provide them a mechanism to individually specify the cache-inhibited and guard bits from userspace so that they can try to fine-tune their performance.
What's the most logical way for me to do this? Do I extend mprotect() to support additional flags? Has anyone done this before? I didn't find anything in google. Currently the guard bit seems to only be used for ioremap() and in __pci_mmap_set_pgprot() if the memory doesn't support write combining. Thanks, Chris
