> without patch: > 1M memory region 120usec > 16M memory region 1970usec > > with patch v2: > 1M memory region 172usec > 16M memory region 2030usec
So if I read this correctly this patch introduces almost a 50% overhead in the 1M case... and probably much worse (as a fraction) in say the 64K or 4K case. I wonder if that's acceptable? Alex's original approach was to try the memadvise with normal page size and then fall back to huge page size if that failed. But of course that wastes some time on the failed madvise in the hugepage case. I think it would be interesting to compare timings for registering, say, 4K, 64K, 1M and 16M regions with and without huge page backing, for three possibilities: - unpatched libibverbs (will obviously fail on hugepage backed regions) - patched with this v2 - alternate patch that tries madvise and only does your /proc/pid/smaps - parsing if the first madvise fails. - R. -- Roland Dreier <[email protected]> || For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/index.html -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
