Mike Lambert wrote:
Currently the -L flag is only enabled if HAVE_GETPAGESIZES&&HAVE_MEMCNTL. I'm curious what the motivation is for something like that? In our experience, for some memcache pools we end up fragmenting memory due to the repeated allocation of 1MB slabs around all the other hashtables and free lists going on. We know we want to allocate all memory upfront, but can't seem to do that on a Linux system.
The primary motivation was more about not beating up the TLB cache on the CPU when running with large heaps. There are users with large heaps already, so this should help if the underlying OS supports large pages. TLB cache sizes are getting bigger in CPUs, but virtualization is more common and memory heaps are growing faster.
I'd like to have some empirical data on how big a difference the -L flag makes, but that assumes a workload profile. I should be able to hack one up and do this with memcachetest, but I've just not done it yet. :)
To put it more concretely, here is a proposed change to make -L do a contiguous preallocation even on machines without getpagesizes tuning. My memcached server doesn't seem to crash, but I'm not sure if that's a proper litmus test. What are the pros/cons of doing something like this?
This feels more related to the -k flag, and that it should be using madvise() in there somewhere too. It wouldn't be a bad idea to separate these necessarily. I don't know that the day after 1.4.0 is the day to redefine -L though, but it's not necessarily bad. We should wait for Trond's repsonse to see what he thinks about this since he implemented it. :)
Also, I did some testing with this (-L) some time back (admittedly on OpenSolaris) and the actual behavior will vary based on the memory allocation library you're using and what it does with the OS underneath. I didn't try Linux variations, but that may be worthwhile for you. IIRC, default malloc would wait for page-fault to do the actual memory allocation, so there'd still be risk of fragmentation.
- Matt
