I agree that if you have futexes on your platform and you don't contend (i.e. don't even have to call into the kernel) the overhead is small(er), however, there is also the overhead between the producer-consumer contexts, namely the event base and the `-t 1' thread (again, unless I misread the code, in which case I apologize).
I am not sure what paper you mean, but my thesis did deal with the scalability of sockets (raw sockets in particular) and how a single-producer multiple-consumer approach doesn't scale nearly as good as a ``one instance per core." Sure enough the details are more gory and I won't get into them. I'll try to find some time to test and compare and get back to you with some numbers. By the way, I do ot see how there would be ``loss when running multiple instances'' since your traffic is disjoint, and if you do have ``loss'' then your OS kernel is broken. As a matter of fact, I do have a 10Gbps machine, actually two of them, each with a dual socket Xeon X5570 and 2x Myri-10G NICs that I was planning on using for tests. Would you be so kind to tell me if there's any standard performance test suite for memcached that is typically used? Or should I just write my own trivial client---in particular, as you mentioned, I am interested in the scalability of memcached (-t 4 versus proper singlethreaded/multi-process) with respect to the key and/or value size. Regards, T On Mon, Oct 4, 2010 at 4:21 PM, dormando <[email protected]> wrote: > We took it out for a reason, + if you run with -t 1 you won't really see > contention. 'Cuz it's running single threaded and using futexes under > linux. Those don't have much of a performance hit until you do contend. > > I know some paper just came out which showed people using multiple > memcached instances and scaling some kernel locks, along with the whole > redis "ONE INSTANCE PER CORE IS AWESOME GUYS" thing. > > But I'd really love it if you would prove that this is better, and prove > that there is no loss when running multiple instances. This is all > conjecture. > > I'm slowly chipping away at the 1.6 branch and some lock scaling patches, > which feels a lot more productive than anecdotally naysaying progress. > > memcached -t 4 will run 140,000 sets and 300,000+ gets per second on a box > of mine. An unrefined patch on an older version from trond gets that to > 400,000 sets and 630,000 gets. I expect to get that to be a bit higher. > > I assume you have some 10GE memcached instances pushing 5gbps+ of traffic > in order for this patch to be worth your time? > > Or are all of your keys 1 byte and you're fetching 1 million of them per > second? > > On Mon, 4 Oct 2010, tudorica wrote: > > > The current memcached-1.4.5 version I downloaded appears to always be > > built with multithreaded support (unless something subtle is happening > > during configure that I haven't noticed). Would it be OK if I > > submitted a patch that allows a single-threaded memcached build? Here > > is the rationale: instead of peppering the code with expensive user- > > space locking and events (e.g. pthread_mutex_lock, and the producer- > > consumers), why not just have the alternative to deploy N instances of > > plain singlethreaded memcached distinct/isolated processes, where N is > > the number of available CPUs (e.g. each instance on a different port)? > > Each such memcached process will utilize 1/Nth of the memory that a > > `memcached -t N' would have otherwise utilized, and there would be no > > user-space locking (unlike when memcached is launched with `-t 1'), > > i.e. all locking is performed by the in-kernel network stack when > > traffic is demuxed onto the N sockets. Sure, this would mean that the > > clients will have to deal with more memcached instances (albeit > > virtual), but my impression is that this is already the norm (see the > > consistent hashing libraries like libketama), and proper hashing (in > > the client) to choose the target memcached server (ip:port) is already > > commonplace. The only down-side I may envision is clients utilizing > > non-uniform hash functions to choose the target memcached server, but > > that's their problem. > > > > Regards, > > T > > >
