Dear Dormando,

Regards "binding one memcached instance per NUMA node",  should we 
understand "NUMA node" as a core with Intel i3/i5  4-core processors?

So " numactl --cpunodebind=0 ./memcached -m 4000 -t 4" will bind memcached 
instance to a CPU core, right?

Thanks again!

On Tuesday, April 17, 2012 8:56:31 AM UTC+8, Dormando wrote:
>
> > The business scenario requires:
> >
> > 50M key-value pairs, 2K each , 100G memory in total.
> >
> > About 40% of key-value will change in a second.
> >
> > The Java application need Get() once and set() once for each changed 
> pair, it will be 50M*40%*2=4M qps (query per second) .
> >
> > We tested memcached - which shows very limited qps.
> > Our benchmarking is very similar to results showed herehttp://
> xmemcached.googlecode.com/svn/trunk/benchmark/benchmark.html
> >
> > 10,000 around qps is the limitation of one memcached server.
> >
> > That mean we need 40 partitioned memcached servers in our business 
> scenario- which seems very uneconomic and unrealistic.
> >
> > In your experience, is the benchmarking accurate in term of memcached’s 
> designed performance?
> >
> > Any suggestion to tune memcached system(client or server)?
> >
> > Or any other alternative memory store system that is able meet the 
> requirement more economically?
> >
> > Many thanks in advance!
>
> You should share your actual benchmark code. Also, what version of
> memcached, OS, network, etc?
>
> After 1.4.10, a single memcached instance can do nearly one million sets
> per second:
>
> http://groups.google.com/group/memcached/browse_thread/thread/972a4cf1f2c1b017/b3aaf416639e81a6
>
> There are a lot of things you need to tune to get that level of
> performance in a real scenario, however:
>
> - fast network. you will be limited by your packets per second. a single
> gige nic might not do more than 600,000 per second, but also could be as
> low as 250,000 before packet loss.
>
> - batch as many commands as you can (using binary protocol, with
> "noreply"). fewer round trips, fewer packets on the wire.
>
> - use as many clients as you can (a single connection doing synchronous
> sets will be slow in *any* benchmark)
>
> - as noted in the above link, binding one memcached instance per NUMA node
> can improve performance
>
> - tune the number of threads correctly
>
> - always use the latest version
>
> performance should continue to improve over the coming months, but it's
> very difficult to see results of the improvements on actual hardware. I'd
> say you'd need 10 half decent servers to achieve that level of performance
> and have good headroom. If you really tune things hard you could get that
> down to 6. If you left me alone in a room for a few months with a giant
> pile of money I could do it with two. three for redundancy.
>
> -Dormando
>
>

Reply via email to