On 26-11-2011 19:28 Les Mikesell wrote:
On Sat, Nov 26, 2011 at 7:15 AM, Arjen van der Meijden<[email protected]>  
wrote:
Wouldn't more servers become increasingly (seen from the application) slower
as you force your clients to connect to more servers?

Assuming all machines have enough processing power and network bandwidth,
I'd expect performance of the last of these variants to be best:
16x  1GB machines
  8x  2GB machines
  4x  4GB machines
  2x  8GB machines
  1x 16GB machines

In the first one you may end up with 16 different tcp/ip-connections per
client. Obviously, connection pooling and proxies can alleviate some of that
overhead. Still, a multi-get might actually hit all 16 servers.

That doesn't make sense.  Why would you expect 16 servers acting in
parallel to be slower than a single server?  And in many/most cases
the application will also be spread over multiple servers so the load
is distributed independently there as well.

Why not? Will it really be in parallel? Given that most application code is fairly linear (i.e. all parallelism will have to come from the client library). Even with true parallelism, you'll still have to connect to all servers, be hindered by slow starts, etc (a connection pool may help here). I'm just wondering whether the connection and other tcp/ip overheads will be outweighed by any load-spreading gains. Especially since memcached's part of the job is fairly quick.

Here's another variant on my question I hadn't even thought about:
http://highscalability.com/blog/2009/10/26/facebooks-memcached-multiget-hole-more-machines-more-capacit.html
And here's Dormando's response to that;
http://dormando.livejournal.com/521163.html

So his post also suggests it might not be a good idea to issue small requests to many servers rather than issue large requests to few.

Best regards,

Arjen

Reply via email to