Hi Les,

On Thu, Oct 07, 2010 at 01:52:57PM -0400, Les Stroud wrote:
(...)
> Also, one interesting thing to note is that at no time am I able to max out
> the cpu on any of the boxes.  In fact, I barely touch them.  I don?t see a
> memory or io constraint either.  All I can assume is a network constraint.

That's typical of VM usage for network. Network-oriented workloads are of
the nastiest ones on virtual machines, because they require very low
communication overhead between the userland and the hardware. When you're
in a VM, each system call experiences a huge overhead, and each I/O access
to the network interface takes an even higher hit. That's why you don't see
any high CPU usage in the VMs, the CPU is wasted outside the VMs.

In your scenario 1, you were limited to 5500 hits/s through haproxy. That's
about what I've observed on ESX too. At 5500 hits/s, assuming the usual 10
packets per connection, you have 55000 packets per second on each side, and
approximately 10 system calls per connection. That means approximately 55000
syscalls per second and 110000 packets per second. Let's assume now that each
of them costs only one microsecond. Then you waste 165 milliseconds each
second between the VM and the hardware, which is 16.5% of the CPU time wasted
in the virtualizer. Now, from my experience, numbers are even higher because
I've observed that a machine which runs haproxy at 5000 hits/s on ESX does
about 35-40000 when running native. I'm not saying that virtualization is
bad, just that it's not suited to every workload and that it can sometimes
be responsible for a 80% overhead for such workloads, which translates into 
5 times more powerful servers to achieve the same job.

Now I guess you found why you get lower perf through haproxy than on tomcat
directly ? Tomcat does half the job through the VM (less syscalls per
connection and half the packets since it only has one side).

It is possible to reduce the number of packets per session in haproxy,
this is sometimes very efficient in VMs :

  1) use "option forceclose" instead of "option httpclose". It will actively
     reset the server connection instead of sendind a FIN and receiving an ACK.
     Alternatively you can use "option http-server-close" which does the same
     on the server side but allows the client side to use keep-alive (use ab -k
     for that).

  2) use "option tcp-smart-accept". This makes haproxy ask the system not to
     immediately acknowledge reception of the TCP segment which holds the
     HTTP request. If the response comes in less than 200 ms, it saves one
     more ACK.

  3) use "option tcp-smart-connect". This makes haproxy ask the system to
     send the request to the server with the initial ACK packets right after
     the SYN-ACK. This saves another packet.

Doing that, you can go down to approximately 6 packets on the server side
and 7 on the client side. With keep-alive, you could even go down to 2
packets per request on the client side once the connection is established.

> These are virtualized boxes and I do, from time to time, see a ?hiccup?
> that does not seem to correlate to java garbage collection.

do you observe this yourself or only in logs/stats ? If it's just the
later, then I would recommend to monitor the regularity of your clocks
in the VMs. It's quite common to see the clock speed up or slow down
two-fold then abruptly resync (eg: jump forwards by 30 seconds or
remain at the same second for 5 seconds). That was the reason why I
had to develop the monotonic clock in haproxy, inside VMs the system
clock was totally unusable in some environments.

>  Is there a network queue somewhere that could be reaching saturation that
> needs some additional capacity?

there should not be, but you should probably check if you don't have any
network losses. That's quite tricky to do in virtualized environments.
Basically, you need to sniff inside the VM and to sniff outside the machine.
If you see that some packets are missing, you can suspect that network queues
are throwing up at the driver level, but I don't know if VMware maintains any
such statistics, and unfortunately I'm not aware of any reliable means of
capturing packets inside the host itself.

>  In fact, hitting refresh repeatedly on the hastats page, I will see it enter 
> this ?pause?.  The stats frontend will update, but the backend will appear to 
> not be doing work.  After a couple of seconds, it will pick back up.  Could 
> this be related?  Or, is this something entirely different?

This would mean that sometimes some packets are lost once the connection
is established and you have to wait for a retransmit. I assume it only
happens under load. The game is now to try to find out where those packets
are lost :-/

I've just noticed something. All your servers have the same cookie value
('A'). This means that only the first one will get the traffic for a client
which correctly supports cookies. I don't remember if ab supports them, but
this will definitely prevent you from scaling in many tests.

Last thing while I'm at it, you should really remove that "nbproc 4" line,
it makes the debugging even harder as you never know what process gets what
request, nor which one gets the stats.

Regards,
Willy


Reply via email to