Thank you for taking the time to reply.  It is VERY helpful.  Please see my 
further questions :) below.

LES

On Oct 7, 2010, at 5:11 PM, Willy Tarreau wrote:

> Hi Les,
> 
> On Thu, Oct 07, 2010 at 01:52:57PM -0400, Les Stroud wrote:
> (...)
>> Also, one interesting thing to note is that at no time am I able to max out
>> the cpu on any of the boxes.  In fact, I barely touch them.  I don?t see a
>> memory or io constraint either.  All I can assume is a network constraint.
> 
> That's typical of VM usage for network. Network-oriented workloads are of
> the nastiest ones on virtual machines, because they require very low
> communication overhead between the userland and the hardware. When you're
> in a VM, each system call experiences a huge overhead, and each I/O access
> to the network interface takes an even higher hit. That's why you don't see
> any high CPU usage in the VMs, the CPU is wasted outside the VMs.
> 
> In your scenario 1, you were limited to 5500 hits/s through haproxy. That's
> about what I've observed on ESX too. At 5500 hits/s, assuming the usual 10
> packets per connection, you have 55000 packets per second on each side, and
> approximately 10 system calls per connection. That means approximately 55000
> syscalls per second and 110000 packets per second. Let's assume now that each
> of them costs only one microsecond. Then you waste 165 milliseconds each
> second between the VM and the hardware, which is 16.5% of the CPU time wasted
> in the virtualizer. Now, from my experience, numbers are even higher because
> I've observed that a machine which runs haproxy at 5000 hits/s on ESX does
> about 35-40000 when running native. I'm not saying that virtualization is
> bad, just that it's not suited to every workload and that it can sometimes
> be responsible for a 80% overhead for such workloads, which translates into 
> 5 times more powerful servers to achieve the same job.
> 
> Now I guess you found why you get lower perf through haproxy than on tomcat
> directly ? Tomcat does half the job through the VM (less syscalls per
> connection and half the packets since it only has one side).
> 

So, it sounds like, in this case that haproxy itself is the limiting factor.  
Tomcat outperforms in some ways because it has removed some of the vm 
dependency.  I would assume that some of the extreme optimization that you do 
specifically targets a hardware architecture and not the software architecture 
that vmware uses.  So, am I right to conclude that this is about as good as I 
can get, from a throughput perspective in this environment? If I wanted to get 
the throughput advantage that haproxy can deliver I would need to move the 
haproxy install to a physical box, correct?  

> It is possible to reduce the number of packets per session in haproxy,
> this is sometimes very efficient in VMs :
> 
>  1) use "option forceclose" instead of "option httpclose". It will actively
>     reset the server connection instead of sendind a FIN and receiving an ACK.
>     Alternatively you can use "option http-server-close" which does the same
>     on the server side but allows the client side to use keep-alive (use ab -k
>     for that).
> 
>  2) use "option tcp-smart-accept". This makes haproxy ask the system not to
>     immediately acknowledge reception of the TCP segment which holds the
>     HTTP request. If the response comes in less than 200 ms, it saves one
>     more ACK.
> 
>  3) use "option tcp-smart-connect". This makes haproxy ask the system to
>     send the request to the server with the initial ACK packets right after
>     the SYN-ACK. This saves another packet.
> 
> Doing that, you can go down to approximately 6 packets on the server side
> and 7 on the client side. With keep-alive, you could even go down to 2
> packets per request on the client side once the connection is established.
> 
>> These are virtualized boxes and I do, from time to time, see a ?hiccup?
>> that does not seem to correlate to java garbage collection.
> 
> do you observe this yourself or only in logs/stats ? If it's just the
> later, then I would recommend to monitor the regularity of your clocks
> in the VMs. It's quite common to see the clock speed up or slow down
> two-fold then abruptly resync (eg: jump forwards by 30 seconds or
> remain at the same second for 5 seconds). That was the reason why I
> had to develop the monotonic clock in haproxy, inside VMs the system
> clock was totally unusable in some environments.

I think that I did observe the clock issue that you are referring to.  Now that 
you mention it, I have noticed top refresh quicker than it should (skip some 
seconds) from time to time. 

I also noticed a related but slightly different behavior.  During some of the 
tests, I was refreshing the stats page constantly (manually).  I had a separate 
port listener setup for the stats.   On some of the test runs (but not 
consistently), the test would ‘pause’, but the stats page would continue to 
return in the normal quick fashion.  The counters on the stats listener were 
increasing, but they were not changing for the backend.  This would last a few 
seconds and then it would return to normal.  I suppose that it could be a clock 
synchronization happening on the backend servers, but it seems unlikely to be 
happening simultaneously.  Any thoughts?

> 
>> Is there a network queue somewhere that could be reaching saturation that
>> needs some additional capacity?
> 
> there should not be, but you should probably check if you don't have any
> network losses. That's quite tricky to do in virtualized environments.
> Basically, you need to sniff inside the VM and to sniff outside the machine.
> If you see that some packets are missing, you can suspect that network queues
> are throwing up at the driver level, but I don't know if VMware maintains any
> such statistics, and unfortunately I'm not aware of any reliable means of
> capturing packets inside the host itself.

I think I may pass this one off to someone else to run down.. :)

> 
>> In fact, hitting refresh repeatedly on the hastats page, I will see it enter 
>> this ?pause?.  The stats frontend will update, but the backend will appear 
>> to not be doing work.  After a couple of seconds, it will pick back up.  
>> Could this be related?  Or, is this something entirely different?
> 
> This would mean that sometimes some packets are lost once the connection
> is established and you have to wait for a retransmit. I assume it only
> happens under load. The game is now to try to find out where those packets
> are lost :-/
> 
> I've just noticed something. All your servers have the same cookie value
> ('A'). This means that only the first one will get the traffic for a client
> which correctly supports cookies. I don't remember if ab supports them, but
> this will definitely prevent you from scaling in many tests.

Thank you for that.  I modified it and it fixed an issue I was having (session 
dependent problem).

> 
> Last thing while I'm at it, you should really remove that "nbproc 4" line,
> it makes the debugging even harder as you never know what process gets what
> request, nor which one gets the stats.
> 

Got it.  Out of curiosity, is there an optimal setting (or formula) for how 
many processes you should have for nbproc? 

> Regards,
> Willy
> 


Reply via email to