On Thu, Jul 23, 2009 at 12:06 AM, Willy Tarreau<w...@1wt.eu> wrote:
> On Wed, Jul 22, 2009 at 11:24:15PM -0500, Dan O'Bryan wrote:
>> I'm in the process of moving our core traffic from a local datacenter
>> to ec2, using haproxy for load balancing.
>>
>> I am unable to get full usage of the 2 virtual cores.  Previewed the
>> full traffic load today and hit cpu limits immediately.  Initially,
>> with nbproc = 1, I see the first core is used at 100% utilization, the
>> second core remains completely idle.  Tried setting with nbproc = 2,
>> same result, the second core stays completely idle.
>
> Have you checked if this was user or system CPU usage ? Maybe it's even
> softirq ? That would explain why it cannot scale if it is the system
> which is saturating on a non-scalable thing.

>From sar:

09:15:02 AM     CPU     %user     %nice   %system   %iowait    %steal     %idle
10:35:01 AM       0     19.04      0.00     49.69      0.00     22.84      8.43

Idle time showed up when I decreased inbound traffic but the
proportions are correct.  I also discovered that cpu0 is handling all
the interrupts for eth0, and haproxy also runs on cpu0.


>
>> All
>> recommendations I've read indicate that nbproc=1 is the preferred
>> setting and that multiple cores should still get utilized, but not
>> having any luck.
>>
>> After a lot of reading and tinkering, ended up with the following setup:
>>    EC2: m1.large instance type, 2 virtual cores, 7.5GB ram,
>>    OS: based on canonical amis, 2.6.27-23-xen #1 SMP Thu Apr 16
>> 14:36:38 UTC 2009 x86_64 GNU/Linux.
>>    HAProxy 1.3.18 with http-ecv patch
>>
>> Hitting peak CPU bottlenecks when session rate hits about 2500, avg 6k
>> per request.
>
> Well, I have never tried xen yet, it is said to be faster than vmware
> on such workloads (which itself is particularly slow). But do you have
> an idea of the real-world equivalence of those "virtual cores" ? If
> they sell you the equivalent of a 200 MHz processor, you'll not get
> far. Also, it is possible that the host system is the bottleneck. The
> problem with networked applications in virtual environments is that
> they perform expensive processing at an extremely high rate. At 2500
> reqs/s of 6kB, you have about 40000 packets/s in+out. I don't know if
> the host nor the virtualization layer is able to sustain this times
> the number of virtual machines.

Its roughly comparable to a 2 core 2GHz Xeon if I've read the
documentation correctly (http://aws.amazon.com/ec2/instance-types).
Since cpu1 is completely idle, it seems that there is much more
capability available.  On other applications, we've had no trouble
saturating both CPUs.  However, the virtualization layer is a black
box, with Amazon metering all shared resources.

>
> Also, check if you would have ip_conntrack loaded. It might consume
> a lot there too.

Confirmed that this is not loaded.

>
>> Do you have any suggestions to increase usage of both cores to improve
>> my per-node capacity?
>
> No that many ideas, as I don't know how the hypervisor balances the
> load across your two virtual cores. It may even be possible that both
> cores are in fact the same one running alternatively.
>
> In fact, there are two things you can try :
>  - reboot on a non-SMP kernel and see if it improves performance,
>    and by how much. Because locking everywhere in SMP costs a lot,
>    and you may end up with a faster machine when using only 1 core.
>
>  - check with "top -d 1" if you see a lot of irq/softirq in your
>    CPU load. If so, try to identify the culprit in /proc/interrupts
>    and assign it one core, and bind haproxy to another one. I don't
>    know if xen emulates interrupt delivery like in a real machine,
>    but there are chances it works similarly. That way you could try
>    to force your system to process network on one core and haproxy
>    on the other one, thus using them both at a time. But it *may*
>    end up slower because the system will have to pass data between
>    both "virtual cores", which is not necessarily desirable.
>

I bound haproxy to cpu1.  Lots of idle time is now being reported,
even as session rate continues higher.  It appears that xen is
misleading in its reporting, since cpu0 steal time is running between
10 - 20% on cpu0 without anything else running there.  But its a much
happier setup now.  Thanks for the help.

> Regards,
> Willy
>
>

Reply via email to