Thanks for the response.  Can you explain what nbproc is if I am using
it incorrectly?  My VM shows 4 cores of cpu.

I will run tcpdump in the background for a few hours to try and see if
there is major network latency.

Thanks!

On 6/4/10, Willy Tarreau <[email protected]> wrote:
> Hi,
>
> On Wed, Jun 02, 2010 at 01:19:59AM -0400, Geoffrey Mina wrote:
>> Greetings,
>> We recently deployed HAProxy in a virtualized environment.  I am having
>> some
>> problems with occasional socket accept errors.
>
> Where do you observe those errors ? I'm seeing you have "nbproc 4" which you
> shouldn't be using. It is possible that you're just observing some of the
> processes waking up, performing an accept() which returns -1 EAGAIN and that
> you take that for an error.
>
>> We are seeing the problems
>> primarily on the pure TCP load balancing portion of our configuration.
>> The
>> load balancer is running in the Rackspace Cloud under Xen. Basically what
>> I
>> am seeing is that sockets never get nailed up, even though I am 110% sure
>> all the back-end servers are operating fine.  We have secondary monitoring
>> processes which are constantly setting up and tearing down sockets
>> directly
>> (bypassing HAProxy) to ensure that the servers are up and running.  I have
>> provided our config and some other information below.  If anyone can point
>> me in the right direction for figuring out this issue, i would greatly
>> appreciate it.
>
> I suppose you have already performed the usual tuning bits (tune or disable
> iptables, etc...).
>
> One thing that can happen in virtualized environments is that the haproxy VM
> starves without getting access to the CPU for long periods of time if your
> hosting provider sells more power than the physical machines can provide.
> It is also possible that network packets get queued up for a very long time
> between VMs because they physical network (or even the physical machines)
> are
> overloaded. I have already observed pings up to 7 seconds on an EC2 platform
> that finally migrated to Rackspace to solve such issues, but since it was
> more than a year ago, it does not mean they might not experience similar
> trouble now :-)
>
> The most important thing to do in such environments is to sniff traffic in
> real time. Since you have zero control over the resource allocation and the
> timing, the best you can do is observe if locally initiated I/Os reach their
> target in time or not.
>
> Regards,
> Willy
>
>

-- 
Sent from my mobile device

Reply via email to