Re: Problems with load balancing on cloud servers
Hi, On Tue, Sep 13, 2011 at 11:02:26AM +0800, Liong Kok Foo wrote: > Top for server 70 (load problem) > top - 10:51:23 up 32 days, 22:21, 1 user, load average: 3.09, 2.99, 2.50 > Tasks: 115 total, 3 running, 112 sleeping, 0 stopped, 0 zombie > Cpu(s): 38.5%us, 11.0%sy, 0.0%ni, 48.2%id, 0.0%wa, 0.0%hi, 2.3%si, > 0.0%st > Mem: 205k total, 1049708k used, 1000292k free, 264264k buffers > Swap: 1052248k total, 876k used, 1051372k free, 418272k cached > > Sometimes server B's load will shoot up to 20 or more while server A > (and the rest remain at around 5). > > Would really appreciate any input on this matter. When you look at the stats, you notice that there is much a higher retransmit number than for other servers. This almost always translates to connectivity issues. And if there are connectivity issues, then the server has more difficulties pushing out responses to the clients and gets more concurrent processes than the other ones, leading to a higher load and memory usage. You should run tests between this server and another one : transfer a large file (500 MB) several time. You should reach the gbps (118 MB/s). Do this in different directions. Often you'll notice that one direction is approximately OK while the other one is terrible. Do this with other reference servers that work well and if you note that communications with this server are the only ones affected, then ask the provider to replace it. Sometimes it's just a cable issue. Sometimes it's switch port, sometimes it's a NIC. But those issues are quite common in datacenters. You can also look at the network statistics : $ netstat -s|grep retrans I suspect that you'll notice more retransmits on this one than on the other servers. Be careful, those stats are from the last boot, so you have to take uptime into account. Regards, Willy
Re: Problems with load balancing on cloud servers
Hi Liong, You can also play with vm.swapiness to avoid your ubuntu server to use its swap. cheers
Re: Problems with load balancing on cloud servers
Liong: Have you looked at average query response time statistics? Perhaps this could shed some light on your issue. A graph of response time vs. current requests for a given resource might help you narrow down the issue if it's not a load balancer issue or just requests processed / second compared between the two servers would give you more conclusive evidence that the load balancer is doing the wrong thing. -Jerry Jerry Champlin Absolute Performance Inc. -- Enabling businesses to deliver critical applications at lower cost and higher value to their customers. On Mon, Sep 12, 2011 at 9:38 PM, Liong Kok Foo wrote: > Hi Jerry, > > Appreciate the quick reply and pointing out this difference. > > However, we are aware of some server having swap and some not. We have > explored turning swap on or off but it doesn't solve the issue. > > Thanks. > > Liong Kok Foo > > > On 9/13/2011 11:15 AM, Jerry Champlin wrote: > > Liong: > > The only significant difference in the stats you posted is that the server > without a load problem has no swap configured and the other one does. I > would not expect that to cause a problem but that's the only difference that > jumps off the page. > > -Jerry > > Jerry Champlin > Absolute Performance Inc. > -- > Enabling businesses to deliver critical applications at lower cost and > higher value to their customers. > > > On Mon, Sep 12, 2011 at 9:02 PM, Liong Kok Foo wrote: > >> Hi, >> >> We have been using haproxy for many years now. Implemented it in few of >> our systems. However, we have been facing some odd problem which we are not >> sure if it is related to haproxy. The odd problem is that one of the server >> is having higher load than the others. That is the last server in the LB but >> we tried switching it to second last and still see only this server giving >> high load. We also tried cloning the server from one of the existing server >> that doesn't have this problem but it still is giving this problem. >> >> We have checked with the cloud provider to see if the odd server is hosted >> in a different cloud segment in their datacenter. Doesn't seems so from >> their reply. >> >> We have setup 5 instance of servers in the cloud computing hosted by >> voxel. One is being used as haproxy server and the other 4 is running apache >> serving website. >> >> Specs of the cloud servers: >> 1 core CPU >> 2GB RAM >> Ubuntu 10.04 LTS >> Haproxy 1.3.25 >> >> Check the screen shot for the haproxy stats. We allocated weight of >> 1:1:1:1. We just doesn't understand why the extra load on this server. >> >> Top for server 69 (no problem): >> top - 10:51:08 up 145 days, 16:34, 1 user, load average: 1.06, 1.22, >> 1.27 >> Tasks: 117 total, 2 running, 115 sleeping, 0 stopped, 0 zombie >> Cpu(s): 28.8%us, 7.6%sy, 0.0%ni, 61.3%id, 0.0%wa, 0.0%hi, 2.1%si, >> 0.2%st >> Mem: 205k total, 1587608k used, 462392k free, 760100k buffers >> Swap:0k total,0k used,0k free, 432848k cached >> >> Top for server 70 (load problem) >> top - 10:51:23 up 32 days, 22:21, 1 user, load average: 3.09, 2.99, 2.50 >> Tasks: 115 total, 3 running, 112 sleeping, 0 stopped, 0 zombie >> Cpu(s): 38.5%us, 11.0%sy, 0.0%ni, 48.2%id, 0.0%wa, 0.0%hi, 2.3%si, >> 0.0%st >> Mem: 205k total, 1049708k used, 1000292k free, 264264k buffers >> Swap: 1052248k total, 876k used, 1051372k free, 418272k cached >> >> Sometimes server B's load will shoot up to 20 or more while server A (and >> the rest remain at around 5). >> >> Would really appreciate any input on this matter. >> >> Thanks. >> >> -- >> Liong Kok Foo >> >> >
Re: Problems with load balancing on cloud servers
Hi Jerry, Appreciate the quick reply and pointing out this difference. However, we are aware of some server having swap and some not. We have explored turning swap on or off but it doesn't solve the issue. Thanks. Liong Kok Foo On 9/13/2011 11:15 AM, Jerry Champlin wrote: Liong: The only significant difference in the stats you posted is that the server without a load problem has no swap configured and the other one does. I would not expect that to cause a problem but that's the only difference that jumps off the page. -Jerry Jerry Champlin Absolute Performance Inc. -- Enabling businesses to deliver critical applications at lower cost and higher value to their customers. On Mon, Sep 12, 2011 at 9:02 PM, Liong Kok Foo mailto:kokfoo.li...@innity.com>> wrote: Hi, We have been using haproxy for many years now. Implemented it in few of our systems. However, we have been facing some odd problem which we are not sure if it is related to haproxy. The odd problem is that one of the server is having higher load than the others. That is the last server in the LB but we tried switching it to second last and still see only this server giving high load. We also tried cloning the server from one of the existing server that doesn't have this problem but it still is giving this problem. We have checked with the cloud provider to see if the odd server is hosted in a different cloud segment in their datacenter. Doesn't seems so from their reply. We have setup 5 instance of servers in the cloud computing hosted by voxel. One is being used as haproxy server and the other 4 is running apache serving website. Specs of the cloud servers: 1 core CPU 2GB RAM Ubuntu 10.04 LTS Haproxy 1.3.25 Check the screen shot for the haproxy stats. We allocated weight of 1:1:1:1. We just doesn't understand why the extra load on this server. Top for server 69 (no problem): top - 10:51:08 up 145 days, 16:34, 1 user, load average: 1.06, 1.22, 1.27 Tasks: 117 total, 2 running, 115 sleeping, 0 stopped, 0 zombie Cpu(s): 28.8%us, 7.6%sy, 0.0%ni, 61.3%id, 0.0%wa, 0.0%hi, 2.1%si, 0.2%st Mem: 205k total, 1587608k used, 462392k free, 760100k buffers Swap:0k total,0k used,0k free, 432848k cached Top for server 70 (load problem) top - 10:51:23 up 32 days, 22:21, 1 user, load average: 3.09, 2.99, 2.50 Tasks: 115 total, 3 running, 112 sleeping, 0 stopped, 0 zombie Cpu(s): 38.5%us, 11.0%sy, 0.0%ni, 48.2%id, 0.0%wa, 0.0%hi, 2.3%si, 0.0%st Mem: 205k total, 1049708k used, 1000292k free, 264264k buffers Swap: 1052248k total, 876k used, 1051372k free, 418272k cached Sometimes server B's load will shoot up to 20 or more while server A (and the rest remain at around 5). Would really appreciate any input on this matter. Thanks. -- Liong Kok Foo
Re: Problems with load balancing on cloud servers
Liong: The only significant difference in the stats you posted is that the server without a load problem has no swap configured and the other one does. I would not expect that to cause a problem but that's the only difference that jumps off the page. -Jerry Jerry Champlin Absolute Performance Inc. -- Enabling businesses to deliver critical applications at lower cost and higher value to their customers. On Mon, Sep 12, 2011 at 9:02 PM, Liong Kok Foo wrote: > Hi, > > We have been using haproxy for many years now. Implemented it in few of our > systems. However, we have been facing some odd problem which we are not sure > if it is related to haproxy. The odd problem is that one of the server is > having higher load than the others. That is the last server in the LB but we > tried switching it to second last and still see only this server giving high > load. We also tried cloning the server from one of the existing server that > doesn't have this problem but it still is giving this problem. > > We have checked with the cloud provider to see if the odd server is hosted > in a different cloud segment in their datacenter. Doesn't seems so from > their reply. > > We have setup 5 instance of servers in the cloud computing hosted by voxel. > One is being used as haproxy server and the other 4 is running apache > serving website. > > Specs of the cloud servers: > 1 core CPU > 2GB RAM > Ubuntu 10.04 LTS > Haproxy 1.3.25 > > Check the screen shot for the haproxy stats. We allocated weight of > 1:1:1:1. We just doesn't understand why the extra load on this server. > > Top for server 69 (no problem): > top - 10:51:08 up 145 days, 16:34, 1 user, load average: 1.06, 1.22, 1.27 > Tasks: 117 total, 2 running, 115 sleeping, 0 stopped, 0 zombie > Cpu(s): 28.8%us, 7.6%sy, 0.0%ni, 61.3%id, 0.0%wa, 0.0%hi, 2.1%si, > 0.2%st > Mem: 205k total, 1587608k used, 462392k free, 760100k buffers > Swap:0k total,0k used,0k free, 432848k cached > > Top for server 70 (load problem) > top - 10:51:23 up 32 days, 22:21, 1 user, load average: 3.09, 2.99, 2.50 > Tasks: 115 total, 3 running, 112 sleeping, 0 stopped, 0 zombie > Cpu(s): 38.5%us, 11.0%sy, 0.0%ni, 48.2%id, 0.0%wa, 0.0%hi, 2.3%si, > 0.0%st > Mem: 205k total, 1049708k used, 1000292k free, 264264k buffers > Swap: 1052248k total, 876k used, 1051372k free, 418272k cached > > Sometimes server B's load will shoot up to 20 or more while server A (and > the rest remain at around 5). > > Would really appreciate any input on this matter. > > Thanks. > > -- > Liong Kok Foo > >