Re: Haproxy running on 100% CPU and slow downloads

2016-05-24 Thread Sachin Shetty
To close this thread out: we found the issue to be in 1.6.4-20160426 patch
that I was using. The issue is fixed in 1.6.5.

Thanks Willy and Lukas.

Thanks
Sachin

On 5/13/16, 8:14 PM, "Willy Tarreau"  wrote:

>On Fri, May 13, 2016 at 07:32:36PM +0530, Sachin Shetty wrote:
>> In 24 hours all servers had connections growing, we have reverted the
>> patch for now.
>> 
>> I have the show sess all output if you would like to see.
>
>Interestingly in the "show sess all" from yesterday I'm seeing only
>negative "tofwd" values for stuck sessions. Exactly the type of thing
>which is supposedly fixed now (it's the problem with 2-4GB transfers).
>I don't understand since I tested the backport and had the confirmation
>from another user that it was OK for him. Maybe there's a corner case I
>haven't figure which may depend on certain options.
>
>Could you please send me privately your config (remove the confidential
>stuff) ? I think you gave it to me a few times already but I don't want
>to keep those you know.
>
>Thanks,
>Willy
>





Re: Haproxy running on 100% CPU and slow downloads

2016-05-13 Thread Willy Tarreau
On Fri, May 13, 2016 at 07:32:36PM +0530, Sachin Shetty wrote:
> In 24 hours all servers had connections growing, we have reverted the
> patch for now.
> 
> I have the show sess all output if you would like to see.

Interestingly in the "show sess all" from yesterday I'm seeing only
negative "tofwd" values for stuck sessions. Exactly the type of thing
which is supposedly fixed now (it's the problem with 2-4GB transfers).
I don't understand since I tested the backport and had the confirmation
from another user that it was OK for him. Maybe there's a corner case I
haven't figure which may depend on certain options.

Could you please send me privately your config (remove the confidential
stuff) ? I think you gave it to me a few times already but I don't want
to keep those you know.

Thanks,
Willy




Re: Haproxy running on 100% CPU and slow downloads

2016-05-13 Thread Willy Tarreau
Hi Sachin,

On Fri, May 13, 2016 at 07:32:36PM +0530, Sachin Shetty wrote:
> In 24 hours all servers had connections growing, we have reverted the
> patch for now.
> 
> I have the show sess all output if you would like to see.

Thank you very much, that's extremely useful. I'll probably get back to
you in the next few days if I find that I need more information. Indeed,
do not take risks on your production, our development model makes it
easy for you to limit the risks by switching back, so stay safe!

Best regards,
Willy




Re: Haproxy running on 100% CPU and slow downloads

2016-05-13 Thread Sachin Shetty
In 24 hours all servers had connections growing, we have reverted the
patch for now.

I have the show sess all output if you would like to see.

Thanks
Sachin

On 5/12/16, 10:08 PM, "Sachin Shetty"  wrote:

>Hi Lukas,
>
>Attached output.
>
>Thanks
>Sachin
>
>On 5/12/16, 7:41 PM, "Lukas Tribus"  wrote:
>
>>Hi,
>>
>>
>>Am 12.05.2016 um 14:37 schrieb Sachin Shetty:
>>> Hi Willy,
>>>
>>> We are seeing a strange problem  on the patched server. We have several
>>> haproxy servers running but only one with the latest patch, and this
>>> haproxy has frozen twice in last two days, basically it hits max open
>>> connections 2000 on frontend and then stalls. From the logs it has 1999
>>> connections on one of the backends which is nginx, but nginx_status
>>>shows
>>> me only a few active connections. It only happens on the patched
>>>haproxy
>>> server and does not happen anywhere else. Interesting thing is this
>>> haproxy is not the one doing SSL, we have two haproxies on the same box
>>> with the latest binary, the SSL one seems ok but the non SSL one keeps
>>>on
>>> accumulating connections.
>>>
>>> Right now, I see connections building on one backend hitting 150 in the
>>> last few hours, but the backend nginx only shows about 20 active
>>> connections.
>>
>>Can you collect "show sess all" output from the admin socket?
>>
>>Lukas





Re: Haproxy running on 100% CPU and slow downloads

2016-05-12 Thread Lukas Tribus

Hi,


Am 12.05.2016 um 14:37 schrieb Sachin Shetty:

Hi Willy,

We are seeing a strange problem  on the patched server. We have several
haproxy servers running but only one with the latest patch, and this
haproxy has frozen twice in last two days, basically it hits max open
connections 2000 on frontend and then stalls. From the logs it has 1999
connections on one of the backends which is nginx, but nginx_status shows
me only a few active connections. It only happens on the patched haproxy
server and does not happen anywhere else. Interesting thing is this
haproxy is not the one doing SSL, we have two haproxies on the same box
with the latest binary, the SSL one seems ok but the non SSL one keeps on
accumulating connections.

Right now, I see connections building on one backend hitting 150 in the
last few hours, but the backend nginx only shows about 20 active
connections.


Can you collect "show sess all" output from the admin socket?

Lukas




Re: Haproxy running on 100% CPU and slow downloads

2016-05-12 Thread Sachin Shetty
Hi Willy,

We are seeing a strange problem  on the patched server. We have several
haproxy servers running but only one with the latest patch, and this
haproxy has frozen twice in last two days, basically it hits max open
connections 2000 on frontend and then stalls. From the logs it has 1999
connections on one of the backends which is nginx, but nginx_status shows
me only a few active connections. It only happens on the patched haproxy
server and does not happen anywhere else. Interesting thing is this
haproxy is not the one doing SSL, we have two haproxies on the same box
with the latest binary, the SSL one seems ok but the non SSL one keeps on
accumulating connections.

Right now, I see connections building on one backend hitting 150 in the
last few hours, but the backend nginx only shows about 20 active
connections. 


On 5/10/16, 5:47 PM, "Willy Tarreau"  wrote:

>On Tue, May 10, 2016 at 11:10:14AM +0530, Sachin Shetty wrote:
>> We deployed the latest and we saw throughput still dropped around peak
>> hours a bit, then we swithed to nbproc 4 which is holding up ok.
>
>So probably you were reaching the processing limits for a single process,
>that can easily happen with SSL if a lot of rekeying has to be done.
>
>> Note that
>> 4 Cpus was not sufficient earlier, so I believe the latest version is
>> scaling better. 
>
>Good, that confirms that you're not facing these bugs anymore. I'm
>currently
>starting a new release, that will make it easier for you to deploy.
>
>Thanks for the report,
>Willy
>





Re: Haproxy running on 100% CPU and slow downloads

2016-05-10 Thread Willy Tarreau
On Tue, May 10, 2016 at 11:10:14AM +0530, Sachin Shetty wrote:
> We deployed the latest and we saw throughput still dropped around peak
> hours a bit, then we swithed to nbproc 4 which is holding up ok.

So probably you were reaching the processing limits for a single process,
that can easily happen with SSL if a lot of rekeying has to be done.

> Note that
> 4 Cpus was not sufficient earlier, so I believe the latest version is
> scaling better. 

Good, that confirms that you're not facing these bugs anymore. I'm currently
starting a new release, that will make it easier for you to deploy.

Thanks for the report,
Willy




Re: Haproxy running on 100% CPU and slow downloads

2016-05-09 Thread Sachin Shetty
We deployed the latest and we saw throughput still dropped around peak
hours a bit, then we swithed to nbproc 4 which is holding up ok. Note that
4 Cpus was not sufficient earlier, so I believe the latest version is
scaling better. 

Thanks Lukas and Willy.


On 4/29/16, 11:09 AM, "Willy Tarreau"  wrote:

>Hi guys,
>
>On Tue, Apr 26, 2016 at 08:46:37AM +0200, Lukas Tribus wrote:
>> Hi Sachin,
>> 
>> 
>> there is another fix Willy recently committed, its ff9c7e24fb [1]
>> and its in the snapshots [2] since 1.6.4-20160426.
>> 
>> This is supposed to fix the issue altogether.
>> 
>> Please let us know if this works for you.
>
>Yes it should fix this. Please note that I've got one report in 1.5 of
>some huge transfers (multi-GB) stalling after this patch, and since I
>can't find any case where it could be wrong nor can I reproduce it, I
>suspect we may have a bug somewhere else (at least in 1.5) that was
>hidden by the bug this series of patches fix. We had no such report on
>1.6 however.
>
>There's another case of high CPU usage which Cyril managed to isolate.
>The issue has been present since 1.4 and is *very* hard to reproduce,
>I even had to tweek some sysctls on my laptop to see it and am careful
>not to reboot it. It is triggered by *some* pipelined requests. We're
>currently working on fixing it, there are several ways to fix it but
>all of them come with their downsides for now (one of them being a
>different code path between 1.7 and 1.6/1.5/1.4 which doesn't appeal
>me much).
>
>This is why I'm still waiting before issuing a new series of versions.
>
>In the mean time, feel free to test latest 1.6 snapshot and report any
>issues you may face. I've really committed into getting these issues
>fixed once for all, it's getting irritating to see such bugs surviving
>but I never give up the fight :-)
>
>Best regards,
>Willy
>





Re: Haproxy running on 100% CPU and slow downloads

2016-04-28 Thread Sachin Shetty
Thanks Lukas and Willy. I am in the process of getting 1.6.4-20160426
deployed in our QA, I will keep you guys posted.


On 4/29/16, 11:09 AM, "Willy Tarreau"  wrote:

>Hi guys,
>
>On Tue, Apr 26, 2016 at 08:46:37AM +0200, Lukas Tribus wrote:
>> Hi Sachin,
>> 
>> 
>> there is another fix Willy recently committed, its ff9c7e24fb [1]
>> and its in the snapshots [2] since 1.6.4-20160426.
>> 
>> This is supposed to fix the issue altogether.
>> 
>> Please let us know if this works for you.
>
>Yes it should fix this. Please note that I've got one report in 1.5 of
>some huge transfers (multi-GB) stalling after this patch, and since I
>can't find any case where it could be wrong nor can I reproduce it, I
>suspect we may have a bug somewhere else (at least in 1.5) that was
>hidden by the bug this series of patches fix. We had no such report on
>1.6 however.
>
>There's another case of high CPU usage which Cyril managed to isolate.
>The issue has been present since 1.4 and is *very* hard to reproduce,
>I even had to tweek some sysctls on my laptop to see it and am careful
>not to reboot it. It is triggered by *some* pipelined requests. We're
>currently working on fixing it, there are several ways to fix it but
>all of them come with their downsides for now (one of them being a
>different code path between 1.7 and 1.6/1.5/1.4 which doesn't appeal
>me much).
>
>This is why I'm still waiting before issuing a new series of versions.
>
>In the mean time, feel free to test latest 1.6 snapshot and report any
>issues you may face. I've really committed into getting these issues
>fixed once for all, it's getting irritating to see such bugs surviving
>but I never give up the fight :-)
>
>Best regards,
>Willy
>





Re: Haproxy running on 100% CPU and slow downloads

2016-04-28 Thread Willy Tarreau
Hi guys,

On Tue, Apr 26, 2016 at 08:46:37AM +0200, Lukas Tribus wrote:
> Hi Sachin,
> 
> 
> there is another fix Willy recently committed, its ff9c7e24fb [1]
> and its in the snapshots [2] since 1.6.4-20160426.
> 
> This is supposed to fix the issue altogether.
> 
> Please let us know if this works for you.

Yes it should fix this. Please note that I've got one report in 1.5 of
some huge transfers (multi-GB) stalling after this patch, and since I
can't find any case where it could be wrong nor can I reproduce it, I
suspect we may have a bug somewhere else (at least in 1.5) that was
hidden by the bug this series of patches fix. We had no such report on
1.6 however.

There's another case of high CPU usage which Cyril managed to isolate.
The issue has been present since 1.4 and is *very* hard to reproduce,
I even had to tweek some sysctls on my laptop to see it and am careful
not to reboot it. It is triggered by *some* pipelined requests. We're
currently working on fixing it, there are several ways to fix it but
all of them come with their downsides for now (one of them being a
different code path between 1.7 and 1.6/1.5/1.4 which doesn't appeal
me much).

This is why I'm still waiting before issuing a new series of versions.

In the mean time, feel free to test latest 1.6 snapshot and report any
issues you may face. I've really committed into getting these issues
fixed once for all, it's getting irritating to see such bugs surviving
but I never give up the fight :-)

Best regards,
Willy




Re: Haproxy running on 100% CPU and slow downloads

2016-04-26 Thread Lukas Tribus

Hi Sachin,


there is another fix Willy recently committed, its ff9c7e24fb [1]
and its in the snapshots [2] since 1.6.4-20160426.

This is supposed to fix the issue altogether.

Please let us know if this works for you.



Thanks,

Lukas


[1] 
http://www.haproxy.org/git?p=haproxy-1.6.git;a=commitdiff_plain;h=ff9c7e24fbbc33074e5257297e38473a3411f407

[2] http://www.haproxy.org/download/1.6/src/snapshot/





Re: Haproxy running on 100% CPU and slow downloads

2016-04-25 Thread Sachin Shetty
Hi Lukas,

We tried the patch, it seems better. As soon as we switched nbproc off,
throughput did not drop immediately like it did with earlier version, it
started deteriorating slowly as traffic increased to peak hours, but
eventually it did crash to the same levels as before.

CPU Usage was also better, only at peak hours I saw 100% CPU consumed by
haproxy, other wise it would be between 60-80%.

Please see attached image measuring througput, nbproc=20 until ~10PM,
nbroc=1 from ~10PM to ~10AM, nbproc reverted to 20 from 10 AM onwards.
Y-axis is speed in MBPS.

Thanks
Sachin

On 4/21/16, 12:57 PM, "Lukas Tribus"  wrote:

>Hi,
>
>
>Am 21.04.2016 um 08:11 schrieb Sachin Shetty:
>> Hi,
>>
>> any hints to further isolate this - we have deferred the problem by
>>adding
>> all the cores we had, but I have a feeling that our request rate is not
>> that high (7K per minute a peak)  and it will show up again as traffic
>> increases.
>>
>> Thanks
>> Sachin
>>
>
>Try the fix 9c09ee87 [1], which is in snapshots since 1.6.4-20160412.
>
>
>cheers,
>
>lukas
>
>[1] 
>http://www.haproxy.org/git?p=haproxy-1.6.git;a=commitdiff_plain;h=9c09ee87
>836bb2efd78a17f9b16d8afe0ec64018;hp=3bee40bfb7a35b624c5cc9d88daff5a9e3b99f
>33
>[2] http://www.haproxy.org/download/1.6/src/snapshot/



Re: Haproxy running on 100% CPU and slow downloads

2016-04-21 Thread Lukas Tribus

Hi,


Am 21.04.2016 um 08:11 schrieb Sachin Shetty:

Hi,

any hints to further isolate this - we have deferred the problem by adding
all the cores we had, but I have a feeling that our request rate is not
that high (7K per minute a peak)  and it will show up again as traffic
increases.

Thanks
Sachin



Try the fix 9c09ee87 [1], which is in snapshots since 1.6.4-20160412.


cheers,

lukas

[1] 
http://www.haproxy.org/git?p=haproxy-1.6.git;a=commitdiff_plain;h=9c09ee87836bb2efd78a17f9b16d8afe0ec64018;hp=3bee40bfb7a35b624c5cc9d88daff5a9e3b99f33

[2] http://www.haproxy.org/download/1.6/src/snapshot/



Re: Haproxy running on 100% CPU and slow downloads

2016-04-21 Thread Sachin Shetty
Hi,

any hints to further isolate this - we have deferred the problem by adding
all the cores we had, but I have a feeling that our request rate is not
that high (7K per minute a peak)  and it will show up again as traffic
increases. 

Thanks
Sachin

On 4/18/16, 12:22 PM, "Sachin Shetty"  wrote:

>Hi Lukas,
>
>We upgraded to 1.6, went back to nbproc 1 from 12 and the problem showed
>up again. Haproxy hitting 90-100% and monitors reported download speed
>drop from 100MBPS to 10MBPS immediately.
>
>I ran strace as you said, output it huge, have attached a small subset of
>it in the email. Please let me know if you need more of strace output.
>
>Thanks
>Sachin
>
>
>
>On 4/7/16, 5:51 PM, "Lukas Tribus"  wrote:
>
>>Hi,
>>
>>Am 05.04.2016 um 09:38 schrieb Sachin Shetty:
>>> Hi Lukas, Pavlos,
>>>
>>> Thanks for your response, more info as requested.
>>>
>>> 1. Attached conf with some obfuscation
>>> 2. Haproxy -vv
>>> HA-Proxy version 1.5.4 2014/09/02
>>> Copyright 2000-2014 Willy Tarreau 
>>>
>>
>>I would upgrade to something more recent, the number of bugfixes
>>since 1.5.4 amount to more than 100!
>>
>>That said, I've not stumbled upon a particular bug explaining what
>>you are seeing.
>>
>>My suggestion would be to go back to nbproc 1 (its easier to
>>troubleshoot), and run the 100% spinning process through
>>strace -tt -p and post the output.
>>
>>
>>
>>
>>Thanks,
>>
>>Lukas





Re: Haproxy running on 100% CPU and slow downloads

2016-04-18 Thread Sachin Shetty
Hi Lukas,

We upgraded to 1.6, went back to nbproc 1 from 12 and the problem showed
up again. Haproxy hitting 90-100% and monitors reported download speed
drop from 100MBPS to 10MBPS immediately.

I ran strace as you said, output it huge, have attached a small subset of
it in the email. Please let me know if you need more of strace output.

Thanks
Sachin



On 4/7/16, 5:51 PM, "Lukas Tribus"  wrote:

>Hi,
>
>Am 05.04.2016 um 09:38 schrieb Sachin Shetty:
>> Hi Lukas, Pavlos,
>>
>> Thanks for your response, more info as requested.
>>
>> 1. Attached conf with some obfuscation
>> 2. Haproxy -vv
>> HA-Proxy version 1.5.4 2014/09/02
>> Copyright 2000-2014 Willy Tarreau 
>>
>
>I would upgrade to something more recent, the number of bugfixes
>since 1.5.4 amount to more than 100!
>
>That said, I've not stumbled upon a particular bug explaining what
>you are seeing.
>
>My suggestion would be to go back to nbproc 1 (its easier to
>troubleshoot), and run the 100% spinning process through
>strace -tt -p and post the output.
>
>
>
>
>Thanks,
>
>Lukas

23:30:41.257757 sendto(120, "...", 16384, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_MORE, 
NULL, 0) = 16384
23:30:41.258001 sendto(87, "...", 919, MSG_DONTWAIT|MSG_NOSIGNAL, NULL, 0) = 919
23:30:41.258077 read(33, "\27\3\3\0020", 5) = 5
23:30:41.258134 read(33, "...", 560) = 560
23:30:41.258201 read(3, "\26\3\3\0F", 5) = 5
23:30:41.258244 read(3, "...", 70) = 70
23:30:41.259294 read(3, "\24\3\3\0\1", 5) = 5
23:30:41.259347 read(3, "\1", 1)= 1
23:30:41.259514 read(3, "\26\3\3\0@", 5) = 5
23:30:41.259559 read(3, "...", 64) = 64
23:30:41.259668 write(3, "...", 75) = 75
23:30:41.259748 read(3, 0x7feeaed21343, 5) = -1 EAGAIN (Resource temporarily 
unavailable)
23:30:41.259818 read(71, "\26\3\1\2\6", 5) = 5
23:30:41.259863 read(71, "...", 518) = 518
23:30:41.280711 read(71, "\24\3\1\0\1", 5) = 5
23:30:41.280790 read(71, "\1", 1)   = 1
23:30:41.280967 read(71, "\26\3\1\", 5) = 5
23:30:41.281012 read(71, "...", 48) = 48
23:30:41.281121 write(71, "...", 59) = 59
23:30:41.281199 read(71, 0x7feeaed21343, 5) = -1 EAGAIN (Resource temporarily 
unavailable)
23:30:41.281246 read(51, "...", 14977) = 14977
23:30:41.281405 sendto(56, "...", 16384, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_MORE, 
NULL, 0) = 16384
23:30:41.281472 read(38, 0x7feeaeb15183, 5) = -1 EAGAIN (Resource temporarily 
unavailable)
23:30:41.281517 read(140, "...", 7677) = 5840
23:30:41.281562 read(140, 0x7feeaec87a2b, 1837) = -1 EAGAIN (Resource 
temporarily unavailable)
23:30:41.281605 read(45, "\27\3\3\2\240", 5) = 5
23:30:41.281647 read(45, "...", 672) = 672
23:30:41.281699 read(31, "...", 48) = 48
23:30:41.281811 sendto(272, "...", 16384, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_MORE, 
NULL, 0) = 16384
23:30:41.281948 write(167, "...", 15525) = 15525
23:30:41.282025 read(72, "...", 15923) = 8184
23:30:41.282076 read(72, "...", 7739) = 1364
23:30:41.282119 read(72, 0x7feeaebf89c1, 6375) = -1 EAGAIN (Resource 
temporarily unavailable)
23:30:41.282162 read(24, "...", 1837) = 1837
23:30:41.282278 sendto(107, "...", 16384, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_MORE, 
NULL, 0) = 16384
23:30:41.282328 recvfrom(41, "...", 16384, 0, NULL, NULL) = 16384
23:30:41.282382 recvfrom(81, "...", 15360, 0, NULL, NULL) = 214
23:30:41.282438 write(21, "...", 389) = 389
23:30:41.282497 write(25, "...", 389) = 389
23:30:41.282563 write(25, "...", 53) = 53
23:30:41.282613 shutdown(25, SHUT_WR)   = 0
23:30:41.282660 read(18, 0x7feeae813be3, 5) = -1 EAGAIN (Resource temporarily 
unavailable)
23:30:41.282704 sendto(92, "...", 818, MSG_DONTWAIT|MSG_NOSIGNAL, NULL, 0) = 818
23:30:41.282753 read(39, 0x7feeae813be3, 5) = -1 EAGAIN (Resource temporarily 
unavailable)
23:30:41.282796 sendto(88, "...", 2062, MSG_DONTWAIT|MSG_NOSIGNAL, NULL, 0) = 
2062
23:30:41.282944 getsockname(33, {sa_family=AF_INET, sin_port=htons(443), 
sin_addr=inet_addr("Some-IP")}, [16]) = 0
23:30:41.283008 getsockopt(33, SOL_IP, 0x50 /* IP_??? */, 
"\2\0\1\273\n\31\220\17\0\0\0\0\0\0\0\0", [16]) = 0
23:30:41.283082 socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 77
23:30:41.283132 fcntl(77, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
23:30:41.283188 setsockopt(77, SOL_TCP, TCP_NODELAY, [1], 4) = 0
23:30:41.283233 connect(77, {sa_family=AF_INET, sin_port=htons(7300), 
sin_addr=inet_addr("Some-IP")}, 16) = -1 EINPROGRESS (Operation now in progress)
23:30:41.283415 getsockname(45, {sa_family=AF_INET, sin_port=htons(443), 
sin_addr=inet_addr("Some-IP")}, [16]) = 0
23:30:41.283467 getsockopt(45, SOL_IP, 0x50 /* IP_??? */, 
"\2\0\1\273\n\31\220\17\0\0\0\0\0\0\0\0", [16]) = 0
23:30:41.283521 socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 79
23:30:41.283565 fcntl(79, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
23:30:41.283605 setsockopt(79, SOL_TCP, TCP_NODELAY, [1], 4) = 0
23:30:41.283647 connect(79, {sa_family=AF_INET, sin_port=htons(9930), 
sin_addr=inet_addr("Some-IP")}, 16) = -1 EINPROGRESS (Operation now in progress)
23:30:41.283723 setsockopt(81, SOL_SOCKET, SO_LINGER, {onoff=1, linger=0}, 8) = 0

Re: Haproxy running on 100% CPU and slow downloads

2016-04-07 Thread Sachin Shetty
agree to both the points.

Thanks
Sachin

On 4/7/16, 11:24 PM, "Willy Tarreau"  wrote:

>On Thu, Apr 07, 2016 at 10:59:24PM +0530, Sachin Shetty wrote:
>> Hi Willy,
>> 
>> Sorry for the confusion. I wrote to you much before in my
>>investigation. I
>> will take care going forward.
>
>OK but in general the point remains, and it's not just for you but for
>everyone in general, the mailing list is here to reach around 1000 persons
>at once, so once your message is posted, you have to keep in mind that
>several of them will start to think about your problem even if they don't
>respond, which is why it is very important to be transparent about any
>progress made on parallel investigation or parallel contacts. Just like
>when you ask something to two distinct coworkers, one gives you a fast
>response, the other ones comes the next day and says "I've setup a lab
>yesterday to check what you asked me and I found this last night". You'll
>feel bad telling him "Oh I already got the response, thank you anyway".
>
>> Only now I realized that I messed up the version numbers because it
>>seems
>> we have different versions in our cluster.
>
>OK similarly there's nothing wrong telling errors in bug reports, we all
>do this because we test lots of stuff and we end up confusing things. But
>once you notice something was wrong, simply respond again and fix the
>information. Reliable versions helps eliminate candidate patches and also
>help people joining saying "same problem here".
>
>> We are now testing with 1.6.4 and trying to fast track it.
>
>OK thanks for the feedback!
>
>Willy
>





Re: Haproxy running on 100% CPU and slow downloads

2016-04-07 Thread Willy Tarreau
On Thu, Apr 07, 2016 at 10:59:24PM +0530, Sachin Shetty wrote:
> Hi Willy,
> 
> Sorry for the confusion. I wrote to you much before in my investigation. I
> will take care going forward.

OK but in general the point remains, and it's not just for you but for
everyone in general, the mailing list is here to reach around 1000 persons
at once, so once your message is posted, you have to keep in mind that
several of them will start to think about your problem even if they don't
respond, which is why it is very important to be transparent about any
progress made on parallel investigation or parallel contacts. Just like
when you ask something to two distinct coworkers, one gives you a fast
response, the other ones comes the next day and says "I've setup a lab
yesterday to check what you asked me and I found this last night". You'll
feel bad telling him "Oh I already got the response, thank you anyway".

> Only now I realized that I messed up the version numbers because it seems
> we have different versions in our cluster.

OK similarly there's nothing wrong telling errors in bug reports, we all
do this because we test lots of stuff and we end up confusing things. But
once you notice something was wrong, simply respond again and fix the
information. Reliable versions helps eliminate candidate patches and also
help people joining saying "same problem here".

> We are now testing with 1.6.4 and trying to fast track it.

OK thanks for the feedback!

Willy




Re: Haproxy running on 100% CPU and slow downloads

2016-04-07 Thread Sachin Shetty
Hi Willy,

Sorry for the confusion. I wrote to you much before in my investigation. I
will take care going forward.

Only now I realized that I messed up the version numbers because it seems
we have different versions in our cluster.

We are now testing with 1.6.4 and trying to fast track it.

Thanks
Sachin

On 4/7/16, 6:31 PM, "Willy Tarreau"  wrote:

>Hi Sachin,
>
>On Thu, Apr 07, 2016 at 02:21:16PM +0200, Lukas Tribus wrote:
>> Hi,
>> 
>> Am 05.04.2016 um 09:38 schrieb Sachin Shetty:
>> >Hi Lukas, Pavlos,
>> >
>> >Thanks for your response, more info as requested.
>> >
>> >1. Attached conf with some obfuscation
>> >2. Haproxy -vv
>> >HA-Proxy version 1.5.4 2014/09/02
>> >Copyright 2000-2014 Willy Tarreau 
>> >
>> 
>> I would upgrade to something more recent, the number of bugfixes
>> since 1.5.4 amount to more than 100!
>(...)
>
>I'm just discovering that you opened this thread twice in parallel,
>once with me in private and once with the ML, resulting in everyone
>doing the work twice and giving you the same advices twice. Please
>avoid this in the future, it wastes everyone's time and discourages
>people from responding to such questions. The place to ask is the ML,
>and if you contact someone privately please at least point to the
>public question so that the response is public and it saves others'
>valuable time.
>
>Also the version you reported to me was different :
>
>   HA-Proxy version 1.5.9 2014/11/25
>
>Thanks,
>Willy
>





Re: Haproxy running on 100% CPU and slow downloads

2016-04-07 Thread Willy Tarreau
Hi Sachin,

On Thu, Apr 07, 2016 at 02:21:16PM +0200, Lukas Tribus wrote:
> Hi,
> 
> Am 05.04.2016 um 09:38 schrieb Sachin Shetty:
> >Hi Lukas, Pavlos,
> >
> >Thanks for your response, more info as requested.
> >
> >1. Attached conf with some obfuscation
> >2. Haproxy -vv
> >HA-Proxy version 1.5.4 2014/09/02
> >Copyright 2000-2014 Willy Tarreau 
> >
> 
> I would upgrade to something more recent, the number of bugfixes
> since 1.5.4 amount to more than 100!
(...)

I'm just discovering that you opened this thread twice in parallel,
once with me in private and once with the ML, resulting in everyone
doing the work twice and giving you the same advices twice. Please
avoid this in the future, it wastes everyone's time and discourages
people from responding to such questions. The place to ask is the ML,
and if you contact someone privately please at least point to the
public question so that the response is public and it saves others'
valuable time.

Also the version you reported to me was different :

   HA-Proxy version 1.5.9 2014/11/25

Thanks,
Willy




Re: Haproxy running on 100% CPU and slow downloads

2016-04-07 Thread Lukas Tribus

Hi,

Am 05.04.2016 um 09:38 schrieb Sachin Shetty:

Hi Lukas, Pavlos,

Thanks for your response, more info as requested.

1. Attached conf with some obfuscation
2. Haproxy -vv
HA-Proxy version 1.5.4 2014/09/02
Copyright 2000-2014 Willy Tarreau 



I would upgrade to something more recent, the number of bugfixes
since 1.5.4 amount to more than 100!

That said, I've not stumbled upon a particular bug explaining what
you are seeing.

My suggestion would be to go back to nbproc 1 (its easier to
troubleshoot), and run the 100% spinning process through
strace -tt -p and post the output.




Thanks,

Lukas



Re: Haproxy running on 100% CPU and slow downloads

2016-04-05 Thread Sachin Shetty
Hi Lukas, Pavlos,

Thanks for your response, more info as requested.

1. Attached conf with some obfuscation
2. Haproxy -vv
HA-Proxy version 1.5.4 2014/09/02
Copyright 2000-2014 Willy Tarreau 


Build options :
  TARGET  = linux2628
  CPU = generic
  CC  = gcc
  CFLAGS  = -O2 -g -fno-strict-aliasing -DTCP_USER_TIMEOUT=18
  OPTIONS = USE_LINUX_TPROXY=1 USE_ZLIB=1 USE_REGPARM=1 USE_OPENSSL=1
USE_PCRE=1


Default settings :
  maxconn = 2000, bufsize = 16384, maxrewrite = 8192, maxpollevents = 200


Encrypted password support via crypt(3): yes
Built with zlib version : 1.2.7
Compression algorithms supported : identity, deflate, gzip
Built with OpenSSL version : OpenSSL 1.0.1e-fips 11 Feb 2013
Running on OpenSSL version : OpenSSL 1.0.1e-fips 11 Feb 2013
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports prefer-server-ciphers : yes
Built with PCRE version : 8.32 2012-11-30
PCRE library supports JIT : no (USE_PCRE_JIT not set)
Built with transparent proxy support using: IP_TRANSPARENT
IPV6_TRANSPARENT IP_FREEBIND


Available polling systems :
  epoll : pref=300,  test result OK
   poll : pref=200,  test result OK
 select : pref=150,  test result OK
Total: 3 (3 usable), will use epoll.

3. uname -a

Linux avl-www10.dc.egnyte.lan 3.10.0-327.10.1.el7.x86_64 #1 SMP Tue Feb 16
17:03:50 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
[sshetty@avl-www10 haproxy_l1_sync]$

4. rfc5077-client seems ok

[✔] Prepare tests.
[✔] Run tests without use of tickets.
[✔] Display result set:
│  IP address│ Try │ Cipher│ Reuse
│SSL Session ID   │  Master key │ Ticket │ Answer
│ 
───┼─┼───┼───┼─
┼─┼┼───
│ 208.83.105.14  │   0 │ ECDHE-RSA-AES256-SHA  │   ✘
│ 40A2D3E903C2457551… │ B4A08BB73457356AA2… │   ✘│ HTTP/1.1 200 OK
│ 208.83.105.14  │   1 │ ECDHE-RSA-AES256-SHA  │   ✔
│ 40A2D3E903C2457551… │ B4A08BB73457356AA2… │   ✘│ HTTP/1.1 200 OK
│ 208.83.105.14  │   2 │ ECDHE-RSA-AES256-SHA  │   ✔
│ 40A2D3E903C2457551… │ B4A08BB73457356AA2… │   ✘│ HTTP/1.1 200 OK
│ 208.83.105.14  │   3 │ ECDHE-RSA-AES256-SHA  │   ✔
│ 40A2D3E903C2457551… │ B4A08BB73457356AA2… │   ✘│ HTTP/1.1 200 OK
│ 208.83.105.14  │   4 │ ECDHE-RSA-AES256-SHA  │   ✔
│ 40A2D3E903C2457551… │ B4A08BB73457356AA2… │   ✘│ HTTP/1.1 200 OK
[✔] Dump results to file.
[✔] Run tests with use of tickets.
[✔] Display result set:
│  IP address│ Try │ Cipher│ Reuse
│SSL Session ID   │  Master key │ Ticket │ Answer
│ 
───┼─┼───┼───┼─
┼─┼┼───
│ 208.83.105.14  │   0 │ ECDHE-RSA-AES256-SHA  │   ✘
│ E4559330FD100E69F5… │ 05F768F5574FD27E88… │   ✔│ HTTP/1.1 200 OK
│ 208.83.105.14  │   1 │ ECDHE-RSA-AES256-SHA  │   ✔
│ E4559330FD100E69F5… │ 05F768F5574FD27E88… │   ✔│ HTTP/1.1 200 OK
│ 208.83.105.14  │   2 │ ECDHE-RSA-AES256-SHA  │   ✔
│ E4559330FD100E69F5… │ 05F768F5574FD27E88… │   ✔│ HTTP/1.1 200 OK
│ 208.83.105.14  │   3 │ ECDHE-RSA-AES256-SHA  │   ✔
│ E4559330FD100E69F5… │ 05F768F5574FD27E88… │   ✔│ HTTP/1.1 200 OK
│ 208.83.105.14  │   4 │ ECDHE-RSA-AES256-SHA  │   ✔
│ E4559330FD100E69F5… │ 05F768F5574FD27E88… │   ✔│ HTTP/1.1 200 OK
[✔] Dump results to file.







On 4/5/16, 12:14 AM, "Lukas Tribus"  wrote:

>Hi Sachin,
>
>
>(due to email troubles on my side this may look like a new thread, sorry
>about that)
>
>
> > We have quite a few regex and acls in our config, is there a way to
>profile
> > haproxy and see what could be slowing it down?
>
>You can use strace for syscalls or ltrace for library calls to see if
>something
>in particular shows up, but perf may be the better tool for this job (I
>never
>used it though).
>
>
>Like Pavlos said, lets collect some basic informations first:
>
>- haproxy -vv output
>- uname -a
>- configuration (replace proprietary informations but leave everything
>else intact)
>- does TLS resumption correctly work? Check with rfc5077-client:
>
>git clone https://github.com/vincentbernat/rfc5077.git
>cd rfc5077
>make rfc5077-client
>
>
>./rfc5077-client 
>
>
>
>There's a chance that it is SSL/TLS related.
>
>
>
>Regards,
>
>Lukas
>



haproxy.sync.conf
Description: Binary data


Re: Haproxy running on 100% CPU and slow downloads

2016-04-04 Thread Lukas Tribus

Hi Sachin,


(due to email troubles on my side this may look like a new thread, sorry
about that)


> We have quite a few regex and acls in our config, is there a way to 
profile

> haproxy and see what could be slowing it down?

You can use strace for syscalls or ltrace for library calls to see if 
something
in particular shows up, but perf may be the better tool for this job (I 
never

used it though).


Like Pavlos said, lets collect some basic informations first:

- haproxy -vv output
- uname -a
- configuration (replace proprietary informations but leave everything 
else intact)

- does TLS resumption correctly work? Check with rfc5077-client:

git clone https://github.com/vincentbernat/rfc5077.git
cd rfc5077
make rfc5077-client


./rfc5077-client 



There's a chance that it is SSL/TLS related.



Regards,

Lukas




Re: Haproxy running on 100% CPU and slow downloads

2016-04-04 Thread Pavlos Parissis
On 04/04/2016 05:23 μμ, Sachin Shetty wrote:
> Hi,
> 
> I am chasing some weird capacity issues in our setup. 
> 
> Haproxy which also does SSL is forwarding request to various other
> servers upstream. I am seeing a simple 100MB file download from our
> upstream components starts to slow down time to time like hitting as low
> as 1MBPS, usually is it greater than 100MBPS. When this happens, I tried
> downloading the file from the upstream component bypassing haproxy from
> the same box, and that is fast enough – 100MBPS. So it seems like
> haproxy is getting jammed on something. 

Did you use HTTPS on the server as well?

> 
> The only suspicious thing I see is that haproxy will be spinning on 100%
> CPU. So we added nbproc 4 and I still see the same pattern, when the
> speed drops, all haproxy proceses are hitting 80-100%. The request rate
> when the speed drops is about 5K/minute which is only 2X of requests
> when things are normal and download speeds are fine.

what is user and sys level of CPU?

> 
> We have quite a few regex and acls in our config, is there a way to
> profile haproxy and see what could be slowing it down?
> 

You better include the actual config, it will increase the level of
support that you may get.

Cheers,
Pavlos



signature.asc
Description: OpenPGP digital signature