Hi Nick,

I've experienced increased CPU usage going from v1.7 to v1.9+v2.0.
Don't know if it's for the same reason as your workload. My thread subject
is "Upgrade from 1.7 to 2.0 = increased CPU usage".
Also there is a similar conversation on discourse,
https://discourse.haproxy.org/t/2-0-1-cpu-usage-at-near-100-after-upgrade-from-1-5/

/Elias


On Wed, Jul 24, 2019 at 9:43 PM ngaugler <ngaug...@ngworld.net> wrote:

> Hello,
>
>
> I am currently running Haproxy 1.6.14-1ppa1~xenial-66af4a1 2018/01/06.
> There are many features that were implemented in 1.8, 1.9 and 2.0 that
> would benefit my deployments.  I tested 2.0.3-1ppa1~xenial last night but
> unfortunately found it to be using excessive amounts of CPU and had to
> revert.  For this implementation, I have two separate use cases in
> haproxy:  first being external HTTP/HTTPS load balancing to a cluster from
> external clients, the second being HTTP internal load balancing between the
> two different applications (for simplicity sake we can call them front and
> back).  The excessive CPU was noticed on the second implementation, HTTP
> between the front and back applications.   I previously leveraged nbproc
> and cpu-map to isolate the use cases, but in 2.0 moved to nbthread
> (default) and cpu-map (auto) to isolate.  The CPU usage was so excessive
> that I had to move the second implementation to two cores to not utilize
> 100% of the processer and still I was getting timeouts.  It took some time
> to rewrite the config files from 1.6 to 2.0 but I was able to get them all
> configured properly and leveraged top and mpstat to ensure threads and use
> cases were on the proper cores.
>
>
> Because of the problems with usage case #2 I did not even get a chance to
> evaluate use case #1, but again, I use cpu-map and 'process' to isolate
> these use cases as much as possible.   Upon reverting back to 1.6 (install
> and configs) everything worked as expected.
>
>
>
> Here is the CPU usage on 1.6 from mpstat -P ALL 5:
> 08:33:02 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal
> %guest  %gnice   %idle
> 08:33:07 PM    0    7.48    0.00   16.63    0.00    0.00    0.00
> 0.00    0.00    0.00   75.88
>
>
>
> Here is the CPU usage on 2.0.3 when using one thread:
> 08:29:35 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal
> %guest  %gnice   %idle
> 08:29:40 PM   39   35.28    0.00   55.24    0.00    0.00    0.00
> 0.00    0.00    0.00    9.48
>
>
> Here is the CPU usage on 2.0.3 when using two threads (the front
> application still experienced timeouts to the back application even without
> 100% cpu utilization on the cores):
> 08:30:48 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal
> %guest  %gnice   %idle
> 08:30:53 PM    0   22.93    0.00   19.75    0.00    0.00    0.00
> 0.00    0.00    0.00   57.32
> 08:30:53 PM   39   21.60    0.00   25.10    0.00    0.00    0.00
> 0.00    0.00    0.00   53.29
>
>
>
> Also, note, our front generally keeps connections open to our back for an
> extended period of time as it pools them internally, so many requests are
> sent over the connection via HTTP/1.1 keep-alive connections.  I think we
> had roughly ~1000 connections established during these tests.
>
>
> Some configurations that might be relevant to your analysis (there are
> more but they are pretty much standard, such as user, group, stats, log,
> chroot, etc):
>
> global
>         cpu-map auto:1/1-40 0-39
>
>         maxconn 500000
>
>         spread-checks 2
>
>         server-state-file global
>         server-state-base /var/lib/haproxy/
>
>
> defaults
>         option  dontlognull
>         option  dontlog-normal
>         option  redispatch
>
>         option  tcp-smart-accept
>         option  tcp-smart-connect
>
>         timeout connect 2s
>         timeout client  50s
>         timeout server  50s
>         timeout client-fin 1s
>         timeout server-fin 1s
>
>
> This part has been sanitized and I reduced the number of servers from 14
> to 2.
>
> listen back
>         bind    10.0.0.251:8080    defer-accept  process 1/40
>         bind    10.0.0.252:8080    defer-accept  process 1/40
>         bind    10.0.0.253:8080    defer-accept  process 1/40
>         bind    10.0.0.254:8080    defer-accept  process 1/40
>
>         mode    http
>         maxconn 65000
>         fullconn 65000
>
>         balance leastconn
>         http-reuse safe
>
>        source 10.0.1.100
>
>        option httpchk GET /ping HTTP/1.0
>        http-check expect string OK
>
>         server  s1     10.0.2.1:8080   check agent-check agent-port 8009
> agent-inter 250ms inter 500ms fastinter 250ms downinter 1000ms weight 100
> source 10.0.1.100
>         server  s2     10.0.2.2:8080   check agent-check agent-port 8009
> agent-inter 250ms inter 500ms fastinter 250ms downinter 1000ms weight 100
> source 10.0.1.101
>
>
> To configure multiple cores, I changed the bind line to add 'process 1/1'
> I also removed process 1/1 from the other use case.
>
>
>
> The OS is Ubuntu 16.04.3 LTS, procs are 2x E5-2630, 64GB of RAM.  The
> output from haproxy -vv looked very typical between both, epoll, openssl
> 1.0.2g (not used in this case), etc.
>
>
> Please let me know if there is any additional information I can provide to
> assist in isolating the cause of this issue.
>
>
>
> Thank you!
>
> Nick
>
>

Reply via email to