Hello,
I am currently running Haproxy 1.6.14-1ppa1~xenial-66af4a1 2018/01/06. There are many features that were implemented in 1.8, 1.9 and 2.0 that would benefit my deployments. I tested 2.0.3-1ppa1~xenial last night but unfortunately found it to be using excessive amounts of CPU and had to revert. For this implementation, I have two separate use cases in haproxy: first being external HTTP/HTTPS load balancing to a cluster from external clients, the second being HTTP internal load balancing between the two different applications (for simplicity sake we can call them front and back). The excessive CPU was noticed on the second implementation, HTTP between the front and back applications. I previously leveraged nbproc and cpu-map to isolate the use cases, but in 2.0 moved to nbthread (default) and cpu-map (auto) to isolate. The CPU usage was so excessive that I had to move the second implementation to two cores to not utilize 100% of the processer and still I was getting timeouts. It took some time to rewrite the config files from 1.6 to 2.0 but I was able to get them all configured properly and leveraged top and mpstat to ensure threads and use cases were on the proper cores. Because of the problems with usage case #2 I did not even get a chance to evaluate use case #1, but again, I use cpu-map and 'process' to isolate these use cases as much as possible. Upon reverting back to 1.6 (install and configs) everything worked as expected. Here is the CPU usage on 1.6 from mpstat -P ALL 5: 08:33:02 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle 08:33:07 PM 0 7.48 0.00 16.63 0.00 0.00 0.00 0.00 0.00 0.00 75.88 Here is the CPU usage on 2.0.3 when using one thread: 08:29:35 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle 08:29:40 PM 39 35.28 0.00 55.24 0.00 0.00 0.00 0.00 0.00 0.00 9.48 Here is the CPU usage on 2.0.3 when using two threads (the front application still experienced timeouts to the back application even without 100% cpu utilization on the cores): 08:30:48 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle 08:30:53 PM 0 22.93 0.00 19.75 0.00 0.00 0.00 0.00 0.00 0.00 57.32 08:30:53 PM 39 21.60 0.00 25.10 0.00 0.00 0.00 0.00 0.00 0.00 53.29 Also, note, our front generally keeps connections open to our back for an extended period of time as it pools them internally, so many requests are sent over the connection via HTTP/1.1 keep-alive connections. I think we had roughly ~1000 connections established during these tests. Some configurations that might be relevant to your analysis (there are more but they are pretty much standard, such as user, group, stats, log, chroot, etc): global cpu-map auto:1/1-40 0-39 maxconn 500000 spread-checks 2 server-state-file global server-state-base /var/lib/haproxy/ defaults option dontlognull option dontlog-normal option redispatch option tcp-smart-accept option tcp-smart-connect timeout connect 2s timeout client 50s timeout server 50s timeout client-fin 1s timeout server-fin 1s This part has been sanitized and I reduced the number of servers from 14 to 2. listen back bind 10.0.0.251:8080 defer-accept process 1/40 bind 10.0.0.252:8080 defer-accept process 1/40 bind 10.0.0.253:8080 defer-accept process 1/40 bind 10.0.0.254:8080 defer-accept process 1/40 mode http maxconn 65000 fullconn 65000 balance leastconn http-reuse safe source 10.0.1.100 option httpchk GET /ping HTTP/1.0 http-check expect string OK server s1 10.0.2.1:8080 check agent-check agent-port 8009 agent-inter 250ms inter 500ms fastinter 250ms downinter 1000ms weight 100 source 10.0.1.100 server s2 10.0.2.2:8080 check agent-check agent-port 8009 agent-inter 250ms inter 500ms fastinter 250ms downinter 1000ms weight 100 source 10.0.1.101 To configure multiple cores, I changed the bind line to add 'process 1/1' I also removed process 1/1 from the other use case. The OS is Ubuntu 16.04.3 LTS, procs are 2x E5-2630, 64GB of RAM. The output from haproxy -vv looked very typical between both, epoll, openssl 1.0.2g (not used in this case), etc. Please let me know if there is any additional information I can provide to assist in isolating the cause of this issue. Thank you! Nick