Hello,




I am currently running Haproxy 1.6.14-1ppa1~xenial-66af4a1 2018/01/06. There 
are many features that were implemented in 1.8, 1.9 and 2.0 that would benefit 
my deployments.  I tested 2.0.3-1ppa1~xenial last night but unfortunately found 
it to be using excessive amounts of CPU and had to revert.  For this 
implementation, I have two separate use cases in haproxy:  first being external 
HTTP/HTTPS load balancing to a cluster from external clients, the second being 
HTTP internal load balancing between the two different applications (for 
simplicity sake we can call them front and back).  The excessive CPU was 
noticed on the second implementation, HTTP between the front and back 
applications.   I previously leveraged nbproc and cpu-map to isolate the use 
cases, but in 2.0 moved to nbthread (default) and cpu-map (auto) to isolate.  
The CPU usage was so excessive that I had to move the second implementation to 
two cores to not utilize 100% of the processer and still I was getting 
timeouts.  It took some time to rewrite the config files from 1.6 to 2.0 but I 
was able to get them all configured properly and leveraged top and mpstat to 
ensure threads and use cases were on the proper cores.





Because of the problems with usage case #2 I did not even get a chance to 
evaluate use case #1, but again, I use cpu-map and 'process' to isolate these 
use cases as much as possible.   Upon reverting back to 1.6 (install and 
configs) everything worked as expected.







Here is the CPU usage on 1.6 from mpstat -P ALL 5:

08:33:02 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  
%guest  %gnice   %idle 

08:33:07 PM    0    7.48    0.00   16.63    0.00    0.00    0.00    0.00    
0.00    0.00   75.88







Here is the CPU usage on 2.0.3 when using one thread:

08:29:35 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  
%guest  %gnice   %idle

08:29:40 PM   39   35.28    0.00   55.24    0.00    0.00    0.00    0.00    
0.00    0.00    9.48





Here is the CPU usage on 2.0.3 when using two threads (the front application 
still experienced timeouts to the back application even without 100% cpu 
utilization on the cores):

08:30:48 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  
%guest  %gnice   %idle

08:30:53 PM    0   22.93    0.00   19.75    0.00    0.00    0.00    0.00    
0.00    0.00   57.32

08:30:53 PM   39   21.60    0.00   25.10    0.00    0.00    0.00    0.00    
0.00    0.00   53.29







Also, note, our front generally keeps connections open to our back for an 
extended period of time as it pools them internally, so many requests are sent 
over the connection via HTTP/1.1 keep-alive connections.  I think we had 
roughly ~1000 connections established during these tests.





Some configurations that might be relevant to your analysis (there are more but 
they are pretty much standard, such as user, group, stats, log, chroot, etc):



global

        cpu-map auto:1/1-40 0-39



        maxconn 500000



        spread-checks 2



        server-state-file global


        server-state-base /var/lib/haproxy/





defaults

        option  dontlognull 

        option  dontlog-normal

        option  redispatch



        option  tcp-smart-accept 

        option  tcp-smart-connect



        timeout connect 2s

        timeout client  50s

        timeout server  50s

        timeout client-fin 1s

        timeout server-fin 1s





This part has been sanitized and I reduced the number of servers from 14 to 2.



listen back

        bind    10.0.0.251:8080    defer-accept  process 1/40

        bind    10.0.0.252:8080    defer-accept  process 1/40

        bind    10.0.0.253:8080    defer-accept  process 1/40

        bind    10.0.0.254:8080    defer-accept  process 1/40



        mode    http

        maxconn 65000


        fullconn 65000



        balance leastconn 

        http-reuse safe



       source 10.0.1.100



       option httpchk GET /ping HTTP/1.0 

       http-check expect string OK



        server  s1     10.0.2.1:8080   check agent-check agent-port 8009 
agent-inter 250ms inter 500ms fastinter 250ms downinter 1000ms weight 100 
source 10.0.1.100

        server  s2     10.0.2.2:8080   check agent-check agent-port 8009 
agent-inter 250ms inter 500ms fastinter 250ms downinter 1000ms weight 100 
source 10.0.1.101





To configure multiple cores, I changed the bind line to add 'process 1/1' I 
also removed process 1/1 from the other use case.







The OS is Ubuntu 16.04.3 LTS, procs are 2x E5-2630, 64GB of RAM.  The output 
from haproxy -vv looked very typical between both, epoll, openssl 1.0.2g (not 
used in this case), etc.





Please let me know if there is any additional information I can provide to 
assist in isolating the cause of this issue.







Thank you!



Nick

Reply via email to