Hi,

Le 02/10/2015 22:48, Daren Sefcik a écrit :
I Hope this is the right place to ask for help..if not please flame me
and send me on my way....

So I had haproxy 1.5 installed (as a front end for a cluster of squid
proxies) on a low end Dell server with pfsense(PFS) 2.1.5 and was
experiencing slow down with 1500+ connections so I  built up a new PFS
2.2.4 machine on a brand new Dell R630  with 64gb RAM, Dual CPU,  bad
ass raid disks etc....loaded and configured haproxy with several squid
backends and some ICAP  backends. Things work great until I hit about
1500 or more connections and then everything just slows to a crawl.
Restarting haproxy helps momentarily but it will slow back down again
very quickly. If I offload clients to the point of only 300-400
connections it will become responsive again. In the haproxy stats page
it will show 97% idle or similar and the output from top will show maybe
5% cpu for haproxy. If I configure the browser client to use one of the
squid backends directly it works fast but as soon as I put the broswer
proxy config back to use the haproxy frontend IP it will slow down.

I am not really sure how to troubleshoot this and would appreciate any
help. I have done the usual searching and tried many of the fixes others
have posted but my problem continues. I can post any info here that
would help someone determine where my problems may be, I am just not
sure what is useful. Below are a few of my  essential configs to start with

This may not be the issue but first of all, you should read about the different maxconn keywords : - global : http://cbonte.github.io/haproxy-dconv/configuration-1.5.html#3.2-maxconn - for the proxy (listen/frontend) : http://cbonte.github.io/haproxy-dconv/configuration-1.5.html#4.2-maxconn - for each server : http://cbonte.github.io/haproxy-dconv/configuration-1.5.html#5.2-maxconn

Can you verify in your statistics that a limit is not reached ? Is it possible to provide a screenshot of the statistics page when it happens (hide any private information before).

See below for some comments.

TIA..!

*/var/etc/haproxy.cfg file contents:*
global
    maxconn         50000

=> Here you allow 50000 connections for all the frontends but...


    log         /var/run/log   local0   info
    stats socket /tmp/haproxy.socket level admin
    uid         80
    gid         80
    nbproc         1
    chroot         /tmp/haproxy_chroot
    daemon
    spread-checks 5

listen HAProxyLocalStats
    bind 127.0.0.1:2200 <http://127.0.0.1:2200> name localstats
    mode http
    stats enable
    stats admin if TRUE
    stats uri /haproxy_stats.php?haproxystats=1
    timeout client 5000
    timeout connect 5000
    timeout server 5000

frontend HTPL_PROXY

=> No maxconn is defined here (and there is no defaults section), so your frontend will only accept 2000 concurrent connections. After that, it won't accept any new connection until one is closed.

    bind 10.1.4.105:8181 <http://10.1.4.105:8181> name 10.1.4.105:8181
<http://10.1.4.105:8181>
    mode         http
    log         global
    option         http-server-close
    option         forwardfor
    acl https ssl_fc
    reqadd X-Forwarded-Proto:\ http if !https
    reqadd X-Forwarded-Proto:\ https if https

Unrelated, but your 2nd reqadd will never match, as you are 100% http

    timeout client      30000

As you didn't specify any "timeout http-keep-alive", your client will use HTTP Keep-alive for 30 seconds, and will only close idle connections after 30 seconds. You may want a lesser timeout for the keep-alive.

    default_backend      HTPL_WEB_PROXY_http_ipvANY

frontend HTPL_CONTENT_FILTER

=> Same here

    bind 10.1.4.106:8182 <http://10.1.4.106:8182> name 10.1.4.106:8182
<http://10.1.4.106:8182>
    mode         tcp
    log         global
    timeout client      30000
    default_backend      HTPL_CONT_FILTER_tcp_ipvANY

backend HTPL_WEB_PROXY_http_ipvANY
    mode         http
    cookie SERVERID insert indirect
    stick-table type ip size 1m expire 5m
    stick on src

This is not related to your issue, but are you sure you want to mix cookie persistence and stick tables ?

    balance         roundrobin
    timeout connect      50000
    timeout server      50000
    retries         3
    server         HTPL-PROXY-01 10.1.4.103:3128
<http://10.1.4.103:3128> cookie HTPLPROXY01 check inter 60000  weight
150 fastinter 1000 fall 5
    server         HTPL-PROXY-02 10.1.4.104:3128
<http://10.1.4.104:3128> cookie HTPLPROXY02 check inter 60000  weight
100 fastinter 1000 fall 5
    server         HTPL-PROXY-03 10.1.4.107:3128
<http://10.1.4.107:3128> cookie HTPLPROXY03 check inter 60000  weight 50
fastinter 1000 fall 5
    server         HTPL-PROXY-04 10.1.4.108:3128
<http://10.1.4.108:3128> cookie HTPLPROXY04 check inter 60000  weight
200 fastinter 1000 fall 5
    server         HTHPL-PROXY-01 10.1.4.101:3128
<http://10.1.4.101:3128> cookie HTHPLPROXY1 check inter 60000  weight
150 fastinter 1000 fall 5
    server         HTHPL-PROXY-02 10.1.4.102:3128
<http://10.1.4.102:3128> cookie HTPHLPROXY02 check inter 60000  weight
100 fastinter 1000 fall 5

backend HTPL_CONT_FILTER_tcp_ipvANY
    mode         tcp
    balance         roundrobin
    timeout connect      50000
    timeout server      50000
    retries         3
    server         HTHPL-PROXY-01 10.1.4.101:1344
<http://10.1.4.101:1344> check inter 60000 disabled weight 100 fastinter
1000 fall 5
    server         HTHPL-PROXY-02 10.1.4.102:1344
<http://10.1.4.102:1344> check inter 60000 disabled weight 100 fastinter
1000 fall 5
    server         HTPL-WEB-01 10.1.4.153:1344 <http://10.1.4.153:1344>
check inter 60000  weight 200 fastinter 1000 fall 5
    server         HTPL-WEB-02 10.1.4.154:1344 <http://10.1.4.154:1344>
check inter 60000  weight 200 fastinter 1000 fall 5

Some sysctl stuff
kern.ostype: FreeBSD
kern.osrelease: 10.1-RELEASE-p15
kern.osrevision: 199506
kern.version: FreeBSD 10.1-RELEASE-p15 #0 c5ab052(releng/10.1)-dirty:
Sat Jul 25 20:20:58 CDT 2015

root@pfs22-amd64-builder:/usr/obj.amd64/usr/pfSensesrc/src/sys/pfSense_SMP.10

kern.maxvnodes: 200000
kern.maxproc: 70788
kern.maxfiles: 204800
kern.argmax: 262144
kern.securelevel: -1
kern.hostname: HTPL-PROXY-03.hth.hightechhigh.org
<http://HTPL-PROXY-03.hth.hightechhigh.org>
kern.hostid: 1053306123
kern.clockrate: { hz = 1000, tick = 1000, profhz = 8128, stathz = 127 }
kern.posix1version: 200112
kern.ngroups: 1023
kern.job_control: 1
kern.saved_ids: 0
kern.boottime: { sec = 1443678149, usec = 901465 } Wed Sep 30 22:42:29 2015
kern.domainname:
kern.osreldate: 1001000
kern.bootfile: /boot/kernel/kernel
kern.maxfilesperproc: 300000
kern.maxprocperuid: 63709
kern.ipc.maxsockbuf: 4262144
kern.ipc.sockbuf_waste_factor: 8
kern.ipc.max_linkhdr: 16
kern.ipc.max_protohdr: 60
kern.ipc.max_hdr: 76
kern.ipc.max_datalen: 76
kern.ipc.maxmbufmem: 217774080
kern.ipc.nmbclusters: 262144
kern.ipc.nmbjumbop: 13291
kern.ipc.nmbjumbo9: 11814
kern.ipc.nmbjumbo16: 8860
kern.ipc.nmbufs: 1048590
kern.ipc.maxpipekva: 1071579136
kern.ipc.pipekva: 163840
kern.ipc.pipefragretry: 0
kern.ipc.pipeallocfail: 0
kern.ipc.piperesizefail: 0
kern.ipc.piperesizeallowed: 1
kern.ipc.msgmax: 16384
kern.ipc.msgmni: 40
kern.ipc.msgmnb: 8192
kern.ipc.msgtql: 2048
kern.ipc.msgssz: 32
kern.ipc.msgseg: 512
kern.ipc.semmni: 50
kern.ipc.semmns: 340
kern.ipc.semmnu: 150
kern.ipc.semmsl: 340
kern.ipc.semopm: 100
kern.ipc.semume: 50
kern.ipc.semusz: 632
kern.ipc.semvmx: 32767
kern.ipc.semaem: 16384
kern.ipc.shmmax: 536870912
kern.ipc.shmmin: 1
kern.ipc.shmmni: 192
kern.ipc.shmseg: 128
kern.ipc.shmall: 131072
kern.ipc.shm_use_phys: 0
kern.ipc.shm_allow_removed: 0
kern.ipc.soacceptqueue: 4096
kern.ipc.numopensockets: 3448
kern.ipc.maxsockets: 2092935
kern.ipc.sendfile.readahead: 1
kern.dummy: 0
kern.ps_strings: 140737488351200
kern.usrstack: 140737488351232
kern.logsigexit: 1
kern.iov_max: 1024
kern.hostuuid: 1d9f393c-6870-11e5-9ebd-000e1e9c38d0
kern.cam.sort_io_queues: 1
kern.cam.boot_delay: 0
kern.cam.num_doneqs: 6
kern.cam.dflags: 0
kern.cam.debug_delay: 0
kern.cam.pmp.retry_count: 1
kern.cam.pmp.default_timeout: 30
kern.cam.pmp.hide_special: 1
kern.cam.cam_srch_hi: 0
kern.cam.scsi_delay: 5000
kern.cam.cd.poll_period: 3
kern.cam.cd.retry_count: 4
kern.cam.cd.timeout: 30000
kern.cam.ada.legacy_aliases: 1
kern.cam.ada.retry_count: 4
kern.cam.ada.default_timeout: 30
kern.cam.ada.send_ordered: 1
kern.cam.ada.spindown_shutdown: 1
kern.cam.ada.spindown_suspend: 1
kern.cam.ada.read_ahead: 1
kern.cam.ada.write_cache: 1
kern.cam.da.poll_period: 3
kern.cam.da.retry_count: 4
kern.cam.da.default_timeout: 60
kern.cam.da.send_ordered: 1
kern.cam.enc.emulate_array_devices: 1
kern.tty_pty_warningcnt: 1
kern.random.adaptors: yarrow,dummy
kern.random.active_adaptor: yarrow
kern.random.live_entropy_sources: Hardware, Intel Secure Key RNG
kern.random.yarrow.gengateinterval: 10
kern.random.yarrow.bins: 10
kern.random.yarrow.fastthresh: 96
kern.random.yarrow.slowthresh: 128
kern.random.yarrow.slowoverthresh: 2
kern.random.sys.seeded: 1
kern.random.sys.harvest.ethernet: 0
kern.random.sys.harvest.point_to_point: 0
kern.random.sys.harvest.interrupt: 0
kern.random.sys.harvest.swi: 1
kern.rndtest.retest: 120
kern.rndtest.verbose: 1
kern.vt.enable_altgr: 1
kern.vt.debug: 0
kern.vt.deadtimer: 15
kern.vt.suspendswitch: 1
kern.vt.kbd_halt: 1
kern.vt.kbd_poweroff: 1
kern.vt.kbd_reboot: 1
kern.vt.kbd_debug: 1
kern.vt.kbd_panic: 0
kern.disks: mfisyspd9 mfisyspd8 mfisyspd7 mfisyspd6 mfisyspd5 mfisyspd4
mfisyspd3 mfisyspd2 mfisyspd1 mfisyspd0
kern.geom.eli.version: 7
kern.geom.eli.debug: 0
kern.geom.eli.tries: 3
kern.geom.eli.visible_passphrase: 0
kern.geom.eli.overwrites: 5
kern.geom.eli.threads: 0
kern.geom.eli.batch: 0
kern.geom.eli.boot_passcache: 1
kern.geom.eli.key_cache_limit: 8192
kern.geom.eli.key_cache_hits: 0
kern.geom.eli.key_cache_misses: 0
kern.geom.dev.delete_max_sectors: 262144
kern.geom.disk.mfisyspd0.led:
kern.geom.disk.mfisyspd1.led:
kern.geom.disk.mfisyspd2.led:
kern.geom.disk.mfisyspd3.led:
kern.geom.disk.mfisyspd4.led:
kern.geom.disk.mfisyspd5.led:
kern.geom.disk.mfisyspd6.led:
kern.geom.disk.mfisyspd7.led:
kern.geom.disk.mfisyspd8.led:
kern.geom.disk.mfisyspd9.led:
kern.geom.transient_maps: 33202
kern.geom.transient_map_retries: 10
kern.geom.transient_map_hard_failures: 0
kern.geom.transient_map_soft_failures: 0
kern.geom.inflight_transient_maps: 0
kern.geom.confxml: <mesh>


--
Cyril Bonté

Reply via email to