I went ahead and added the performance tuning you recommended (changing the maxconn to 1024). Hopefully this adds some stability
As for the port, we’re using 1027 for our SSL traffic vs 443. We are currently getting SSL traffic that isn’t always failing on handshake. As for what is in front of our HAP: Our clients have HAP servers that are forwarding requests off to our HAP setup From: Igor Cicimov <[email protected]<mailto:[email protected]>> Date: Wednesday, March 16, 2016 at 6:51 PM To: Zachary Punches <[email protected]<mailto:[email protected]>> Cc: Baptiste <[email protected]<mailto:[email protected]>>, "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Subject: Re: Help! HAProxy randomly failing health checks! On Thu, Mar 17, 2016 at 12:46 PM, Igor Cicimov <[email protected]<mailto:[email protected]>> wrote: On Thu, Mar 17, 2016 at 11:14 AM, Zachary Punches <[email protected]<mailto:[email protected]>> wrote: I wanna say average is like 4-6 connections a second? Super minimal From what I’ve seen in the logs during the SSL errors, the log hangs then outputs a bunch of SSL errors all at once. Here it the output from sysctl –p net.ipv4.ip_forward = 0 net.ipv4.conf.default.rp_filter = 1 net.ipv4.conf.default.accept_source_route = 0 kernel.sysrq = 0 kernel.core_uses_pid = 1 net.ipv4.tcp_syncookies = 1 error: "net.bridge.bridge-nf-call-ip6tables" is an unknown key error: "net.bridge.bridge-nf-call-iptables" is an unknown key error: "net.bridge.bridge-nf-call-arptables" is an unknown key kernel.msgmnb = 65536 kernel.msgmax = 65536 kernel.shmmax = 68719476736 kernel.shmall = 4294967296 What would lowering the tune.ssl.default-dh-param to 1024 do? From: Igor Cicimov <[email protected]<mailto:[email protected]>> Date: Wednesday, March 16, 2016 at 5:01 PM To: Zachary Punches <[email protected]<mailto:[email protected]>> Cc: Baptiste <[email protected]<mailto:[email protected]>>, "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Subject: Re: Help! HAProxy randomly failing health checks! On Thu, Mar 17, 2016 at 10:55 AM, Zachary Punches <[email protected]<mailto:[email protected]>> wrote: Thanks for the reply! Ok so based on what you saw in my config, does it look like we’re misconfigured enough to cause this to happen? If we were misconfigured, one would assume we would go down all the time yeah? From: Igor Cicimov <[email protected]<mailto:[email protected]>> Date: Wednesday, March 16, 2016 at 4:50 PM To: Zachary Punches <[email protected]<mailto:[email protected]>> Cc: Baptiste <[email protected]<mailto:[email protected]>>, "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Subject: Re: Help! HAProxy randomly failing health checks! On Thu, Mar 17, 2016 at 10:47 AM, Igor Cicimov <[email protected]<mailto:[email protected]>> wrote: On Thu, Mar 17, 2016 at 5:29 AM, Zachary Punches <[email protected]<mailto:[email protected]>> wrote: I’m not, these guys aren’t sitting behind an ELB. They sit behind route53 routing. If one of the proxy boxes fails 3 checks in 30 seconds (with 4 checks done a second) then Route53 changes its routing from the first proxy box to the second On 3/15/16, 9:46 PM, "Baptiste" <[email protected]<mailto:[email protected]>> wrote: >Maybe you're checking a third party VM :) > AFAIK the Route53 health checks come from different points around the globe and it is possible that at some time of the day AWS has scheduled some specific end points to perform the HC. And it is possible that those ones have different SSL settings from the ones performing the HC during your day time. I would suggest you bring up this issue with AWS support, let them know your SSL cypher settings in HAP and ask if they are compatible with ALL their servers performing SSL health checks. I personally haven't seen any issues with failed SSL handshakes coming from AWS servers and have HAP's running in AU and UK regions. Igor That is if you are absolutely sure that the failed handshakes are not caused by overload or misconfigured (system) settings on HAP I was saying this in regards to system (kernel) settings. For example, assuming Unix/Linux is your net.core.somaxconn actually set *higher* than your maxconn which is set to 30000 and 15000 in HAP? Any other kernel settings you might have changed? (output of "sysctl -p" command) What is your pick hour load, how many connections/sessions are you seeing on each HAP? Another suggestion is maybe set tune.ssl.default-dh-param to 1024 and see if that helps. Ok, so on default ubuntu cloud instance this is what we have: # sysctl -a | grep maxconn net.core.somaxconn = 128 which is too low for production server. Check your value and adjust it to your needs. By the way, what is accept-proxy doing there in your setup? Is there anything else in front of HAP using PROXY protocol? Wait a minute: bind *:1027 # Health checking port are you maybe using https health check on a non SSL port?

