On Wed, Oct 14, 2015 at 3:03 PM, Krishna Kumar (Engineering) <[email protected]> wrote: > Hi all, > > We are occasionally getting these messages (about 25 errors/per occurrence, > 1 occurrence per hour) in the *error* log: > > 10.xx.xxx.xx:60086 [14/Oct/2015:04:21:25.048] Alert-FE > Alert-BE/10.xx.xx.xx 0/5000/1/32/+5033 200 +149 - - --NN 370/4/1/0/+1 > 0/0 {10.xx.x.xxx||367||} {|||432} "POST /fk-alert-service/nsca > HTTP/1.1" > 10.xx.xxx.xx:60046 [14/Oct/2015:04:21:19.936] Alert-FE > Alert-BE/10.xx.xx.xx 0/5000/1/21/+5022 200 +149 - - --NN 302/8/2/0/+1 > 0/0 {10.xx.x.xxx||237||} {|||302} "POST /fk-alert-service/nsca > HTTP/1.1" > ... > > We are unsure what errors were seen at the client. What could possibly be the > reason for these? Every error line has retries value as "+1", as seen above. > The > specific options in the configuration are (HAProxy v1.5.12): > > 1. "retries 1" > 2. "option redispatch" > 3. "option logasap" > 4. "timeout connect 5000", server and client timeouts are high - 300s > 5. Number of backend servers is 7. > 6. ulimit is 512K > 7. balance is "roundrobin" > > Thank you for any leads/insights. > > Regards, > - Krishna Kumar >
Hi Krishna, First, I don't understand how the "retries 1" and the "redispatch" works together in your case. I mean, redispatch is supposed to be applied at 'retries - 1'... So basically, what may be happening: - because of logasap, HAProxy does not wait until the end of the session to generate the log line - this log is in error because a connection was attempted (and failed) on a server You should not setup any ulimit and let HAProxy do the job for you. Baptiste

