Re: 500s with 1.4.18 and 1.5d7
On Monday 03 of October 2011 20:09:17 Hank A. Paulson wrote: I am not sure if these counts are exceeding the never threshold 500 when haproxy encounters an unrecoverable internal error, such as a memory allocation failure, which should never happen I am not sure what I can do to troubleshoot this since it is in prod :( Is there a way to set it to core dump and die when it has a 500? Are you sure, that these are not upstream server 500 errors? Best regards, Brane
Re: 500s with 1.4.18 and 1.5d7
On 10/3/11 12:19 PM, Brane F. Gračnar wrote: On Monday 03 of October 2011 20:09:17 Hank A. Paulson wrote: I am not sure if these counts are exceeding the never threshold 500 when haproxy encounters an unrecoverable internal error, such as a memory allocation failure, which should never happen I am not sure what I can do to troubleshoot this since it is in prod :( Is there a way to set it to core dump and die when it has a 500? Are you sure, that these are not upstream server 500 errors? Best regards, Brane Good point, I don't know how to differentiate from the haproxy logs which 500s originate from haproxy and which are passed through from the backend servers. I wish there was an easy way to tell since haproxy 500s are much more worrisome. Maybe ai am missing something...
Re: 500s with 1.4.18 and 1.5d7
On Mon, Oct 3, 2011 at 11:02 PM, Hank A. Paulson h...@spamproof.nospammail.net wrote: On 10/3/11 12:19 PM, Brane F. Gračnar wrote: On Monday 03 of October 2011 20:09:17 Hank A. Paulson wrote: I am not sure if these counts are exceeding the never threshold 500 when haproxy encounters an unrecoverable internal error, such as a memory allocation failure, which should never happen I am not sure what I can do to troubleshoot this since it is in prod :( Is there a way to set it to core dump and die when it has a 500? Are you sure, that these are not upstream server 500 errors? Best regards, Brane Good point, I don't know how to differentiate from the haproxy logs which 500s originate from haproxy and which are passed through from the backend servers. I wish there was an easy way to tell since haproxy 500s are much more worrisome. Maybe ai am missing something... on your log line, you may have a letter on the second character of the termination state flags. If there is a letter, then it means there has been an issue between HAProxy and your server. cheers
Re: 500s with 1.4.18 and 1.5d7
On Mon, Oct 03, 2011 at 11:17:14PM +0200, Baptiste wrote: On Mon, Oct 3, 2011 at 11:02 PM, Hank A. Paulson h...@spamproof.nospammail.net wrote: On 10/3/11 12:19 PM, Brane F. Gra??nar wrote: On Monday 03 of October 2011 20:09:17 Hank A. Paulson wrote: I am not sure if these counts are exceeding the never threshold 500 when haproxy encounters an unrecoverable internal error, such as a memory allocation failure, which should never happen I am not sure what I can do to troubleshoot this since it is in prod :( Is there a way to set it to core dump and die when it has a 500? Are you sure, that these are not upstream server 500 errors? Best regards, Brane Good point, I don't know how to differentiate from the haproxy logs which 500s originate from haproxy and which are passed through from the backend servers. I wish there was an easy way to tell since haproxy 500s are much more worrisome. Maybe ai am missing something... on your log line, you may have a letter on the second character of the termination state flags. If there is a letter, then it means there has been an issue between HAProxy and your server. There are very few situations where haproxy may emit a 500 right now : - lack of memory during session_accept() : the session will not be logged, so you will not see it in haproxy's logs ; - internal error : the first char of the flags in the log will *always* be I (for internal error) ; - tarpit : if you configured a tarpit and some users are experiencing it, you will see the flags PT in the logs (P for proxy, T for tarpit) ; There are also other hints. For instance, if you see that the session timers are still at -1 for the connect time or the response time, it is guaranteed that it cannot be the server which reported the response since it has not responded. I have planned to add two status codes in the future, one to log what the server reported, and one to log what haproxy sent to the client. It will make troubleshooting much easier. Regards, Willy