Re: 500s with 1.4.18 and 1.5d7

2011-10-03 Thread Brane F. Gračnar
On Monday 03 of October 2011 20:09:17 Hank A. Paulson wrote:
 I am not sure if these counts are exceeding the never threshold
 
 500  when haproxy encounters an unrecoverable internal error, such as a
  memory allocation failure, which should never happen
 
 I am not sure what I can do to troubleshoot this since it is in prod :(
 Is there a way to set it to core dump and die when it has a 500?

Are you sure, that these are not upstream server 500 errors?

Best regards, Brane



Re: 500s with 1.4.18 and 1.5d7

2011-10-03 Thread Hank A. Paulson

On 10/3/11 12:19 PM, Brane F. Gračnar wrote:

On Monday 03 of October 2011 20:09:17 Hank A. Paulson wrote:

I am not sure if these counts are exceeding the never threshold

 500  when haproxy encounters an unrecoverable internal error, such as a
  memory allocation failure, which should never happen

I am not sure what I can do to troubleshoot this since it is in prod :(
Is there a way to set it to core dump and die when it has a 500?


Are you sure, that these are not upstream server 500 errors?

Best regards, Brane


Good point, I don't know how to differentiate from the haproxy logs which 500s 
originate from haproxy and which are passed through from the backend servers. 
I wish there was an easy way to tell since haproxy 500s are much more 
worrisome. Maybe ai am missing something...




Re: 500s with 1.4.18 and 1.5d7

2011-10-03 Thread Baptiste
On Mon, Oct 3, 2011 at 11:02 PM, Hank A. Paulson
h...@spamproof.nospammail.net wrote:
 On 10/3/11 12:19 PM, Brane F. Gračnar wrote:

 On Monday 03 of October 2011 20:09:17 Hank A. Paulson wrote:

 I am not sure if these counts are exceeding the never threshold

     500  when haproxy encounters an unrecoverable internal error, such as
 a
          memory allocation failure, which should never happen

 I am not sure what I can do to troubleshoot this since it is in prod :(
 Is there a way to set it to core dump and die when it has a 500?

 Are you sure, that these are not upstream server 500 errors?

 Best regards, Brane

 Good point, I don't know how to differentiate from the haproxy logs which
 500s originate from haproxy and which are passed through from the backend
 servers. I wish there was an easy way to tell since haproxy 500s are much
 more worrisome. Maybe ai am missing something...



on your log line, you may have a letter on the second character of the
termination state flags.
If there is a letter, then it means there has been an issue between
HAProxy and your server.

cheers



Re: 500s with 1.4.18 and 1.5d7

2011-10-03 Thread Willy Tarreau
On Mon, Oct 03, 2011 at 11:17:14PM +0200, Baptiste wrote:
 On Mon, Oct 3, 2011 at 11:02 PM, Hank A. Paulson
 h...@spamproof.nospammail.net wrote:
  On 10/3/11 12:19 PM, Brane F. Gra??nar wrote:
 
  On Monday 03 of October 2011 20:09:17 Hank A. Paulson wrote:
 
  I am not sure if these counts are exceeding the never threshold
 
      500  when haproxy encounters an unrecoverable internal error, such as
  a
           memory allocation failure, which should never happen
 
  I am not sure what I can do to troubleshoot this since it is in prod :(
  Is there a way to set it to core dump and die when it has a 500?
 
  Are you sure, that these are not upstream server 500 errors?
 
  Best regards, Brane
 
  Good point, I don't know how to differentiate from the haproxy logs which
  500s originate from haproxy and which are passed through from the backend
  servers. I wish there was an easy way to tell since haproxy 500s are much
  more worrisome. Maybe ai am missing something...
 
 
 
 on your log line, you may have a letter on the second character of the
 termination state flags.
 If there is a letter, then it means there has been an issue between
 HAProxy and your server.

There are very few situations where haproxy may emit a 500 right now :
  - lack of memory during session_accept() : the session will not be
logged, so you will not see it in haproxy's logs ;

  - internal error : the first char of the flags in the log will *always*
be I (for internal error) ;

  - tarpit : if you configured a tarpit and some users are experiencing it,
you will see the flags PT in the logs (P for proxy, T for tarpit) ;

There are also other hints. For instance, if you see that the session timers
are still at -1 for the connect time or the response time, it is guaranteed
that it cannot be the server which reported the response since it has not
responded.

I have planned to add two status codes in the future, one to log what the
server reported, and one to log what haproxy sent to the client. It will
make troubleshooting much easier.

Regards,
Willy