Hi, Please excuse me if the information contained in this email is a bit generic. I'm not the regular administrator, but I've been given the task of troubleshooting some issues with my website. If more details are needed, I'll gladly go look for additional information.
Currently, my website is experiencing a lot of errors and slowness. The server errors I see in the HAProxy log file are mainly 503 errors. The slowness of page loads for my website can be as long as minutes. I'm trying to determine if anyone have had any similar issues with using HAProxy as a high availability load balancer. HAProxy 1.3.21 CentOS running on Citrix XenServer HP blades There are actually almost 100 virtual servers running on the blades. A good many of the virtual servers are application servers running Glassfish. There are a few servers dedicated to CAS for authentication and access. I have three servers running Rsyslogd for writing HAProxy log data to file. A NetApp filer is used for storage. Currently, the website gets about: 73,000 pageviews a day 32,000 unique visitors a day 46,000 visit a day 3,000 pageviews a hr 1,300 unique visitors a hr 1,000 visit a hr I am using Akamai to help manage content delivery. One of the things Akamai is reporting to me is that they are having difficulty requesting content that needs to be refreshed. Akamai tries up to 4 times to get the content with a 2 second timeout to update content whose TTL has expired. After the 4th time, Akamai looks to their own cache before returning a 503 error to the user if the content is not available in the cache. Recently, I've noticed that Akamai is encountering an increasingly large number of 503 and 404 errors from my website. I've traced the 404 errors to missing images, but I'm not sure what the cause of the 503 errors could be. I had some external resources help me verify that they are able to retrieve the content from the Glassfish application servers even when HAProxy is reporting the 503 errors. One thing I did notice about the HAProxy configuration is that there are actually three servers running HAProxy with identical configurations. One serves as the primary high availability load balancer while the other two act as failovers. The keep-alive daemons are configured to accomodate that setup. From this generic description, is there something in the way this architecture is set up or in the configuration of HAProxy that may be causing the 503 errors to be reported to Akamai? As I mentioned, when an external resource makes a request for the same content directly from the application server, the same errors do not appear to occur. thanks, C.Y.

