You do have bunch of services that are http mode that don't seem to have any type of http close. Some I don't understand why they are not http mode and they probably should be.

Just a note you may be able to greatly simplify (and possibly speed up) your config using the new capabilities for tables of IPs added in 1.4.6.

solr should probably be http mode and anywhere else that you have http mode you probably want an http close option turned on.

I am not sure why they chose dispatch for the prod glassfish server, my guess is they are running apache and mod_jk or something and then forwarding the requests to different glassfish servers - are there really more than one prod glassfish servers? I am wondering if the previous admin set up more than one copy of haproxy and that is why several services are redirected to the same machine - like glassfish prod there is no other reference to port 4850 in this config, so what is running on port 4850? haproxy/apache/heaven forbid - glassfish itself? netstat -antope | fgrep LIST | fgrep 4850

I think one of the problems is the "inter_server" it doesn't have http mode set so if more than one hit/request comes in on an open connection then your request parsing rules are not run on any requests except the first one (as Wille keeps reminding people). That might work ok for most things since you are mostly breaking things up by service liferay goes to the liferay servers, etc - the problem comes in if you have a portal that people sign into and then have a menu/navbar that they can choose different services that should be going to different front/backends.

On 5/18/10 3:49 PM, Chih Yin wrote:


On Mon, May 17, 2010 at 11:11 PM, Hank A. Paulson
<[email protected] <mailto:[email protected]>> wrote:

    On 5/17/10 10:24 PM, Willy Tarreau wrote:

        On Mon, May 17, 2010 at 07:42:03PM -0700, Hank A. Paulson wrote:

            I have some sites running a similar set up - Xen domU,
            keepalived,
            fedora not RHEL and they get 50+ million hits per day with
            pretty
            fast response. you might want to use the "log separate
            errors" (sp?)
            option and review those 50X errors carefully, you might see
            a pattern
            - do you have http-close* in all you configs? That got me
            weird, slow
            results when I missed it once.


        Indeed, that *could* be a possibility if combined with a server
        maxconn
        because connections would be kept for a long time on the server
        (waiting
        for either the client or the server to close) and during that
        time nobody
        else could connect. The typical problem with keep-alive to the
        servers in
        fact. The 503 could be caused by requests waiting too long in
        the queue
        then.


    My example was just to assure Chin Yin that haproxy on xen should be
    able to handle his current load depending, of course, on the
    glassfish servers.

    I meant some kind of httpclose option
    (httpclose/forceclose/http-server-close/etc) turned on regardless of
    keep-alive status - you know, like you are always reminding people :)

    I noticed when I forgot it on a section (that was not keepalive
    related) it caused wacky results - hanging browsers,
    images/icons/css not showing up, etc. Obviously it should not affect
    single requests like you would assume Akamai would be sending, it
    was a pure guess.


Thank you everyone for your feedback.  I really appreciate your help.

Sorry for taking so long to respond.  I had to get permission from my
director to post some of the log data and our haproxy configuration
file.  I also had to hide a bit more of the configuration than was
suggested because of concerns about making the issues we're encountering
too public.  I hope you understand.

 From my research on HAProxy and high availability websites in general,
it seemed to me that compared to other websites, our traffic volume is
actually rather light.  In addition to how we have configured HAProxy
for our infrastructure, I'm definitely also taking a look at our
application servers and our content as well.

I started looking at the log files and the HAProxy configuration file
more closely today.
I attached the (poorly) cleaned HAProxy configuration file.  Looking at
it, I can already see that the httpclose option isn't consistently
included in all the sections, both the frontend and the backend.  I will
make sure this option is in all sections.  Should I also add this to the
global settings for HAProxy?  Is it okay if this option is listed more
than once in a section (I noticed that this happened a couple of times)?


        Chin Yin, Xani was right, please take a look at your logs. Also,
        sending
        us your config would help a lot. Replace IP addresses and
        passwords with
        "XXX" if you want, we'll comment on the rest. BTW you should
        tell your
        admin that 1.3.21 has an annoying bug which makes it crash when
        connecting
        to the stats socket. Thus, this reduces your possibilities of
        debugging it.
        When you have some time, you should upgrade it to 1.3.22 or
        later (1.3.24)
        which fix a small number of remaining bugs.

            example stats page screenshot attached.


        Nice stats Hank :-)


    That is just the page frames (mostly) not including images, css, js,
    static icons or any other "stuff" but neither is it just for one
    day, it is longer.


I have already reported to my director to let him know that we really
need to upgrade to 1.3.22 or later.

As for the logs, it seems that I'll need to look at the configuration
for HAProxy a bit more to make some adjustments first.  A few months
back, I know I saw messages indicating the status of server (e.g. 3
active, 2 backup).  I also see messages when the HAProxy configuration
was reloaded or when HAProxy was restarted.  I no longer see these
status messages in the log files.

That is a good reason to turn on the log separate errors option - the error go into both log files but it is easier to review the error log without all the normal accesses. It doesnt realy add any load, just makes life easier.

> I recall that the system
administrator who initially configured HAProxy mentioned that he removed
the logging of some inter-server traffic to make the log file sizes
smaller.  I'm wondering if he also removed these status messages as well.


Maybe, that would be surprising since those msgs should infrequent and are somewhat important - It is more probable that they adjusted the apache logging (for example on the cas servers) to not log the hits to /security/check.txt given that you are hitting it all the cas servers every 7 seconds so those start to add up if your real reaffic is low.
    option      httpchk HEAD /security/check.txt HTTP/1.0

Again, thank you all for your help and suggestions.
C.Y.


        Cheers,
        Willy




Reply via email to