Hi Willy, I've been following this with some interest (I'm due to roll out 1.4.9 on a number of loadbalancers soon), and just wondered if these patches are recommended for application even if you do build with libpcre (as I also do), or whether it should be safe to deploy 1.4.9 as is?
Cheers Chris -----Original Message----- From: Willy Tarreau <[email protected]> To: Hogan Yu <[email protected]> Cc: Cyril Bonté <[email protected]>, [email protected] Subject: Re: appsession does not work in Haproxy 1.4.9 Date: Fri, 19 Nov 2010 12:02:30 +0100 Hi Hogan, I have good news. I found two bugs that could cause what you describe in case of out of memory conditions, and I could trigger them with your config by limiting my amount of memory. One of them is triggered by the cookie capture. It it failed to allocate memory to store the cookie, it would emit an alert but still use it. That was the case only for the response cookies, the request cookies has the correct fix. Not sure how that slipped through fixes, but it's fixed now. The other one was caused by a risk of freeing a non-allocated pool in the appsession in case of memory shortage : some fields were released while not yet allocated, resulting in a segfault too during the call to the libc's free(). However, I found a third bug with my glibc : the call to regexec() spins forever when it fails to allocate memory. It is very possible that with your libc, instead of looping it crashes. And the only way to have this code called with your config is precisely by adding the reqirep rule : 0x6f755d7d in mmap () from /lib/libc.so.6 (gdb) bt #0 0x6f755d7d in mmap () from /lib/libc.so.6 #1 0x6f703bb4 in sYSMALLOc () from /lib/libc.so.6 #2 0x6f702835 in _int_realloc () from /lib/libc.so.6 #3 0x6f701100 in realloc () from /lib/libc.so.6 #4 0x6f731d6f in re_string_realloc_buffers () from /lib/libc.so.6 #5 0x6f73dfb3 in extend_buffers () from /lib/libc.so.6 #6 0x6f73b222 in transit_state () from /lib/libc.so.6 #7 0x6f739a6f in check_matching () from /lib/libc.so.6 #8 0x6f7392a5 in re_search_internal () from /lib/libc.so.6 #9 0x6f73874d in regexec@@GLIBC_2.3.4 () from /lib/libc.so.6 #10 0x0806b8da in apply_filter_to_req_line (t=0x0, req=0x8ba5da8, exp=0x8b07b08) at src/proto_http.c:5534 #11 0x0806badc in apply_filters_to_request (s=0x8ba5650, req=0x8ba5da8, px=0x8afec80) at src/proto_http.c:5649 #12 0x080682fe in http_process_req_common (s=0x8ba5650, req=0x8ba5da8, an_bit=-12, px=0x8afec80) at src/proto_http.c:3013 #13 0x0807e776 in process_session (t=0x8ba5a08) at src/session.c:1068 #14 0x0804ec8d in process_runnable_tasks (next=0x77bcd29c) at src/task.c:234 #15 0x0804b005 in run_poll_loop () at src/haproxy.c:974 #16 0x0804b389 in main (argc=6, argv=0x77bcd334) at src/haproxy.c:1255 I'm used to build with libpcre which is much faster than libc and which I never got to fail. So what I'm suggesting for your case : 1) I'm certain that you're running out of memory, maybe because of too many learned cookies, or maybe because your machine is under-sized for the amount of traffic that passes through. However, since you're saying it takes one hour to die, I suspect that it's a combination of both. You need much memory, and the appsession that slowly piles up progressively reduces the amount of memory available until it dies. So you must find why you have so little memory (maybe a ulimit -m or something like this). 2) rebuild with support for libpcre, which handles out of memory conditions much better than libc. An out of memory should in theory never happen, but since it does in your case, let's be careful. 3) apply the two patches below to your current version to fix the two bugs : http://git.1wt.eu/web?p=haproxy-1.4.git;a=commitdiff_plain;h=62e360 http://git.1wt.eu/web?p=haproxy-1.4.git;a=commitdiff_plain;h=75eae4 Best regards, Willy

