Re: [ANNOUNCE] haproxy 1.4-dev6 : many fixes
Hi Cyril, On Sat, Jan 9, 2010 at 1:47 PM, Cyril Bonté cyril.bo...@free.fr wrote: Same here, I can't reproduce it. Bart, is it quickly reproducible or does it happen after a lot of traffic ? How much memory does haproxy use when it segfaults (I find the appsession timeout quite big) ? Haproxy segfaults just after a few requests. I didn't explicitly look at the memory usage, but I don't expect it uses a lot of memory. Bart
Re: [ANNOUNCE] haproxy 1.4-dev6 : many fixes
Hi Willy, Thanks for the quick response and the patch. I've applied the patch and it seems to be running fine now and I can't reproduce the segfault. Regards, Bart On Fri, Jan 8, 2010 at 10:32 PM, Willy Tarreau w...@1wt.eu wrote: Hi again Bart, I confirm that I forgot to reinit the session cookie in keep-alive. Could you please apply the attached patch to your sources and try again ? I tried to reproduce the issue but failed to do so, reason why I'm asking for a test. Thanks Willy
Re: [ANNOUNCE] haproxy 1.4-dev6 : many fixes
Hi Bart, On Sun, Jan 10, 2010 at 04:02:18PM +0100, Bart van der Schans wrote: Hi Willy, Thanks for the quick response and the patch. I've applied the patch and it seems to be running fine now and I can't reproduce the segfault. thanks. It has been merged into mainline. Regards, Willy
Re: [ANNOUNCE] haproxy 1.4-dev6 : many fixes
Hi again Willy, Le Dimanche 10 Janvier 2010 00:47:14, Willy Tarreau a écrit : Good catch ! Aleks and I have spent some time in the past to track memory leaks in this area. This is a sensible area because it's one where we're dynamically allocating memory. Obviously those two have escaped us. I'm applying your patch. Maybe appsession should be forbidden in the 'defaults' section as it will not work in the backends. Moreover, haproxy sergfaults when compiled with DEBUG_HASH. --- haproxy-1.4-dev6/src/cfgparse.c 2010-01-08 07:49:44.0 +0100 +++ haproxy-1.4-dev6-appsession/src/cfgparse.c 2010-01-10 16:51:52.0 +0100 @@ -1578,6 +1578,12 @@ else if (!strcmp(args[0], appsession)) { /* cookie name */ int cur_arg; + if (curproxy == defproxy) { + Alert(parsing [%s:%d] : '%s' not allowed in 'defaults' section.\n, file, linenum, args[0]); + err_code |= ERR_ALERT | ERR_FATAL; + goto out; + } + if (warnifnotcap(curproxy, PR_CAP_BE, file, linenum, args[0], NULL)) err_code |= ERR_WARN; Still about appsession, I've seen in the timeout http-keep-alive commit that timeout appsession is supported (tested, it modifies the timeout defined in the previous appsession line, depending on the declaration order), but this appears nowhere in the documentation. -- Cyril Bonté
Re: [ANNOUNCE] haproxy 1.4-dev6 : many fixes
On Sun, Jan 10, 2010 at 05:01:47PM +0100, Cyril Bonté wrote: Hi again Willy, Le Dimanche 10 Janvier 2010 00:47:14, Willy Tarreau a écrit : Good catch ! Aleks and I have spent some time in the past to track memory leaks in this area. This is a sensible area because it's one where we're dynamically allocating memory. Obviously those two have escaped us. I'm applying your patch. Maybe appsession should be forbidden in the 'defaults' section as it will not work in the backends. OK, I agree since that's not allowed in the doc. Moreover, haproxy sergfaults when compiled with DEBUG_HASH. I'm not sure when that feature last worked. There are several DEBUG_* build directives that are regularly broken because they're not built by default. Most often they're fixed when we need them :-) --- haproxy-1.4-dev6/src/cfgparse.c 2010-01-08 07:49:44.0 +0100 +++ haproxy-1.4-dev6-appsession/src/cfgparse.c 2010-01-10 16:51:52.0 +0100 @@ -1578,6 +1578,12 @@ else if (!strcmp(args[0], appsession)) { /* cookie name */ int cur_arg; + if (curproxy == defproxy) { + Alert(parsing [%s:%d] : '%s' not allowed in 'defaults' section.\n, file, linenum, args[0]); + err_code |= ERR_ALERT | ERR_FATAL; + goto out; + } + if (warnifnotcap(curproxy, PR_CAP_BE, file, linenum, args[0], NULL)) err_code |= ERR_WARN; OK I will apply this patch. However, please check your mailer, it replaces tabs with spaces so the patch does not apply and I have to redo it by hand (it was the same with the last one). Still about appsession, I've seen in the timeout http-keep-alive commit that timeout appsession is supported (tested, it modifies the timeout defined in the previous appsession line, depending on the declaration order), but this appears nowhere in the documentation. I've digged through the history and found that it was introduced after 1.3.13 with all the timeouts, then removed from the doc exactly 2 years ago after 1.3.14 (844e3ac5) but was not removed from the code. I believe I added it when reviewing all timeouts without thinking that this one made no sense. I'm removing that now since it's undocumented and buggy by design. Thanks, Willy
Re: [ANNOUNCE] haproxy 1.4-dev6 : many fixes
I wanted to report after using 1.4-dev6 for several sites for a couple days that the results seem very good. One site was peaking at over 150 Mbps and over 65 million hits past couple of days, during that time memory use stayed steady between 1.5-2.5 GB and went down when load went down. On 1/7/10 11:05 PM, Willy Tarreau wrote: Hi all, well, some of you have encountered issues with 1.4-dev5 with sessions left in CLOSE_WAIT state or with memory leaks.
Re: [ANNOUNCE] haproxy 1.4-dev6 : many fixes
On Sat, Jan 09, 2010 at 11:03:16AM -0800, Hank A. Paulson wrote: I wanted to report after using 1.4-dev6 for several sites for a couple days that the results seem very good. One site was peaking at over 150 Mbps and over 65 million hits past couple of days, during that time memory use stayed steady between 1.5-2.5 GB and went down when load went down. Excellent, thanks a lot for your report Hank ! BTW if you're running with many concurrent connections causing that amount of memory to be consumed, you may want to try to build with dlmalloc (check the makefile for that). It makes extensive use of mmap() and is able to release lots of unused memory, more than the libc's malloc. This is particularly appreciated during soft restarts, when you need to make two processes coexist. Regards, Willy
Re: [ANNOUNCE] haproxy 1.4-dev6 : many fixes
Hi Willy, Le Samedi 9 Janvier 2010 14:01:36, Willy Tarreau a écrit : (...) One thing I suspect would be that we simply fail to free lots of allocated memory and that the last pool_alloc() returns NULL due to lack of memory, hence the segfault. But I also suspect that we *may* end up corrupting some lists or pools if we reuse some data across two consecutive requests. Anyway I've committed the fix. This doesn't directly concern this issue but I've tried to follow all the pool_alloc2/pool_free2 calls in the code to track memory leaks. I've found one which only happens when there's already no more memory when allocating a new appsession cookie : --- haproxy-1.4-dev6/src/proto_http.c 2010-01-10 00:14:47.0 +0100 +++ haproxy-1.4-dev6-freememory/src/proto_http.c2010-01-10 00:15:16.0 +0100 @@ -5954,6 +5954,7 @@ if ((asession-sessid = pool_alloc2(apools.sessid)) == NULL) { Alert(Not enough Memory process_srv():asession-sessid:malloc().\n); send_log(t-be, LOG_ALERT, Not enough Memory process_srv():asession-sessid:malloc().\n); + t-be-htbl_proxy.destroy(asession); return; } memcpy(asession-sessid, t-sessid, t-be-appsession_len); @@ -5963,6 +5964,7 @@ if ((asession-serverid = pool_alloc2(apools.serverid)) == NULL) { Alert(Not enough Memory process_srv():asession-sessid:malloc().\n); send_log(t-be, LOG_ALERT, Not enough Memory process_srv():asession-sessid:malloc().\n); + t-be-htbl_proxy.destroy(asession); return; } asession-serverid[0] = '\0'; -- Cyril Bonté
Re: [ANNOUNCE] haproxy 1.4-dev6 : many fixes
Hi, First of all, I really like haproxy and I'm really exited about the keep-alive support! So I gave the new 1.4-dev6 spin, but I'm getting a segfault. When I run haproxy (with -d) in gdb I get: Program received signal SIGSEGV, Segmentation fault. 0x00433cf9 in manage_server_side_cookies (t=0x25dd1f0, rtr=0x25e1ae0) at src/proto_http.c:5929 5929if ((t-sessid = pool_alloc2(apools.sessid)) == NULL) { (gdb) (gdb) (gdb) bt #0 0x00433cf9 in manage_server_side_cookies (t=0x25dd1f0, rtr=0x25e1ae0) at src/proto_http.c:5929 #1 0x004306f4 in http_process_res_common (t=0x25dd1f0, rep=0x25e1ae0, an_bit=262144, px=0x2554230) at src/proto_http.c:4587 #2 0x0044c0ba in process_session (t=0x25dd660) at src/session.c:982 #3 0x00409e6b in process_runnable_tasks (next=0x7fff914ba01c) at src/task.c:234 #4 0x004045bd in run_poll_loop () at src/haproxy.c:935 #5 0x00404d1f in main (argc=4, argv=0x7fff914ba188) at src/haproxy.c:1210 So it looks like something goes wrong with the session cookie handling. The relevant part in my config (do you need my complete config?): backend cms mode http balance leastconn appsession JSESSIONID len 20 timeout 12h option httpchk GET /ping/ HTTP/1.1\r\nHost:\ cms.mycompany.com server cms1 172.16.1.33:80 check inter 3s weight 100 server cms2 172.16.1.34:80 check inter 3s weight 100 Any help would be really welcome! Regards, Bart On Fri, Jan 8, 2010 at 8:05 AM, Willy Tarreau w...@1wt.eu wrote: Hi all, well, some of you have encountered issues with 1.4-dev5 with sessions left in CLOSE_WAIT state or with memory leaks. With the help of Cyril Bonté and Hank A. Paulson who have sent a lot of feedback and tested almost all intermediate versions, we finally managed to nail all the problems down and to reach a working state. Of course that does not mean there is no bug left, but it's working reliably on two servers here (including haproxy.1wt.eu) without any of those issues. Also, after some thinking I realized that the changes had affected the behaviour of option httpclose to mimmic a bit option forceclose and I realized that there was no reason for this. So the behaviour of option httpclose has been restored so that it only ensures the headers are correct and relies on both ends to agree on when to close. Option forceclose must be used if you want haproxy to follow the request/response and enforce the close at the right moment. This is important, because some issues that people have been facing with 1.4-dev5 should not have impacted them if this change had not been performed (though it helped to spot the bugs faster). As usual, Krzysztof has sent a bunch of nice improvements for the checks and the stats interface, and a new very cool feature : default-server. This will make it possible to move almost all of the server settings to a central place so that you won't have to copy/paste them all for all servers. So it was time to release 1.4-dev6 so that you have a better experience with it and can test the keep-alive mode without the fear of a degraded service. It's available at the usual place : http://haproxy.1wt.eu/download/1.4/src/ I hope we won't have to run after the bugs as we did this week, it has been quite painful, but very useful. Thanks to all those who helped, and thanks in advance to all those who will test this one. Willy
Re: [ANNOUNCE] haproxy 1.4-dev6 : many fixes
Hi Bart, On Fri, Jan 08, 2010 at 01:00:42PM +0100, Bart van der Schans wrote: Hi, First of all, I really like haproxy and I'm really exited about the keep-alive support! So I gave the new 1.4-dev6 spin, but I'm getting a segfault. When I run haproxy (with -d) in gdb I get: Program received signal SIGSEGV, Segmentation fault. 0x00433cf9 in manage_server_side_cookies (t=0x25dd1f0, rtr=0x25e1ae0) at src/proto_http.c:5929 5929if ((t-sessid = pool_alloc2(apools.sessid)) == NULL) { (gdb) (gdb) (gdb) bt #0 0x00433cf9 in manage_server_side_cookies (t=0x25dd1f0, rtr=0x25e1ae0) at src/proto_http.c:5929 #1 0x004306f4 in http_process_res_common (t=0x25dd1f0, rep=0x25e1ae0, an_bit=262144, px=0x2554230) at src/proto_http.c:4587 #2 0x0044c0ba in process_session (t=0x25dd660) at src/session.c:982 #3 0x00409e6b in process_runnable_tasks (next=0x7fff914ba01c) at src/task.c:234 #4 0x004045bd in run_poll_loop () at src/haproxy.c:935 #5 0x00404d1f in main (argc=4, argv=0x7fff914ba188) at src/haproxy.c:1210 So it looks like something goes wrong with the session cookie handling. The relevant part in my config (do you need my complete config?): Yes I think that this one requires particular handling during the session resetting. I'll investigate. Could you try with cookie prefix mode which should be 100% compatible in your case since you don't look it up in the URL : cookie JSESSIONID prefix server cms1 172.16.1.33:80 cookie s1 check inter 3s weight 100 server cms2 172.16.1.34:80 cookie s2 check inter 3s weight 100 If it works it will also ensure you can restart haproxy at any time without loosing your session cookies. In parallel, I'll check in the code if I can spot anything causing issues with the appsession cookie in case of keep-alive mode (as I assume this was with http-server-close option). I'll ask you for your complete config if I can't reproduce it here. Thanks! willy
Re: [ANNOUNCE] haproxy 1.4-dev6 : many fixes
Hi again Bart, I confirm that I forgot to reinit the session cookie in keep-alive. Could you please apply the attached patch to your sources and try again ? I tried to reproduce the issue but failed to do so, reason why I'm asking for a test. Thanks Willy diff --git a/src/proto_http.c b/src/proto_http.c index 32284fe..4986422 100644 --- a/src/proto_http.c +++ b/src/proto_http.c @@ -6357,6 +6357,8 @@ void http_end_txn(struct session *s) pool_free2(pool2_requri, txn-uri); pool_free2(pool2_capture, txn-cli_cookie); pool_free2(pool2_capture, txn-srv_cookie); + pool_free2(apools.sessid, s-sessid); + s-sessid = NULL; txn-uri = NULL; txn-srv_cookie = NULL; txn-cli_cookie = NULL; diff --git a/src/session.c b/src/session.c index 5e8c990..65e22f6 100644 --- a/src/session.c +++ b/src/session.c @@ -78,9 +78,6 @@ void session_free(struct session *s) pool_free2(pool2_buffer, s-req); pool_free2(pool2_buffer, s-rep); - if (s-sessid) - pool_free2(apools.sessid, s-sessid); - http_end_txn(s); if (fe) {
[ANNOUNCE] haproxy 1.4-dev6 : many fixes
Hi all, well, some of you have encountered issues with 1.4-dev5 with sessions left in CLOSE_WAIT state or with memory leaks. With the help of Cyril Bonté and Hank A. Paulson who have sent a lot of feedback and tested almost all intermediate versions, we finally managed to nail all the problems down and to reach a working state. Of course that does not mean there is no bug left, but it's working reliably on two servers here (including haproxy.1wt.eu) without any of those issues. Also, after some thinking I realized that the changes had affected the behaviour of option httpclose to mimmic a bit option forceclose and I realized that there was no reason for this. So the behaviour of option httpclose has been restored so that it only ensures the headers are correct and relies on both ends to agree on when to close. Option forceclose must be used if you want haproxy to follow the request/response and enforce the close at the right moment. This is important, because some issues that people have been facing with 1.4-dev5 should not have impacted them if this change had not been performed (though it helped to spot the bugs faster). As usual, Krzysztof has sent a bunch of nice improvements for the checks and the stats interface, and a new very cool feature : default-server. This will make it possible to move almost all of the server settings to a central place so that you won't have to copy/paste them all for all servers. So it was time to release 1.4-dev6 so that you have a better experience with it and can test the keep-alive mode without the fear of a degraded service. It's available at the usual place : http://haproxy.1wt.eu/download/1.4/src/ I hope we won't have to run after the bugs as we did this week, it has been quite painful, but very useful. Thanks to all those who helped, and thanks in advance to all those who will test this one. Willy