glibc double free or corruption with 1.5-dev20
Hi, I've compiled 1.5-dev20 on debian wheezy and now I get a double free or corruption bug. Haproxy will not start. *** glibc detected *** /usr/sbin/haproxy: double free or corruption (fasttop): 0x03c5a880 *** === Backtrace: = /lib/x86_64-linux-gnu/libc.so.6(+0x76d76)[0x6853e222fd76] /lib/x86_64-linux-gnu/libc.so.6(cfree+0x6c)[0x6853e2234aac] /usr/sbin/haproxy[0x466c36] /usr/sbin/haproxy[0x467224] /usr/sbin/haproxy[0x460ddd] /usr/sbin/haproxy[0x46129e] /usr/sbin/haproxy[0x418549] /usr/sbin/haproxy[0x421472] /usr/sbin/haproxy[0x407f2a] /usr/sbin/haproxy[0x406639] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xfd)[0x6853e21d7ead] /usr/sbin/haproxy[0x4071fd] === Memory map: 0040-00496000 r-xp 08:05 65203 /usr/sbin/haproxy 00695000-0069d000 rw-p 00095000 08:05 65203 /usr/sbin/haproxy 0069d000-006a9000 rw-p 00:00 0 006a9000-03b8e000 ---p 00:00 0 03b8e000-03c68000 rw-p 00:00 0 [heap] 6853dc00-6853dc021000 rw-p 00:00 0 6853dc021000-6853e000 ---p 00:00 0 6853e1568000-6853e157d000 r-xp 08:02 211757 /lib/x86_64-linux-gnu/libgcc_s.so.1 6853e157d000-6853e177d000 ---p 00015000 08:02 211757 /lib/x86_64-linux-gnu/libgcc_s.so.1 6853e177d000-6853e177e000 rw-p 00015000 08:02 211757 /lib/x86_64-linux-gnu/libgcc_s.so.1 6853e177e000-6853e1789000 r-xp 08:02 211810 /lib/x86_64-linux-gnu/libnss_files-2.13.so 6853e1789000-6853e1988000 ---p b000 08:02 211810 /lib/x86_64-linux-gnu/libnss_files-2.13.so 6853e1988000-6853e1989000 r--p a000 08:02 211810 /lib/x86_64-linux-gnu/libnss_files-2.13.so 6853e1989000-6853e198a000 rw-p b000 08:02 211810 /lib/x86_64-linux-gnu/libnss_files-2.13.so 6853e198a000-6853e1994000 r-xp 08:02 211924 /lib/x86_64-linux-gnu/libnss_nis-2.13.so 6853e1994000-6853e1b93000 ---p a000 08:02 211924 /lib/x86_64-linux-gnu/libnss_nis-2.13.so 6853e1b93000-6853e1b94000 r--p 9000 08:02 211924 /lib/x86_64-linux-gnu/libnss_nis-2.13.so 6853e1b94000-6853e1b95000 rw-p a000 08:02 211924 /lib/x86_64-linux-gnu/libnss_nis-2.13.so 6853e1b95000-6853e1baa000 r-xp 08:02 211919 /lib/x86_64-linux-gnu/libnsl-2.13.so 6853e1baa000-6853e1da9000 ---p 00015000 08:02 211919 /lib/x86_64-linux-gnu/libnsl-2.13.so 6853e1da9000-6853e1daa000 r--p 00014000 08:02 211919 /lib/x86_64-linux-gnu/libnsl-2.13.so 6853e1daa000-6853e1dab000 rw-p 00015000 08:02 211919 /lib/x86_64-linux-gnu/libnsl-2.13.so 6853e1dab000-6853e1dad000 rw-p 00:00 0 6853e1dad000-6853e1db4000 r-xp 08:02 211824 /lib/x86_64-linux-gnu/libnss_compat-2.13.so 6853e1db4000-6853e1fb3000 ---p 7000 08:02 211824 /lib/x86_64-linux-gnu/libnss_compat-2.13.so 6853e1fb3000-6853e1fb4000 r--p 6000 08:02 211824 /lib/x86_64-linux-gnu/libnss_compat-2.13.so 6853e1fb4000-6853e1fb5000 rw-p 7000 08:02 211824 /lib/x86_64-linux-gnu/libnss_compat-2.13.so 6853e1fb5000-6853e1fb7000 r-xp 08:02 211807 /lib/x86_64-linux-gnu/libdl-2.13.so 6853e1fb7000-6853e21b7000 ---p 2000 08:02 211807 /lib/x86_64-linux-gnu/libdl-2.13.so 6853e21b7000-6853e21b8000 r--p 2000 08:02 211807 /lib/x86_64-linux-gnu/libdl-2.13.so 6853e21b8000-6853e21b9000 rw-p 3000 08:02 211807 /lib/x86_64-linux-gnu/libdl-2.13.so 6853e21b9000-6853e2339000 r-xp 08:02 211866 /lib/x86_64-linux-gnu/libc-2.13.so 6853e2339000-6853e2539000 ---p 0018 08:02 211866 /lib/x86_64-linux-gnu/libc-2.13.so 6853e2539000-6853e253d000 r--p 0018 08:02 211866 /lib/x86_64-linux-gnu/libc-2.13.so 6853e253d000-6853e253e000 rw-p 00184000 08:02 211866 /lib/x86_64-linux-gnu/libc-2.13.so 6853e253e000-6853e2543000 rw-p 00:00 0 6853e2543000-6853e257f000 r-xp 08:02 211948 /lib/x86_64-linux-gnu/libpcre.so.3.13.1 6853e257f000-6853e277f000 ---p 0003c000 08:02 211948 /lib/x86_64-linux-gnu/libpcre.so.3.13.1 6853e277f000-6853e278 rw-p 0003c000 08:02 211948 /lib/x86_64-linux-gnu/libpcre.so.3.13.1 6853e278-6853e2782000 r-xp 08:05 978315 /usr/lib/x86_64-linux-gnu/libpcreposix.so.3.13.1 6853e2782000-6853e2981000 ---p 2000 08:05 978315 /usr/lib/x86_64-linux-gnu/libpcreposix.so.3.13.1 6853e2981000-6853e2982000 rw-p 1000 08:05 978315
Re: glibc double free or corruption with 1.5-dev20
Hi Sander, On Mon, Dec 16, 2013 at 09:43:07AM +0100, Sander Klein wrote: Hi, I've compiled 1.5-dev20 on debian wheezy and now I get a double free or corruption bug. Haproxy will not start. Interesting, I never experienced this one. Could you please run it through gdb and issue bt full ? Otherwise if you can send me privately the config you use to reproduce this, without sensitive information, it would be great! Thanks, Willy
Re: glibc double free or corruption with 1.5-dev20
On , Willy Tarreau wrote: Hi Sander, On Mon, Dec 16, 2013 at 09:43:07AM +0100, Sander Klein wrote: Hi, I've compiled 1.5-dev20 on debian wheezy and now I get a double free or corruption bug. Haproxy will not start. Interesting, I never experienced this one. Could you please run it through gdb and issue bt full ? Otherwise if you can send me privately the config you use to reproduce this, without sensitive information, it would be great! Hmmm, I think something is not right here. I do have debugging symbols in the binary but I get nothing AFAICS. Am I doing something wrong here? Or is the SIGABRT the problem? I'll send you my config. GNU gdb (GDB) 7.4.1-debian Copyright (C) 2012 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type show copying and show warranty for details. This GDB was configured as x86_64-linux-gnu. For bug reporting instructions, please see: http://www.gnu.org/software/gdb/bugs/... Reading symbols from /home/sander/haproxy...done. (gdb) run -f /etc/haproxy/haproxy.cfg -D Starting program: /home/sander/haproxy -f /etc/haproxy/haproxy.cfg -D warning: no loadable sections found in added symbol-file system-supplied DSO at 0x6d43ce93c000 *** glibc detected *** /home/sander/haproxy: double free or corruption (fasttop): 0x00fe3b90 *** === Backtrace: = /lib/x86_64-linux-gnu/libc.so.6(+0x76d76)[0x6d43cd53bd76] /lib/x86_64-linux-gnu/libc.so.6(cfree+0x6c)[0x6d43cd540aac] /home/sander/haproxy[0x466c36] /home/sander/haproxy[0x467224] /home/sander/haproxy[0x460ddd] /home/sander/haproxy[0x46129e] /home/sander/haproxy[0x418549] /home/sander/haproxy[0x421472] /home/sander/haproxy[0x407f2a] /home/sander/haproxy[0x406639] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xfd)[0x6d43cd4e3ead] /home/sander/haproxy[0x4071fd] === Memory map: 0040-00496000 r-xp 08:08 6635592 /home/sander/haproxy 00695000-0069d000 rw-p 00095000 08:08 6635592 /home/sander/haproxy 0069d000-006a9000 rw-p 00:00 0 006a9000-00f18000 ---p 00:00 0 00f18000-01002000 rw-p 00:00 0 [heap] 6d43c800-6d43c8021000 rw-p 00:00 0 6d43c8021000-6d43cc00 ---p 00:00 0 6d43cc874000-6d43cc889000 r-xp 08:02 211757 /lib/x86_64-linux-gnu/libgcc_s.so.1 6d43cc889000-6d43cca89000 ---p 00015000 08:02 211757 /lib/x86_64-linux-gnu/libgcc_s.so.1 6d43cca89000-6d43cca8a000 rw-p 00015000 08:02 211757 /lib/x86_64-linux-gnu/libgcc_s.so.1 6d43cca8a000-6d43cca95000 r-xp 08:02 211810 /lib/x86_64-linux-gnu/libnss_files-2.13.so 6d43cca95000-6d43ccc94000 ---p b000 08:02 211810 /lib/x86_64-linux-gnu/libnss_files-2.13.so 6d43ccc94000-6d43ccc95000 r--p a000 08:02 211810 /lib/x86_64-linux-gnu/libnss_files-2.13.so 6d43ccc95000-6d43ccc96000 rw-p b000 08:02 211810 /lib/x86_64-linux-gnu/libnss_files-2.13.so 6d43ccc96000-6d43ccca r-xp 08:02 211924 /lib/x86_64-linux-gnu/libnss_nis-2.13.so 6d43ccca-6d43cce9f000 ---p a000 08:02 211924 /lib/x86_64-linux-gnu/libnss_nis-2.13.so 6d43cce9f000-6d43ccea r--p 9000 08:02 211924 /lib/x86_64-linux-gnu/libnss_nis-2.13.so 6d43ccea-6d43ccea1000 rw-p a000 08:02 211924 /lib/x86_64-linux-gnu/libnss_nis-2.13.so 6d43ccea1000-6d43cceb6000 r-xp 08:02 211919 /lib/x86_64-linux-gnu/libnsl-2.13.so 6d43cceb6000-6d43cd0b5000 ---p 00015000 08:02 211919 /lib/x86_64-linux-gnu/libnsl-2.13.so 6d43cd0b5000-6d43cd0b6000 r--p 00014000 08:02 211919 /lib/x86_64-linux-gnu/libnsl-2.13.so 6d43cd0b6000-6d43cd0b7000 rw-p 00015000 08:02 211919 /lib/x86_64-linux-gnu/libnsl-2.13.so 6d43cd0b7000-6d43cd0b9000 rw-p 00:00 0 6d43cd0b9000-6d43cd0c r-xp 08:02 211824 /lib/x86_64-linux-gnu/libnss_compat-2.13.so 6d43cd0c-6d43cd2bf000 ---p 7000 08:02 211824 /lib/x86_64-linux-gnu/libnss_compat-2.13.so 6d43cd2bf000-6d43cd2c r--p 6000 08:02 211824 /lib/x86_64-linux-gnu/libnss_compat-2.13.so 6d43cd2c-6d43cd2c1000 rw-p 7000 08:02 211824 /lib/x86_64-linux-gnu/libnss_compat-2.13.so 6d43cd2c1000-6d43cd2c3000 r-xp 08:02 211807 /lib/x86_64-linux-gnu/libdl-2.13.so 6d43cd2c3000-6d43cd4c3000 ---p 2000 08:02 211807 /lib/x86_64-linux-gnu/libdl-2.13.so 6d43cd4c3000-6d43cd4c4000 r--p 2000 08:02 211807
Re: glibc double free or corruption with 1.5-dev20
On Mon, Dec 16, 2013 at 10:19:42AM +0100, Sander Klein wrote: On , Willy Tarreau wrote: Hi Sander, On Mon, Dec 16, 2013 at 09:43:07AM +0100, Sander Klein wrote: Hi, I've compiled 1.5-dev20 on debian wheezy and now I get a double free or corruption bug. Haproxy will not start. Interesting, I never experienced this one. Could you please run it through gdb and issue bt full ? Otherwise if you can send me privately the config you use to reproduce this, without sensitive information, it would be great! Hmmm, I think something is not right here. I do have debugging symbols in the binary but I get nothing AFAICS. Am I doing something wrong here? Or is the SIGABRT the problem? Yes I think that's because the process died. I'll send you my config. OK thank you! Willy
Re: glibc double free or corruption with 1.5-dev20
OK here's the fix, it was not a big deal, just a missing NULL after a free when loading patterns from a file. Thank you for your quick help Sander! Willy From 6762a3061ac0d1d8c8860a2191c602a3c526205c Mon Sep 17 00:00:00 2001 From: Willy Tarreau w...@1wt.eu Date: Mon, 16 Dec 2013 10:40:28 +0100 Subject: BUG/MAJOR: patterns: fix double free caused by loading strings from files A null pointer assignment was missing after a free in commit 7148ce6 (MEDIUM: pattern: Extract the index process from the pat_parse_*() functions), causing a double free after loading a file of string patterns. This bug was introduced in 1.5-dev20, no backport is needed. Thanks to Sander Klein for reporting this bug and providing the config needed to trigger it. --- src/pattern.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/pattern.c b/src/pattern.c index ce60f76..8380c63 100644 --- a/src/pattern.c +++ b/src/pattern.c @@ -882,6 +882,7 @@ int pattern_register(struct pattern_expr *expr, const char **args, /* the map_parser_str() function always duplicate string information */ free((*pattern)-ptr.str); + (*pattern)-ptr.str = NULL; /* we pre-set the data pointer to the tree's head so that functions * which are able to insert in a tree know where to do that. -- 1.7.12.2.21.g234cd45.dirty
Re: glibc double free or corruption with 1.5-dev20
On , Willy Tarreau wrote: OK here's the fix, it was not a big deal, just a missing NULL after a free when loading patterns from a file. Thank you for your quick help Sander! Something is fishy. I've compiled a new version with your patch, haproxy starts but it 'just doesn't work (tm)'. I know this is a useless vague description but it is the best I have right now. I try and have a look later to see why web pages do not load with this new haproxy version. Greets, Sander
Re: glibc double free or corruption with 1.5-dev20
On , Sander Klein wrote: On , Willy Tarreau wrote: OK here's the fix, it was not a big deal, just a missing NULL after a free when loading patterns from a file. Thank you for your quick help Sander! Something is fishy. I've compiled a new version with your patch, haproxy starts but it 'just doesn't work (tm)'. I know this is a useless vague description but it is the best I have right now. I try and have a look later to see why web pages do not load with this new haproxy version. Replying to myself a bit. All connections seem to get status CQ. Haproxy 1.5-ss-20131105 doesn't have this problem. Again, I'll try and see if I can get a better description later today or tomorrow. Greets, Sander
Re: glibc double free or corruption with 1.5-dev20
On Mon, Dec 16, 2013 at 01:10:11PM +0100, Sander Klein wrote: On , Willy Tarreau wrote: OK here's the fix, it was not a big deal, just a missing NULL after a free when loading patterns from a file. Thank you for your quick help Sander! Something is fishy. I've compiled a new version with your patch, haproxy starts but it 'just doesn't work (tm)'. I know this is a useless vague description but it is the best I have right now. I try and have a look later to see why web pages do not load with this new haproxy version. You should check logs to see if you think the traffic follows the correct backends. We could indeed imagine an ACL match issue related to the thing I just fixed. Thanks, Willy
Re: glibc double free or corruption with 1.5-dev20
On , Willy Tarreau wrote: On Mon, Dec 16, 2013 at 01:10:11PM +0100, Sander Klein wrote: On , Willy Tarreau wrote: OK here's the fix, it was not a big deal, just a missing NULL after a free when loading patterns from a file. Thank you for your quick help Sander! Something is fishy. I've compiled a new version with your patch, haproxy starts but it 'just doesn't work (tm)'. I know this is a useless vague description but it is the best I have right now. I try and have a look later to see why web pages do not load with this new haproxy version. You should check logs to see if you think the traffic follows the correct backends. We could indeed imagine an ACL match issue related to the thing I just fixed. I see that the correct backends are selected. It looks like this. Dec 16 13:05:45 localhost haproxy[28322]: x.x.x.x:49389 [16/Dec/2013:13:05:40.833] cluster1-in cluster1-53/web008 8/4314/-1/-1/4322 503 1995 - - CQVN 552/401/211/6/0 68/0 {some.site.com|Mozilla/5.0 (Win||http://some.site.com/url/goes/here/24?q_searchfield=something} {} GET /url/goes/here/36?q_searchfield=something HTTP/1.1 Greets, Sander
Re: glibc double free or corruption with 1.5-dev20
On Mon, Dec 16, 2013 at 02:19:28PM +0100, Sander Klein wrote: On , Willy Tarreau wrote: On Mon, Dec 16, 2013 at 01:10:11PM +0100, Sander Klein wrote: On , Willy Tarreau wrote: OK here's the fix, it was not a big deal, just a missing NULL after a free when loading patterns from a file. Thank you for your quick help Sander! Something is fishy. I've compiled a new version with your patch, haproxy starts but it 'just doesn't work (tm)'. I know this is a useless vague description but it is the best I have right now. I try and have a look later to see why web pages do not load with this new haproxy version. You should check logs to see if you think the traffic follows the correct backends. We could indeed imagine an ACL match issue related to the thing I just fixed. I see that the correct backends are selected. It looks like this. Dec 16 13:05:45 localhost haproxy[28322]: x.x.x.x:49389 [16/Dec/2013:13:05:40.833] cluster1-in cluster1-53/web008 8/4314/-1/-1/4322 503 1995 - - CQVN 552/401/211/6/0 68/0 {some.site.com|Mozilla/5.0 (Win||http://some.site.com/url/goes/here/24?q_searchfield=something} {} GET /url/goes/here/36?q_searchfield=something HTTP/1.1 It indicates the visitor aborts while waiting in the queue, so typically a click on the STOP button while waiting. There are 68 other requests in the backend's queue, 211 connections on the backend and 6 on the server. In your config, I'm seeing a minconn 100 on the server, so the server is not full. The slowstart could possibly limit the accepted concurrency however. I'll have to see if something changed with slowstart (I'm not aware of any change there). Regards, Willy
Re: glibc double free or corruption with 1.5-dev20
On , Willy Tarreau wrote: On Mon, Dec 16, 2013 at 02:19:28PM +0100, Sander Klein wrote: On , Willy Tarreau wrote: On Mon, Dec 16, 2013 at 01:10:11PM +0100, Sander Klein wrote: On , Willy Tarreau wrote: OK here's the fix, it was not a big deal, just a missing NULL after a free when loading patterns from a file. Thank you for your quick help Sander! Something is fishy. I've compiled a new version with your patch, haproxy starts but it 'just doesn't work (tm)'. I know this is a useless vague description but it is the best I have right now. I try and have a look later to see why web pages do not load with this new haproxy version. You should check logs to see if you think the traffic follows the correct backends. We could indeed imagine an ACL match issue related to the thing I just fixed. I see that the correct backends are selected. It looks like this. Dec 16 13:05:45 localhost haproxy[28322]: x.x.x.x:49389 [16/Dec/2013:13:05:40.833] cluster1-in cluster1-53/web008 8/4314/-1/-1/4322 503 1995 - - CQVN 552/401/211/6/0 68/0 {some.site.com|Mozilla/5.0 (Win||http://some.site.com/url/goes/here/24?q_searchfield=something} {} GET /url/goes/here/36?q_searchfield=something HTTP/1.1 It indicates the visitor aborts while waiting in the queue, so typically a click on the STOP button while waiting. There are 68 other requests in the backend's queue, 211 connections on the backend and 6 on the server. In your config, I'm seeing a minconn 100 on the server, so the server is not full. The slowstart could possibly limit the accepted concurrency however. I'll have to see if something changed with slowstart (I'm not aware of any change there). H, dev20 does a slowstart when haproxy starts. Dev19 (and before) doesn't do that. It even does a slowstart when I reload the config file. That doesn't seem right to me. Greets, Sander
Re: glibc double free or corruption with 1.5-dev20
On Mon, Dec 16, 2013 at 04:46:55PM +0100, Sander Klein wrote: H, dev20 does a slowstart when haproxy starts. Dev19 (and before) doesn't do that. It even does a slowstart when I reload the config file. That doesn't seem right to me. But you're perfectly right! I introduced this regression when fixing another an issue with the agent checks (they prevented one from starting a server after boot). And I was happy to see the slowstart ramp up... I didn't realize it was not normal during boot :-/ I now fixed it and pushed it after having tested on your config that it's OK now. I'm attaching the fix for your convenience. Thanks for reporting this! Willy From 02541e8be22c212493ceeb88a54ccd32190db36d Mon Sep 17 00:00:00 2001 From: Willy Tarreau w...@1wt.eu Date: Mon, 16 Dec 2013 18:08:36 +0100 Subject: BUG/MEDIUM: checks: servers must not start in slowstart mode In 1.5-dev20, commit bb9665e (BUG/MEDIUM: checks: ensure we can enable a server after boot) tried to fix a side effect of having both regular checks and agent checks condition the up state propagation to servers. Unfortunately it was still not fine because after this fix, servers which make use of slowstart start in this mode. We must not check the agent's health if agent checks are not enabled, and likewise, we must not check the regular check's health if they are not enabled. Reading the code, it seems like we could avoid entering this function at all if (s-state SRV_RUNNING) is not satisfied. Let's reserve this for a later patch if needed. Thanks to Sander Klein for reporting this abnormal situation. --- src/checks.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/checks.c b/src/checks.c index 8014a66..115cc85 100644 --- a/src/checks.c +++ b/src/checks.c @@ -480,8 +480,10 @@ void set_server_up(struct check *check) { } if (s-track || - (s-check.health == s-check.rise (s-agent.health = s-agent.rise || !(s-agent.state CHK_ST_ENABLED))) || - (s-agent.health == s-agent.rise (s-check.health = s-check.rise || !(s-check.state CHK_ST_ENABLED { + ((s-check.state CHK_ST_ENABLED) (s-check.health == s-check.rise) +(s-agent.health = s-agent.rise || !(s-agent.state CHK_ST_ENABLED))) || + ((s-agent.state CHK_ST_ENABLED) (s-agent.health == s-agent.rise) +(s-check.health = s-check.rise || !(s-check.state CHK_ST_ENABLED { if (s-proxy-srv_bck == 0 s-proxy-srv_act == 0) { if (s-proxy-last_change now.tv_sec) // ignore negative times s-proxy-down_time += now.tv_sec - s-proxy-last_change; -- 1.7.12.1