[PATCH] fine guard for ssl random extraction functions
Hello, yet another patch that removes several occurrences of HA_OPENSSL_VERSION also, fetches enabled for BoringSSL and LibreSSL-2.7.0 and higher Ilya From dcdfb25d51e44bf84175514a4a6b786b9c15e20e Mon Sep 17 00:00:00 2001 From: Ilya Shipitsin Date: Thu, 25 Mar 2021 00:41:41 +0500 Subject: [PATCH] BUILD: ssl: introduce fine guard for ssl random extraction functions SSL_get_{client,server}_random are supported in OpenSSL-1.1.0, BoringSSL, LibreSSL-2.7.0 let us introduce HAVE_SSL_EXTRACT_RANDOM for that purpose --- include/haproxy/openssl-compat.h | 4 src/ssl_sample.c | 6 +++--- 2 files changed, 7 insertions(+), 3 deletions(-) diff --git a/include/haproxy/openssl-compat.h b/include/haproxy/openssl-compat.h index 396810a0a..d26deccc6 100644 --- a/include/haproxy/openssl-compat.h +++ b/include/haproxy/openssl-compat.h @@ -41,6 +41,10 @@ #define OpenSSL_version_num SSLeay #endif +#if (LIBRESSL_VERSION_NUMBER >= 0x2070100fL) || defined(OPENSSL_IS_BORINGSSL) || (!defined(LIBRESSL_VERSION_NUMBER) && (OPENSSL_VERSION_NUMBER >= 0x1010L)) +#define HAVE_SSL_EXTRACT_RANDOM +#endif + #if ((OPENSSL_VERSION_NUMBER >= 0x10101000L) && !defined(OPENSSL_IS_BORINGSSL) && !defined(LIBRESSL_VERSION_NUMBER)) #define HAVE_SSL_RAND_KEEP_RANDOM_DEVICES_OPEN #endif diff --git a/src/ssl_sample.c b/src/ssl_sample.c index e2479f501..4c7d9aa9d 100644 --- a/src/ssl_sample.c +++ b/src/ssl_sample.c @@ -1029,7 +1029,7 @@ smp_fetch_ssl_fc_session_id(const struct arg *args, struct sample *smp, const ch #endif -#if HA_OPENSSL_VERSION_NUMBER >= 0x1010L +#ifdef HAVE_SSL_EXTRACT_RANDOM static int smp_fetch_ssl_fc_random(const struct arg *args, struct sample *smp, const char *kw, void *private) { @@ -1462,7 +1462,7 @@ static struct sample_fetch_kw_list sample_fetch_keywords = {ILH, { #if HA_OPENSSL_VERSION_NUMBER > 0x0090800fL { "ssl_bc_session_id", smp_fetch_ssl_fc_session_id, 0, NULL,SMP_T_BIN, SMP_USE_L5SRV }, #endif -#if HA_OPENSSL_VERSION_NUMBER >= 0x1010L +#ifdef HAVE_SSL_EXTRACT_RANDOM { "ssl_bc_client_random", smp_fetch_ssl_fc_random, 0, NULL,SMP_T_BIN, SMP_USE_L5SRV }, { "ssl_bc_server_random", smp_fetch_ssl_fc_random, 0, NULL,SMP_T_BIN, SMP_USE_L5SRV }, { "ssl_bc_session_key", smp_fetch_ssl_fc_session_key, 0, NULL,SMP_T_BIN, SMP_USE_L5SRV }, @@ -1514,7 +1514,7 @@ static struct sample_fetch_kw_list sample_fetch_keywords = {ILH, { #if HA_OPENSSL_VERSION_NUMBER > 0x0090800fL { "ssl_fc_session_id", smp_fetch_ssl_fc_session_id, 0, NULL,SMP_T_BIN, SMP_USE_L5CLI }, #endif -#if HA_OPENSSL_VERSION_NUMBER >= 0x1010L +#ifdef HAVE_SSL_EXTRACT_RANDOM { "ssl_fc_client_random", smp_fetch_ssl_fc_random, 0, NULL,SMP_T_BIN, SMP_USE_L5CLI }, { "ssl_fc_server_random", smp_fetch_ssl_fc_random, 0, NULL,SMP_T_BIN, SMP_USE_L5CLI }, { "ssl_fc_session_key", smp_fetch_ssl_fc_session_key, 0, NULL,SMP_T_BIN, SMP_USE_L5CLI }, -- 2.30.2
Re: [2.2.9] 100% CPU usage
śr., 24 mar 2021 o 10:37 Christopher Faulet napisał(a): > However, reading the other trace Maciej sent (bussy_thread_peers.txt), it > seems > possible to stop a memory allocation from other places. Thus, I guess we > must > find a more global way to prevent the lua stack dump. > I'm not sure which part of trace you're referring to but I need to clarify that both "bussy_thread_peers.txt" and "free_thread_spoe_lua.txt" occurred at the same time but on different threads (free_thread_spoe_lua on thread 10 and bussy_thread_peers on other threads). So If I understand it correctly thread 10 locked itself and other threads looping to acquire lock.
Re: [2.2.9] 100% CPU usage
Le 24/03/2021 à 10:16, Willy Tarreau a écrit : On Wed, Mar 24, 2021 at 10:11:19AM +0100, Maciej Zdeb wrote: Wow, that's it! :) 0x55d94949e965 <+53>: addl $0x1,%fs:0xfffdd688 0x55d94949e96e <+62>: callq 0x55d9494782c0 0x55d94949e973 <+67>: subl $0x1,%fs:0xfffdd688 [...] 0x55d94949e99f <+111>: ja 0x55d94949e9b0 0x55d94949e9a1 <+113>: mov%rbx,%rdi 0x55d94949e9a4 <+116>: callq 0x55d949477d50 0x55d94949e9a9 <+121>: test %rax,%rax [...] 0x55d94949e9c1 <+145>: addl $0x1,%fs:0xfffdd688 0x55d94949e9ca <+154>: callq 0x55d949477b50 0x55d94949e9cf <+159>: subl $0x1,%fs:0xfffdd688 Cool! Ok I'll make hlua_not_dumpable volatile instead of applying compiler barriers. Yes, it's safer for the long term as nobody will have to think about it anymore. We'll have to integrate this one as well. Thanks Guys, I will push a fix to make the variable volatile. However, reading the other trace Maciej sent (bussy_thread_peers.txt), it seems possible to stop a memory allocation from other places. Thus, I guess we must find a more global way to prevent the lua stack dump. -- Christopher Faulet
Re: [2.2.9] 100% CPU usage
On Wed, Mar 24, 2021 at 10:11:19AM +0100, Maciej Zdeb wrote: > Wow, that's it! :) > >0x55d94949e965 <+53>: addl $0x1,%fs:0xfffdd688 >0x55d94949e96e <+62>: callq 0x55d9494782c0 >0x55d94949e973 <+67>: subl $0x1,%fs:0xfffdd688 > [...] >0x55d94949e99f <+111>: ja 0x55d94949e9b0 >0x55d94949e9a1 <+113>: mov%rbx,%rdi >0x55d94949e9a4 <+116>: callq 0x55d949477d50 >0x55d94949e9a9 <+121>: test %rax,%rax > [...] >0x55d94949e9c1 <+145>: addl $0x1,%fs:0xfffdd688 >0x55d94949e9ca <+154>: callq 0x55d949477b50 >0x55d94949e9cf <+159>: subl $0x1,%fs:0xfffdd688 > Cool! > Ok I'll make hlua_not_dumpable volatile instead of applying compiler > barriers. Yes, it's safer for the long term as nobody will have to think about it anymore. We'll have to integrate this one as well. I'm wondering how many programs do *really* benefit from having gcc have a builtin malloc that bypasses the library-provided allocator, compared to all the ones experiencing pain and breakage, especially when using a non-legacy allocator or a debug-enabled one :-/ Willy
Re: [2.2.9] 100% CPU usage
Wow, that's it! :) 0x55d94949e965 <+53>: addl $0x1,%fs:0xfffdd688 0x55d94949e96e <+62>: callq 0x55d9494782c0 0x55d94949e973 <+67>: subl $0x1,%fs:0xfffdd688 [...] 0x55d94949e99f <+111>: ja 0x55d94949e9b0 0x55d94949e9a1 <+113>: mov%rbx,%rdi 0x55d94949e9a4 <+116>: callq 0x55d949477d50 0x55d94949e9a9 <+121>: test %rax,%rax [...] 0x55d94949e9c1 <+145>: addl $0x1,%fs:0xfffdd688 0x55d94949e9ca <+154>: callq 0x55d949477b50 0x55d94949e9cf <+159>: subl $0x1,%fs:0xfffdd688 Ok I'll make hlua_not_dumpable volatile instead of applying compiler barriers. śr., 24 mar 2021 o 10:00 Willy Tarreau napisał(a): > On Wed, Mar 24, 2021 at 09:52:22AM +0100, Willy Tarreau wrote: > > So yes, it looks like gcc decides that a function called "malloc" will > > never use your program's global variables but that "blablalloc" may. I > > have no explanation to this except "optimization craziness" resulting > > in breaking valid code, since it means that if I provide my own malloc > > function it might not work. Pfff > > And that's exactly it! > > https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html#Other-Builtins > > By default a number of functions including malloc, strchr etc are mapped > to builtins so it figures it knows better how these work and how to > optimize > around. Once built with -fno-builtin-malloc no more probmlem. So the > volatile > is the way to go to respect the risk of a signal hitting it. > > I found an old report of gcc-4.5 breaking chromium and tcmalloc because of > this, which put me on the track: > > https://github.com/gperftools/gperftools/issues/239 > > Willy >
Re: [2.2.9] 100% CPU usage
On Wed, Mar 24, 2021 at 09:52:22AM +0100, Willy Tarreau wrote: > So yes, it looks like gcc decides that a function called "malloc" will > never use your program's global variables but that "blablalloc" may. I > have no explanation to this except "optimization craziness" resulting > in breaking valid code, since it means that if I provide my own malloc > function it might not work. Pfff And that's exactly it! https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html#Other-Builtins By default a number of functions including malloc, strchr etc are mapped to builtins so it figures it knows better how these work and how to optimize around. Once built with -fno-builtin-malloc no more probmlem. So the volatile is the way to go to respect the risk of a signal hitting it. I found an old report of gcc-4.5 breaking chromium and tcmalloc because of this, which put me on the track: https://github.com/gperftools/gperftools/issues/239 Willy
Re: Fwd: [PATCH] cleanup unused definitions
On Wed, Mar 24, 2021 at 08:09:05AM +0100, Willy Tarreau wrote: > William, > > could you please have a quick look at these ? I didn't touch them because > I don't know if these are things you intend to use in the short-to-mid > term. If you take them, please be careful to fix the subject to mention > "ssl" in them. > > Thanks, > Willy > > - Forwarded message from ??? - > > > Date: Wed, 24 Mar 2021 11:29:03 +0500 > > From: ??? > > Subject: Re: [PATCH] cleanup unused definitions > > To: HAProxy , Willy Tarreau > > Delivered-To: haproxy@formilux.org > > List-Id: Haproxy > > > > ping > > > > ??, 20 ???. 2021 ?. ? 22:43, ??? : > > > > > while refactoring HA_OPENSSL_VERSION usage, > > > I've found unused definitions. nice. > > > > > > > > > Ilya > > > > > - End forwarded message - > - Forwarded message from ??? - > > > Date: Wed, 24 Mar 2021 11:29:19 +0500 > > From: ??? > > Subject: Re: [PATCH] BUILD: ssl: use feature guard instead of openssl > > version for ecdh functions > > To: HAProxy , Willy Tarreau > > Delivered-To: haproxy@formilux.org > > List-Id: Haproxy > > > > ping > > > > ??, 21 ???. 2021 ?. ? 13:02, ??? : > > > > > Hello, > > > > > > yet another patch that reduces number of HA_OPENSSL_VERSION use > > > > > > Ilya > > > > > > > > > > > - End forwarded message - > Thanks, both merged. -- William Lallemand
Re: [PATCH] BUILD: ssl: use feature guard instead of openssl version for ecdh functions
On Wed, Mar 24, 2021 at 11:29:19AM +0500, Илья Шипицин wrote: > ping > > вс, 21 мар. 2021 г. в 13:02, Илья Шипицин : > > > Hello, > > > > yet another patch that reduces number of HA_OPENSSL_VERSION use > > > > Ilya > > > > > > Thanks, merged. -- William Lallemand
Re: [PATCH] cleanup unused definitions
On Wed, Mar 24, 2021 at 11:29:03AM +0500, Илья Шипицин wrote: > ping > > сб, 20 мар. 2021 г. в 22:43, Илья Шипицин : > > > while refactoring HA_OPENSSL_VERSION usage, > > I've found unused definitions. nice. > > > > > > Ilya > > Thanks, merged. -- William Lallemand
Re: [2.2.9] 100% CPU usage
On Wed, Mar 24, 2021 at 09:41:03AM +0100, Willy Tarreau wrote: > This is particularly strange. Could you please disassemble hlua_alloc ? > (dis hlua_alloc) ? > > You should find something like this: > >0x004476c3 <+147>: addDWORD PTR fs:0xfffdd678,0x1 >0x004476cc <+156>: lock add QWORD PTR [rip+0x25df8b],0x1 > # 0x6a5660 <_.44806> >0x004476d5 <+165>: lock add QWORD PTR [rip+0x25df8b],rbx > # 0x6a5668 <_.44806+8> >0x004476dd <+173>: movrdi,rbx >0x004476e0 <+176>: call 0x4248b0 >0x004476e5 <+181>: subDWORD PTR fs:0xfffdd678,0x1 >0x004476ee <+190>: test rax,rax >0x004476f1 <+193>: jne0x44769d > > The "add DWORD..." is the ++, and the "sub DWORD..." is the --, both of > them around the function call. > > If the variable had been declared static I would have agreed with the > possibility of reordering. But here it's global and it's not legal to > move it when you don't know if the called function will make use of it. > Otherwise global variables cannot be used anymore! > > By the way, in your case above it should even equal 2. That's rather > strange. > > One thing you can test is simply to declare it volatile. If it changes > anything, it will mean that gcc has some specific assumptions on this > malloc function (i.e. it decides it will never read your global variables) :-/ And this seems to be the case :-( I was wondering where the two lock add instructions were coming from and figure that I was building with DEBUG_MEM_STATS. Removing it resulted in the add/sub being remove, but only around malloc, not around realloc nor free. Malloc is declared with __attribute__((malloc)) on my system so I wondered if it could cause this. I tried adding this in the function: extern void *malloc (size_t __size); And it didn't change the output code. I just *renamed* the function to blablalloc() and now the code is correct again: 22cb: 64 83 04 25 00 00 00addl $0x1,%fs:0x0 22d2: 00 01 22cf: R_X86_64_TPOFF32 hlua_not_dumpable 22d4: 48 89 dfmov%rbx,%rdi 22d7: e8 00 00 00 00 callq 22dc 22d8: R_X86_64_PLT32blablalloc-0x4 22dc: 64 83 2c 25 00 00 00subl $0x1,%fs:0x0 22e3: 00 01 22e0: R_X86_64_TPOFF32 hlua_not_dumpable 22e5: 48 85 c0test %rax,%rax 22e8: 75 be jne22a8 So yes, it looks like gcc decides that a function called "malloc" will never use your program's global variables but that "blablalloc" may. I have no explanation to this except "optimization craziness" resulting in breaking valid code, since it means that if I provide my own malloc function it might not work. Pfff At least the volatile solves it and is cleaner since it indicates it should assume it doesn't know it (e.g. for use from a signal handler which is what we're doing by the way): diff --git a/include/haproxy/hlua.h b/include/haproxy/hlua.h index 29656258e..46fbba6ec 100644 --- a/include/haproxy/hlua.h +++ b/include/haproxy/hlua.h @@ -50,7 +50,7 @@ void hlua_applet_tcp_fct(struct appctx *ctx); void hlua_applet_http_fct(struct appctx *ctx); struct task *hlua_process_task(struct task *task, void *context, unsigned short state); -extern THREAD_LOCAL unsigned int hlua_not_dumpable; +extern THREAD_LOCAL volatile unsigned int hlua_not_dumpable; #else /* USE_LUA */ / For use when Lua is disabled / diff --git a/src/hlua.c b/src/hlua.c index 5bef67a48..6ccce72dd 100644 --- a/src/hlua.c +++ b/src/hlua.c @@ -237,7 +237,7 @@ struct hlua_mem_allocator { static struct hlua_mem_allocator hlua_global_allocator; /* > 0 if lua is in a non-rentrant part, thus with a non-dumpable stack */ -THREAD_LOCAL unsigned int hlua_not_dumpable = 0; +THREAD_LOCAL unsigned int volatile hlua_not_dumpable = 0; /* These functions converts types between HAProxy internal args or * sample and LUA types. Another function permits to check if the Willy
Re: [2.2.9] 100% CPU usage
On Wed, Mar 24, 2021 at 08:55:33AM +0100, Maciej Zdeb wrote: > After reading I wasn't sure anymore I even tested properly patched package. > :) Hehe, I know that this happens quite a lot when starting to play with different binaries. > Fortunately I have a core file so I verified if hlua_not_dumpable > variable exists. I don't know how it is possible that hlua_not_dumpable is > 0 in this state (bt), if Christopher theory about lacking compiler barriers > is wrong. > > (gdb) p hlua_not_dumpable > $1 = 0 > > (gdb) bt > #0 __lll_lock_wait_private () at > ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95 > #1 0x7f52fb16c34b in __GI___libc_malloc (bytes=bytes@entry=41) at > malloc.c:3063 > #2 0x55d94949e9a9 in hlua_alloc (ud=0x55d9498e90d0 > , ptr=, osize=4, nsize=41) at > src/hlua.c:8210 > #3 0x55d949622271 in luaM_realloc_ () > #4 0x55d949621dbe in luaC_newobj () > #5 0x55d949626f4d in internshrstr () > #6 0x55d9496271fc in luaS_new () > #7 0x55d94961bd03 in lua_pushstring () > #8 0x55d94962d48a in luaL_traceback () > #9 0x55d9495a1b62 in ha_task_dump (buf=buf@entry=0x7f5247c7f628, > task=0x7f5234057ba0, pfx=pfx@entry=0x55d949672f19 ' ' ) > at src/debug.c:227 > #10 0x55d9495a1fd7 in ha_thread_dump (buf=0x7f5247c7f628, > thr=, calling_tid=7) at src/debug.c:91 > #11 0x55d9495a2236 in debug_handler (sig=, si= out>, arg=) at src/debug.c:847 > #12 > #13 _int_malloc (av=av@entry=0x7f523420, bytes=bytes@entry=56) at > malloc.c:4100 > #14 0x7f52fb16c35d in __GI___libc_malloc (bytes=bytes@entry=56) at > malloc.c:3065 > #15 0x55d94949e9a9 in hlua_alloc (ud=0x55d9498e90d0 > , ptr=, osize=5, nsize=56) at > src/hlua.c:8210 This is particularly strange. Could you please disassemble hlua_alloc ? (dis hlua_alloc) ? You should find something like this: 0x004476c3 <+147>: addDWORD PTR fs:0xfffdd678,0x1 0x004476cc <+156>: lock add QWORD PTR [rip+0x25df8b],0x1# 0x6a5660 <_.44806> 0x004476d5 <+165>: lock add QWORD PTR [rip+0x25df8b],rbx# 0x6a5668 <_.44806+8> 0x004476dd <+173>: movrdi,rbx 0x004476e0 <+176>: call 0x4248b0 0x004476e5 <+181>: subDWORD PTR fs:0xfffdd678,0x1 0x004476ee <+190>: test rax,rax 0x004476f1 <+193>: jne0x44769d The "add DWORD..." is the ++, and the "sub DWORD..." is the --, both of them around the function call. If the variable had been declared static I would have agreed with the possibility of reordering. But here it's global and it's not legal to move it when you don't know if the called function will make use of it. Otherwise global variables cannot be used anymore! By the way, in your case above it should even equal 2. That's rather strange. One thing you can test is simply to declare it volatile. If it changes anything, it will mean that gcc has some specific assumptions on this malloc function (i.e. it decides it will never read your global variables) :-/ Willy
Re: [2.2.9] 100% CPU usage
Hi, wt., 23 mar 2021 o 18:36 Willy Tarreau napisał(a): > > It is most probably because of compiler optimizations. Some compiler > > barriers are necessary to avoid instructions reordering. It is the > purpose > > of attached patches. Sorry to ask you it again, but could you make some > > tests ? > Sure, I'll verify! I don't believe in it at all. free(), malloc() etc are free to manipulate > global variables so the compiler cannot reorder these operations around > them. We're probably facing other corner cases (or the same but not > totally addressed maybe). > After reading I wasn't sure anymore I even tested properly patched package. :) Fortunately I have a core file so I verified if hlua_not_dumpable variable exists. I don't know how it is possible that hlua_not_dumpable is 0 in this state (bt), if Christopher theory about lacking compiler barriers is wrong. (gdb) p hlua_not_dumpable $1 = 0 (gdb) bt #0 __lll_lock_wait_private () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95 #1 0x7f52fb16c34b in __GI___libc_malloc (bytes=bytes@entry=41) at malloc.c:3063 #2 0x55d94949e9a9 in hlua_alloc (ud=0x55d9498e90d0 , ptr=, osize=4, nsize=41) at src/hlua.c:8210 #3 0x55d949622271 in luaM_realloc_ () #4 0x55d949621dbe in luaC_newobj () #5 0x55d949626f4d in internshrstr () #6 0x55d9496271fc in luaS_new () #7 0x55d94961bd03 in lua_pushstring () #8 0x55d94962d48a in luaL_traceback () #9 0x55d9495a1b62 in ha_task_dump (buf=buf@entry=0x7f5247c7f628, task=0x7f5234057ba0, pfx=pfx@entry=0x55d949672f19 ' ' ) at src/debug.c:227 #10 0x55d9495a1fd7 in ha_thread_dump (buf=0x7f5247c7f628, thr=, calling_tid=7) at src/debug.c:91 #11 0x55d9495a2236 in debug_handler (sig=, si=, arg=) at src/debug.c:847 #12 #13 _int_malloc (av=av@entry=0x7f523420, bytes=bytes@entry=56) at malloc.c:4100 #14 0x7f52fb16c35d in __GI___libc_malloc (bytes=bytes@entry=56) at malloc.c:3065 #15 0x55d94949e9a9 in hlua_alloc (ud=0x55d9498e90d0 , ptr=, osize=5, nsize=56) at src/hlua.c:8210
Fwd: [PATCH] cleanup unused definitions
William, could you please have a quick look at these ? I didn't touch them because I don't know if these are things you intend to use in the short-to-mid term. If you take them, please be careful to fix the subject to mention "ssl" in them. Thanks, Willy - Forwarded message from ??? - > Date: Wed, 24 Mar 2021 11:29:03 +0500 > From: ??? > Subject: Re: [PATCH] cleanup unused definitions > To: HAProxy , Willy Tarreau > Delivered-To: haproxy@formilux.org > List-Id: Haproxy > > ping > > ??, 20 ???. 2021 ?. ? 22:43, ??? : > > > while refactoring HA_OPENSSL_VERSION usage, > > I've found unused definitions. nice. > > > > > > Ilya > > - End forwarded message - - Forwarded message from ??? - > Date: Wed, 24 Mar 2021 11:29:19 +0500 > From: ??? > Subject: Re: [PATCH] BUILD: ssl: use feature guard instead of openssl version > for ecdh functions > To: HAProxy , Willy Tarreau > Delivered-To: haproxy@formilux.org > List-Id: Haproxy > > ping > > ??, 21 ???. 2021 ?. ? 13:02, ??? : > > > Hello, > > > > yet another patch that reduces number of HA_OPENSSL_VERSION use > > > > Ilya > > > > > > - End forwarded message -
Re: [PATCH] BUILD: ssl: use feature guard instead of openssl version for ecdh functions
ping вс, 21 мар. 2021 г. в 13:02, Илья Шипицин : > Hello, > > yet another patch that reduces number of HA_OPENSSL_VERSION use > > Ilya > > >
Re: [PATCH] cleanup unused definitions
ping сб, 20 мар. 2021 г. в 22:43, Илья Шипицин : > while refactoring HA_OPENSSL_VERSION usage, > I've found unused definitions. nice. > > > Ilya >