OpenSSL locking was Re: mod_ssl broken as DSO in HEAD
--On Tuesday, June 17, 2003 4:34 PM -0400 Geoff Thorpe [EMAIL PROTECTED] wrote: Well first off, the original backtrace showed a segfault down inside the bowels of the error handlers, something to do with CRYPTO_thread_id() IIRC. That you were getting this, combined with the fact you couldn't get a segfault under a different debugger and/or other variations of running it (again, IIRC), would suggest that at least part of what you're seeing is environmental and/or platform specific. You would still have been getting library errors of course, but it doesn't explain away the segfaults. Finally got a chance to dig some more and found that if I add: CRYPTO_set_id_callback(openssl_id); to ssl_hook_pre_config before the ENGINE_load_builtin_engines and where openssl_id is defined as: static unsigned long openssl_id(void) { /* FIXME: This is lame and not portable. -aaron */ return (unsigned long) apr_os_thread_current(); } (The above is used in flood - hence Aaron's comment.) The SEGVs go away if we do this. Does this make sense given the internals of OpenSSL? On this same tangent, do we need to be doing all of the CRYPTO_lock stuff? I don't believe we are doing that. And, I know in flood, we had lots of problems until we called them. So, I think mod_ssl should be passing the lock structures - especially for worker MPM builds... While I've got someone's attention, I have some real issues in that both of the OpenSSL lock implementations (CRYPTO_set_locking_callback and CRYPTO_set_dynlock_*) require global variables to implement. I don't think I can say 'ick' loud enough. What would the possibility/feasibility of passing application-specific opaque values through the CRYPTO_lock callbacks? Say static void openssl_lock(int mode, int n, void *app, const char *file, int line) static CRYPTO_dynlock_value *ssl_dyn_create(void *app, const char* file, int line) We can have app passed into the CRYPTO_set functions - in the case of httpd, it would be the pools for ssl_dyn_create *or* the global array of locks that openssl_lock would require. You don't know how much happier that would make me if we could do that. globals are just so icky. -- justin
Re: mod_ssl broken as DSO in HEAD
Hi there, On June 16, 2003 10:14 pm, William A. Rowe, Jr. wrote: At 08:13 PM 6/16/2003, Justin Erenkrantz wrote: The change to move OpenSSL's initialization to ssl_hook_pre_config is busted when mod_ssl is a DSO. It will try to call all of the OpenSSL init sequences twice which I don't believe OpenSSL supports. We may need to register proper 'cleanups' to 'unload' all of the engines, as you point out in this problem; If things really are supposed to initialise (and unload) two times on startup, then I guess this is what you should do for openssl too. Right. What is more interesting (as a DSO) is that our module, and ergo the SSL library, should have been unloaded and reinitialized. Apparently we can't count on that behavior, though. Depending on what gets initialised, there are a variety of things you might need to unload from openssl's point of view. However, this is usually an issue for people from a memory leak point of view, not a double initialisation. Potential issue: in a few places, openssl uses a global is_set variable to handle first-time initialisation of callbacks (eg. malloc). The idea is that once they're set, they can't be overriden. If you don't need to do *that* twice on startup, there's no problem. If you do, then it shouldn't matter *provided* you can live without error checking - if it fails then the (correct) global is still set from the previous initialisation, and if it succeeds then the globals needed setting anyway. Of course if it fails on the first initialisation then you have a bug that you'll never spot ... BTW: this is only for stuff where you are providing callback overrides for memory management, thread locking, error stacks, etc. Well, if it isn't done before pre-config, the entire ENGINE support code in the command handler for SSLCryptoDevice is borked. I know, I already spent two days there. gulp SSLCryptoDevice ... I don't like the sound of that ;-) If we force the SSL Library to stay loaded (reliably, across platforms) between startup phases (with the SSL module itself unloading and reloading - ick), then saving the initialization in pprocess might be a viable alternative, but where? register_hooks is no better than pre_config in this regard. Or we simply work out an ENGINE_unload_engines() as a cleanup. But of course, OpenSSL doesn't really like to be cleared and restarted from scratch, so I don't know if this is really trivial or a nightmare. This can be done easily enough, it's just a case of knowing what bits you need to clean up. What is beyond my reach however is a way of convincing Apache to run in one process and start up and close down cleanly - this is normally the easiest way to check for missing cleanups because they show up as leaks. I could otherwise give you some tips on cleanup sequences, but without a way of checking the cleanup code for leaks we'd be acting on faith that the sequence is (a) complete, and (b) the minimum necessary (no point putting redundant calls in and risking extra static linking). Let me know if, and how, I can help. If you want to hunt around yourself, check out the apps_startup() and apps_shutdown() macro definitions inside apps/apps.h in the openssl source. You may not need all of those cleanups, and of course YMMV :-) Cheers, Geoff -- Geoff Thorpe [EMAIL PROTECTED] http://www.geoffthorpe.net/
RE: OpenSSL locking was Re: mod_ssl broken as DSO in HEAD
-Original Message- From: Justin Erenkrantz [mailto:[EMAIL PROTECTED] [SNIP] On this same tangent, do we need to be doing all of the CRYPTO_lock stuff? I don't believe we are doing that. And, I know in flood, we had lots of problems until we called them. So, I think mod_ssl should be passing the lock structures - especially for worker MPM builds... Hmmn.. I have a vague recollection of posting a patch do to exactly that.. and it was rejected because of some reason. I already have a patch which does that if ppl. are interested in evaluating it. (Off to dig my mails to find what happened.) -Madhu
Re: OpenSSL locking was Re: mod_ssl broken as DSO in HEAD
Hi, On June 18, 2003 02:28 am, Justin Erenkrantz wrote: Finally got a chance to dig some more and found that if I add: CRYPTO_set_id_callback(openssl_id); to ssl_hook_pre_config before the ENGINE_load_builtin_engines and where openssl_id is defined as: static unsigned long openssl_id(void) { /* FIXME: This is lame and not portable. -aaron */ return (unsigned long) apr_os_thread_current(); } (The above is used in flood - hence Aaron's comment.) The SEGVs go away if we do this. Does this make sense given the internals of OpenSSL? I'm not the best person to talk about with respect to locking in openssl (I more or less detest threads, an Alan Cox quote about state-machines springs to mind...). That said, :-), I took another look at the backtrace you posted and it certainly seems odd. Digging ... the implementation of CRYPTO_thread_id() is; unsigned long CRYPTO_thread_id(void) { unsigned long ret=0; if (id_callback == NULL) { #ifdef OPENSSL_SYS_WIN16 ret=(unsigned long)GetCurrentTask(); #elif defined(OPENSSL_SYS_WIN32) ret=(unsigned long)GetCurrentThreadId(); #elif defined(GETPID_IS_MEANINGLESS) ret=1L; #else ret=(unsigned long)getpid(); #endif } else ret=id_callback(); return(ret); } The segfault is (or appears to be, stack smashing possibilities aside) occuring in a function called from this, and the address is less than the other openssl library functions in the trace, but not by much. Which begs the question - is id_callback set to something (and if so, what?) or does the address correspond to getpid()? Unfortunately, I think the easiest way to see this would be to hack the above function (and perhaps others in crypto/cryptlib.c that deal with id_callback) to watch what is going on. On this same tangent, do we need to be doing all of the CRYPTO_lock stuff? I don't believe we are doing that. And, I know in flood, we had lots of problems until we called them. So, I think mod_ssl should be passing the lock structures - especially for worker MPM builds... Whenever I build Apache, I typically build a version of openssl configured with no-threads so this sort of issue is implicitly ruled out. It also removes a few meaningless function calls given that (multi-forked, traditional) Apache doesn't need any thread-safe locking. However I know that isn't the answer to your question in general, as many people will be using the openssl libs bundled with their system and/or trying to use Apache2 in a threaded setup. While I've got someone's attention, I have some real issues in that both of the OpenSSL lock implementations (CRYPTO_set_locking_callback and CRYPTO_set_dynlock_*) require global variables to implement. I don't think I can say 'ick' loud enough. What would the possibility/feasibility of passing application-specific opaque values through the CRYPTO_lock callbacks? Say static void openssl_lock(int mode, int n, void *app, const char *file, int line) static CRYPTO_dynlock_value *ssl_dyn_create(void *app, const char* file, int line) We can have app passed into the CRYPTO_set functions - in the case of httpd, it would be the pools for ssl_dyn_create *or* the global array of locks that openssl_lock would require. You don't know how much happier that would make me if we could do that. globals are just so icky. -- justin For this and the previous question, the short answer is maybe, but it doesn't solve the problem of working with existing versions. In both cases, ie. the segfault you're seeing *and* the issue about attaching an opaque pointer to caller-provide locking callbacks, it would make sense to take that conversation over to openssl-dev. In particular, we should get any appropriate details into the request-tracking system. BTW: Once we're over there, Richard Levitte may be able to provide better comments than I on this locking stuff, particularly as the dynlock stuff was (IIRC) his creation. Cheers, Geoff -- Geoff Thorpe [EMAIL PROTECTED] http://www.geoffthorpe.net/
Re: mod_ssl broken as DSO in HEAD
William A. Rowe, Jr. wrote: At 09:15 PM 6/16/2003, Jeff Trawick wrote: Justin Erenkrantz wrote: Since pre_config doesn't get the process info, we can't use the userdata trick to prevent double initialization. My solution would be to grow the pre_config hook to get the process info. But, that can't be what was intended. =) -- justin set an environment variable on first pass to know which pass it is Won't work across platforms. Across Unix, perhaps. where and why, if you have the time (other enquiring modules want to know)?
Re: mod_ssl broken as DSO in HEAD
--On Monday, June 16, 2003 11:15 PM -0500 William A. Rowe, Jr. [EMAIL PROTECTED] wrote: I just wonder why the global pool ID is Unknowable. Why did we ever decree that the 'one true root pool' is unknowable to the masses (modules)? That would be a really sweet place to drop process-lifetime userdata. You can always create a pool descended from it by passing a NULL parent to apr_pool_create. That will create a separate pool chain outside of httpd's. But, I really think exposing the global pool would open ourselves to a lot of misuse by modules. Anything allocated in the global pool would never be cleaned up. So, while attaching the userdata information to the global pool might sound nice, I think it opens a can of worms we don't want to open as lots of people may misuse this pool. But, they can already do this if they absolutely must (via pool_create/parent_get, so I doubt we have to make it easier). As I said before, post_config gets the process pool, so why doesn't pre_config? That seems like the best solution so far. However, that means that we can't backport the mod_ssl changes to 2.0 or we have to come up with another strategy for 2.0. Oh, well. -- justin
Re: mod_ssl broken as DSO in HEAD
Justin Erenkrantz wrote: --On Monday, June 16, 2003 11:15 PM -0500 William A. Rowe, Jr. [EMAIL PROTECTED] wrote: I just wonder why the global pool ID is Unknowable. Why did we ever decree that the 'one true root pool' is unknowable to the masses (modules)? That would be a really sweet place to drop process-lifetime userdata. You can always create a pool descended from it by passing a NULL parent to apr_pool_create. That will create a separate pool chain outside of httpd's. lots of wasted storage for a flag :) As I said before, post_config gets the process pool, so why doesn't pre_config? That seems like the best solution so far. However, that means that we can't backport the mod_ssl changes to 2.0 or we have to come up with another strategy for 2.0. Oh, well. -- justin ap_hook_pre_config_ex() should work
Re: mod_ssl broken as DSO in HEAD
--On Tuesday, June 17, 2003 12:07 PM -0400 Jeff Trawick [EMAIL PROTECTED] wrote: ap_hook_pre_config_ex() should work I'd be fine with a new hook in 2.0, but 2.1+ should just have pre_config modified to always take that pool parameter. I don't see a need to keep cruft around just for cruft's sake. If that sounds okay, I'll work up patches in the next few days - unless someone wants to beat me to it. -- justin
RE: mod_ssl broken as DSO in HEAD
From: Justin Erenkrantz [mailto:[EMAIL PROTECTED] Sent: Tuesday, June 17, 2003 6:11 PM --On Tuesday, June 17, 2003 12:07 PM -0400 Jeff Trawick [EMAIL PROTECTED] wrote: ap_hook_pre_config_ex() should work I'd be fine with a new hook in 2.0, but 2.1+ should just have pre_config modified to always take that pool parameter. I don't see a need to keep cruft around just for cruft's sake. If that sounds okay, I'll work up patches in the next few days - unless someone wants to beat me to it. -- justin +1. I'm happy that this thread didn't go further in the direction of exposing the parent pool ;). Sander
Re: mod_ssl broken as DSO in HEAD
At 07:28 AM 6/17/2003, Jeff Trawick wrote: William A. Rowe, Jr. wrote: At 09:15 PM 6/16/2003, Jeff Trawick wrote: Justin Erenkrantz wrote: Since pre_config doesn't get the process info, we can't use the userdata trick to prevent double initialization. My solution would be to grow the pre_config hook to get the process info. But, that can't be what was intended. =) -- justin set an environment variable on first pass to know which pass it is Won't work across platforms. Across Unix, perhaps. where and why, if you have the time (other enquiring modules want to know)? My bad - setenv() will 'work' on win32 to modify the program's 'C' environment table. Where it falls down is modifying the 'inherited' environment, which must be maintained with SetEnvironmentVariable. Why setenv() on Win32 didn't do this has always puzzled me. By the way, it's _setenv() on win32, unless you link oldnames.lib. But I'm liking Justin's comment that creating a pool of our own based on the NULL pool might be the way to go. Some pools need to last a lifetime. My last question for everyone with insight ... I load modfoo.so, which is bound to libfoo.so. Once modfoo.so is unloaded and loaded again, is it really guarenteed that libfoo.so will remain loaded throughout, since it was implicitly loaded by the now reloaded modfoo.so? Bill
Re: mod_ssl broken as DSO in HEAD
If folks would be so kind as to cvs up their httpd-2.1 trees and check out the latest cleanup reordering and introduction of ENGINE_ and CONF_ cleanups - I would be most grateful. Geoff, if you could review the latest commit for sanity, I'd appreciate that, too. Bill At 12:17 PM 6/17/2003, Geoff Thorpe wrote: Hi there, On June 16, 2003 10:14 pm, William A. Rowe, Jr. wrote: At 08:13 PM 6/16/2003, Justin Erenkrantz wrote: The change to move OpenSSL's initialization to ssl_hook_pre_config is busted when mod_ssl is a DSO. It will try to call all of the OpenSSL init sequences twice which I don't believe OpenSSL supports. We may need to register proper 'cleanups' to 'unload' all of the engines, as you point out in this problem; If things really are supposed to initialise (and unload) two times on startup, then I guess this is what you should do for openssl too. Right. What is more interesting (as a DSO) is that our module, and ergo the SSL library, should have been unloaded and reinitialized. Apparently we can't count on that behavior, though. Depending on what gets initialised, there are a variety of things you might need to unload from openssl's point of view. However, this is usually an issue for people from a memory leak point of view, not a double initialisation. Potential issue: in a few places, openssl uses a global is_set variable to handle first-time initialisation of callbacks (eg. malloc). The idea is that once they're set, they can't be overriden. If you don't need to do *that* twice on startup, there's no problem. If you do, then it shouldn't matter *provided* you can live without error checking - if it fails then the (correct) global is still set from the previous initialisation, and if it succeeds then the globals needed setting anyway. Of course if it fails on the first initialisation then you have a bug that you'll never spot ... BTW: this is only for stuff where you are providing callback overrides for memory management, thread locking, error stacks, etc. Well, if it isn't done before pre-config, the entire ENGINE support code in the command handler for SSLCryptoDevice is borked. I know, I already spent two days there. gulp SSLCryptoDevice ... I don't like the sound of that ;-) If we force the SSL Library to stay loaded (reliably, across platforms) between startup phases (with the SSL module itself unloading and reloading - ick), then saving the initialization in pprocess might be a viable alternative, but where? register_hooks is no better than pre_config in this regard. Or we simply work out an ENGINE_unload_engines() as a cleanup. But of course, OpenSSL doesn't really like to be cleared and restarted from scratch, so I don't know if this is really trivial or a nightmare. This can be done easily enough, it's just a case of knowing what bits you need to clean up. What is beyond my reach however is a way of convincing Apache to run in one process and start up and close down cleanly - this is normally the easiest way to check for missing cleanups because they show up as leaks. I could otherwise give you some tips on cleanup sequences, but without a way of checking the cleanup code for leaks we'd be acting on faith that the sequence is (a) complete, and (b) the minimum necessary (no point putting redundant calls in and risking extra static linking). Let me know if, and how, I can help. If you want to hunt around yourself, check out the apps_startup() and apps_shutdown() macro definitions inside apps/apps.h in the openssl source. You may not need all of those cleanups, and of course YMMV :-) Cheers, Geoff -- Geoff Thorpe [EMAIL PROTECTED] http://www.geoffthorpe.net/
Re: mod_ssl broken as DSO in HEAD
William A. Rowe, Jr. wrote: My bad - setenv() will 'work' on win32 to modify the program's 'C' environment table. Where it falls down is modifying the 'inherited' environment, which must be maintained with SetEnvironmentVariable. Why setenv() on Win32 didn't do this has always puzzled me. By the way, it's _setenv() on win32, unless you link oldnames.lib. But apr_env_set() does the right thing I suppose... But I'm liking Justin's comment that creating a pool of our own based on the NULL pool might be the way to go. Some pools need to last a lifetime. Lots of storage for a flag. And how do you maintain addressibility to the pool across unload/reload of the DSO?
Re: mod_ssl broken as DSO in HEAD
--On Tuesday, June 17, 2003 12:46 PM -0500 William A. Rowe, Jr. [EMAIL PROTECTED] wrote: If folks would be so kind as to cvs up their httpd-2.1 trees and check out the latest cleanup reordering and introduction of ENGINE_ and CONF_ cleanups - I would be most grateful. Geoff, if you could review the latest commit for sanity, I'd appreciate that, too. Nope, still SEGVs in the same location even though the cleanups are invoked. -- justin
Re: mod_ssl broken as DSO in HEAD
Actually, that is Geoff's code; would you mind taking a look? I suppose I should point out that I was building libssl/libcrypto into mod_ssl, which is why I never observed this bug. No matter which ultimate solution we adopt - we need to stay cognizant of both static and dynamic (the one I forgot to vet) builds. Invoking the setups only once would toggle similar problems in the static builds. Bill At 03:06 PM 6/17/2003, Justin Erenkrantz wrote: --On Tuesday, June 17, 2003 12:46 PM -0500 William A. Rowe, Jr. [EMAIL PROTECTED] wrote: If folks would be so kind as to cvs up their httpd-2.1 trees and check out the latest cleanup reordering and introduction of ENGINE_ and CONF_ cleanups - I would be most grateful. Geoff, if you could review the latest commit for sanity, I'd appreciate that, too. Nope, still SEGVs in the same location even though the cleanups are invoked. -- justin
Re: mod_ssl broken as DSO in HEAD
Hi, On June 17, 2003 04:06 pm, Justin Erenkrantz wrote: --On Tuesday, June 17, 2003 12:46 PM -0500 William A. Rowe, Jr. [EMAIL PROTECTED] wrote: If folks would be so kind as to cvs up their httpd-2.1 trees and check out the latest cleanup reordering and introduction of ENGINE_ and CONF_ cleanups - I would be most grateful. Geoff, if you could review the latest commit for sanity, I'd appreciate that, too. Nope, still SEGVs in the same location even though the cleanups are invoked. -- justin Well first off, the original backtrace showed a segfault down inside the bowels of the error handlers, something to do with CRYPTO_thread_id() IIRC. That you were getting this, combined with the fact you couldn't get a segfault under a different debugger and/or other variations of running it (again, IIRC), would suggest that at least part of what you're seeing is environmental and/or platform specific. You would still have been getting library errors of course, but it doesn't explain away the segfaults. But with respect to the missing cleanups; it's quite possible that adding those cleanup calls won't guarantee complete unloading if there are references left lingering by the application code. If there are objects allocated (or references held) by Apache code, these should all be released prior to calling the cleanup functions otherwise things will be left lingering. The cleanup functions should only release the references held by the library's internal state - the application must do it's part too. This includes any of the following objects mod_ssl may have lying around; RSA, DSA, EVP_PKEY, SSL, SSL_CTX, ENGINE, X509, ... (you get the picture). Note that these things can have references between them, eg. an SSL object may hold a reference to an RSA key which in turn may hold a reference to an ENGINE, etc. This is why the easiest way usually to track down this sort of thing is to run the cleanup, exit(0), and then check for memory leaks using valgrind or whatever. Other than those vague thoughts, I can't suggest much else. If this continues to resist your efforts, please let me know how I can reproduce and debug this (in linux) from the apache point of view. I'm a bit shallow in my knowledge of apache's internals and how to track its loading, unloading, memory handling, etc. Cheers, Geoff -- Geoff Thorpe [EMAIL PROTECTED] http://www.geoffthorpe.net/
mod_ssl broken as DSO in HEAD
The change to move OpenSSL's initialization to ssl_hook_pre_config is busted when mod_ssl is a DSO. It will try to call all of the OpenSSL init sequences twice which I don't believe OpenSSL supports. Namely, that is a problem for ENGINE_load_builtin_engines() which will 'detect' conflicting engine IDs: (dbx 1) where [1] 0xdd32cbe0(), at 0xdd32cbdf [2] CRYPTO_thread_id(), at 0xddab652f [3] ERR_get_state(), at 0xdda72732 [4] ERR_put_error(0x26, 0x78, 0x67, 0xddac1ca8, 0x72), at 0xdda71e49 [5] engine_list_add(0x81e9720), at 0xdda586f1 [6] ENGINE_add(0x81e9720), at 0xdda58a5d [7] ENGINE_load_dynamic(), at 0xdda5b78e [8] ENGINE_load_builtin_engines(), at 0xdda5a565 =[9] ssl_hook_pre_config(pconf = 0x8197d28, plog = 0x81d5e20, ptemp = 0x81dbe38), line 245 in mod_ssl.c [10] ap_run_pre_config(0x8197d28, 0x81d5e20, 0x81dbe38), at 0x80dcd07 [11] main(argc = 2, argv = 0x8046a70), line 640 in main.c ERR_put_error's third argument (reason) is 0x67 (103) which is ENGINE_R_CONFLICTING_ENGINE_ID. 0x72 (114) is the line number that has that ID in crypto/engine/eng_list.c. So, this code path makes sense (to me) since if I break on the debugger, I see that ENGINE_load_builtin_engines() is called twice since ssl_hook_pre_config is called twice. Note that I can't reproduce this SEGV running under dbx, but can reproduce out of the debugger (this trace is from a core). This leads me to believe that we are having a double-init problem. Since pre_config doesn't get the process info, we can't use the userdata trick to prevent double initialization. My solution would be to grow the pre_config hook to get the process info. But, that can't be what was intended. =) -- justin
Re: mod_ssl broken as DSO in HEAD
At 08:13 PM 6/16/2003, Justin Erenkrantz wrote: The change to move OpenSSL's initialization to ssl_hook_pre_config is busted when mod_ssl is a DSO. It will try to call all of the OpenSSL init sequences twice which I don't believe OpenSSL supports. We may need to register proper 'cleanups' to 'unload' all of the engines, as you point out in this problem; Namely, that is a problem for ENGINE_load_builtin_engines() which will 'detect' conflicting engine IDs: (dbx 1) where [1] 0xdd32cbe0(), at 0xdd32cbdf [2] CRYPTO_thread_id(), at 0xddab652f [3] ERR_get_state(), at 0xdda72732 [4] ERR_put_error(0x26, 0x78, 0x67, 0xddac1ca8, 0x72), at 0xdda71e49 [5] engine_list_add(0x81e9720), at 0xdda586f1 [6] ENGINE_add(0x81e9720), at 0xdda58a5d [7] ENGINE_load_dynamic(), at 0xdda5b78e [8] ENGINE_load_builtin_engines(), at 0xdda5a565 =[9] ssl_hook_pre_config(pconf = 0x8197d28, plog = 0x81d5e20, ptemp = 0x81dbe38), line 245 in mod_ssl.c [10] ap_run_pre_config(0x8197d28, 0x81d5e20, 0x81dbe38), at 0x80dcd07 [11] main(argc = 2, argv = 0x8046a70), line 640 in main.c ERR_put_error's third argument (reason) is 0x67 (103) which is ENGINE_R_CONFLICTING_ENGINE_ID. 0x72 (114) is the line number that has that ID in crypto/engine/eng_list.c. So, this code path makes sense (to me) since if I break on the debugger, I see that ENGINE_load_builtin_engines() is called twice since ssl_hook_pre_config is called twice. Right. What is more interesting (as a DSO) is that our module, and ergo the SSL library, should have been unloaded and reinitialized. Apparently we can't count on that behavior, though. Note that I can't reproduce this SEGV running under dbx, but can reproduce out of the debugger (this trace is from a core). This leads me to believe that we are having a double-init problem. Since pre_config doesn't get the process info, we can't use the userdata trick to prevent double initialization. My solution would be to grow the pre_config hook to get the process info. But, that can't be what was intended. =) -- justin Well, if it isn't done before pre-config, the entire ENGINE support code in the command handler for SSLCryptoDevice is borked. I know, I already spent two days there. If we force the SSL Library to stay loaded (reliably, across platforms) between startup phases (with the SSL module itself unloading and reloading - ick), then saving the initialization in pprocess might be a viable alternative, but where? register_hooks is no better than pre_config in this regard. Or we simply work out an ENGINE_unload_engines() as a cleanup. But of course, OpenSSL doesn't really like to be cleared and restarted from scratch, so I don't know if this is really trivial or a nightmare. Bill
Re: mod_ssl broken as DSO in HEAD
Justin Erenkrantz wrote: Since pre_config doesn't get the process info, we can't use the userdata trick to prevent double initialization. My solution would be to grow the pre_config hook to get the process info. But, that can't be what was intended. =) -- justin set an environment variable on first pass to know which pass it is
Re: mod_ssl broken as DSO in HEAD
At 09:15 PM 6/16/2003, Jeff Trawick wrote: Justin Erenkrantz wrote: Since pre_config doesn't get the process info, we can't use the userdata trick to prevent double initialization. My solution would be to grow the pre_config hook to get the process info. But, that can't be what was intended. =) -- justin set an environment variable on first pass to know which pass it is Won't work across platforms. Across Unix, perhaps. I just wonder why the global pool ID is Unknowable. Why did we ever decree that the 'one true root pool' is unknowable to the masses (modules)? That would be a really sweet place to drop process-lifetime userdata. Bill
Re: Optimizing dir_merge() AND RE: [BUG] mod_ssl broken
From: Sander Striker [EMAIL PROTECTED] Sent: Thursday, September 13, 2001 7:30 AM Ok, now I have a repro recipe that doesn't require mod_dav and mod_dav_svn. The last commit should have fixed the problem (and does with your mod_ssl example.) Could you go back and check mod_dav with mod_dav_svn to assure I've licked it? Bill
RE: Optimizing dir_merge() AND RE: [BUG] mod_ssl broken
From: William A. Rowe, Jr. [mailto:[EMAIL PROTECTED]] Sent: 21 September 2001 08:38 From: Sander Striker [EMAIL PROTECTED] Sent: Thursday, September 13, 2001 7:30 AM Ok, now I have a repro recipe that doesn't require mod_dav and mod_dav_svn. The last commit should have fixed the problem (and does with your mod_ssl example.) Could you go back and check mod_dav with mod_dav_svn to assure I've licked it? Bill I just tried it. Works like a charm. Thanks Bill. Now I'll go back hunting for more segfaults :) [I saw another one the other day, but am not sure it is related or not, will investigate] Sander
RE: Optimizing dir_merge() AND RE: [BUG] mod_ssl broken
Hi, I can reproduce the problem easily now with openssl s_client. If anyone is interested to hunt this bug down (I am personally not familiar enough with the location_walk code to find it (without spending more time on it than I have at the moment)), I can give an account on my box to observe the problem first hand. Once again, it is not in the optimization code, I reverted to request.c 1.47 (with patch 1.49 applied). If someone has time, please help hunt this down so I can focus on working on the ssl side of subversion ;) Sander
RE: Optimizing dir_merge() AND RE: [BUG] mod_ssl broken
On Thu, 13 Sep 2001, Sander Striker wrote: can also reproduce with less in the mix: Listen 8999 VirtualHost _default_:8999 Location /manual #this can be any module with per-directory create+merge functions RequestHeader add Foo bar /Location /VirtualHost Location /manual Order deny,allow Deny from all Allow from 127.0.0.1 /Location 0x40246225 in merge_headers_config (p=0x8230334, basev=0x0, ^^^ overridesv=0x8220b54) at mod_headers.c:216 216 newconf-fixup_in = apr_array_append(p, base-fixup_in, overrides-fixup_in); producable with any module that has a per-directory create and merge function. seems there needs to be merging with r-server-lookup_defaults, that is not currently happening. if you remove the Location /manual from outside the vhost and put its contents into the Location /manual inside the vhost, all is well.
RE: Optimizing dir_merge() AND RE: [BUG] mod_ssl broken
From: William A. Rowe, Jr. [mailto:[EMAIL PROTECTED]] My location_walk optimization (which suffers a potential bug, per our svn friends) takes an entirely different tact, which renders that whole idea DOA. Ok, to rule out the possibility it is in the optimization code I reverted to request.c 1.47. The bug remains: Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 1024 (LWP 21805)] 0x806dd1d in ssl_config_perdir_merge (p=0x814e20c, basev=0x0, addv=0x8192e2c) at ssl_engine_config.c:269 269 cfgMergeArray(aRequirement); (gdb) bt #0 0x806dd1d in ssl_config_perdir_merge (p=0x814e20c, basev=0x0, addv=0x8192e2c) at ssl_engine_config.c:269 #1 0x80a2c15 in ap_merge_per_dir_configs (p=0x814e20c, base=0x8192f14, new_conf=0x8192c34) at config.c:262 #2 0x80b7284 in ap_location_walk (r=0x814e23c) at request.c:1169 #3 0x80b6511 in ap_process_request_internal (r=0x814e23c) at request.c:154 #4 0x8080ba0 in ap_process_request (r=0x814e23c) at http_request.c:284 #5 0x807c4fa in ap_process_http_connection (c=0x814c2fc) at http_core.c:287 #6 0x80ac6db in ap_run_process_connection (c=0x814c2fc) at connection.c:82 #7 0x80ac8b0 in ap_process_connection (c=0x814c2fc) at connection.c:219 #8 0x80a13b2 in child_main (child_num_arg=1) at prefork.c:829 #9 0x80a150a in make_child (s=0x80e831c, slot=1) at prefork.c:916 #10 0x80a174f in perform_idle_server_maintenance (p=0x80e6cc4) at prefork.c:1057 #11 0x80a1b2f in ap_mpm_run (_pconf=0x80e6cc4, plog=0x811ee84, s=0x80e831c) at prefork.c:1235 #12 0x80a7323 in main (argc=1, argv=0xbd9c) at main.c:431 #13 0x401f026a in __libc_start_main (main=0x80a6d98 main, argc=1, ubp_av=0xbd9c, init=0x8062460 _init, fini=0x80be7c4 _fini, rtld_fini=0x4000daa4 _dl_fini, stack_end=0xbd8c) at ../sysdeps/generic/libc-start.c:129 Somewhere between ap_merge_per_dir_configs and ssl_config_perdir_merge base is reset to NULL. Will investigate further. For completeness, the relevant part of my config: VirtualHost _default_:443 SSLEngine on SSLCACertificatePath /var/openssl/ca/private SSLCACertificateFile /var/openssl/ca/private/cacert.pem SSLCertificateFile /var/openssl/ca/certs/striker.xs4all.nl-cert.pem SSLCertificateKeyFile /var/openssl/ca/certs/striker.xs4all.nl-key.pem DocumentRoot /opt/httpd/htdocs ServerName striker.xs4all.nl Location /svn SSLRequireSSL #SSLVerifyClient require #SSLVerifyDepth 1 DAV svn SVNPath /home/svn /Location /VirtualHost Location /svn DAV svn SVNPath /home/svn /Location [trying again] Ah, the first loop through ap_merge_per_dir_configs seems to be ok. Second time through ap_merge_per_dir_configs: Breakpoint 4, ap_merge_per_dir_configs (p=0x814e20c, base=0x8192f14, new_conf=0x8192c34) at config.c:262 262 conf_vector[i] = (*df) (p, base_vector[i], new_vector[i]); (gdb) p *modp $6 = {version = 20010825, minor_version = 0, module_index = 7, name = 0x80c0caf mod_ssl.c, dynamic_load_handle = 0x0, next = 0x80da140, magic = 1095774768, rewrite_args = 0, create_dir_config = 0x806dc6c ssl_config_perdir_create, merge_dir_config = 0x806dce0 ssl_config_perdir_merge, create_server_config = 0x806d9c4 ssl_config_server_create, merge_server_config = 0x806dab4 ssl_config_server_merge, cmds = 0x80da180, register_hooks = 0x806d760 ssl_register_hooks} (gdb) s 0x806dce3 in ssl_config_perdir_merge (p=0x814e20c, basev=0x0, addv=0x8192e2c) at ssl_engine_config.c:263 263 { Now, why is base_vector[i] == NULL now? It isn't wrowes caching code, because that is not in here (request.c 1.47). Thoughts anyone? Sander
RE: mod_ssl broken
From: William A. Rowe, Jr. [mailto:[EMAIL PROTECTED]] From: Ben Laurie [EMAIL PROTECTED] Sent: Sunday, September 09, 2001 3:50 PM Sander Striker wrote: [...] The problem is that basev == NULL, which causes apr_array_append to barf. I'll be looking into this next week (or someone working on mod_ssl will). I've looked into this slightly, and the underlying problem is that ap_merge_per_dir_configs finds a previous cached merge (for /?) and tries to use it but the entry for mod_ssl is NULL. The URL, btw, is /svn. This is very, very odd. Would someone please post the svn create/merge dir config code, just in case it's clobbering a sibling? The last silly guess, perhaps we have a subrequest where the prev-pool has been destroyed (sounds unlikely to me.) #define INHERIT_VALUE(parent, child, field) \ ((child)-field ? (child)-field : (parent)-field) [...] static void *dav_svn_create_dir_config(apr_pool_t *p, char *dir) { /* NOTE: dir==NULL creates the default per-dir config */ return apr_pcalloc(p, sizeof(dav_svn_dir_conf)); } static void *dav_svn_merge_dir_config(apr_pool_t *p, void *base, void *overrides) { dav_svn_dir_conf *parent = base; dav_svn_dir_conf *child = overrides; dav_svn_dir_conf *newconf; newconf = apr_pcalloc(p, sizeof(*newconf)); newconf-fs_path = INHERIT_VALUE(parent, child, fs_path); return newconf; } Looks ok to me... There are no conditional create/merge results in mod_ssl, whatsoever. Ergo, it should always be given a base and addv, and return a new conf. I got stuck at that point, coz it really isn't at all clear where the entry was expected to be set in the first place. BTW, all but the first entry in the cached merge were also NULL. I don't have time before bedtime to figure this out, I'm afraid. It sounds as if I've failed to build/record/retrieve the cache-walked result properly. I'll hack at this a bit tommorow. Hint, server/request.c:1085 is the start of this block, and we should have a stack of walk_walked_t structures for every merge performed, recorded in the (cache_walk_t*)cache-walked apr_array. Bill
Re: mod_ssl broken
From: Sander Striker [EMAIL PROTECTED] Sent: Monday, September 10, 2001 3:24 AM I've looked into this slightly, and the underlying problem is that ap_merge_per_dir_configs finds a previous cached merge (for /?) and tries to use it but the entry for mod_ssl is NULL. The URL, btw, is /svn. This is very, very odd. Would someone please post the svn create/merge dir config code, just in case it's clobbering a sibling? #define INHERIT_VALUE(parent, child, field) \ ((child)-field ? (child)-field : (parent)-field) Looks ok to me... Looks OK to me too... Who wants to toss together a patch to pull all the helper code from mod_ssl, change it from eastern to western European, and drop it back in the right place (AP_ decorated) such as ap_config.h? If _we_ need these sort of helpers... #define cfgMerge(el,unset) new-el = (add-el == (unset)) ? base-el : add-el #define cfgMergeArray(el) new-el = apr_array_append(p, add-el, base-el) #define cfgMergeTable(el) new-el = apr_table_overlay(p, add-el, base-el) #define cfgMergeCtx(el) new-el = apr_table_overlay(p, add-el, base-el) #define cfgMergeString(el) cfgMerge(el, NULL) #define cfgMergeBool(el)cfgMerge(el, UNSET) #define cfgMergeInt(el) cfgMerge(el, UNSET) then apparently (as your example shows) the rest of the world needs these as well. Bill
RE: mod_ssl broken
[dropped dev@subversion] From: William A. Rowe, Jr. [mailto:[EMAIL PROTECTED]] Sent: 10 September 2001 14:37 From: Sander Striker [EMAIL PROTECTED] Sent: Monday, September 10, 2001 3:24 AM I've looked into this slightly, and the underlying problem is that ap_merge_per_dir_configs finds a previous cached merge (for /?) and tries to use it but the entry for mod_ssl is NULL. The URL, btw, is /svn. This is very, very odd. Would someone please post the svn create/merge dir config code, just in case it's clobbering a sibling? #define INHERIT_VALUE(parent, child, field) \ ((child)-field ? (child)-field : (parent)-field) Looks ok to me... Looks OK to me too... Who wants to toss together a patch to pull all the helper code from mod_ssl, change it from eastern to western European, and drop it back in the right place (AP_ decorated) such as ap_config.h? If _we_ need these sort of helpers... #define cfgMerge(el,unset) new-el = (add-el == (unset)) ? base-el : add-el #define cfgMergeArray(el) new-el = apr_array_append(p, add-el, base-el) #define cfgMergeTable(el) new-el = apr_table_overlay(p, add-el, base-el) #define cfgMergeCtx(el) new-el = apr_table_overlay(p, add-el, base-el) #define cfgMergeString(el) cfgMerge(el, NULL) #define cfgMergeBool(el)cfgMerge(el, UNSET) #define cfgMergeInt(el) cfgMerge(el, UNSET) then apparently (as your example shows) the rest of the world needs these as well. I don't mind putting a patch together that does this (and to use it in all core modules). I would appreciate suggestions for the final names though (naming isn't my strong side). AP_CFG_MERGE AP_CFG_MERGE_ARRAY ... ? Bill Sander
Re: mod_ssl broken
On Monday 10 September 2001 06:28, Sander Striker wrote: [dropped dev@subversion] From: William A. Rowe, Jr. [mailto:[EMAIL PROTECTED]] Sent: 10 September 2001 14:37 From: Sander Striker [EMAIL PROTECTED] Sent: Monday, September 10, 2001 3:24 AM I've looked into this slightly, and the underlying problem is that ap_merge_per_dir_configs finds a previous cached merge (for /?) and tries to use it but the entry for mod_ssl is NULL. The URL, btw, is /svn. This is very, very odd. Would someone please post the svn create/merge dir config code, just in case it's clobbering a sibling? #define INHERIT_VALUE(parent, child, field) \ ((child)-field ? (child)-field : (parent)-field) Looks ok to me... Looks OK to me too... Who wants to toss together a patch to pull all the helper code from mod_ssl, change it from eastern to western European, and drop it back in the right place (AP_ decorated) such as ap_config.h? If _we_ need these sort of helpers... #define cfgMerge(el,unset) new-el = (add-el == (unset)) ? base-el : add-el #define cfgMergeArray(el) new-el = apr_array_append(p, add-el, base-el) #define cfgMergeTable(el) new-el = apr_table_overlay(p, add-el, base-el) #define cfgMergeCtx(el) new-el = apr_table_overlay(p, add-el, base-el) #define cfgMergeString(el) cfgMerge(el, NULL) #define cfgMergeBool(el)cfgMerge(el, UNSET) #define cfgMergeInt(el) cfgMerge(el, UNSET) then apparently (as your example shows) the rest of the world needs these as well. I don't mind putting a patch together that does this (and to use it in all core modules). I would appreciate suggestions for the final names though (naming isn't my strong side). AP_CFG_MERGE AP_CFG_MERGE_ARRAY Don't do that. That is just Ralf's coding style. It is not proof that we need this in every single core module. Ryan __ Ryan Bloom [EMAIL PROTECTED] Covalent Technologies [EMAIL PROTECTED] --
Re: mod_ssl broken
From: Ryan Bloom [EMAIL PROTECTED] Sent: Monday, September 10, 2001 8:39 AM I don't mind putting a patch together that does this (and to use it in all core modules). I would appreciate suggestions for the final names though (naming isn't my strong side). AP_CFG_MERGE AP_CFG_MERGE_ARRAY Don't do that. That is just Ralf's coding style. It is not proof that we need this in every single core module. HUH? The point is that even subversion uses such helpers for clarity. It would help all authors to offer these, consistently. We've (collectively) proven that config merge errors are simple to author, and difficult to debug. Bill
RE: merge macros (was: Re: mod_ssl broken)
From: Greg Stein [mailto:[EMAIL PROTECTED]] Sent: 10 September 2001 22:08 On Mon, Sep 10, 2001 at 09:23:59AM -0500, William A. Rowe, Jr. wrote: From: Ryan Bloom [EMAIL PROTECTED] Sent: Monday, September 10, 2001 8:39 AM I don't mind putting a patch together that does this (and to use it in all core modules). I would appreciate suggestions for the final names though (naming isn't my strong side). AP_CFG_MERGE AP_CFG_MERGE_ARRAY Don't do that. That is just Ralf's coding style. It is not proof that we need this in every single core module. HUH? The point is that even subversion uses such helpers for clarity. It would help all authors to offer these, consistently. We've (collectively) proven that config merge errors are simple to author, and difficult to debug. And the SVN merge macros came from mod_dav. The only reason mod_dav_fs doesn't have a similar macro is that it only has a single item to merge, so I just spelled it out. So you could say that I'm hitting 3 for 3 with using that INHERIT_VALUE macro, and would appreciate a core version of it. AP_CFG_MERGE_* located in http_config.h would make some sense to me. In the mod_ssl version, there was the unset param; I would suggest the simplest for assumes NULL. A second one would take an unset param. Cheers, -g Ryan, was that a -1, or just a 'don't invest time in it, I don't think it's worth it'? Consistency wise I would find it usefull to put a patch together. But if I know beforehand that it is not going in, I'd rather defer and use my energy elsewhere. Sander
Re: merge macros (was: Re: mod_ssl broken)
On Monday 10 September 2001 15:11, Sander Striker wrote: Ryan, was that a -1, or just a 'don't invest time in it, I don't think it's worth it'? The latter. I would never veto something like this, I just don't believe that it is required. Ryan __ Ryan Bloom [EMAIL PROTECTED] Covalent Technologies [EMAIL PROTECTED] --
Re: mod_ssl broken
On Sun, 9 Sep 2001, Sander Striker wrote: mod_ssl is working fine here, passes all httpd-test tests (t/TEST -ssl) that includes perdir merging. are you up to date with httpd-2.0 from cvs? mod_ssl segfaults in ssl_config_perdir_merge: Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 1024 (LWP 9130)] 0x806dd1d in ssl_config_perdir_merge (p=0x814e22c, basev=0x0, addv=0x8192e4c) at ssl_engine_config.c:269 269 cfgMergeArray(aRequirement); (gdb) bt #0 0x806dd1d in ssl_config_perdir_merge (p=0x814e22c, basev=0x0, addv=0x8192e4c) at ssl_engine_config.c:269 i don't think basev should ever be NULL here. can somebody confirm? #1 0x80a2c15 in ap_merge_per_dir_configs (p=0x814e22c, base=0x8192f34, new_conf=0x8192c54) at config.c:262 #2 0x80b7376 in ap_location_walk (r=0x814e25c) at request.c:1236 this is in the area of new directory/location walk optmizations, i have a feeling the bug is there, not mod_ssl. wrowe could probably provide some insight here.
Re: mod_ssl broken
To be completely accurate, the request is: OPTIONS /svn HTTP/1.1 Cheers, Ben. -- http://www.apache-ssl.org/ben.html There is no limit to what a man can do or how far he can go if he doesn't mind who gets the credit. - Robert Woodruff
Re: mod_ssl broken
From: Ben Laurie [EMAIL PROTECTED] Sent: Sunday, September 09, 2001 3:50 PM Sander Striker wrote: Hi, I have a local test machine running now. subversion over http works as expected. https is broken. This is not on the svn side (I didn't get that far). mod_ssl is broken when it comes to configs such as these: VirtualHost _default_:443 SSLEngine on SSLCACertificatePath /var/openssl/ca/private SSLCACertificateFile /var/openssl/ca/private/cacert.pem SSLCertificateFile /var/openssl/ca/certs/striker.xs4all.nl-cert.pem SSLCertificateKeyFile /var/openssl/ca/certs/striker.xs4all.nl-key.pem DocumentRoot /opt/httpd/htdocs ServerName my.server.name Location /svn SSLRequireSSL #SSLVerifyClient require #SSLVerifyDepth 1 DAV svn SVNPath /home/svn /Location /VirtualHost mod_ssl segfaults in ssl_config_perdir_merge: Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 1024 (LWP 9130)] 0x806dd1d in ssl_config_perdir_merge (p=0x814e22c, basev=0x0, addv=0x8192e4c) at ssl_engine_config.c:269 269 cfgMergeArray(aRequirement); (gdb) bt #0 0x806dd1d in ssl_config_perdir_merge (p=0x814e22c, basev=0x0, addv=0x8192e4c) at ssl_engine_config.c:269 #1 0x80a2c15 in ap_merge_per_dir_configs (p=0x814e22c, base=0x8192f34, new_conf=0x8192c54) at config.c:262 #2 0x80b7376 in ap_location_walk (r=0x814e25c) at request.c:1236 #3 0x80b6511 in ap_process_request_internal (r=0x814e25c) at request.c:154 #4 0x8080ba0 in ap_process_request (r=0x814e25c) at http_request.c:284 #5 0x807c4fa in ap_process_http_connection (c=0x814c31c) at http_core.c:287 #6 0x80ac6db in ap_run_process_connection (c=0x814c31c) at connection.c:82 #7 0x80ac8b0 in ap_process_connection (c=0x814c31c) at connection.c:219 #8 0x80a13b2 in child_main (child_num_arg=0) at prefork.c:829 #9 0x80a150a in make_child (s=0x80e833c, slot=0) at prefork.c:916 #10 0x80a158c in startup_children (number_to_start=1) at prefork.c:939 #11 0x80a1985 in ap_mpm_run (_pconf=0x80e6ce4, plog=0x811eea4, s=0x80e833c) at prefork.c:1155 #12 0x80a7323 in main (argc=1, argv=0xbdac) at main.c:431 #13 0x401f026a in __libc_start_main (main=0x80a6d98 main, argc=1, ubp_av=0xbdac, init=0x8062460 _init, fini=0x80be834 _fini, rtld_fini=0x4000daa4 _dl_fini, stack_end=0xbd9c) at ../sysdeps/generic/libc-start.c:129 The problem is that basev == NULL, which causes apr_array_append to barf. I'll be looking into this next week (or someone working on mod_ssl will). I've looked into this slightly, and the underlying problem is that ap_merge_per_dir_configs finds a previous cached merge (for /?) and tries to use it but the entry for mod_ssl is NULL. The URL, btw, is /svn. This is very, very odd. Would someone please post the svn create/merge dir config code, just in case it's clobbering a sibling? The last silly guess, perhaps we have a subrequest where the prev-pool has been destroyed (sounds unlikely to me.) There are no conditional create/merge results in mod_ssl, whatsoever. Ergo, it should always be given a base and addv, and return a new conf. I got stuck at that point, coz it really isn't at all clear where the entry was expected to be set in the first place. BTW, all but the first entry in the cached merge were also NULL. I don't have time before bedtime to figure this out, I'm afraid. It sounds as if I've failed to build/record/retrieve the cache-walked result properly. I'll hack at this a bit tommorow. Hint, server/request.c:1085 is the start of this block, and we should have a stack of walk_walked_t structures for every merge performed, recorded in the (cache_walk_t*)cache-walked apr_array. Bill