Re: Memory Leaks in SSL_Library_init()
Nitin M wrote: Darryl, What is your opinion on this finding? As you were also keen on knowing the result of this experimentation. :) What is you opinion? Here is the valgrind output for the above program for your reference. ==10877== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 17 from 1) --10877-- --10877-- supp: 17 Ugly strchr error in /lib/ld-2.3.2.so ==10877== malloc/free: in use at exit: 36 bytes in 2 blocks. ==10877== malloc/free: 89 allocs, 87 frees, 1632 bytes allocated. ==10877== ==10877== searching for pointers to 2 not-freed blocks. ==10877== checked 2852348 bytes. Here is my output from 0.9.7f: --5448-- supp:4 dl_relocate_object ==5448== malloc/free: in use at exit: 0 bytes in 0 blocks. ==5448== malloc/free: 79 allocs, 79 frees, 2,640 bytes allocated. No leaks, but you are testing with 0.9.8, which version? I don't have time at the minute to investigate with 0.9.8. I'd also like to see a SSL_library_cleanup(), which in turn calls SSL_free_comp_methods() just to start the ball rolling on reducing the OpenSSL voodoo. This is still true. The cleanup is an area of the library which could do with the rough edges taking off it and the officially agreed methods to be documented in the same place as SSL_library_init(). With a call SSL_library_cleanup(), which internally calls EVP_cleanup() and SSL_free_comp_methods() do you see any harm or any scenario in which it might break. I feel that if used properly for every corresponding SSL_library_init(), it should not causse any problem. Agreed on this. A function SSL_library_cleanup() should be created, which will reverse the action of calling SSL_library_init() and using the OpenSSL library in any way possible. It is fine by me to document that the SSL_library_cleanup() is not thread-safe to clearly warn users to ensure they revert back to single threaded access to the OpenSSL library for calling cleanup functions. It should also be documented that the application code should have already free'ed any OpenSSL objects it created and retained before calling SSL_library_cleanup(). If should be documented that the state of the OpenSSL library reverts to the same state as when an process first loads the library, so the same rules apply again (i.e. its maybe necessary to call SSL_library_init() before making use of some OpenSSL provided functions). It is fine by me to implement the necessary locking within the sub-functions it calls to make those respective function thread-safe (since there maybe legitimate reasons to call them independently of SSL_library_cleanup() function) I think this is already done anyway. It is fine by me for the proposed SSL_free_comp_methods() to be implemented and also fine to remove the previously discussed first to lines from the posted patch which attempt to reduce lock contention and locking overhead. You will have to petition one of the project comitters to actually get any action on these points. Maybe posting a fresh patch which adds the above and the necessary new documentation will help get things moving. rant_modeI already have an outstanding contribution for the project that still remains unaddressed (see postings on SSL_shutdown()) so although this issue is something I'd normally pickup on and try to formalize (as I think I understand the issue well enough to do that) I can only contribute as fast as the consortium allows. My extra energy which would otherwise be spent assisting the greater good is eaten up by maintaining a local fork./rant_mode Darryl __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: Memory Leaks in SSL_Library_init()
From: Darryl Miles [EMAIL PROTECTED] Reply-To: openssl-dev@openssl.org To: openssl-dev@openssl.org Subject: Re: Memory Leaks in SSL_Library_init() Date: Mon, 02 Apr 2007 18:56:40 +0100 Nitin M wrote: Darryl, What is your opinion on this finding? As you were also keen on knowing the result of this experimentation. :) What is you opinion? Here is the valgrind output for the above program for your reference. ==10877== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 17 from 1) --10877-- --10877-- supp: 17 Ugly strchr error in /lib/ld-2.3.2.so ==10877== malloc/free: in use at exit: 36 bytes in 2 blocks. ==10877== malloc/free: 89 allocs, 87 frees, 1632 bytes allocated. ==10877== ==10877== searching for pointers to 2 not-freed blocks. ==10877== checked 2852348 bytes. Here is my output from 0.9.7f: --5448-- supp:4 dl_relocate_object ==5448== malloc/free: in use at exit: 0 bytes in 0 blocks. ==5448== malloc/free: 79 allocs, 79 frees, 2,640 bytes allocated. No leaks, but you are testing with 0.9.8, which version? I am testing with 0.9.8d. I don't have time at the minute to investigate with 0.9.8. I'd also like to see a SSL_library_cleanup(), which in turn calls SSL_free_comp_methods() just to start the ball rolling on reducing the OpenSSL voodoo. This is still true. The cleanup is an area of the library which could do with the rough edges taking off it and the officially agreed methods to be documented in the same place as SSL_library_init(). With a call SSL_library_cleanup(), which internally calls EVP_cleanup() and SSL_free_comp_methods() do you see any harm or any scenario in which it might break. I feel that if used properly for every corresponding SSL_library_init(), it should not causse any problem. Agreed on this. A function SSL_library_cleanup() should be created, which will reverse the action of calling SSL_library_init() and using the OpenSSL library in any way possible. Just out of curiosity I wanted to know in case any one has any Idea why was the earlier patch submitted by Jonatahan Green was not included in mainstream? Any specific technical reason? It is fine by me to document that the SSL_library_cleanup() is not thread-safe to clearly warn users to ensure they revert back to single threaded access to the OpenSSL library for calling cleanup functions. It should also be documented that the application code should have already free'ed any OpenSSL objects it created and retained before calling SSL_library_cleanup(). If should be documented that the state of the OpenSSL library reverts to the same state as when an process first loads the library, so the same rules apply again (i.e. its maybe necessary to call SSL_library_init() before making use of some OpenSSL provided functions). It is fine by me to implement the necessary locking within the sub-functions it calls to make those respective function thread-safe (since there maybe legitimate reasons to call them independently of SSL_library_cleanup() function) I think this is already done anyway. It is fine by me for the proposed SSL_free_comp_methods() to be implemented and also fine to remove the previously discussed first to lines from the posted patch which attempt to reduce lock contention and locking overhead. U mean that these lines as per the previous arguments and discusssions is not required ? 1289 if (ssl_comp_methods == NULL) 1290 return; You will have to petition one of the project comitters to actually get any action on these points. Maybe posting a fresh patch which adds the above and the necessary new documentation will help get things moving. rant_modeI already have an outstanding contribution for the project that still remains unaddressed (see postings on SSL_shutdown()) so although this issue is something I'd normally pickup on and try to formalize (as I think I understand the issue well enough to do that) I can only contribute as fast as the consortium allows. My extra energy which would otherwise be spent assisting the greater good is eaten up by maintaining a local fork./rant_mode Darryl __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED] _ Sign in and get updated on all the action from Formula One http://content.msn.co.in/Sports/FormulaOne/Default __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: Memory Leaks in SSL_Library_init()
Hi All! Any inputs on this finding? I would like to check with the core development team also, if their opinion differs on this, whether there is a need of SSL_free_comp_methods() funtion to cleanup the ssl_comp_methods? Darryl, What is your opinion on this finding? As you were also keen on knowing the result of this experimentation. :) Thanks in advance regards -Nitin From: Nitin M [EMAIL PROTECTED] Reply-To: openssl-dev@openssl.org To: openssl-dev@openssl.org CC: [EMAIL PROTECTED] Subject: Re: Memory Leaks in SSL_Library_init() Date: Wed, 28 Mar 2007 04:14:17 + 1 #includeopenssl/ssl.h 2 int main() 3 { 4 SSL_library_init(); 5 6 /* thread-local cleanup */ 7 ERR_remove_state(0); 8 9 /* Globals we finished with */ 10 // my_application_function_to_destroy_OpenSSL_objects_attached_to_local_variables(); 11 // In the particular application I am keeping an RSA 12 // // object around. 13 // if(tmp_rsa != NULL) { 14 // RSA_free(tmp_rsa); 15 // tmp_rsa = NULL; 16 // } 17 18 19 // /* thread-safe cleanup */ 20 ENGINE_cleanup(); 21 CONF_modules_unload(1); 22 // 23 // /* global application exit cleanup (after all SSL activity is shutdown) */ 24 ERR_free_strings(); 25 EVP_cleanup(); 26 // SSL_free_comp_methods(); 27 CRYPTO_cleanup_all_ex_data(); 28 } I also tried with the above code and found that 36 bytes are still reachable. Then after retining only EVP_cleanup() from the clanup sequence of calls, I felt that EVP_clenup is doing the job of cleaning up most of what SSL_library_init() created. In one of your earlier posts you have said that while you use the mentioned sequence of cleanup calls you did not get any memory leak at all. You dont even get the Still reachable msg with that sequence? if this is true,is it possible for you to post a small example code here? Also I think the function my_application_function_to_destroy_OpenSSL_objects_attached_to_local_variables(); is your application specific function and since I did not have any RSA object I did not call RSA_free(tmp_rsa); With a call SSL_library_cleanup(), which internally calls EVP_cleanup() and SSL_free_comp_methods() do you see any harm or any scenario in which it might break. I feel that if used properly for every corresponding SSL_library_init(), it should not causse any problem. What is you opinion? Here is the valgrind output for the above program for your reference. ==10877== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 17 from 1) --10877-- --10877-- supp: 17 Ugly strchr error in /lib/ld-2.3.2.so ==10877== malloc/free: in use at exit: 36 bytes in 2 blocks. ==10877== malloc/free: 89 allocs, 87 frees, 1632 bytes allocated. ==10877== ==10877== searching for pointers to 2 not-freed blocks. ==10877== checked 2852348 bytes. ==10877== ==10877== 36 bytes in 2 blocks are still reachable in loss record 1 of 1 ==10877==at 0x1B903CC8: malloc (vg_replace_malloc.c:131) ==10877==by 0x1B978CAA: default_malloc_ex (in /home/nitin/software/lib/libcrypto.so.0.9.8) ==10877== ==10877== LEAK SUMMARY: ==10877==definitely lost: 0 bytes in 0 blocks. ==10877==possibly lost: 0 bytes in 0 blocks. ==10877==still reachable: 36 bytes in 2 blocks. ==10877== suppressed: 0 bytes in 0 blocks. From: Darryl Miles [EMAIL PROTECTED] Reply-To: openssl-dev@openssl.org To: openssl-dev@openssl.org Subject: Re: Memory Leaks in SSL_Library_init() Date: Wed, 28 Mar 2007 04:53:02 +0100 Nitin M wrote: I wrote a simple program like this to find out the possibility of a memory leak. The program goes like this. #includeopenssl/ssl.h int main() { SSL_library_init(); EVP_cleanup(); } when I run this programm through the valgrind I get the following output. Ok, it is not clear if tried out the cleanup sequence I posted and checked for results after using it ? If you are still getting leaks after; then the function SSL_free_comp_methods(); might be addressing a genuine leak which can't be reclaimed using any of the existing cleanup functions. So I'd vote for it to get more attention and be included. I'd also like to see a SSL_library_cleanup(), which in turn calls SSL_free_comp_methods() just to start the ball rolling on reducing the OpenSSL voodoo. Darryl __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED] _ Spice up your IM conversations. New, colorful and animated emoticons. Get chatting! http://server1.msn.co.in/SP05/emoticons
RE: Memory Leaks in SSL_Library_init()
Richard Salz: Suppose another thread does this: *p=99; *p=98; Out of scope -- the C standard does not define ANY semantics for multiple threads of execution. Exactly. The original example was: extern volatile char* p; int i, j; i = *p; j = *p; The standard says that the compiler may not treat this as j = i = *p; *and that's all it says.* The only case in scope that I can think of is if the process catches a signal. I believe the standard does require that no matter when the process catches a signal, it must not be the case that j's value has changed but not i's. Because the standard has an 'as-if' rule, saying the compiler may not treat A as B only makes sense when you can show a detectable difference between A and B. In any event, you definitely won't get two fetches from memory on most modern computers. So if that's the requirement, everyone's violating it. DS __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: Memory Leaks in SSL_Library_init()
On 3/28/07, Darryl Miles [EMAIL PROTECTED] wrote: Actually 'volatile' doesn't provide a useful fix. [...] The problem occurs after the beginning of the function. If the compare is done on a cached copy in a register. Look at this example: if (variable!=NULL) { free(variable); variable=NULL; } This could easily be optimized to (cached_copy is a register or other fast place to store data): int cached_copy=variable; if(cached_copy!=NULL) { acquire_lock(); cached_copy=variable; free(cached_copy); cached_copy=NULL; variable=cached_copy; release_lock(); } else variable=cached_copy; If you think this cannot happen, I defy you to explain to me why. What standard or guarantee is being violated? How can POSIX-compliant or C-compliant code break with this optimization? How can you say it can never be faster? This is the precise optimization that 'volatile' inhibits. 'volatile' requires that the value not be cached in cheap-to-access locations like registers, instead being re-loaded from expensive-to-access locations like the original memory -- because it may be changed from outside the control flow that the compiler knows about. -Kyle H __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: Memory Leaks in SSL_Library_init()
This is the precise optimization that 'volatile' inhibits. For single-threaded code, you are right. But we are talking about multi-threaded code. 'volatile' requires that the value not be cached in cheap-to-access locations like registers, instead being re-loaded from expensive-to-access locations like the original memory -- because it may be changed from outside the control flow that the compiler knows about. I had this exact same argument a long time ago about code like this: int foo; volatile int have_pointer=0; volatile int *pointer=NULL; In one thread: pointer=foo; have_pointer=1; In another: if(have_pointer!=0) *pointer++; All I could say was that this code breaks the standard. One thread accesses 'have_pointer' while another thread might be modifying it. Did I know how it could fail? No. In fact, nobody could think of a way it could fail because we didn't really realize that the ordering could break and not just the concurrent access/modification. Did it fail? Yes. I believe we first had faults on new UltraSparc CPUs (might have been new Alphas, I can't remember for sure). It was a classic case of worked on old CPUs, failed on new ones. The new ones had some write reordering capability. If you look at Itanium compilers, you will see that they don't put memory barriers between 'volatile' accesses. So 'volatile' does not provide ordering. It is known not to provide atomicity. There are platforms where a 'uint64_t' *can't* be written atomically, so what's a read of a volatile 'uint64_t' even supposed to do? So you have to be saying, you don't get atomicity or ordering, but you do get ... What exactly? And what standard says so? My copy of the pthreads standard says that accessing a variable in one thread while another thread is or might be modifying it is undefined behavior. IMO, it may break on a new CPU is totally unacceptable for code that is going to be distributed and especially for libraries with expected wide distribution. DS PS: There are many, many more stories of I couldn't think of any way it could break code breaking when the real world thought of ways. You have to code based on guarantees in standards, not your own lack of imagination. __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: Memory Leaks in SSL_Library_init()
If you have multiple threads accessing it, you manage access using a mutex. If locking is important to the application that it's in. (Clearing the compression is as important as clearing the library state. If there's a lock around the library state clearing, a lock needs to exist around the compression. If not, it's the same level of priority.) A read of a 'volatile uint64_t', btw, is supposed to make sure that it reads from the original memory locations, not cached copies of it in register or spread across multiple registers. Even if it can't be accessed atomically, the guarantee is that it will fetch from memory, so that it can be changed outside the current program flow (signal handlers, shm, mmap, and the like). Yes, it needs to be locked (all possible concurrent access needs to be locked, ideally, even though that's a performance hog). I'd not doubt that there are platforms where uint32_t can't be accessed atomically. Or even uint16_t. The question becomes, at what point does it become not-cost-effective to support platforms that cannot guarantee things that can be guaranteed by the platforms used by a majority of the user base? So, here's a novel idea: why don't you write a patch to clear the compression structs that adheres to your view of appropriate design, and submit it? -Kyle H On 3/29/07, David Schwartz [EMAIL PROTECTED] wrote: This is the precise optimization that 'volatile' inhibits. For single-threaded code, you are right. But we are talking about multi-threaded code. 'volatile' requires that the value not be cached in cheap-to-access locations like registers, instead being re-loaded from expensive-to-access locations like the original memory -- because it may be changed from outside the control flow that the compiler knows about. I had this exact same argument a long time ago about code like this: int foo; volatile int have_pointer=0; volatile int *pointer=NULL; In one thread: pointer=foo; have_pointer=1; In another: if(have_pointer!=0) *pointer++; All I could say was that this code breaks the standard. One thread accesses 'have_pointer' while another thread might be modifying it. Did I know how it could fail? No. In fact, nobody could think of a way it could fail because we didn't really realize that the ordering could break and not just the concurrent access/modification. Did it fail? Yes. I believe we first had faults on new UltraSparc CPUs (might have been new Alphas, I can't remember for sure). It was a classic case of worked on old CPUs, failed on new ones. The new ones had some write reordering capability. If you look at Itanium compilers, you will see that they don't put memory barriers between 'volatile' accesses. So 'volatile' does not provide ordering. It is known not to provide atomicity. There are platforms where a 'uint64_t' *can't* be written atomically, so what's a read of a volatile 'uint64_t' even supposed to do? So you have to be saying, you don't get atomicity or ordering, but you do get ... What exactly? And what standard says so? My copy of the pthreads standard says that accessing a variable in one thread while another thread is or might be modifying it is undefined behavior. IMO, it may break on a new CPU is totally unacceptable for code that is going to be distributed and especially for libraries with expected wide distribution. DS PS: There are many, many more stories of I couldn't think of any way it could break code breaking when the real world thought of ways. You have to code based on guarantees in standards, not your own lack of imagination. __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED] -- -Kyle H __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: Memory Leaks in SSL_Library_init()
A read of a 'volatile uint64_t', btw, is supposed to make sure that it reads from the original memory locations, not cached copies of it in register or spread across multiple registers. Which it doesn't do on any platform I know of. On every platform, 'volatile' reads through the caches and 'volatile' writes are writes to a cache to be flushed to main memory when it's convenient. The rationale is this simple -- people use 'volatile' in very important cases, such as longjmp and signals. Anything that is very expensive to do that isn't needed in those cases simply will not be put into 'volatile'. If that means you don't get the multi-thread semantics you erroneously thought you guaranteed, that's your problem. Even if it can't be accessed atomically, the guarantee is that it will fetch from memory, so that it can be changed outside the current program flow (signal handlers, shm, mmap, and the like). The problem is that on modern machines fetch from memory is ill-defined because memory is not all in one place. What is a fetch from memory on a ccNUMA system? The 'volatile' keyword comes from the C standard. The C standard doesn't say anything about sharing memory between processes or threads. There are standards for doing that, and 'volatile' only means anything in those contexts if the standards that let you share memory or create threads say so. A multi-threaded standard could say that 'volatile' has some particular set of semantics. However, as far as I know, *none* *does*. At one time, people thought ordering was part of those semantics. Now there are significant platforms where it isn't. At one point, people thought some types of atomicity were part of those semantics. Now there are significant platforms where it isn't. The 'volatile' keyword is only required to work for signals and longjmp. It may or may not work in other cases. If you try to state precisely what semantics you think 'volatile' assures you, I think you'll find that it's impossible. And even if you did, next year it could go on the list of semantics people thought they had. In this case, the semantic you need is basically that a concurrent read/write will not get a value other than one in the variable before the write or the one after. We know for a fact 'volatile' doesn't provide this guarantee, since it doesn't provide it for 'uint64_t' on platforms that don't have atomic 64-bit writes. And it's not because it's impossible to provide it. Atomic 64-bit writes can be faked with spinlocks around all 64-bit atomic accesses. You know why those platforms don't provide those spinlocks? Because 'volatile' does not provide the read/write guarantee you need, so there is no reason to put them there. Yes, it needs to be locked (all possible concurrent access needs to be locked, ideally, even though that's a performance hog). Since a lock is both necessary and sufficient, what help is 'volatile'? I'd not doubt that there are platforms where uint32_t can't be accessed atomically. Or even uint16_t. The question becomes, at what point does it become not-cost-effective to support platforms that cannot guarantee things that can be guaranteed by the platforms used by a majority of the user base? No, that's not the question. The question is, are we going to follow the standards and use the guarantees we have or senselessly write code that relies on behavior that is explicitly undefined by the standards and risk having our code break on new CPUs? So, here's a novel idea: why don't you write a patch to clear the compression structs that adheres to your view of appropriate design, and submit it? I suggested either removing the locking entirely or simply removing the unlocked check for NULL. I think cutting/pasting a patch would be more work than simply making the change. DS __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: Memory Leaks in SSL_Library_init()
A read of a 'volatile uint64_t', btw, is supposed to make sure that it reads from the original memory locations, not cached copies of it in register or spread across multiple registers. No. The computing model in ANSI/ISO C doesn't really go below the level of source code. Volatile only guarantees that the compiler will not remove reads/writes that, based on its analysis of the program source, it thinks are redundant. For example: volatile unsigned char* p = get_address(); *p = *p = 5 The compiler is not allowed to remove either assignment. int i = *p + *p; The compiler must dereference the pointer twice. There is no guarantee about caching, atomicity, etc. In fact, there is no mention of it. Even if it can't be accessed atomically, the guarantee is that it will fetch from memory, so that it can be changed outside the current program flow (signal handlers, shm, mmap, and the like). Yes, it needs to be locked (all possible concurrent access needs to be locked, ideally, even though that's a performance hog). This is mostly right, but the one wrong bit (the guarantee) is the most important part. -- STSM Senior Security Architect DataPower SOA Appliances __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: Memory Leaks in SSL_Library_init()
Kyle Hamilton wrote: On 3/28/07, Darryl Miles [EMAIL PROTECTED] wrote: The problem occurs after the beginning of the function. If the compare is done on a cached copy in a register. Look at this example: if (variable!=NULL) { free(variable); variable=NULL; } This could easily be optimized to (cached_copy is a register or other fast place to store data): int cached_copy=variable; if(cached_copy!=NULL) { acquire_lock(); cached_copy=variable; free(cached_copy); cached_copy=NULL; variable=cached_copy; release_lock(); } else variable=cached_copy; If you think this cannot happen, I defy you to explain to me why. What standard or guarantee is being violated? How can POSIX-compliant or C-compliant code break with this optimization? How can you say it can never be faster? This is the precise optimization that 'volatile' inhibits. 'volatile' requires that the value not be cached in cheap-to-access locations like registers, instead being re-loaded from expensive-to-access locations like the original memory -- because it may be changed from outside the control flow that the compiler knows about. Agreed with the sorts of things volatile inhibits. But this whole tangent is irrelevant to the original posters code. The above pseudo code is not a valid interpretation. The last line David wrote variable=cached_copy; can't happen for the code: if(variable) { free(variable); variable=NULL; } Like David has claimed. A compiler will not generate a store instruction to put back a cached_copy into the variable location. Principally because there was no assignment operation in the original code and because even a non-optimizing compiler knows it can just dump the cached_copy temporary register on the floor. The conversation stemmed from the code: 1289 if (ssl_comp_methods == NULL) 1290 return; 1291 CRYPTO_w_lock(CRYPTO_LOCK_SSL); 1292 if (ssl_comp_methods != NULL) 1993 { ...SNIP... 1296 } 1297 CRYPTO_w_unlock(CRYPTO_LOCK_SSL); If ssl_comp_methods is NULL, then no store instruction to the memory location of ssl_comp_methods will ever happen. So nothing is subject to the so called lost write concurrency problem. So marking it 'volatile' will not gain the above code anything. But it will inhibit valid optimizations the compiler might make to all code that uses the variable 'ssl_comp_methods' that are inside the locked regions. Darryl __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: Memory Leaks in SSL_Library_init()
This is the precise optimization that 'volatile' inhibits. 'volatile' requires that the value not be cached in cheap-to-access locations like registers, instead being re-loaded from expensive-to-access locations like the original memory -- because it may be changed from outside the control flow that the compiler knows about. Agreed with the sorts of things volatile inhibits. You are both wrong. The C standard is *silent* on the concept of underlying hardware. It only talks about the flow of control as defined in the module being translated. It created a new term, sequence point, to talk about control flow in an abstract way. The ISO C web site seems to be: http://www.open-std.org/jtc1/sc22/wg14/ and if you go there you can find copies of the standard and the committee mailings, e.g. /r$ -- STSM Senior Security Architect DataPower SOA Appliances __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: Memory Leaks in SSL_Library_init()
Richard Salz wrote: This is the precise optimization that 'volatile' inhibits. 'volatile' requires that the value not be cached in cheap-to-access locations like registers, instead being re-loaded from expensive-to-access locations like the original memory -- because it may be changed from outside the control flow that the compiler knows about. Agreed with the sorts of things volatile inhibits. You are both wrong. Just to clarify what exactly your comment is aimed at. Is it still on the tangent subject of volatile usage ? or is this in relation to the original poster subject matter ? If its the meaning of volatile, I used the term sorts of to be deliberately vague to mean that Kyle is in the right ball park (which he is even under your double *p++ *p++ example I can relate that to what Kyle wrote above) and as it is a tangent subject the main point of my comment was to close the tangent down; so thats that on volatile usage from me. However if your comment is addressed at the more useful part of my replies (the bit you didn't quote) then I am definitely interested if you would elaborate on that. The C standard is *silent* on the concept of underlying hardware. It only talks about the flow of control as defined in the module being translated. It created a new term, sequence point, to talk about control flow in an abstract way. I can't actually relate that comment to either discussion. No one is disputing the definition of flow control. No one is claiming that the C standard defines anything at the hardware layer. When the hardware layer has been talked about it was either cited as an example of usage (not cited as the definition); or it was talked about in relation to what a compiler does and does not do when it generates assembly language instructions. Darryl __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: Memory Leaks in SSL_Library_init()
Darryl Mile wrote: A compiler will not generate a store instruction to put back a cached_copy into the variable location. Principally because there was no assignment operation in the original code and because even a non-optimizing compiler knows it can just dump the cached_copy temporary register on the floor. That may be true, but it doesn't help. The compiler generating a store instruction is only one way a cached copy can be written back to memory. Anything a compiler can do could also be done by the CPU or memory hardware. Only operations with specific guaranteed semantics solve that problem. The conversation stemmed from the code: 1289 if (ssl_comp_methods == NULL) 1290 return; 1291 CRYPTO_w_lock(CRYPTO_LOCK_SSL); 1292 if (ssl_comp_methods != NULL) 1993 { ...SNIP... 1296 } 1297 CRYPTO_w_unlock(CRYPTO_LOCK_SSL); If ssl_comp_methods is NULL, then no store instruction to the memory location of ssl_comp_methods will ever happen. So nothing is subject to the so called lost write concurrency problem. You are correct that no store *instruction* will happen, but that doesn't mean the CPU won't generate a store anyway. While it's true that current caches keep track of whether data is modified and don't write back data unless it is, that doesn't mean future CPUs might not find it more efficient to avoid having to keep the flag and write back in all cases. What if the flag gets to be expensive but write operations are cheap? There is precedent. On the x86, for example, at least some compare-exchange operations will write back the original data if the compare fails. In other words, if you ask for: if(a==7) a=3; You might get (in *microcode* even though the instructions say what you want): a = (a==7) ? 3 : a; This can be an optimization because the the code as specified requires a micro-branch internal to the compare-exchange instruction, which can be expensive if mispredicted. The unconditional store avoids the branch and can be implemented mathematically. If the compiler was designed before anyone knew a CPU might do this, it could easily generate a compare-exchange instruction for a volatile variable that had an access of this form. The compiler author knows the CPU follows the requested code precisely. The next CPU, however, makes the store unconditional as an optimization because a branch is cheaper than a store. So sorry, your new CPU breaks your old executables. The code we are talking about is of this precise form. It compares the value of a pointer and changes the value of that pointer only if the pointer was not NULL to begin with. Now the code we are talking about contains locks inside the conditional, so it will not fail this exact way. (I believe the x86 will only do this for bus-locked instructions anyway, but there is no reason why the next x86 couldn't do the same thing for non-bus-locked transactions too.) You argument is of the form I can't think of any way it could fail. This says more about your imagination than the validity of the code. ;) I agree none of the way I can think of that it can fail seem particularly likely. The problem is the ways it can fail that neither of us can think of -- until it fails. Experience has shown me (and I cited a few examples in this thread) that will be ways it can fail you can't think of. That is why standards provide guarantees. So marking it 'volatile' will not gain the above code anything. But it will inhibit valid optimizations the compiler might make to all code that uses the variable 'ssl_comp_methods' that are inside the locked regions. I agree with that. DS __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: Memory Leaks in SSL_Library_init()
I was not commenting on any part of the message that I didn't quote. :) Kyle's claim about things like cache's and registers is wrong, not even sort-of right. The standard talks about only in terms of sequence points, and volatile limits what can be done in terms of sequence points. So extern volatile char* p; int i, j; i = *p; j = *p; The standard says that the compiler may not treat this as j = i = *p; *and that's all it says.* -- STSM Senior Security Architect DataPower SOA Appliances __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: Memory Leaks in SSL_Library_init()
Richard Salz wrote: Kyle's claim about things like cache's and registers is wrong, not even sort-of right. The standard talks about only in terms of sequence points, and volatile limits what can be done in terms of sequence points. So extern volatile char* p; int i, j; i = *p; j = *p; The standard says that the compiler may not treat this as j = i = *p; *and that's all it says.* Suppose another thread does this: *p=99; *p=98; Would a compiler that let another code running your code above get 'i' equal to 98 and 'p' equal to 99 be broken in your opinion? I've seen code generated on weakly-ordered memory platforms, and memory barriers are not placed between stores to volatiles. The problem is that it is not well-defined to talk about the order in which operations take place. The order can be different depending upon where you look (one can be first in the instruction stream but last out of the write posting buffer). The standard is discussed in terms of an abstract machine and the machine the code is running on may bear only as much resemblance to the abstract machine as is needed to make strictly-compliant code work. DS __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: Memory Leaks in SSL_Library_init()
Suppose another thread does this: *p=99; *p=98; Out of scope -- the C standard does not define ANY semantics for multiple threads of execution. The standard is discussed in terms of an abstract machine and the machine the code is running on may bear only as much resemblance to the abstract machine as is needed to make strictly-compliant code work. Yes. Google for mashey volatile will turn up some interesting stories from an expert, including this one: http://yarchive.net/comp/volatile.html /r$ -- STSM Senior Security Architect DataPower SOA Appliances __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: Memory Leaks in SSL_Library_init()
Kyle Hamilton wrote: Oh. I'm sorry. Someone needs to use a keyword 'volatile'. Bingo. Problem solved on the improper optimization issue. Actually 'volatile' doesn't provide a useful fix. Ignoring the fact there is no problem with any CPU that currently exists. The volatile is most useful for memory mapped I/O where touching a memory location (for load or store) can have side effects on the I/O device. So 'volatile' can be used to enforce a strict access policy based on the code as written (as opposed to how the compiler might re-order load/stores if it was regular memory). My points here are that even under the best optimizing compiler following 100% safe optimizations will never have a problem with the original posters code. There is no need to use volatile, it provide no benefit and will impact correct optimizations within code making use of the variable. David seems to be thinking ahead into the realms of CPUs that have not been invented yet. I'm don't give two hoots about those CPUs and I certainly wouldn't waste brain cells trying to make my code portable to a piece of hardware that does not exist. On 3/27/07, David Schwartz [EMAIL PROTECTED] wrote: For POSIX threads, the result of reading a variable in one thread while it might be modified in another thread is undefined. Line 1289 and 1290 should be removed. Not this old chestnut again. Like it or not, it's a fact. I would disagree. Its merely your opinion; and in my opinion your understanding of C, CPU architecture, CPU cache cohearancy and memory access isn't as good as you think it is. You did point out a scenario that was not at all relevant to the situation in the code of the original post. Of course, that's because those CPUs don't exist yet. But I expect code that I write and compile today to work on future CPUs because I don't rely on guarantees I don't actually ahve. LOL. Hey that trans-wrap-cpu which converts charged ions into bits uses an equally spartan language better able describe all those new features it brings to the party. Umm, what CPU? *Every* CPU OpenSSL does or might ever run on? And that would have to include every compiler OpenSSL does or might support, because the compiler can change things around too. I already pointed out the hows and whys the compiler CAN NOT change anything around in the situation embodied in the original post. The function entrance, the return and the CRYPTO_w_lock() calls all act as the correct memory barriers with regards to compiler optimizations and load/store rescheduling concerns. Even if the CRYPTO_w_lock() is a macro or inline, since the writer would have been sure to insert the correct barriers if he chose to implement CRYPTO_w_lock() as a macro/inline. Where is that documented or guaranteed? You mean that they *happen* to be barriers on the platforms you have used. You have no idea what the next version of GCC might do. We are talking about saving *NOTHING* here. Its documented in the multi-CPU erratum for each CPU that can be wired up to access the same memory. There are different levels to the entire problem area, there is the electrical and timing of the memory bus access to the same memory location, there is the area of CPU cache and shared CPU cache cohearancy. There are all documented for each architecture. Then there is the area of the C language optimizing compiler. Which is less documented as its the language constructs which are documented rather than the optimization. The barriers being discussed in this point are those for the compiler that affect code generation and how load/store operations are managed. But in principle the C language is a sequential language, that means the observable order of events that occurs is STRICTLY in the order laid out in the source code. Any optimization that defies that basic principal is a bug. The problem occurs after the beginning of the function. If the compare is done on a cached copy in a register. Look at this example: if (variable!=NULL) { free(variable); variable=NULL; } This could easily be optimized to (cached_copy is a register or other fast place to store data): int cached_copy=variable; if(cached_copy!=NULL) { acquire_lock(); cached_copy=variable; free(cached_copy); cached_copy=NULL; variable=cached_copy; release_lock(); } else variable=cached_copy; If you think this cannot happen, I defy you to explain to me why. What standard or guarantee is being violated? How can POSIX-compliant or C-compliant code break with this optimization? How can you say it can never be faster? I presume in this example the variable 'cached_copy' is not a real variable (as in it does not exist in the original C source) but a compiler generated variable during the optimization process. It can not happen because when the compiler makes a function call to acquire_lock() (in the o/p this is CRYPTO_w_lock(); by definition of C
RE: Memory Leaks in SSL_Library_init()
So the point you are trying to make is, while the function would solve the purpose of freeing the compression methods, However the lock are not really required in the usage secnario of this function? If the usage scenario is solely final shutdown of the library, then the lock is not required. If the usage scenario is larger than that, then the double-checked locking should be removed and all tests should be done under the lock. The current code is an insensible compromise. The double-checked locking makes no sense in either usage scenario. If it's purely a shutdown function, then the optimization is a pure complication -- there's no need to acquire the lock in any case. If it's not purely a shutdown function, then the optimization is prohibited by the pthreads standard and its behavior is undefined and it could break on future CPUs, compilers, or platforms. I admit it's unlikely to break on future x86 platforms -- too much existing x86-only code would break. But I have seen similar types of 'optimizations' break on new CPUs when those CPUs used optimizations that were not imaginable at the time the code was written. OpenSSL has a way for platforms to provide safe multi-threading primitives. If additional primitives are needed or beneficial for optimization, they should be provided the same way. OpenSSL itself should not make assumptions that are specifically prohibited by the standards it is designed to work with. DS __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: Memory Leaks in SSL_Library_init()
David seems to be thinking ahead into the realms of CPUs that have not been invented yet. Exactly. That's why there are standards and guarantees. If you follow the standards and rely on the guarantees you have, your code will work on all future platforms that provide those same guarantees you have. At one time, the idea that a compiler would enforce the C standard's aliasing rules was inconceivable. Did a lot of code break on newer compilers that relied on them in optimization? Yes, it did. At one time, the idea that a CPU would reorder writes that had been placed in order in assembly code was inconceivable. Did some code break on newer CPUs that had posted write buffers? Yes, it did. I'm don't give two hoots about those CPUs and I certainly wouldn't waste brain cells trying to make my code portable to a piece of hardware that does not exist. Forgive me for being blunt, but that's a totally idiotic attitude. I for one don't expect to have to recompile or even debug my software when I upgrade my CPU. It is unreasonable to design code so that it breaks on new CPUs, and when you rely on behavior that is not documented or guaranteed, that's exactly what happens. I understand that you think that this one assumption is safe and won't break. But I can tell you from a long history of such assumptions that they are *not* safe and they *do* break. Can I tell you how and when this particular one will break? No, I can't. Have I seen many similar assumptions break in the past? Yes, I have. I have worked as a professional programmer for more than 20 years and have worked directly for or consulted for about 8 companies that develop software as a primary or significant aspect of their business. This type of code violates the published coding standards of the six of those companies that have them. DS __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: Memory Leaks in SSL_Library_init()
From: Darryl Miles [EMAIL PROTECTED] Reply-To: openssl-dev@openssl.org To: openssl-dev@openssl.org Subject: Re: Memory Leaks in SSL_Library_init() Date: Wed, 21 Mar 2007 11:12:38 + Nitin M wrote: Does this mean that in such scenario the application need not call SSL_library_init since it does not need those extra initialisations and can achieve only the required initialisations with specific calls? If this is true I have two more questions here, 1. In what scenario then an application should use SSL_library_init()? I think not for the simplest use of SSL/Crypto functionality in the application. 2. If SSL_library_init() should be called only once in the lifetime of a process and it is is creating some global data structures only once within that process, why cant we have a function to clear all these global variables which can be called at the the process termination time? If you are using SSL then you are touching many parts of the OpenSSL library as all the digests and crypto algorithms are pulled in to allow them to be available for selection and negotiation. Most users want to simple few function calls to auto-register all those methods. The simplest use of OpenSSL is more inline with how I explained, an app just wants to use a single digest algorithm (like MD5 or SHA1, but not SSL). As I still maintain you need to provide information about the leak(s) in order to progress. Hi! I wrote a simple program like this to find out the possibility of a memory leak. The program goes like this. #includeopenssl/ssl.h int main() { SSL_library_init(); EVP_cleanup(); } when I run this programm through the valgrind I get the following output. ==10036== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 17 from 1) --10036-- --10036-- supp: 17 Ugly strchr error in /lib/ld-2.3.2.so ==10036== malloc/free: in use at exit: 36 bytes in 2 blocks. ==10036== malloc/free: 87 allocs, 85 frees, 1472 bytes allocated. ==10036== ==10036== searching for pointers to 2 not-freed blocks. ==10036== checked 2852348 bytes. ==10036== ==10036== 36 bytes in 2 blocks are still reachable in loss record 1 of 1 ==10036==at 0x1B903CC8: malloc (vg_replace_malloc.c:131) ==10036==by 0x1B978CAA: default_malloc_ex (in /home/nitin/software/lib/libcrypto.so.0.9.8) ==10036== ==10036== LEAK SUMMARY: ==10036==definitely lost: 0 bytes in 0 blocks. ==10036==possibly lost: 0 bytes in 0 blocks. ==10036==still reachable: 36 bytes in 2 blocks. ==10036== suppressed: 0 bytes in 0 blocks. This shows that 36 bytes in 2 block are still reachable. In Valgrinds language A pointer to the start of the block is found. This usually indicates programming sloppiness. Since the block is still pointed at, the programmer could, at least in principle, free it before program exit. But I think these would get accumulated over a period of time if the application uses pluggin DLLs which inturn use OpenSSL and it is not known how often the plugin DLLs are loaded and unloaded. After further looking into source I found that there is no way/function for the The SSL compression methods global stack (ssl_comp_methods) to be freed. I added a small function SSL_free_comp_methods to cleanup this stack based on the patch submitted by Jonathan Green. http://marc.info/?l=openssl-devm=112200683926727w=2 1287 void SSL_free_comp_methods(void) 1288 { 1289 if (ssl_comp_methods == NULL) 1290 return; 1291 CRYPTO_w_lock(CRYPTO_LOCK_SSL); 1292 if (ssl_comp_methods != NULL) 1293 { 1294 sk_SSL_COMP_pop_free(ssl_comp_methods,CRYPTO_free); 1295 ssl_comp_methods = NULL; 1296 } 1297 CRYPTO_w_unlock(CRYPTO_LOCK_SSL); 1298 } So when I call this function also in the application program in addition to EVP_cleanup(), #includeopenssl/ssl.h int main() { SSL_library_init(); EVP_cleanup(); SSL_free_comp_methods(); } the valgrind output is like this ==10055== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 17 from 1) --10055-- --10055-- supp: 17 Ugly strchr error in /lib/ld-2.3.2.so ==10055== malloc/free: in use at exit: 0 bytes in 0 blocks. ==10055== malloc/free: 87 allocs, 87 frees, 1472 bytes allocated. ==10055== ==10055== No malloc'd blocks -- no leaks are possible. Which looks like a more cleaner output. I was just wondering if this issue has been higlighted in some earlier posts and the patch submitted looks like solving the problem, technically was there any problem in including it into mainstream code? Application usages goes like this: * Application initializes the OpenSSL libary. * Application creates and uses data object with the OpenSSL library. * Application wants to shutdown. * Application destroys all OpenSSL data objects it is holding. * Application calls cleanup routines in the OpenSSL library. This is pretty standard stuff, what David maybe highlighting is that many
RE: Memory Leaks in SSL_Library_init()
1287 void SSL_free_comp_methods(void) 1288 { 1289 if (ssl_comp_methods == NULL) 1290 return; 1291 CRYPTO_w_lock(CRYPTO_LOCK_SSL); 1292 if (ssl_comp_methods != NULL) 1293 { 1294 sk_SSL_COMP_pop_free(ssl_comp_methods,CRYPTO_free); 1295 ssl_comp_methods = NULL; 1296 } 1297 CRYPTO_w_unlock(CRYPTO_LOCK_SSL); 1298 } For POSIX threads, the result of reading a variable in one thread while it might be modified in another thread is undefined. Line 1289 and 1290 should be removed. Alternatively, lines 1288, 1289, 1290, and 1297 should be removed. It's hard to understand any reasonable scenario in which one thread might be calling SSL_free_comp_methods while another thread is or might be accessing the ssl_comp_methods variable or any of the registered compression methods. But the code as posted is an unreasonable compromise between two reasonable positions. It is sort of safe if called while another thread is or might be accesing ssl_comp_methods. If you think there is no way this code could fail, imagine a hypothetical compiler/platform in which if(ssl_comp_methods == NULL) translates into an operation through a fast caching shadow register like this: shadow=ssl_comp_methods; if(ssl_comp_methods==NULL) { ssl_comp_methods=shadow; /* flush cache, writing back entries */ /* whatever */ } If another thread modified 'ssl_comp_methods' in between the read and the cache flush, that modification would be lost. I, for one, don't like needlessly making code potentially disastrous on future CPUs and compilers. (To save what?!) ... the evils of premature optimization. DS __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: Memory Leaks in SSL_Library_init()
David Schwartz wrote: 1287 void SSL_free_comp_methods(void) 1288 { 1289 if (ssl_comp_methods == NULL) 1290 return; 1291 CRYPTO_w_lock(CRYPTO_LOCK_SSL); 1292 if (ssl_comp_methods != NULL) 1293 { 1294 sk_SSL_COMP_pop_free(ssl_comp_methods,CRYPTO_free); 1295 ssl_comp_methods = NULL; 1296 } 1297 CRYPTO_w_unlock(CRYPTO_LOCK_SSL); 1298 } For POSIX threads, the result of reading a variable in one thread while it might be modified in another thread is undefined. Line 1289 and 1290 should be removed. Not this old chestnut again. I can't name a CPU in which an aligned load/store of a natural/primitive data size between overlapping threads actually has undefined behavior. The underlying assembly instructions generated by the C compiler have a defined behavior on multi-CPU systems. Due to this defined behavior of the underlying CPU at a lower level than the scope of the POSIX specification is designed to cover you should treat this behavior as defined. This would not contradict POSIX which is effectively saying out of the scope of the POSIX spec by using the term undefined. The function entrance, the return and the CRYPTO_w_lock() calls all act as the correct memory barriers with regards to compiler optimizations and load/store rescheduling concerns. Even if the CRYPTO_w_lock() is a macro or inline, since the writer would have been sure to insert the correct barriers if he chose to implement CRYPTO_w_lock() as a macro/inline. Alternatively, lines 1288, 1289, 1290, and 1297 should be removed. It's hard to understand any reasonable scenario in which one thread might be calling SSL_free_comp_methods while another thread is or might be accessing the ssl_comp_methods variable or any of the registered compression methods. I would agree with your overall view, but I would on the basis that this function is not performance critical to deserve any special lock contention avoidance treatment. But there is a case for locking to not be necessary at all. If the SSL_free_comp_methods() is only designed to be used during the cleanup phase. During a clean shutdown sequence of the OpenSSL library I'm pretty sure you are meant to have reverted the OpenSSL call execution path to single threaded mode. Meaning while you are executing any cleanup functions it is not safe to be calling any other OpenSSL functions at the same time. By virtue of this requirement locking should not be necessary (but should do more good than harm if implemented). If your application needs to repeatedly init and cleanup, init and cleanup, then your application needs to also manage the state and the serialization of the init/cleanup calls (since they may not be safe to be called out-of-order, called twice, interleved, etc...). But the code as posted is an unreasonable compromise between two reasonable positions. It is sort of safe if called while another thread is or might be accesing ssl_comp_methods. This I'm not so sure I agree with; the It is 'sort of' safe if called while another thread ... part (see the last 2 paragraphs of mine above). If you think there is no way this code could fail, imagine a hypothetical compiler/platform in which if(ssl_comp_methods == NULL) translates into an operation through a fast caching shadow register like this: shadow=ssl_comp_methods; if(ssl_comp_methods==NULL) { ssl_comp_methods=shadow; /* flush cache, writing back entries */ /* whatever */ } If another thread modified 'ssl_comp_methods' in between the read and the cache flush, that modification would be lost. I, for one, don't like needlessly making code potentially disastrous on future CPUs and compilers. (To save what?!) ... the evils of premature optimization. No as I said by virtue of it being a function it already forces the compiler to perform the correct barriers. Maybe you can name a CPU target which does not automatically flush pending writes for return from function and call a function instructions, I can't. The example you cite is not valid for line 1289 and 1290. The only write operation to 'ssl_comp_methods' occurs inside the locked region and after the text condition has been re-evaluated. So your pseudo-code serves a useful purpose to illustrate to others on the list about the concern you raise, but the concern itself is not relevant to the example given. Darryl __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: Memory Leaks in SSL_Library_init()
Nitin M wrote: I wrote a simple program like this to find out the possibility of a memory leak. The program goes like this. #includeopenssl/ssl.h int main() { SSL_library_init(); EVP_cleanup(); } when I run this programm through the valgrind I get the following output. Ok, it is not clear if tried out the cleanup sequence I posted and checked for results after using it ? If you are still getting leaks after; then the function SSL_free_comp_methods(); might be addressing a genuine leak which can't be reclaimed using any of the existing cleanup functions. So I'd vote for it to get more attention and be included. I'd also like to see a SSL_library_cleanup(), which in turn calls SSL_free_comp_methods() just to start the ball rolling on reducing the OpenSSL voodoo. Darryl __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: Memory Leaks in SSL_Library_init()
1 #includeopenssl/ssl.h 2 int main() 3 { 4 SSL_library_init(); 5 6 /* thread-local cleanup */ 7 ERR_remove_state(0); 8 9 /* Globals we finished with */ 10 // my_application_function_to_destroy_OpenSSL_objects_attached_to_local_variables(); 11 // In the particular application I am keeping an RSA 12 // // object around. 13 // if(tmp_rsa != NULL) { 14 // RSA_free(tmp_rsa); 15 // tmp_rsa = NULL; 16 // } 17 18 19 // /* thread-safe cleanup */ 20 ENGINE_cleanup(); 21 CONF_modules_unload(1); 22 // 23 // /* global application exit cleanup (after all SSL activity is shutdown) */ 24 ERR_free_strings(); 25 EVP_cleanup(); 26 // SSL_free_comp_methods(); 27 CRYPTO_cleanup_all_ex_data(); 28 } I also tried with the above code and found that 36 bytes are still reachable. Then after retining only EVP_cleanup() from the clanup sequence of calls, I felt that EVP_clenup is doing the job of cleaning up most of what SSL_library_init() created. In one of your earlier posts you have said that while you use the mentioned sequence of cleanup calls you did not get any memory leak at all. You dont even get the Still reachable msg with that sequence? if this is true,is it possible for you to post a small example code here? Also I think the function my_application_function_to_destroy_OpenSSL_objects_attached_to_local_variables(); is your application specific function and since I did not have any RSA object I did not call RSA_free(tmp_rsa); With a call SSL_library_cleanup(), which internally calls EVP_cleanup() and SSL_free_comp_methods() do you see any harm or any scenario in which it might break. I feel that if used properly for every corresponding SSL_library_init(), it should not causse any problem. What is you opinion? Here is the valgrind output for the above program for your reference. ==10877== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 17 from 1) --10877-- --10877-- supp: 17 Ugly strchr error in /lib/ld-2.3.2.so ==10877== malloc/free: in use at exit: 36 bytes in 2 blocks. ==10877== malloc/free: 89 allocs, 87 frees, 1632 bytes allocated. ==10877== ==10877== searching for pointers to 2 not-freed blocks. ==10877== checked 2852348 bytes. ==10877== ==10877== 36 bytes in 2 blocks are still reachable in loss record 1 of 1 ==10877==at 0x1B903CC8: malloc (vg_replace_malloc.c:131) ==10877==by 0x1B978CAA: default_malloc_ex (in /home/nitin/software/lib/libcrypto.so.0.9.8) ==10877== ==10877== LEAK SUMMARY: ==10877==definitely lost: 0 bytes in 0 blocks. ==10877==possibly lost: 0 bytes in 0 blocks. ==10877==still reachable: 36 bytes in 2 blocks. ==10877== suppressed: 0 bytes in 0 blocks. From: Darryl Miles [EMAIL PROTECTED] Reply-To: openssl-dev@openssl.org To: openssl-dev@openssl.org Subject: Re: Memory Leaks in SSL_Library_init() Date: Wed, 28 Mar 2007 04:53:02 +0100 Nitin M wrote: I wrote a simple program like this to find out the possibility of a memory leak. The program goes like this. #includeopenssl/ssl.h int main() { SSL_library_init(); EVP_cleanup(); } when I run this programm through the valgrind I get the following output. Ok, it is not clear if tried out the cleanup sequence I posted and checked for results after using it ? If you are still getting leaks after; then the function SSL_free_comp_methods(); might be addressing a genuine leak which can't be reclaimed using any of the existing cleanup functions. So I'd vote for it to get more attention and be included. I'd also like to see a SSL_library_cleanup(), which in turn calls SSL_free_comp_methods() just to start the ball rolling on reducing the OpenSSL voodoo. Darryl __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED] _ Spice up your IM conversations. New, colorful and animated emoticons. Get chatting! http://server1.msn.co.in/SP05/emoticons/ __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: Memory Leaks in SSL_Library_init()
For POSIX threads, the result of reading a variable in one thread while it might be modified in another thread is undefined. Line 1289 and 1290 should be removed. Not this old chestnut again. Like it or not, it's a fact. I can't name a CPU in which an aligned load/store of a natural/primitive data size between overlapping threads actually has undefined behavior. The underlying assembly instructions generated by the C compiler have a defined behavior on multi-CPU systems. Of course, that's because those CPUs don't exist yet. But I expect code that I write and compile today to work on future CPUs because I don't rely on guarantees I don't actually ahve. Due to this defined behavior of the underlying CPU at a lower level than the scope of the POSIX specification is designed to cover you should treat this behavior as defined. This would not contradict POSIX which is effectively saying out of the scope of the POSIX spec by using the term undefined. Umm, what CPU? *Every* CPU OpenSSL does or might ever run on? And that would have to include every compiler OpenSSL does or might support, because the compiler can change things around too. The function entrance, the return and the CRYPTO_w_lock() calls all act as the correct memory barriers with regards to compiler optimizations and load/store rescheduling concerns. Even if the CRYPTO_w_lock() is a macro or inline, since the writer would have been sure to insert the correct barriers if he chose to implement CRYPTO_w_lock() as a macro/inline. Where is that documented or guaranteed? You mean that they *happen* to be barriers on the platforms you have used. You have no idea what the next version of GCC might do. We are talking about saving *NOTHING* here. Alternatively, lines 1288, 1289, 1290, and 1297 should be removed. It's hard to understand any reasonable scenario in which one thread might be calling SSL_free_comp_methods while another thread is or might be accessing the ssl_comp_methods variable or any of the registered compression methods. I would agree with your overall view, but I would on the basis that this function is not performance critical to deserve any special lock contention avoidance treatment. Exactly. This is the worst kind of premature optimization. But there is a case for locking to not be necessary at all. If the SSL_free_comp_methods() is only designed to be used during the cleanup phase. During a clean shutdown sequence of the OpenSSL library I'm pretty sure you are meant to have reverted the OpenSSL call execution path to single threaded mode. Right. You can't let one thread unconditionally shut down a library while another thread is or might be using it. However, it's possible this function was also intended to 'unregister' the compression functions without harming the library. In that case, the lock is needed and the optimization is broken. Meaning while you are executing any cleanup functions it is not safe to be calling any other OpenSSL functions at the same time. By virtue of this requirement locking should not be necessary (but should do more good than harm if implemented). If your application needs to repeatedly init and cleanup, init and cleanup, then your application needs to also manage the state and the serialization of the init/cleanup calls (since they may not be safe to be called out-of-order, called twice, interleved, etc...). It will do more good than harm if the locks haven't been cleaned up! If another thread modified 'ssl_comp_methods' in between the read and the cache flush, that modification would be lost. I, for one, don't like needlessly making code potentially disastrous on future CPUs and compilers. (To save what?!) ... the evils of premature optimization. No as I said by virtue of it being a function it already forces the compiler to perform the correct barriers. Maybe you can name a CPU target which does not automatically flush pending writes for return from function and call a function instructions, I can't. The problem occurs after the beginning of the function. If the compare is done on a cached copy in a register. Look at this example: if (variable!=NULL) { free(variable); variable=NULL; } This could easily be optimized to (cached_copy is a register or other fast place to store data): int cached_copy=variable; if(cached_copy!=NULL) { acquire_lock(); cached_copy=variable; free(cached_copy); cached_copy=NULL; variable=cached_copy; release_lock(); } else variable=cached_copy; If you think this cannot happen, I defy you to explain to me why. What standard or guarantee is being violated? How can POSIX-compliant or C-compliant code break with this optimization? How can you say it can never be faster? The example you cite is not valid for line 1289 and 1290. The only write operation to 'ssl_comp_methods' occurs inside the locked region and after the text condition has been re-evaluated. So
Re: Memory Leaks in SSL_Library_init()
Oh. I'm sorry. Someone needs to use a keyword 'volatile'. Bingo. Problem solved on the improper optimization issue. Can we commit the patch so that we don't have to keep getting hit by 2 or 3 threads about valgrind complaining about reachable pointers at the end of program execution every month? This /is/ an OpenSSL library bug, and it /does/ need to be addressed -- regardless of it's only 36 bytes, relying on an operating system cleaning up after you is sloppy programming at best and bug-laden at worst. Remember, not all architectures have memory management units, either. -Kyle H On 3/27/07, David Schwartz [EMAIL PROTECTED] wrote: For POSIX threads, the result of reading a variable in one thread while it might be modified in another thread is undefined. Line 1289 and 1290 should be removed. Not this old chestnut again. Like it or not, it's a fact. I can't name a CPU in which an aligned load/store of a natural/primitive data size between overlapping threads actually has undefined behavior. The underlying assembly instructions generated by the C compiler have a defined behavior on multi-CPU systems. Of course, that's because those CPUs don't exist yet. But I expect code that I write and compile today to work on future CPUs because I don't rely on guarantees I don't actually ahve. Due to this defined behavior of the underlying CPU at a lower level than the scope of the POSIX specification is designed to cover you should treat this behavior as defined. This would not contradict POSIX which is effectively saying out of the scope of the POSIX spec by using the term undefined. Umm, what CPU? *Every* CPU OpenSSL does or might ever run on? And that would have to include every compiler OpenSSL does or might support, because the compiler can change things around too. The function entrance, the return and the CRYPTO_w_lock() calls all act as the correct memory barriers with regards to compiler optimizations and load/store rescheduling concerns. Even if the CRYPTO_w_lock() is a macro or inline, since the writer would have been sure to insert the correct barriers if he chose to implement CRYPTO_w_lock() as a macro/inline. Where is that documented or guaranteed? You mean that they *happen* to be barriers on the platforms you have used. You have no idea what the next version of GCC might do. We are talking about saving *NOTHING* here. Alternatively, lines 1288, 1289, 1290, and 1297 should be removed. It's hard to understand any reasonable scenario in which one thread might be calling SSL_free_comp_methods while another thread is or might be accessing the ssl_comp_methods variable or any of the registered compression methods. I would agree with your overall view, but I would on the basis that this function is not performance critical to deserve any special lock contention avoidance treatment. Exactly. This is the worst kind of premature optimization. But there is a case for locking to not be necessary at all. If the SSL_free_comp_methods() is only designed to be used during the cleanup phase. During a clean shutdown sequence of the OpenSSL library I'm pretty sure you are meant to have reverted the OpenSSL call execution path to single threaded mode. Right. You can't let one thread unconditionally shut down a library while another thread is or might be using it. However, it's possible this function was also intended to 'unregister' the compression functions without harming the library. In that case, the lock is needed and the optimization is broken. Meaning while you are executing any cleanup functions it is not safe to be calling any other OpenSSL functions at the same time. By virtue of this requirement locking should not be necessary (but should do more good than harm if implemented). If your application needs to repeatedly init and cleanup, init and cleanup, then your application needs to also manage the state and the serialization of the init/cleanup calls (since they may not be safe to be called out-of-order, called twice, interleved, etc...). It will do more good than harm if the locks haven't been cleaned up! If another thread modified 'ssl_comp_methods' in between the read and the cache flush, that modification would be lost. I, for one, don't like needlessly making code potentially disastrous on future CPUs and compilers. (To save what?!) ... the evils of premature optimization. No as I said by virtue of it being a function it already forces the compiler to perform the correct barriers. Maybe you can name a CPU target which does not automatically flush pending writes for return from function and call a function instructions, I can't. The problem occurs after the beginning of the function. If the compare is done on a cached copy in a register. Look at this example: if (variable!=NULL) { free(variable); variable=NULL; } This could easily be optimized to (cached_copy is a register or other fast
RE: Memory Leaks in SSL_Library_init()
Oh. I'm sorry. Someone needs to use a keyword 'volatile'. Sorry, doesn't help. Bingo. Problem solved on the improper optimization issue. What specification says that 'volatile' causes any particular semantics across threads? I must not have read that one. The 'volatile' keyword is only defined in the C and C++ standards, and neither of those address multi-threading. You might think that multi-threading standards would extend 'volatile' to have multi-thread semantics. The fact is, they don't. If you can cite one threading standard that specifies semantics for 'volatile', please do so. The reason they don't is quite logical. First, it penalizes code that uses 'volatile' for its documented purposes (such as 'longjmp' and signals). Second, they instead provide much more flexible primitives (such as mutexes) which solve a larger range of problems than a variable qualifier ever should. In short, *nobody* has any idea what effect 'volatile' might have on muli-threaded code on future compilers and future CPUs because *no* standard says what it should do. For example, Itanium compilers do not put memory barriers before or after accesses to 'volatile' variables precisely because it heavily penalizes sensible code and better ways to access shared variables are provided. Can we commit the patch so that we don't have to keep getting hit by 2 or 3 threads about valgrind complaining about reachable pointers at the end of program execution every month? This /is/ an OpenSSL library bug, and it /does/ need to be addressed -- regardless of it's only 36 bytes, relying on an operating system cleaning up after you is sloppy programming at best and bug-laden at worst. Where these kinds of things can easily be fixed, they *definitely* should be. Anyone who doesn't want to waste time cleaning up doesn't have to call the cleanup functions at all. I do not in any way oppose making clean up functions do a better job of cleaning up where that is essentially free. Remember, not all architectures have memory management units, either. Even architectures without memory management units keep track of what memory belongs to a process and free it when that process terminates. It's hard to imagine an architecture that couldn't do that where you wouldn't just keep your libraries spun up all the time. I can't think of any. In any event, on such a platform, you probably couldn't use much commodity software. The trade-offs in design would skew very differently. There are SSL libraries aimed at that market, and I don't think OpenSSL is really trying to do that. DS __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: Memory Leaks in SSL_Library_init()
So the point you are trying to make is, while the function would solve the purpose of freeing the compression methods, However the lock are not really required in the usage secnario of this function? From: David Schwartz [EMAIL PROTECTED] Reply-To: openssl-dev@openssl.org To: openssl-dev@openssl.org Subject: RE: Memory Leaks in SSL_Library_init() Date: Tue, 27 Mar 2007 21:23:45 -0700 For POSIX threads, the result of reading a variable in one thread while it might be modified in another thread is undefined. Line 1289 and 1290 should be removed. Not this old chestnut again. Like it or not, it's a fact. I can't name a CPU in which an aligned load/store of a natural/primitive data size between overlapping threads actually has undefined behavior. The underlying assembly instructions generated by the C compiler have a defined behavior on multi-CPU systems. Of course, that's because those CPUs don't exist yet. But I expect code that I write and compile today to work on future CPUs because I don't rely on guarantees I don't actually ahve. Due to this defined behavior of the underlying CPU at a lower level than the scope of the POSIX specification is designed to cover you should treat this behavior as defined. This would not contradict POSIX which is effectively saying out of the scope of the POSIX spec by using the term undefined. Umm, what CPU? *Every* CPU OpenSSL does or might ever run on? And that would have to include every compiler OpenSSL does or might support, because the compiler can change things around too. The function entrance, the return and the CRYPTO_w_lock() calls all act as the correct memory barriers with regards to compiler optimizations and load/store rescheduling concerns. Even if the CRYPTO_w_lock() is a macro or inline, since the writer would have been sure to insert the correct barriers if he chose to implement CRYPTO_w_lock() as a macro/inline. Where is that documented or guaranteed? You mean that they *happen* to be barriers on the platforms you have used. You have no idea what the next version of GCC might do. We are talking about saving *NOTHING* here. Alternatively, lines 1288, 1289, 1290, and 1297 should be removed. It's hard to understand any reasonable scenario in which one thread might be calling SSL_free_comp_methods while another thread is or might be accessing the ssl_comp_methods variable or any of the registered compression methods. I would agree with your overall view, but I would on the basis that this function is not performance critical to deserve any special lock contention avoidance treatment. Exactly. This is the worst kind of premature optimization. But there is a case for locking to not be necessary at all. If the SSL_free_comp_methods() is only designed to be used during the cleanup phase. During a clean shutdown sequence of the OpenSSL library I'm pretty sure you are meant to have reverted the OpenSSL call execution path to single threaded mode. Right. You can't let one thread unconditionally shut down a library while another thread is or might be using it. However, it's possible this function was also intended to 'unregister' the compression functions without harming the library. In that case, the lock is needed and the optimization is broken. Meaning while you are executing any cleanup functions it is not safe to be calling any other OpenSSL functions at the same time. By virtue of this requirement locking should not be necessary (but should do more good than harm if implemented). If your application needs to repeatedly init and cleanup, init and cleanup, then your application needs to also manage the state and the serialization of the init/cleanup calls (since they may not be safe to be called out-of-order, called twice, interleved, etc...). It will do more good than harm if the locks haven't been cleaned up! If another thread modified 'ssl_comp_methods' in between the read and the cache flush, that modification would be lost. I, for one, don't like needlessly making code potentially disastrous on future CPUs and compilers. (To save what?!) ... the evils of premature optimization. No as I said by virtue of it being a function it already forces the compiler to perform the correct barriers. Maybe you can name a CPU target which does not automatically flush pending writes for return from function and call a function instructions, I can't. The problem occurs after the beginning of the function. If the compare is done on a cached copy in a register. Look at this example: if (variable!=NULL) { free(variable); variable=NULL; } This could easily be optimized to (cached_copy is a register or other fast place to store data): int cached_copy=variable; if(cached_copy!=NULL) { acquire_lock(); cached_copy=variable; free(cached_copy); cached_copy=NULL; variable=cached_copy; release_lock(); } else variable=cached_copy; If you think
RE: Memory Leaks in SSL_Library_init()
Hi! I have an example case where by the unused memoy allocated by SSL_library_init when not freed, would accumulate. There is an application which takes services from some of the libraries say A, B and C. These libraries are dynamically loaded and unloaded into the application as and when required depending on certain conditions and using g_module_ calls. All of these plugin libraries call SSL_library_init() and since we do not know how many times in the lifespan of the application these libraries would be loaded/unloaded, we dont know how many times SSL_library_init() would get called. This issue is also encountered in pretty much every library of sufficient complexity. All modules in a process that access a complex library must cooperate to meet that library's process-scope rules. Failure to meet this requirement can lead to unexpected and spectacular failures. As a simple example, consider algorithms that are added. These are process-scope. A plugin may find algorithms added that it didn't want added. So you have to design your plugin to tolerate algorithms added by other plugins. In other words, being a plugin that works with other plugins puts design constraints and imposes cooperation requirements. Imagine two plugins that each want to call CRYPTO_set_mem_functions. Whose locking functions do we use? What if one module uses a kernel threads implementation and the user uses emulated threads inside a single kernel thread? Sorry, it's not pretty, but if plugins are going to use OpenSSL, then the program they are plugged into has to coordinate (or there has to be an OpenSSL plugin they both use or some other sensible arrangement). A process has to agree on how it's going to use OpenSSL because the process loads and initializes the library. Making sure initialization is only done once is the least of your problems. DS __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: Memory Leaks in SSL_Library_init()
HI! Thanks again for highlighting those issues. What would be the best way for the application using those pluggins to avoid this issue of SSL_library_init()? regards -Nitin From: David Schwartz [EMAIL PROTECTED] Reply-To: openssl-dev@openssl.org To: openssl-dev@openssl.org Subject: RE: Memory Leaks in SSL_Library_init() Date: Wed, 21 Mar 2007 00:14:53 -0700 Hi! I have an example case where by the unused memoy allocated by SSL_library_init when not freed, would accumulate. There is an application which takes services from some of the libraries say A, B and C. These libraries are dynamically loaded and unloaded into the application as and when required depending on certain conditions and using g_module_ calls. All of these plugin libraries call SSL_library_init() and since we do not know how many times in the lifespan of the application these libraries would be loaded/unloaded, we dont know how many times SSL_library_init() would get called. This issue is also encountered in pretty much every library of sufficient complexity. All modules in a process that access a complex library must cooperate to meet that library's process-scope rules. Failure to meet this requirement can lead to unexpected and spectacular failures. As a simple example, consider algorithms that are added. These are process-scope. A plugin may find algorithms added that it didn't want added. So you have to design your plugin to tolerate algorithms added by other plugins. In other words, being a plugin that works with other plugins puts design constraints and imposes cooperation requirements. Imagine two plugins that each want to call CRYPTO_set_mem_functions. Whose locking functions do we use? What if one module uses a kernel threads implementation and the user uses emulated threads inside a single kernel thread? Sorry, it's not pretty, but if plugins are going to use OpenSSL, then the program they are plugged into has to coordinate (or there has to be an OpenSSL plugin they both use or some other sensible arrangement). A process has to agree on how it's going to use OpenSSL because the process loads and initializes the library. Making sure initialization is only done once is the least of your problems. DS __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED] _ Catch the complete World Cup coverage with MSN http://content.msn.co.in/Sports/Cricket/Default.aspx __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: Memory Leaks in SSL_Library_init()
hi! If we say that the call SSL_library_init() would initialze some data structures which have process scope and are initialized only once. In such case what is the problem in having a *single* function which exacly cleans up those data structures at the time of process termination? regards -Nitin From: David Schwartz [EMAIL PROTECTED] Reply-To: openssl-dev@openssl.org To: openssl-dev@openssl.org Subject: RE: Memory Leaks in SSL_Library_init() Date: Wed, 21 Mar 2007 00:14:53 -0700 Hi! I have an example case where by the unused memoy allocated by SSL_library_init when not freed, would accumulate. There is an application which takes services from some of the libraries say A, B and C. These libraries are dynamically loaded and unloaded into the application as and when required depending on certain conditions and using g_module_ calls. All of these plugin libraries call SSL_library_init() and since we do not know how many times in the lifespan of the application these libraries would be loaded/unloaded, we dont know how many times SSL_library_init() would get called. This issue is also encountered in pretty much every library of sufficient complexity. All modules in a process that access a complex library must cooperate to meet that library's process-scope rules. Failure to meet this requirement can lead to unexpected and spectacular failures. As a simple example, consider algorithms that are added. These are process-scope. A plugin may find algorithms added that it didn't want added. So you have to design your plugin to tolerate algorithms added by other plugins. In other words, being a plugin that works with other plugins puts design constraints and imposes cooperation requirements. Imagine two plugins that each want to call CRYPTO_set_mem_functions. Whose locking functions do we use? What if one module uses a kernel threads implementation and the user uses emulated threads inside a single kernel thread? Sorry, it's not pretty, but if plugins are going to use OpenSSL, then the program they are plugged into has to coordinate (or there has to be an OpenSSL plugin they both use or some other sensible arrangement). A process has to agree on how it's going to use OpenSSL because the process loads and initializes the library. Making sure initialization is only done once is the least of your problems. DS __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED] _ Post ads for free. On new MSN Classifieds. http://www.yello.in/home.php?utm_source=hotmailtagutm_medium=textlinkutm_content=inutm_campaign=intro __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: Memory Leaks in SSL_Library_init()
HI! Thanks again for highlighting those issues. What would be the best way for the application using those pluggins to avoid this issue of SSL_library_init()? There are really two good ways that ensure that all problems are resolved. Other ways just deal with problems as they crop up and can never ensure there are no more problems (because you never really know what a future plug-in might do when you write a plug-in so you can never be sure it won't break something you rely on). One way is to design into the application a specification for accessing SSL and cryptographic routines. This specification would generally put one master module (or the application itself) in charge of initializing the library and managing all process-level library parameters. Thus the program 'wraps' OpenSSL. This may mean just replacing the startup and shutdown routines and forbidding changes to process-level state. It could mean wrapping a significant part of the library's API. Another way is to have each plugin use its own copy of OpenSSL with its own everything. This guarantees you start up and shut down your own build of OpenSSL and get exactly the version you want. (For some libraries, you have to change the symbols so the names don't conflict.) The first way has the disadvantage that plugins cannot use differing versions of OpenSSL and must settle on process-level settings that might not make any plugin perfectly happy. It also requires someone to actually create the rules and implement them and all plugins using the common library must follow the specification. The second way has the disadvantage that it's ugly and wastes resource, but it does allow each plugin to own its own process-level OpenSSL. Each plugin can get the version of OpenSSL it wants. Where it's too late for the first way, what is generally resorted to is fixing problems that crop up. Where possible, libraries can be redesigned to minimize process-level state. What happens if two plugins need different versions of OpenSSL? Generally, a plug-in library is strictly limited in what it is permitted to do and going outside that scope can cause weird instabilities. Consider a plugin that always worked fine because it only added 64-bit ciphers, then another plugin adds 256-bit ciphers, and the first plugin starts getting larger ciphers because they are preferred but then copies a key into a 64-bit buffer. Oopsie. Which plug-in is at fault? Plug-ins share all kinds of things, and this type of sharing requires agreement and cooperation. You cannot have all the plug-ins use the same library, where that library has process-level state, and just hope it will all work out somehow. Designing programs with plugin modules is not easy. It takes a *lot* of thinking from the beginning to keep it sensible and be relatively sure new modules won't break old ones. For some strange reason, this thread is making me feel old. DS __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: Memory Leaks in SSL_Library_init()
If we say that the call SSL_library_init() would initialze some data structures which have process scope and are initialized only once. In such case what is the problem in having a *single* function which exacly cleans up those data structures at the time of process termination? See my other posts, particularly this section: Again, you can't put everything last. And anything that goes after something else cannot touch what has already been cleaned up unless you add code to make that work at the point where you touch the thing that might be cleaned up. And that point is usually *not* the cleanup code. The perfect solution involves a lot of very complex dependency work to make absolutely sure that nothing can be used after it's cleaned up. What if the cleanup code for A might need B, and the cleanup code for B might need C, and the cleanup code for C might need A? Do you un-cleanup A? Then when do you clean it back up again? Might you never terminate? There are many examples. Engineering a library for perfect cleanup is a *major* design requirement and if you insist on it, things will have to give in other areas. The problem, in a nutshell, is dependencies. DS __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: Memory Leaks in SSL_Library_init()
Nitin M wrote: Does this mean that in such scenario the application need not call SSL_library_init since it does not need those extra initialisations and can achieve only the required initialisations with specific calls? If this is true I have two more questions here, 1. In what scenario then an application should use SSL_library_init()? I think not for the simplest use of SSL/Crypto functionality in the application. 2. If SSL_library_init() should be called only once in the lifetime of a process and it is is creating some global data structures only once within that process, why cant we have a function to clear all these global variables which can be called at the the process termination time? If you are using SSL then you are touching many parts of the OpenSSL library as all the digests and crypto algorithms are pulled in to allow them to be available for selection and negotiation. Most users want to simple few function calls to auto-register all those methods. The simplest use of OpenSSL is more inline with how I explained, an app just wants to use a single digest algorithm (like MD5 or SHA1, but not SSL). As I still maintain you need to provide information about the leak(s) in order to progress. Application usages goes like this: * Application initializes the OpenSSL libary. * Application creates and uses data object with the OpenSSL library. * Application wants to shutdown. * Application destroys all OpenSSL data objects it is holding. * Application calls cleanup routines in the OpenSSL library. This is pretty standard stuff, what David maybe highlighting is that many programmers or a short-lived applications running on a process segmented operating system might skip the last 2 points to save time/money. Darryl __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: Memory Leaks in SSL_Library_init()
Is it required to call SSL_library_init() if I only want to use some crypto functionalities? -Nitin From: Darryl Miles [EMAIL PROTECTED] Reply-To: openssl-dev@openssl.org To: openssl-dev@openssl.org Subject: Re: Memory Leaks in SSL_Library_init() Date: Wed, 21 Mar 2007 11:12:38 + Nitin M wrote: Does this mean that in such scenario the application need not call SSL_library_init since it does not need those extra initialisations and can achieve only the required initialisations with specific calls? If this is true I have two more questions here, 1. In what scenario then an application should use SSL_library_init()? I think not for the simplest use of SSL/Crypto functionality in the application. 2. If SSL_library_init() should be called only once in the lifetime of a process and it is is creating some global data structures only once within that process, why cant we have a function to clear all these global variables which can be called at the the process termination time? If you are using SSL then you are touching many parts of the OpenSSL library as all the digests and crypto algorithms are pulled in to allow them to be available for selection and negotiation. Most users want to simple few function calls to auto-register all those methods. The simplest use of OpenSSL is more inline with how I explained, an app just wants to use a single digest algorithm (like MD5 or SHA1, but not SSL). As I still maintain you need to provide information about the leak(s) in order to progress. Application usages goes like this: * Application initializes the OpenSSL libary. * Application creates and uses data object with the OpenSSL library. * Application wants to shutdown. * Application destroys all OpenSSL data objects it is holding. * Application calls cleanup routines in the OpenSSL library. This is pretty standard stuff, what David maybe highlighting is that many programmers or a short-lived applications running on a process segmented operating system might skip the last 2 points to save time/money. Darryl __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED] _ Sign in and get updated on all the action from Formula One http://content.msn.co.in/Sports/FormulaOne/Default __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: Memory Leaks in SSL_Library_init()
Is it required to call SSL_library_init() if I only want to use some crypto functionalities? All SSL_library_init does is add ciphers and digests to the EVP table. If you don't need any ciphers and digests accessible through the EVP interface or you add those ciphers and digests yourself, you do not need to call SSL_library_init. Look at the ssl/ssl_algs.c file. DS __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: Memory Leaks in SSL_Library_init()
Hi! for using valgrind this way of configuring is OK ./config --prefix=/home/user -DPURIFY or this is required ./config --prefix=/home/user -DPURIFY -ggdb regards -Nitin From: Darryl Miles [EMAIL PROTECTED] Reply-To: openssl-dev@openssl.org To: openssl-dev@openssl.org Subject: Re: Memory Leaks in SSL_Library_init() Date: Wed, 21 Mar 2007 11:12:38 + Nitin M wrote: Does this mean that in such scenario the application need not call SSL_library_init since it does not need those extra initialisations and can achieve only the required initialisations with specific calls? If this is true I have two more questions here, 1. In what scenario then an application should use SSL_library_init()? I think not for the simplest use of SSL/Crypto functionality in the application. 2. If SSL_library_init() should be called only once in the lifetime of a process and it is is creating some global data structures only once within that process, why cant we have a function to clear all these global variables which can be called at the the process termination time? If you are using SSL then you are touching many parts of the OpenSSL library as all the digests and crypto algorithms are pulled in to allow them to be available for selection and negotiation. Most users want to simple few function calls to auto-register all those methods. The simplest use of OpenSSL is more inline with how I explained, an app just wants to use a single digest algorithm (like MD5 or SHA1, but not SSL). As I still maintain you need to provide information about the leak(s) in order to progress. Application usages goes like this: * Application initializes the OpenSSL libary. * Application creates and uses data object with the OpenSSL library. * Application wants to shutdown. * Application destroys all OpenSSL data objects it is holding. * Application calls cleanup routines in the OpenSSL library. This is pretty standard stuff, what David maybe highlighting is that many programmers or a short-lived applications running on a process segmented operating system might skip the last 2 points to save time/money. Darryl __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED] _ Post ads for free. On new MSN Classifieds. http://www.yello.in/home.php?utm_source=hotmailtagutm_medium=textlinkutm_content=inutm_campaign=intro __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: Memory Leaks in SSL_Library_init()
David Schwartz wrote: The function SSL_library_init() is observed to be introudcing memory leak in the application code. There is still some amount of memory leak left even after the series of cleanup calls suggested in the openssl FAQ. Your first sentence is pretty funny though. How can a function you can only call once in the lifetime of a process introduce a memory leak? DS Once you figure that out, it won't be so funny. ;) You're making assumptions about the OS that aren't always true. b.c. __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: Memory Leaks in SSL_Library_init()
David Schwartz wrote: The function SSL_library_init() is observed to be introudcing memory leak in the application code. There is still some amount of memory leak left even after the series of cleanup calls suggested in the openssl FAQ. Can someone help understand that technically what is the problem in having a single cleanup call like SSL_library_cleanup, instead of a using a series of cleanup calls? Because there are multiple layers in use at the same time. Think of OpenSSL more as a collection of cryptographic functions bundled together rather than a single entity. OpenSSL is able to load dynamically at runtime additional functionality and allow the application to provide functionality to hook in. For example I could expose a bespoke homebrew digest or encryption algorithm in such a way that it could be negotiated via SSL. At the other end of the scale there are users who simply want to use OpenSSL to provide just a single digest algorithm, and consequentially don't need all the extra initialization for SSL support. Out of interest where are those memory leaks that you find ? As for the series of cleanup calls, there are several reasons. The two main ones are that you might need to cleanup different things and some cleanup functions need to be called from a particular context. I'm not heard the argument that doing correct cleanup introduces extra code complexity in relation to OpenSSL. David seems to be providing generic reasons why this leak might exist as opposed to the reasons for this specific leak you have found. Sure I can think up many reasons why this leak exists, at the top of my list would be, maybe there is a genuine bug here that the developers dont know about it yet. So I'd like to continue on the path to proving or disproving that by asking the O/P'er to specify what exactly he has found that is being leaked. valgrind can be a useful in this regard, but dont forget to compile OpenSSL with -DPURIFY or disable uninitialized data checking. Despite what the man page / FAQ suggests you should do, would it be possible to try the following cleanup sequence and confirm if this changes the situation with your leaks: /* thread-local cleanup */ ERR_remove_state(0); /* Globals we finished with */ my_application_function_to_destroy_OpenSSL_objects_attached_to_local_variables(); // In the particular application I am keeping an RSA // object around. // if(tmp_rsa != NULL) { // RSA_free(tmp_rsa); // tmp_rsa = NULL; // } /* thread-safe cleanup */ ENGINE_cleanup(); CONF_modules_unload(1); /* global application exit cleanup (after all SSL activity is shutdown) */ ERR_free_strings(); EVP_cleanup(); CRYPTO_cleanup_all_ex_data(); Last time I run my application through valgrind I did not have any leaks with my OpenSSL usage. Darryl __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: Memory Leaks in SSL_Library_init()
From: David Schwartz [EMAIL PROTECTED] Reply-To: openssl-dev@openssl.org To: openssl-dev@openssl.org Subject: RE: Memory Leaks in SSL_Library_init() Date: Tue, 20 Mar 2007 05:02:01 -0700 The function SSL_library_init() is observed to be introudcing memory leak in the application code. There is still some amount of memory leak left even after the series of cleanup calls suggested in the openssl FAQ. Can someone help understand that technically what is the problem in having a single cleanup call like SSL_library_cleanup, instead of a using a series of cleanup calls? It's pretty much the same issue as every other library in the world. Perfect cleanup is extremely difficult and often adds performance overhead during use. Keepin it apart from the memory leak, i would like to know by example how a perfect cleanup can casue performance problems? Since nobody really cares whether you can free every single byte before you terminate a process, nobody bothers to code it. Not just for a single byte, but in embedded systems and in deviced with resource contraints woudn't we care about even a small amount of memory not being freed after it is no more required? For example, singletons can be very difficult to remove. How and when do you remove a memory manager? As for the series of cleanup calls, there are several reasons. The two main ones are that you might need to cleanup different things and some cleanup functions need to be called from a particular context. Your first sentence is pretty funny though. How can a function you can only call once in the lifetime of a process introduce a memory leak? DS __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED] _ Voice your questions and our experts will answer them http://content.msn.co.in/Lifestyle/AskExpert/Default01.htm __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: Memory Leaks in SSL_Library_init()
Keepin it apart from the memory leak, i would like to know by example how a perfect cleanup can casue performance problems? One common case goes like this: 1) You have an object you create very early in the library initialization. 2) The object is accessed a lot, and having to check if it's available or initialized would be tricky to do every place it might be invoked. 3) So long as you never cleanup the object, you know it must exist in any code that library could call, since the first thing the library does is create it. 4) If you destroy it in cleanup, any other cleanup code that runs after that has to deal with the fact that the object doesn't exist. This could in some cases be most of the library. 5) You could destroy it last, but you can't destroy everything last. Consider a linked list of algorithms with sentinels to save pointer compares to NULL. For perfect cleanup, you have to get rid of those dummy objects. But that means any code that runs after that cleanup will blow up if it accesses the list (or all code that accesses the list will have to make sure it hasn't been cleaned up, defeating the whole point of the sentinels). So you have a new ordering rule that affects any code called from cleanup code that might go after those objects are cleaned up -- it most not ever touch that list, and no code it calls can ever touch the list. If you get too many of these rules, it's easier to change to a less-efficient list algorithm that doesn't require sentinels. Again, you can't put everything last. And anything that goes after something else cannot touch what has already been cleaned up unless you add code to make that work at the point where you touch the thing that might be cleaned up. And that point is usually *not* the cleanup code. The perfect solution involves a lot of very complex dependency work to make absolutely sure that nothing can be used after it's cleaned up. What if the cleanup code for A might need B, and the cleanup code for B might need C, and the cleanup code for C might need A? Do you un-cleanup A? Then when do you clean it back up again? Might you never terminate? There are many examples. Engineering a library for perfect cleanup is a *major* design requirement and if you insist on it, things will have to give in other areas. Since nobody really cares whether you can free every single byte before you terminate a process, nobody bothers to code it. Not just for a single byte, but in embedded systems and in deviced with resource contraints woudn't we care about even a small amount of memory not being freed after it is no more required? Sure and in those special cases, you have totally different design requirements and thus totally different designs. Unless you have embedded systems that have enough resources that they can run commodity code, in which case you don't care -- that's why you put the resources there. If you want to do the most with the very least, you have a very special case that bears no resemblance to typical desktop/server code (and there exist SSL/crypto libraries for these unusual cases). If you want to use mainstream code, you have to make the mainstream tradeoffs. Otherwise, you were wrong to want to use mainstream code. One of the reasons people put lots of CPU and lots of memory in modern embedded systems is because it's cheap and allows you to use more mainstream engineering tradeoffs. This is one of those tradeoffs. Welcome to the real world. DS __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: Memory Leaks in SSL_Library_init()
Hi! All, Thanks very much for your inouts on this. From: Darryl Miles [EMAIL PROTECTED] Reply-To: openssl-dev@openssl.org To: openssl-dev@openssl.org Subject: Re: Memory Leaks in SSL_Library_init() Date: Tue, 20 Mar 2007 17:18:40 + David Schwartz wrote: The function SSL_library_init() is observed to be introudcing memory leak in the application code. There is still some amount of memory leak left even after the series of cleanup calls suggested in the openssl FAQ. Can someone help understand that technically what is the problem in having a single cleanup call like SSL_library_cleanup, instead of a using a series of cleanup calls? Because there are multiple layers in use at the same time. Think of OpenSSL more as a collection of cryptographic functions bundled together rather than a single entity. OpenSSL is able to load dynamically at runtime additional functionality and allow the application to provide functionality to hook in. For example I could expose a bespoke homebrew digest or encryption algorithm in such a way that it could be negotiated via SSL. At the other end of the scale there are users who simply want to use OpenSSL to provide just a single digest algorithm, and consequentially don't need all the extra initialization for SSL support. Does this mean that in such scenario the application need not call SSL_library_init since it does not need those extra initialisations and can achieve only the required initialisations with specific calls? If this is true I have two more questions here, 1. In what scenario then an application should use SSL_library_init()? I think not for the simplest use of SSL/Crypto functionality in the application. 2. If SSL_library_init() should be called only once in the lifetime of a process and it is is creating some global data structures only once within that process, why cant we have a function to clear all these global variables which can be called at the the process termination time? Out of interest where are those memory leaks that you find ? As for the series of cleanup calls, there are several reasons. The two main ones are that you might need to cleanup different things and some cleanup functions need to be called from a particular context. I'm not heard the argument that doing correct cleanup introduces extra code complexity in relation to OpenSSL. David seems to be providing generic reasons why this leak might exist as opposed to the reasons for this specific leak you have found. Sure I can think up many reasons why this leak exists, at the top of my list would be, maybe there is a genuine bug here that the developers dont know about it yet. So I'd like to continue on the path to proving or disproving that by asking the O/P'er to specify what exactly he has found that is being leaked. valgrind can be a useful in this regard, but dont forget to compile OpenSSL with -DPURIFY or disable uninitialized data checking. Despite what the man page / FAQ suggests you should do, would it be possible to try the following cleanup sequence and confirm if this changes the situation with your leaks: /* thread-local cleanup */ ERR_remove_state(0); /* Globals we finished with */ my_application_function_to_destroy_OpenSSL_objects_attached_to_local_variables(); // In the particular application I am keeping an RSA // object around. // if(tmp_rsa != NULL) { // RSA_free(tmp_rsa); // tmp_rsa = NULL; // } /* thread-safe cleanup */ ENGINE_cleanup(); CONF_modules_unload(1); /* global application exit cleanup (after all SSL activity is shutdown) */ ERR_free_strings(); EVP_cleanup(); CRYPTO_cleanup_all_ex_data(); Last time I run my application through valgrind I did not have any leaks with my OpenSSL usage. Darryl __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED] regards -Nitin _ Watch free concerts with Pink, Rod Stewart, Oasis and more. Visit MSN Presents today. http://music.msn.com/presents?icid=ncmsnpresentstaglineocid=T002MSN03A07001 __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: Memory Leaks in SSL_Library_init()
Hi! I have an example case where by the unused memoy allocated by SSL_library_init when not freed, would accumulate. There is an application which takes services from some of the libraries say A, B and C. These libraries are dynamically loaded and unloaded into the application as and when required depending on certain conditions and using g_module_ calls. All of these plugin libraries call SSL_library_init() and since we do not know how many times in the lifespan of the application these libraries would be loaded/unloaded, we dont know how many times SSL_library_init() would get called. regards -Nitin From: David Schwartz [EMAIL PROTECTED] Reply-To: openssl-dev@openssl.org To: openssl-dev@openssl.org Subject: RE: Memory Leaks in SSL_Library_init() Date: Tue, 20 Mar 2007 05:02:01 -0700 The function SSL_library_init() is observed to be introudcing memory leak in the application code. There is still some amount of memory leak left even after the series of cleanup calls suggested in the openssl FAQ. Can someone help understand that technically what is the problem in having a single cleanup call like SSL_library_cleanup, instead of a using a series of cleanup calls? It's pretty much the same issue as every other library in the world. Perfect cleanup is extremely difficult and often adds performance overhead during use. Since nobody really cares whether you can free every single byte before you terminate a process, nobody bothers to code it. For example, singletons can be very difficult to remove. How and when do you remove a memory manager? As for the series of cleanup calls, there are several reasons. The two main ones are that you might need to cleanup different things and some cleanup functions need to be called from a particular context. Your first sentence is pretty funny though. How can a function you can only call once in the lifetime of a process introduce a memory leak? DS __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED] _ Spice up your IM conversations. New, colorful and animated emoticons. Get chatting! http://server1.msn.co.in/SP05/emoticons/ __ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager [EMAIL PROTECTED]