Re: [HACKERS] Re: [BUGS] libpq does not manage SSL callbacks properly when other libraries are involved.
Bruce Momjian wrote: Bruce Momjian wrote: Thanks for the review, Magnus. I have adjusted the patch to use the same mutex every time the counter is accessed, and adjusted the pqsecure_destroy() call to properly decrement in the right place. Also, I renamed the libpq global destroy function to be clearer (the function is not exported). Here is an updated version of the patch to match CVS HEAD. I've updated it to match what's CVS HEAD now, and made some minor modifications. Renamed destroySSL() to make it consistent with initializeSSL(). Added and changed some comments. ssldiff.patch contains my changes against Bruce's patch. I also removed the #ifdef NOT_USED parts. They are in CVS history if we need them, and they're trivial things anyway, so I think this is much cleaner. With this, it looks fine to me. Especially since we've seen some testing from the PHP folks already. //Magnus *** src/backend/libpq/be-secure.c --- src/backend/libpq/be-secure.c *** *** 88,94 static DH *tmp_dh_cb(SSL *s, int is_export, int keylength); static int verify_cb(int, X509_STORE_CTX *); static void info_cb(const SSL *ssl, int type, int args); static void initialize_SSL(void); - static void destroy_SSL(void); static int open_server_SSL(Port *); static void close_SSL(Port *); static const char *SSLerrmessage(void); --- 88,93 *** *** 193,209 secure_initialize(void) } /* - * Destroy global context - */ - void - secure_destroy(void) - { - #ifdef USE_SSL - destroy_SSL(); - #endif - } - - /* * Indicate if we have loaded the root CA store to verify certificates */ bool --- 192,197 *** *** 844,862 initialize_SSL(void) } /* - * Destroy global SSL context. - */ - static void - destroy_SSL(void) - { - if (SSL_context) - { - SSL_CTX_free(SSL_context); - SSL_context = NULL; - } - } - - /* * Attempt to negotiate SSL connection. */ static int --- 832,837 *** src/interfaces/libpq/fe-secure.c --- src/interfaces/libpq/fe-secure.c *** *** 44,49 --- 44,50 #endif #include arpa/inet.h #endif + #include sys/stat.h #ifdef ENABLE_THREAD_SAFETY *** *** 89,108 static bool verify_peer_name_matches_certificate(PGconn *); static int verify_cb(int ok, X509_STORE_CTX *ctx); static int client_cert_cb(SSL *, X509 **, EVP_PKEY **); static int init_ssl_system(PGconn *conn); static int initialize_SSL(PGconn *); ! static void destroy_SSL(void); static PostgresPollingStatusType open_client_SSL(PGconn *); static void close_SSL(PGconn *); static char *SSLerrmessage(void); static void SSLerrfree(char *buf); - #endif - #ifdef USE_SSL static bool pq_initssllib = true; - static SSL_CTX *SSL_context = NULL; #endif /* * Macros to handle disabling and then restoring the state of SIGPIPE handling. * Note that DISABLE_SIGPIPE() must appear at the start of a block. --- 90,121 static int verify_cb(int ok, X509_STORE_CTX *ctx); static int client_cert_cb(SSL *, X509 **, EVP_PKEY **); static int init_ssl_system(PGconn *conn); + static void destroy_ssl_system(void); static int initialize_SSL(PGconn *); ! static void destroySSL(void); static PostgresPollingStatusType open_client_SSL(PGconn *); static void close_SSL(PGconn *); static char *SSLerrmessage(void); static void SSLerrfree(char *buf); static bool pq_initssllib = true; static SSL_CTX *SSL_context = NULL; + + #ifdef ENABLE_THREAD_SAFETY + static int ssl_open_connections = 0; + + #ifndef WIN32 + static pthread_mutex_t ssl_config_mutex = PTHREAD_MUTEX_INITIALIZER; + #else + static pthread_mutex_t ssl_config_mutex = NULL; + static long win32_ssl_create_mutex = 0; #endif + #endif /* ENABLE_THREAD_SAFETY */ + + #endif /* SSL */ + + /* * Macros to handle disabling and then restoring the state of SIGPIPE handling. * Note that DISABLE_SIGPIPE() must appear at the start of a block. *** *** 186,192 void pqsecure_destroy(void) { #ifdef USE_SSL ! destroy_SSL(); #endif } --- 199,205 pqsecure_destroy(void) { #ifdef USE_SSL ! destroySSL(); #endif } *** *** 734,739 client_cert_cb(SSL *ssl, X509 **x509, EVP_PKEY **pkey) --- 747,755 } #ifdef ENABLE_THREAD_SAFETY + /* + * Callback functions for OpenSSL internal locking + */ static unsigned long pq_threadidcallback(void) *** *** 765,818 pq_lockingcallback(int mode, int n, const char *file, int line) #endif /* ENABLE_THREAD_SAFETY */ /* ! * Also see similar code in fe-connect.c, default_threadlock() */ static int init_ssl_system(PGconn *conn) { #ifdef ENABLE_THREAD_SAFETY ! #ifndef WIN32 ! static pthread_mutex_t init_mutex = PTHREAD_MUTEX_INITIALIZER; ! #else ! static pthread_mutex_t init_mutex = NULL; ! static long mutex_initlock = 0; ! ! if (init_mutex == NULL) { ! while
Re: [HACKERS] Re: [BUGS] libpq does not manage SSL callbacks properly when other libraries are involved.
Bruce Momjian wrote: Thanks for the review, Magnus. I have adjusted the patch to use the same mutex every time the counter is accessed, and adjusted the pqsecure_destroy() call to properly decrement in the right place. Also, I renamed the libpq global destroy function to be clearer (the function is not exported). Here is an updated version of the patch to match CVS HEAD. -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + Index: src/backend/libpq/be-secure.c === RCS file: /cvsroot/pgsql/src/backend/libpq/be-secure.c,v retrieving revision 1.86 diff -c -c -r1.86 be-secure.c *** src/backend/libpq/be-secure.c 20 Nov 2008 09:29:36 - 1.86 --- src/backend/libpq/be-secure.c 20 Nov 2008 21:42:24 - *** *** 88,94 static int verify_cb(int, X509_STORE_CTX *); static void info_cb(const SSL *ssl, int type, int args); static void initialize_SSL(void); - static void destroy_SSL(void); static int open_server_SSL(Port *); static void close_SSL(Port *); static const char *SSLerrmessage(void); --- 88,93 *** *** 193,209 } /* - * Destroy global context - */ - void - secure_destroy(void) - { - #ifdef USE_SSL - destroy_SSL(); - #endif - } - - /* * Indicate if we have loaded the root CA store to verify certificates */ bool --- 192,197 *** *** 843,853 } } /* ! * Destroy global SSL context. */ static void ! destroy_SSL(void) { if (SSL_context) { --- 831,842 } } + #ifdef NOT_USED /* ! * Destroy global SSL context */ static void ! destroy_global_SSL(void) { if (SSL_context) { *** *** 855,860 --- 844,850 SSL_context = NULL; } } + #endif /* * Attempt to negotiate SSL connection. Index: src/interfaces/libpq/fe-secure.c === RCS file: /cvsroot/pgsql/src/interfaces/libpq/fe-secure.c,v retrieving revision 1.107 diff -c -c -r1.107 fe-secure.c *** src/interfaces/libpq/fe-secure.c 13 Nov 2008 09:45:25 - 1.107 --- src/interfaces/libpq/fe-secure.c 20 Nov 2008 21:42:25 - *** *** 44,49 --- 44,50 #endif #include arpa/inet.h #endif + #include sys/stat.h #ifdef ENABLE_THREAD_SAFETY *** *** 57,72 #ifdef USE_SSL #include openssl/ssl.h #include openssl/bio.h #if (SSLEAY_VERSION_NUMBER = 0x00907000L) #include openssl/conf.h #endif #if (SSLEAY_VERSION_NUMBER = 0x00907000L) !defined(OPENSSL_NO_ENGINE) #include openssl/engine.h #endif - #endif /* USE_SSL */ - - - #ifdef USE_SSL #ifndef WIN32 #define USER_CERT_FILE .postgresql/postgresql.crt --- 58,70 #ifdef USE_SSL #include openssl/ssl.h #include openssl/bio.h + #if (SSLEAY_VERSION_NUMBER = 0x00907000L) #include openssl/conf.h #endif #if (SSLEAY_VERSION_NUMBER = 0x00907000L) !defined(OPENSSL_NO_ENGINE) #include openssl/engine.h #endif #ifndef WIN32 #define USER_CERT_FILE .postgresql/postgresql.crt *** *** 91,110 static int verify_cb(int ok, X509_STORE_CTX *ctx); static int client_cert_cb(SSL *, X509 **, EVP_PKEY **); static int init_ssl_system(PGconn *conn); static int initialize_SSL(PGconn *); static void destroy_SSL(void); static PostgresPollingStatusType open_client_SSL(PGconn *); static void close_SSL(PGconn *); static char *SSLerrmessage(void); static void SSLerrfree(char *buf); - #endif - #ifdef USE_SSL static bool pq_initssllib = true; - static SSL_CTX *SSL_context = NULL; #endif /* * Macros to handle disabling and then restoring the state of SIGPIPE handling. * Note that DISABLE_SIGPIPE() must appear at the start of a block. --- 89,120 static int verify_cb(int ok, X509_STORE_CTX *ctx); static int client_cert_cb(SSL *, X509 **, EVP_PKEY **); static int init_ssl_system(PGconn *conn); + static void destroy_ssl_system(void); static int initialize_SSL(PGconn *); static void destroy_SSL(void); static PostgresPollingStatusType open_client_SSL(PGconn *); static void close_SSL(PGconn *); static char *SSLerrmessage(void); static void SSLerrfree(char *buf); static bool pq_initssllib = true; static SSL_CTX *SSL_context = NULL; + + #ifdef ENABLE_THREAD_SAFETY + static int ssl_open_connections = 0; + + #ifndef WIN32 + static pthread_mutex_t ssl_config_mutex = PTHREAD_MUTEX_INITIALIZER; + #else + static pthread_mutex_t ssl_config_mutex = NULL; + static long win32_ssl_create_mutex = 0; #endif + #endif /* ENABLE_THREAD_SAFETY */ + + #endif /* SSL */ + + /* * Macros to handle disabling and then restoring the state of SIGPIPE handling. * Note that DISABLE_SIGPIPE() must appear at the start of a block. *** *** 725,770
Re: [HACKERS] Re: [BUGS] libpq does not manage SSL callbacks properly when other libraries are involved.
Tom Lane wrote: Alvaro Herrera [EMAIL PROTECTED] writes: Bruce Momjian wrote: This is not something we would typically backpatch because of the danger of introducing some unexpected change in libpq. We can provide a patch to anyone who needs it, or if the community wants it backpatched I can certainly do that. If we start deciding we are not backpatching fixes that we know cause crashes, where is the limit? It isn't? It does seem like a bug, which we do typically backpatch ... Well, it's a risk-reward tradeoff. In this case it seems like there's a nontrivial risk of creating new bugs against fixing a problem that evidently affects very few people. I concur with Bruce's feeling that we shouldn't backpatch ... at least not now. Once the patch has been through beta testing we could reconsider. regards, tom lane I would like to see this backpatched. Even though the PostgreSQL community hasn't seen a lot of complaints, there have been a number of reports where the bug has caused crashes. Ubuntu launchpad has 6 duplicates for this bug. php has a bug report for it. So it's not like people don't know about it. They just didn't know how to fix it. All that said, I agree it's safer to wait until the 8.4 beta cycle has given this code change a good run before proceeding. In the mean time distributions can either backpatch it themselves or wait for PostgreSQL community to apply the patch. For the environment where I have this problem, I think it's still going to be a up hill battle to get RedHat to incorporate the fix into RHEL5. That's whichever route the community takes with backpatching. Russell.
Re: [HACKERS] Re: [BUGS] libpq does not manage SSL callbacks properly when other libraries are involved.
Russell Smith wrote: Tom Lane wrote: Alvaro Herrera [EMAIL PROTECTED] writes: Bruce Momjian wrote: This is not something we would typically backpatch because of the danger of introducing some unexpected change in libpq. We can provide a patch to anyone who needs it, or if the community wants it backpatched I can certainly do that. If we start deciding we are not backpatching fixes that we know cause crashes, where is the limit? Stability. That is our limit (goal). A foolish consistency is the hobgoblin of little minds, adored by little statesmen and philosophers and divines. Ralph Waldo Emerson It isn't? It does seem like a bug, which we do typically backpatch ... Well, it's a risk-reward tradeoff. In this case it seems like there's a nontrivial risk of creating new bugs against fixing a problem that evidently affects very few people. I concur with Bruce's feeling that we shouldn't backpatch ... at least not now. Once the patch has been through beta testing we could reconsider. regards, tom lane I would like to see this backpatched. Even though the PostgreSQL community hasn't seen a lot of complaints, there have been a number of reports where the bug has caused crashes. Ubuntu launchpad has 6 duplicates for this bug. php has a bug report for it. So it's not like Wow, that is interesting. people don't know about it. They just didn't know how to fix it. All that said, I agree it's safer to wait until the 8.4 beta cycle has given this code change a good run before proceeding. In the mean time distributions can either backpatch it themselves or wait for PostgreSQL community to apply the patch. For the environment where I have this problem, I think it's still going to be a up hill battle to get RedHat to incorporate the fix into RHEL5. That's whichever route the community takes with backpatching. Yea, it is a shame we didn't find/fix this earlier. It is reproducable so I am surprised we did not hear about it sooner. -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [BUGS] libpq does not manage SSL callbacks properly when other libraries are involved.
Bruce Momjian wrote: Russell Smith wrote: Will this be back patched when it's committed? This is not something we would typically backpatch because of the danger of introducing some unexpected change in libpq. We can provide a patch to anyone who needs it, or if the community wants it backpatched I can certainly do that. It isn't? It does seem like a bug, which we do typically backpatch ... -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [BUGS] libpq does not manage SSL callbacks properly when other libraries are involved.
Alvaro Herrera [EMAIL PROTECTED] writes: Bruce Momjian wrote: This is not something we would typically backpatch because of the danger of introducing some unexpected change in libpq. We can provide a patch to anyone who needs it, or if the community wants it backpatched I can certainly do that. It isn't? It does seem like a bug, which we do typically backpatch ... Well, it's a risk-reward tradeoff. In this case it seems like there's a nontrivial risk of creating new bugs against fixing a problem that evidently affects very few people. I concur with Bruce's feeling that we shouldn't backpatch ... at least not now. Once the patch has been through beta testing we could reconsider. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [BUGS] libpq does not manage SSL callbacks properly when other libraries are involved.
Russell Smith wrote: Bruce Momjian wrote: Yes, my defines were very messed up; updated version attached. Hi, I've not done a review of this patch, however I did backport it to 8.3 (as attached in unified diff). The patch wasn't made for PG purposes, so it's not in context diff. I tested the backported patch and the issues I was experiencing with the initial bug report have stopped. So the fix works for the initial reported problem. Wow, that is great. I was only guessing on the cause but it seemed logical, and your testing confirmed it. Will this be back patched when it's committed? This is not something we would typically backpatch because of the danger of introducing some unexpected change in libpq. We can provide a patch to anyone who needs it, or if the community wants it backpatched I can certainly do that. -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [BUGS] libpq does not manage SSL callbacks properly when other libraries are involved.
Bruce Momjian wrote: Yes, my defines were very messed up; updated version attached. Hi, I've not done a review of this patch, however I did backport it to 8.3 (as attached in unified diff). The patch wasn't made for PG purposes, so it's not in context diff. I tested the backported patch and the issues I was experiencing with the initial bug report have stopped. So the fix works for the initial reported problem. Will this be back patched when it's committed? Regards Russell diff -uNr postgresql-8.3.3/src/interfaces/libpq/fe-secure.c postgresql-8.3.3.new/src/interfaces/libpq/fe-secure.c --- postgresql-8.3.3/src/interfaces/libpq/fe-secure.c 2008-01-29 13:03:39.0 +1100 +++ postgresql-8.3.3.new/src/interfaces/libpq/fe-secure.c 2008-11-13 20:57:40.0 +1100 @@ -142,12 +142,10 @@ #define ERR_pop_to_mark() ((void) 0) #endif -#ifdef NOT_USED -static int verify_peer(PGconn *); -#endif static int verify_cb(int ok, X509_STORE_CTX *ctx); static int client_cert_cb(SSL *, X509 **, EVP_PKEY **); static int init_ssl_system(PGconn *conn); +static void destroy_ssl_system(void); static int initialize_SSL(PGconn *); static void destroy_SSL(void); static PostgresPollingStatusType open_client_SSL(PGconn *); @@ -156,11 +154,19 @@ static void SSLerrfree(char *buf); #endif -#ifdef USE_SSL static bool pq_initssllib = true; static SSL_CTX *SSL_context = NULL; +#ifdef ENABLE_THREAD_SAFETY +static int ssl_open_connections = 0; + +#ifndef WIN32 +static pthread_mutex_t ssl_config_mutex = PTHREAD_MUTEX_INITIALIZER; +#else +static pthread_mutex_t ssl_config_mutex = NULL; +static long win32_ssl_create_mutex = 0; #endif +#endif/* ENABLE_THREAD_SAFETY */ /* * Macros to handle disabling and then restoring the state of SIGPIPE handling. @@ -839,40 +845,53 @@ init_ssl_system(PGconn *conn) { #ifdef ENABLE_THREAD_SAFETY -#ifndef WIN32 - static pthread_mutex_t init_mutex = PTHREAD_MUTEX_INITIALIZER; -#else - static pthread_mutex_t init_mutex = NULL; - static long mutex_initlock = 0; - - if (init_mutex == NULL) - { - while (InterlockedExchange(mutex_initlock, 1) == 1) - /* loop, another thread own the lock */ ; - if (init_mutex == NULL) - pthread_mutex_init(init_mutex, NULL); - InterlockedExchange(mutex_initlock, 0); - } -#endif - pthread_mutex_lock(init_mutex); - - if (pq_initssllib pq_lockarray == NULL) - { - int i; - - CRYPTO_set_id_callback(pq_threadidcallback); - - pq_lockarray = malloc(sizeof(pthread_mutex_t) * CRYPTO_num_locks()); - if (!pq_lockarray) - { - pthread_mutex_unlock(init_mutex); - return -1; - } - for (i = 0; i CRYPTO_num_locks(); i++) - pthread_mutex_init(pq_lockarray[i], NULL); - - CRYPTO_set_locking_callback(pq_lockingcallback); - } +#ifdef WIN32 + if (ssl_config_mutex == NULL) + { + while (InterlockedExchange(win32_ssl_create_mutex, 1) == 1) + /* loop, another thread own the lock */ ; + if (ssl_config_mutex == NULL) + { + if (pthread_mutex_init(ssl_config_mutex, NULL)) + return -1; + } + InterlockedExchange(win32_ssl_create_mutex, 0); + } +#endif + if (pthread_mutex_lock(ssl_config_mutex)) + return -1; + + if (pq_initssllib) + { + if (pq_lockarray == NULL) + { + int i; + + pq_lockarray = malloc(sizeof(pthread_mutex_t) * CRYPTO_num_locks()); + if (!pq_lockarray) + { + pthread_mutex_unlock(ssl_config_mutex); + return -1; + } + for (i = 0; i CRYPTO_num_locks(); i++) + { + if (pthread_mutex_init(pq_lockarray[i], NULL)) + { + free(pq_lockarray); + pq_lockarray = NULL; + pthread_mutex_unlock(ssl_config_mutex); + return -1; + } + } + } + + if (ssl_open_connections++ == 0) + { + /* These are only required for threaded SSL applications */ + CRYPTO_set_id_callback(pq_threadidcallback); + CRYPTO_set_locking_callback(pq_lockingcallback); + } +} #endif if (!SSL_context) { @@ -894,18 +913,61 @@ err); SSLerrfree(err); #ifdef ENABLE_THREAD_SAFETY - pthread_mutex_unlock(init_mutex); + pthread_mutex_unlock(ssl_config_mutex); #endif return -1; } } #ifdef ENABLE_THREAD_SAFETY - pthread_mutex_unlock(init_mutex); + pthread_mutex_unlock(ssl_config_mutex); #endif return 0; } /* + *This function is needed because if the libpq library is unloaded + *from the application, the callback functions will no longer exist when + *SSL used by other parts of the system. For this reason, + *we unregister the SSL callback functions when the last libpq + *connection is closed. + */ +static void +destroy_ssl_system(void) +{ +#ifdef ENABLE_THREAD_SAFETY + /* Assume mutex is already created */ + if (pthread_mutex_lock(ssl_config_mutex)) +
Re: [HACKERS] Re: [BUGS] libpq does not manage SSL callbacks properly when other libraries are involved.
Bruce Momjian wrote: Thanks for the review, Magnus. I have adjusted the patch to use the same mutex every time the counter is accessed, and adjusted the pqsecure_destroy() call to properly decrement in the right place. Also, I renamed the libpq global destroy function to be clearer (the function is not exported). There's a problem in this patch which is that it is inconsistent in its use of the ENABLE_THREAD_SAFETY symbol. init_ssl_system() is only going to keep the refcount in the threaded compile; but the safeguards are needed even when threading is not enabled. Moreover, destroy_ssl_system() is locking thread mutexes outside ENABLE_THREAD_SAFETY which is going to cause non-threaded builds to fail. As a suggestion, I'd recommend not fooling around with backend files when you're only modifying libpq. It enlarges the patch without benefit. I think that patch should be committed separately. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [BUGS] libpq does not manage SSL callbacks properly when other libraries are involved.
Alvaro Herrera wrote: Bruce Momjian wrote: Thanks for the review, Magnus. I have adjusted the patch to use the same mutex every time the counter is accessed, and adjusted the pqsecure_destroy() call to properly decrement in the right place. Also, I renamed the libpq global destroy function to be clearer (the function is not exported). There's a problem in this patch which is that it is inconsistent in its use of the ENABLE_THREAD_SAFETY symbol. init_ssl_system() is only going to keep the refcount in the threaded compile; but the safeguards are needed even when threading is not enabled. Moreover, Actually, CRYPTO_set_locking_callback() and CRYPTO_set_id_callbac() are needed only for threaded SSL programs; I have added a comments mentioning that. destroy_ssl_system() is locking thread mutexes outside ENABLE_THREAD_SAFETY which is going to cause non-threaded builds to fail. Yes, my defines were very messed up; updated version attached. As a suggestion, I'd recommend not fooling around with backend files when you're only modifying libpq. It enlarges the patch without benefit. I think that patch should be committed separately. OK, I will do that, though the backend change is being made to be consistent with the front end. -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + Index: src/backend/libpq/be-secure.c === RCS file: /cvsroot/pgsql/src/backend/libpq/be-secure.c,v retrieving revision 1.85 diff -c -c -r1.85 be-secure.c *** src/backend/libpq/be-secure.c 24 Oct 2008 12:24:35 - 1.85 --- src/backend/libpq/be-secure.c 7 Nov 2008 23:16:28 - *** *** 88,94 static int verify_cb(int, X509_STORE_CTX *); static void info_cb(const SSL *ssl, int type, int args); static void initialize_SSL(void); - static void destroy_SSL(void); static int open_server_SSL(Port *); static void close_SSL(Port *); static const char *SSLerrmessage(void); --- 88,93 *** *** 191,206 return 0; } /* * Destroy global context */ void secure_destroy(void) { - #ifdef USE_SSL - destroy_SSL(); - #endif } /* * Attempt to negotiate secure session. --- 190,204 return 0; } + #ifdef NOT_USED /* * Destroy global context */ void secure_destroy(void) { } + #endif /* * Attempt to negotiate secure session. *** *** 805,815 } } /* ! * Destroy global SSL context. */ static void ! destroy_SSL(void) { if (SSL_context) { --- 803,814 } } + #ifdef NOT_USED /* ! * Destroy global SSL context */ static void ! destroy_global_SSL(void) { if (SSL_context) { *** *** 817,822 --- 816,822 SSL_context = NULL; } } + #endif /* * Attempt to negotiate SSL connection. Index: src/interfaces/libpq/fe-secure.c === RCS file: /cvsroot/pgsql/src/interfaces/libpq/fe-secure.c,v retrieving revision 1.106 diff -c -c -r1.106 fe-secure.c *** src/interfaces/libpq/fe-secure.c 24 Oct 2008 12:29:11 - 1.106 --- src/interfaces/libpq/fe-secure.c 7 Nov 2008 23:16:31 - *** *** 44,49 --- 44,50 #endif #include arpa/inet.h #endif + #include sys/stat.h #ifdef ENABLE_THREAD_SAFETY *** *** 57,72 #ifdef USE_SSL #include openssl/ssl.h #include openssl/bio.h #if (SSLEAY_VERSION_NUMBER = 0x00907000L) #include openssl/conf.h #endif #if (SSLEAY_VERSION_NUMBER = 0x00907000L) !defined(OPENSSL_NO_ENGINE) #include openssl/engine.h #endif - #endif /* USE_SSL */ - - - #ifdef USE_SSL #ifndef WIN32 #define USER_CERT_FILE .postgresql/postgresql.crt --- 58,70 #ifdef USE_SSL #include openssl/ssl.h #include openssl/bio.h + #if (SSLEAY_VERSION_NUMBER = 0x00907000L) #include openssl/conf.h #endif #if (SSLEAY_VERSION_NUMBER = 0x00907000L) !defined(OPENSSL_NO_ENGINE) #include openssl/engine.h #endif #ifndef WIN32 #define USER_CERT_FILE .postgresql/postgresql.crt *** *** 87,112 #define ERR_pop_to_mark() ((void) 0) #endif - #ifdef NOT_USED - static int verify_peer_name_matches_certificate(PGconn *); - #endif static int verify_cb(int ok, X509_STORE_CTX *ctx); static int client_cert_cb(SSL *, X509 **, EVP_PKEY **); static int init_ssl_system(PGconn *conn); static int initialize_SSL(PGconn *); static void destroy_SSL(void); static PostgresPollingStatusType open_client_SSL(PGconn *); static void close_SSL(PGconn *); static char *SSLerrmessage(void); static void SSLerrfree(char *buf); #endif - #ifdef USE_SSL static bool pq_initssllib = true; - static SSL_CTX *SSL_context = NULL; #endif /* * Macros to handle disabling
Re: [HACKERS] Re: [BUGS] libpq does not manage SSL callbacks properly when other libraries are involved.
Magnus Hagander wrote: Your analysis of this problem is right on target. When the SSL callbacks were implemented for threaded libpq, there was never any thought on the effect of unloading libpq while the callbacks were still registered. The attached patch unregisters the callback on the close of the last libpq connection. Fortunately we require PQfinish() even if the connection request failed, meaning there should be proper accounting of the number of open connections with the method used in this patch. We do leak some memory for every load/unload of libpq, but the leaks extend beyond the SSL code to the rest of libpq so I didn't attempt to address that in this patch (and no one has complained about it). I also could have implemented a function to unload the SSL callbacks. It would have to have been called before libpq was unloaded, but I considered it inconvenient and unlikely to be adopted by applications using libpq in the short-term. I don't see why destroy_ssl_system sets up it's own mutex (that's also called init_mutex). I think it'd make more sense to make the mutex created in init_ssl_system() visible to the destroy function, and make use of that one instead. You'll need to somehow interlock against these two functions running on different threads after all. Also, the code for destroying/unlinking appears to never be called.. The callchain ends in pqsecure_destroy(), which is never called. Thanks for the review, Magnus. I have adjusted the patch to use the same mutex every time the counter is accessed, and adjusted the pqsecure_destroy() call to properly decrement in the right place. Also, I renamed the libpq global destroy function to be clearer (the function is not exported). -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + Index: src/backend/libpq/be-secure.c === RCS file: /cvsroot/pgsql/src/backend/libpq/be-secure.c,v retrieving revision 1.85 diff -c -c -r1.85 be-secure.c *** src/backend/libpq/be-secure.c 24 Oct 2008 12:24:35 - 1.85 --- src/backend/libpq/be-secure.c 5 Nov 2008 04:21:14 - *** *** 88,94 static int verify_cb(int, X509_STORE_CTX *); static void info_cb(const SSL *ssl, int type, int args); static void initialize_SSL(void); - static void destroy_SSL(void); static int open_server_SSL(Port *); static void close_SSL(Port *); static const char *SSLerrmessage(void); --- 88,93 *** *** 191,206 return 0; } /* * Destroy global context */ void secure_destroy(void) { - #ifdef USE_SSL - destroy_SSL(); - #endif } /* * Attempt to negotiate secure session. --- 190,204 return 0; } + #ifdef NOT_USED /* * Destroy global context */ void secure_destroy(void) { } + #endif /* * Attempt to negotiate secure session. *** *** 805,815 } } /* ! * Destroy global SSL context. */ static void ! destroy_SSL(void) { if (SSL_context) { --- 803,814 } } + #ifdef NOT_USED /* ! * Destroy global SSL context */ static void ! destroy_global_SSL(void) { if (SSL_context) { *** *** 817,822 --- 816,822 SSL_context = NULL; } } + #endif /* * Attempt to negotiate SSL connection. Index: src/interfaces/libpq/fe-secure.c === RCS file: /cvsroot/pgsql/src/interfaces/libpq/fe-secure.c,v retrieving revision 1.106 diff -c -c -r1.106 fe-secure.c *** src/interfaces/libpq/fe-secure.c 24 Oct 2008 12:29:11 - 1.106 --- src/interfaces/libpq/fe-secure.c 5 Nov 2008 04:21:16 - *** *** 93,98 --- 93,99 static int verify_cb(int ok, X509_STORE_CTX *ctx); static int client_cert_cb(SSL *, X509 **, EVP_PKEY **); static int init_ssl_system(PGconn *conn); + static void destroy_ssl_system(void); static int initialize_SSL(PGconn *); static void destroy_SSL(void); static PostgresPollingStatusType open_client_SSL(PGconn *); *** *** 105,110 --- 106,122 static bool pq_initssllib = true; static SSL_CTX *SSL_context = NULL; + + #ifdef ENABLE_THREAD_SAFETY + static int ssl_open_connections = 0; + + #ifndef WIN32 + static pthread_mutex_t ssl_config_mutex = PTHREAD_MUTEX_INITIALIZER; + #else + static pthread_mutex_t ssl_config_mutex = NULL; + static long win32_ssl_create_mutex = 0; + #endif + #endif /* *** *** 760,805 init_ssl_system(PGconn *conn) { #ifdef ENABLE_THREAD_SAFETY ! #ifndef WIN32 ! static pthread_mutex_t init_mutex = PTHREAD_MUTEX_INITIALIZER; ! #else ! static pthread_mutex_t init_mutex = NULL; ! static long mutex_initlock = 0; ! ! if (init_mutex == NULL) { ! while
Re: [HACKERS] Re: [BUGS] libpq does not manage SSL callbacks properly when other libraries are involved.
Bruce Momjian wrote: Russell Smith wrote: Alvaro Herrera wrote: PoolSnoopy wrote: ***PUSH*** this bug is really some annoyance if you use automatic build environments. I'm using phpunit to run tests and as soon as postgres is involved the php cli environment segfaults at the end. this can be worked around by disabling ssl but it would be great if the underlying bug got fixed. This is PHP's bug, isn't it? Why are you complaining here No, this is a problem with the callback/exit functions used by PostgreSQL. We setup callback functions when we use SSL, if somebody else uses SSL we can create a problem. I thought my original report was detailed enough to explain where the problem is coming from. Excerpt from original report; This is part of a comment from the php bug comment history; *[12 Nov 2007 2:45pm UTC] sam at zoy dot org* Hello, I did read the sources and studied them, and I can confirm that it is a matter of callback jumping to an invalid address. libpq's init_ssl_system() installs callbacks by calling CRYPTO_set_id_callback() and CRYPTO_set_locking_callback(). This function is called each time initialize_SSL() is called (for instance through the PHP pg_connect() function) and does not keep a reference counter, so libpq's destroy_SSL() has no way to know that it should call a destroy_ssl_system() function, and there is no such function anyway. So the callbacks are never removed. But then, upon cleanup, PHP calls zend_shutdown() which properly unloads pgsql.so and therefore the unused libpq. Finally, the zend_shutdown procedure calls zm_shutdown_curl() which in turn calls curl_global_cleanup() which leads to an ERR_free_strings() call and eventually a CRYPTO_lock() call. CRYPTO_lock() checks whether there are any callbacks to call, finds one (the one installed by libpg), calls it, and crashes because libpq was unloaded and hence the callback is no longer in mapped memory. -- Basically postgresql doesn't cancel the callbacks to itself when the pg connection is shut down. So if the libpq library is unloaded before other libraries that use SSL you get a crash as described above. PHP has suggested the fix is to keep a reference counter in libpq so knows when to remove the callbacks. This is a complicated bug, but without real evidence there is no way to go to back to PHP and say it's their fault. Their analysis is relatively comprehensive compared to the feedback that's been posted here so far. I'm not sure how best to setup an environment to replicate the bug in a way I can debug it. And even if I get to the point of nailing it down, I'll just be back asking questions about how you would fix it because I know very little about SSL. All that said, a quick poke in the source of PostgreSQL says that fe-secure.c sets callbacks using CRYPTO_set_xx_callback(...). These are only set in the threaded version it appears. Which is pretty much default in all the installations I encounter. My google research indicated we need to call CRYPTO_set_xx_callback(NULL) when we exit. but that's not done. One idea for a fix is to add a counter to the initialize_ssl function and when destory_ssl is called, decrement the counter. If it reaches 0 then call CRYPT_set_xx_callback(NULL) to remove the callbacks. This is a windows SSL thread that crashes iexplore and testifies to the same problem http://www.mail-archive.com/[EMAIL PROTECTED]/msg53869.html Sorry for the delay in addressing this bug report. Your analysis of this problem is right on target. When the SSL callbacks were implemented for threaded libpq, there was never any thought on the effect of unloading libpq while the callbacks were still registered. The attached patch unregisters the callback on the close of the last libpq connection. Fortunately we require PQfinish() even if the connection request failed, meaning there should be proper accounting of the number of open connections with the method used in this patch. We do leak some memory for every load/unload of libpq, but the leaks extend beyond the SSL code to the rest of libpq so I didn't attempt to address that in this patch (and no one has complained about it). I also could have implemented a function to unload the SSL callbacks. It would have to have been called before libpq was unloaded, but I considered it inconvenient and unlikely to be adopted by applications using libpq in the short-term. I don't see why destroy_ssl_system sets up it's own mutex (that's also called init_mutex). I think it'd make more sense to make the mutex created in init_ssl_system() visible to the destroy function, and make use of that one instead. You'll need to somehow interlock against these two functions running on different threads after all. Also, the code for destroying/unlinking appears to never be called.. The callchain ends in pqsecure_destroy(), which is never called. //Magnus -- Sent via
[HACKERS] Re: [BUGS] libpq does not manage SSL callbacks properly when other libraries are involved.
Russell Smith wrote: Alvaro Herrera wrote: PoolSnoopy wrote: ***PUSH*** this bug is really some annoyance if you use automatic build environments. I'm using phpunit to run tests and as soon as postgres is involved the php cli environment segfaults at the end. this can be worked around by disabling ssl but it would be great if the underlying bug got fixed. This is PHP's bug, isn't it? Why are you complaining here No, this is a problem with the callback/exit functions used by PostgreSQL. We setup callback functions when we use SSL, if somebody else uses SSL we can create a problem. I thought my original report was detailed enough to explain where the problem is coming from. Excerpt from original report; This is part of a comment from the php bug comment history; *[12 Nov 2007 2:45pm UTC] sam at zoy dot org* Hello, I did read the sources and studied them, and I can confirm that it is a matter of callback jumping to an invalid address. libpq's init_ssl_system() installs callbacks by calling CRYPTO_set_id_callback() and CRYPTO_set_locking_callback(). This function is called each time initialize_SSL() is called (for instance through the PHP pg_connect() function) and does not keep a reference counter, so libpq's destroy_SSL() has no way to know that it should call a destroy_ssl_system() function, and there is no such function anyway. So the callbacks are never removed. But then, upon cleanup, PHP calls zend_shutdown() which properly unloads pgsql.so and therefore the unused libpq. Finally, the zend_shutdown procedure calls zm_shutdown_curl() which in turn calls curl_global_cleanup() which leads to an ERR_free_strings() call and eventually a CRYPTO_lock() call. CRYPTO_lock() checks whether there are any callbacks to call, finds one (the one installed by libpg), calls it, and crashes because libpq was unloaded and hence the callback is no longer in mapped memory. -- Basically postgresql doesn't cancel the callbacks to itself when the pg connection is shut down. So if the libpq library is unloaded before other libraries that use SSL you get a crash as described above. PHP has suggested the fix is to keep a reference counter in libpq so knows when to remove the callbacks. This is a complicated bug, but without real evidence there is no way to go to back to PHP and say it's their fault. Their analysis is relatively comprehensive compared to the feedback that's been posted here so far. I'm not sure how best to setup an environment to replicate the bug in a way I can debug it. And even if I get to the point of nailing it down, I'll just be back asking questions about how you would fix it because I know very little about SSL. All that said, a quick poke in the source of PostgreSQL says that fe-secure.c sets callbacks using CRYPTO_set_xx_callback(...). These are only set in the threaded version it appears. Which is pretty much default in all the installations I encounter. My google research indicated we need to call CRYPTO_set_xx_callback(NULL) when we exit. but that's not done. One idea for a fix is to add a counter to the initialize_ssl function and when destory_ssl is called, decrement the counter. If it reaches 0 then call CRYPT_set_xx_callback(NULL) to remove the callbacks. This is a windows SSL thread that crashes iexplore and testifies to the same problem http://www.mail-archive.com/[EMAIL PROTECTED]/msg53869.html Sorry for the delay in addressing this bug report. Your analysis of this problem is right on target. When the SSL callbacks were implemented for threaded libpq, there was never any thought on the effect of unloading libpq while the callbacks were still registered. The attached patch unregisters the callback on the close of the last libpq connection. Fortunately we require PQfinish() even if the connection request failed, meaning there should be proper accounting of the number of open connections with the method used in this patch. We do leak some memory for every load/unload of libpq, but the leaks extend beyond the SSL code to the rest of libpq so I didn't attempt to address that in this patch (and no one has complained about it). I also could have implemented a function to unload the SSL callbacks. It would have to have been called before libpq was unloaded, but I considered it inconvenient and unlikely to be adopted by applications using libpq in the short-term. -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + Index: src/interfaces/libpq/fe-secure.c === RCS file: /cvsroot/pgsql/src/interfaces/libpq/fe-secure.c,v retrieving revision 1.105 diff -c -c -r1.105 fe-secure.c *** src/interfaces/libpq/fe-secure.c16 May 2008 18:30:53 - 1.105 ---