Just saw two crashes during repeated runs of the http2 tests:

One during a SSL_free:
httpd(46062,0x700010634000) malloc: *** error for object 0x7fa576000088: 
incorrect checksum for freed object - object was probably modified after being 
freed.
0   libsystem_kernel.dylib              0x00007fff916e6dd6 __pthread_kill + 10
1   libsystem_pthread.dylib             0x00007fff917d2787 pthread_kill + 90
2   libsystem_c.dylib                   0x00007fff9164c4bb __abort + 140
3   libsystem_c.dylib                   0x00007fff9164c42f abort + 144
4   libsystem_malloc.dylib              0x00007fff91746f21 szone_error + 626
5   libsystem_malloc.dylib              0x00007fff9173cf65 
tiny_free_list_remove_ptr + 292
6   libsystem_malloc.dylib              0x00007fff91751925 tiny_free_no_lock + 
1533
7   libsystem_malloc.dylib              0x00007fff917520b5 free_tiny + 671
8   libcrypto.1.0.0.dylib               0x000000010e1180d5 CRYPTO_free + 37
9   libcrypto.1.0.0.dylib               0x000000010e1bd4ed BIO_free_all + 157
10  libssl.1.0.0.dylib                  0x000000010e0dc738 SSL_free + 152
11  mod_ssl.so                          0x000000010e082ca9 
ssl_filter_io_shutdown + 441 (ssl_engine_io.c:1144)
12  mod_ssl.so                          0x000000010e080d37 ssl_io_filter_output 
+ 1207 (ssl_engine_io.c:1826)
13  mod_ssl.so                          0x000000010e080826 
ssl_io_filter_coalesce + 822 (ssl_engine_io.c:1763)
14  httpd                               0x000000010dda319e 
ap_start_lingering_close + 222 (connection.c:89)
15  mod_mpm_event.so                    0x000000010e3ca955 worker_thread + 2005 
(event.c:799)
16  libsystem_pthread.dylib             0x00007fff917cfaab _pthread_body + 180
17  libsystem_pthread.dylib             0x00007fff917cf9f7 _pthread_start + 286
18  libsystem_pthread.dylib             0x00007fff917cf1fd thread_start + 13

and one during http/2 request processing:
0   libsystem_c.dylib                   0x00007fff915eeb52 strlen + 18
1   libapr-1.0.dylib                    0x000000010646c211 apr_pstrdup + 33 
(apr_strings.c:77)
2   libapr-1.0.dylib                    0x000000010646f8c6 apr_table_set + 502 
(apr_tables.c:529)
3   mod_ssl.so                          0x00000001065c50f2 ssl_hook_Fixup + 242 
(ssl_engine_kernel.c:1374)
4   httpd                               0x00000001062ff6da 
ap_process_request_internal + 1658 (request.c:83)
5   httpd                               0x00000001063024ea 
ap_sub_req_method_uri + 202 (request.c:2276)
6   mod_dir.so                          0x0000000106873a5c dir_fixups + 1036 
(mod_dir.c:323)
7   httpd                               0x00000001062ff6da 
ap_process_request_internal + 1658 (request.c:83)
8   httpd                               0x0000000106316a18 
ap_process_async_request + 344 (http_request.c:442)
9   httpd                               0x0000000106316ac9 ap_process_request + 
25 (http_request.c:481)
10  mod_http2.so                        0x00000001068a43df h2_task_process_conn 
+ 399 (h2_task.c:623)
11  httpd                               0x00000001062e4c75 
ap_run_process_connection + 53 (connection.c:42)
12  mod_http2.so                        0x00000001068a5501 h2_task_do + 337 
(h2_task.c:581)
13  mod_http2.so                        0x00000001068a84d6 execute + 118 
(h2_worker.c:50)
14  libsystem_pthread.dylib             0x00007fff917cfaab _pthread_body + 180
15  libsystem_pthread.dylib             0x00007fff917cf9f7 _pthread_start + 286
16  libsystem_pthread.dylib             0x00007fff917cf1fd thread_start + 13

The last one led me to look at the hook more closely:

ssl_engine_kernel.c: line 1373
#ifdef HAVE_TLSEXT
    /* add content of SNI TLS extension (if supplied with ClientHello) */
    if ((servername = SSL_get_servername(ssl, TLSEXT_NAMETYPE_host_name))) {
        apr_table_set(env, "SSL_TLS_SNI", servername);
    }
#endif

This is called unconditionally on every request from several threads on the 
same "SSL*" concurrently. The method is innocent enough, so that should not be 
a problem.

However, it seems to me, with the changes I did in v1.9.0 of orderly cleaning 
up http2 on a pre-clean of the conn_rec->pool that this is too late. Because, I 
assume, when the conn_rec->pool is cleared, mod_ssl has already said goodbye to 
the connection and freed the SSL*.

But background processing of http2 requests is still going on. And those 
requests access SSL* memory (access to SSLConnRec* from conn_rec->master is 
unguarded), which might have been freed and then all sort of things can happen.

If someone has the time and energy to look at those things, I'd appreciate it. 
I am also looking for a good idea to fix this and the damn +StdEnvVars as well, 
while we are at it. All conncurrent access to SSL* needs to go, one way or the 
other.

And yes, this could explain the curious crashes by Stefan Priebe during 
connection shutdown.

Stefan Eissing

<green/>bytes GmbH
Hafenstrasse 16
48155 Münster
www.greenbytes.de

Reply via email to