Folks, 

I have been working for several days to track down an issue where Apache 
segfault on startup, most of the time, but ONLY on Red Hat and ONLY when the 
CHIL engine is enabled.  

I'm working with OpenSSL, Apache and APR HEAD on an up-to-date CentOS 5.4 
32bits.  

The segfault occurs when a stale free_func function pointer is called at 
crypto/ex_data.c:522 

        for(i = 0; i < mx; i++)
                {
                if(storage[i] && storage[i]->free_func)
                        {
                        ptr = CRYPTO_get_ex_data(ad,i);
-->                     storage[i]->free_func(obj,ptr,ad,i,
                                storage[i]->argl,storage[i]->argp);
                        }
                }

as we're trying to free an RSA structure.  Setting a breakpoint on that line 
shows that the first few times this gets called, there is one on the stack: 

Breakpoint 1, int_free_ex_data (class_index=6, obj=0xb6535fa8, ad=0xb6535fd8) 
at ex_data.c:522
522     storage[i]->free_func(obj,ptr,ad,i,
(gdb) p class_index
$15 = 6
(gdb) p sk_num(def_get_class(6)->meth)
$16 = 1
(gdb) p *(CRYPTO_EX_DATA_FUNCS *)sk_value(def_get_class(6)->meth,0)
$17 = {argl = 0, argp = 0x2abac8, new_func = 0, free_func = 0x2aaee7 
<hwcrhk_ex_free>, dup_func = 0}

However, the second time through Apache's configuration processing sequence, a 
second CRYPTO_EX_DATA_FUNCS structure has appeared:

Breakpoint 1, int_free_ex_data (class_index=6, obj=0xb6801fa8, ad=0xb6801fd8) 
at ex_data.c:522
522     storage[i]->free_func(obj,ptr,ad,i,
(gdb) p class_index
$41 = 6
(gdb) p sk_num(def_get_class(6)->meth)
$42 = 2
(gdb) p *(CRYPTO_EX_DATA_FUNCS *)sk_value(def_get_class(6)->meth,0)
$43 = {argl = 0, argp = 0x2abac8, new_func = 0, free_func = 0x2aaee7, dup_func 
= 0}
(gdb) p *(CRYPTO_EX_DATA_FUNCS *)sk_value(def_get_class(6)->meth,1)
$44 = {argl = 0, argp = 0x7e0ac8, new_func = 0, free_func = 0x7dfee7 
<hwcrhk_ex_free>, dup_func = 0}

As you can see the first entry still points to the old location of 
hwcrhk_ex_free, which is now stale.  It seems that, of operating systems, only 
Red Hat is likely to load a library (the vendor-supplied Hardware Crypto Hook 
Library libnfhwcrhk.so) in a different memory location the second time around, 
which is why this issue has escaped attention so far.  

My question is where should this get fixed?  We can prevent Apache from loading 
the Engine twice, which is a spot band-aid.  And Apache's double pass through 
the config file (and calling its post_config hook handlers) is not going away 
any time soon.  I would rather fix this by having the CHIL Engine pop that 
CRYPTO_EX_DATA_FUNCS struct off the stack when it gets unloaded.  

Would hwcrhk_finish() be a good spot to do this?  What call would one make to 
get this particular stack cleaned up?  The struct was pushed onto the stack by 
calling RSA_get_ex_new_index() in hwcrhk_init() on e_chil.c:603, but I don't 
see an equivalent function to remove that particular ex_data and clear its 
helper functions.  

What would be best? 

Thanks, 

Sander

-- 
[email protected]              http://www.temme.net/sander/
PGP FP: 51B4 8727 466A 0BC3 69F4  B7B8 B2BE BC40 1529 24AF

______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [email protected]
Automated List Manager                           [email protected]

Reply via email to