> From: owner-openssl-us...@openssl.org 
> [mailto:owner-openssl-us...@openssl.org] On Behalf Of David Li
> Sent: Tuesday, May 20, 2014 13:05
<snip>
> I am using SSL_CTX_use_certificate_chain_file() to load my server certificate 
> files at initialization. 
> The PEM file is created by concatenating server cert, server key and CA cert 
> together.  
> I used the following command line to check its format and it seemed OK.

> $ openssl s_server -cert servercert.pem -www
> Using default temp DH parameters
> Using default temp ECDH parameters
> ACCEPT

Note s_server does use_certificate_file (and also use_PrivateKey_file) not the 
_chain_ version,
so that really only checks that the server (first) cert and privatekey are 
good, not the rest 
of the file. However, even if the CA cert is somehow bad it should at worst 
give an error return
(and maybe just discard it) not a SEGV. However, if the CA cert is a root 
(possibly your own 
DIY root) it doesn't matter if it's in the file and good or not, because 
servers aren't required 
to send the root of their chain -- because clients can never trust a root (or 
in general anchor) 
sent by the server and must already have it local anyway. 

> And I can use openssl s_client command line to connect to the above server 
> without any issues.

What did you use for s_client's trust store (-CAfile and/or -CApath)?

> Now when I started my server, the code crashed inside the 
> SSL_CTX_use_certificate_chain_file():
<snip>
> There wasn't any detailed errors printed out but only:Segmentation fault 
> (core dumped)

When you get an unhandled signal -- and SEGV usually isn't and often can't be 
handled -- 
a C program aborts without outputting anything that wasn't output (and where 
applicable 
flushed) before the signal. This is unlike 'voluntary' error handling where the 
code gets 
a return value indicating an error (such as -1 from SSL_connect or NULL from 
fopen) 
and can -- and should -- print information about the problem. And unlike some 
other 
languages that (more or less reliably) catch exceptions and give details for 
them.

> Can anyone suggest how to debug this issue? 

The same way you debug SEGV in any C program. In this case you got a core dump 
file;
open it with the debugger of your choice -- gdb is common and popular -- and 
try to 
look at the stack (bt in gdb). Sometimes the stack is clobbered by the same bug 
that 
caused the SEGV, but usually it shows where -- or nearly where -- the code was 
executing and called from and sometimes (often?) the function arguments at each 
level.

Alternatively, (re)run the program under control of a debugger like gdb to 
start with.
Set breakpoints before or at the call that fails, and look to make sure the 
arguments 
are good -- for use_cert_chain, ctx points to a validly allocated and 
initialized SSL_CTX 
(to a first approximation if p *ctx doesn't give a gdb error and isn't all zero 
or obvious 
garbage, it's likely okay) and *file is the correct filename (and null 
terminated).
If they look okay and you either built openssl from source or have the source 
from which 
it was built installed, step in and see where it fails; but that's only needed 
if the bug is 
in the openssl code which is unlikely as thousands or millions of other people 
use it 
without problem. (Though not completely impossible.)

Or if you don't like the debugger, try taking out parts of your code that don't 
appear 
to be related to the problem to see if it still occurs. While it does, keep 
reducing until 
you either find the problem or get to a small self-contained example that 
exhibits 
the problem and post it. Unless you are using a good revision control system, 
it's 
usually best to 'remove' code by putting #if 0 and #endif lines around it 
instead of 
actually deleting it, so that you can easily put it back correctly if necessary.

Unfortunately if the symptom stops when you remove some code, that doesn't 
reliably 
prove that code was the problem (or the only problem); problems at the machine 
level 
are usually due to 'undefined behavior' in C where your code is wrong in a way 
that 
isn't required to be caught, like using an invalid pointer, and the actual 
results vary 
depending on seemingly irrelevant factors like the size of code before and 
after the 
location of the actual bug in a complicated way that won't make any sense 
unless 
you understand in detail the machine code generated for your source code -- 
and to be frank if you knew that you wouldn't be asking a question like this.


______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
User Support Mailing List                    openssl-users@openssl.org
Automated List Manager                           majord...@openssl.org

Reply via email to