> From: owner-openssl-us...@openssl.org > [mailto:owner-openssl-us...@openssl.org] On Behalf Of David Li > Sent: Tuesday, May 20, 2014 13:05 <snip> > I am using SSL_CTX_use_certificate_chain_file() to load my server certificate > files at initialization. > The PEM file is created by concatenating server cert, server key and CA cert > together. > I used the following command line to check its format and it seemed OK.
> $ openssl s_server -cert servercert.pem -www > Using default temp DH parameters > Using default temp ECDH parameters > ACCEPT Note s_server does use_certificate_file (and also use_PrivateKey_file) not the _chain_ version, so that really only checks that the server (first) cert and privatekey are good, not the rest of the file. However, even if the CA cert is somehow bad it should at worst give an error return (and maybe just discard it) not a SEGV. However, if the CA cert is a root (possibly your own DIY root) it doesn't matter if it's in the file and good or not, because servers aren't required to send the root of their chain -- because clients can never trust a root (or in general anchor) sent by the server and must already have it local anyway. > And I can use openssl s_client command line to connect to the above server > without any issues. What did you use for s_client's trust store (-CAfile and/or -CApath)? > Now when I started my server, the code crashed inside the > SSL_CTX_use_certificate_chain_file(): <snip> > There wasn't any detailed errors printed out but only:Segmentation fault > (core dumped) When you get an unhandled signal -- and SEGV usually isn't and often can't be handled -- a C program aborts without outputting anything that wasn't output (and where applicable flushed) before the signal. This is unlike 'voluntary' error handling where the code gets a return value indicating an error (such as -1 from SSL_connect or NULL from fopen) and can -- and should -- print information about the problem. And unlike some other languages that (more or less reliably) catch exceptions and give details for them. > Can anyone suggest how to debug this issue? The same way you debug SEGV in any C program. In this case you got a core dump file; open it with the debugger of your choice -- gdb is common and popular -- and try to look at the stack (bt in gdb). Sometimes the stack is clobbered by the same bug that caused the SEGV, but usually it shows where -- or nearly where -- the code was executing and called from and sometimes (often?) the function arguments at each level. Alternatively, (re)run the program under control of a debugger like gdb to start with. Set breakpoints before or at the call that fails, and look to make sure the arguments are good -- for use_cert_chain, ctx points to a validly allocated and initialized SSL_CTX (to a first approximation if p *ctx doesn't give a gdb error and isn't all zero or obvious garbage, it's likely okay) and *file is the correct filename (and null terminated). If they look okay and you either built openssl from source or have the source from which it was built installed, step in and see where it fails; but that's only needed if the bug is in the openssl code which is unlikely as thousands or millions of other people use it without problem. (Though not completely impossible.) Or if you don't like the debugger, try taking out parts of your code that don't appear to be related to the problem to see if it still occurs. While it does, keep reducing until you either find the problem or get to a small self-contained example that exhibits the problem and post it. Unless you are using a good revision control system, it's usually best to 'remove' code by putting #if 0 and #endif lines around it instead of actually deleting it, so that you can easily put it back correctly if necessary. Unfortunately if the symptom stops when you remove some code, that doesn't reliably prove that code was the problem (or the only problem); problems at the machine level are usually due to 'undefined behavior' in C where your code is wrong in a way that isn't required to be caught, like using an invalid pointer, and the actual results vary depending on seemingly irrelevant factors like the size of code before and after the location of the actual bug in a complicated way that won't make any sense unless you understand in detail the machine code generated for your source code -- and to be frank if you knew that you wouldn't be asking a question like this. ______________________________________________________________________ OpenSSL Project http://www.openssl.org User Support Mailing List openssl-users@openssl.org Automated List Manager majord...@openssl.org