On Mon, 7 Aug 2000, Bill Rebey wrote:

> Date: Mon, 7 Aug 2000 14:25:01 -0400 
> From: Bill Rebey <[EMAIL PROTECTED]>
> Reply-To: [EMAIL PROTECTED]
> To: "Openssl-Dev (E-mail)" <[EMAIL PROTECTED]>
> Subject: Crash bug exemplified
> 
> The attached program is about as small as I can make a test app that
> exemplifies the problem that my server application is having.  I have posted
> about it repeatedly with no results, probably because nobody can (or wants
> to <g>) reproduce it. This little test program is only about 160 lines long
OK, I have modified the program slightly to build and run on Linux (I use
fairly standard Redhat Linux 6.2 on intel PII 450MHz,  openssl 0.9.5a;
The compiler was egcs-2.91.66 (egcs-1.1.2-30 package) )

The diff is attached as well as command line (file linux_cc2.how) I was
using to build the program;
Now I am running two copies of resulting binary, one for over 36 minutes
(total thread count over 4 milion) and another for more than 18 minutes
(22MBytes of output produced so far ), and they just run;

So I suppose there is either some problem with SPARC code (I guess SPARC
Ultra is 64-bit CPU, I have only 32-bit single-CPU machines around), or
with the
build process used by the author of the test program (or with Solaris
libraries), or possibly with program behaviour on multi-CPU machines
(which I cannot test now).

The line used to link by the author has -mt on the very end:
CC -o tst tst.o -L$OPENSSLDIR/lib -lssl -lcrypto -lsocket -mt

Is this corect?

Any ideas?



> with comments.  It just tries to keep a bunch of transient threads going at
> once (the threads don't do anything - they just exit after sleeping for a
> millisecond). 
> 
>  <<comp>>  <<link>>  <<tst.cpp>> 
> This problem happens on SPARC Solaris.  This program demonstrates the
> problem very quickly (usually within a minute) on both a SPARC Ultra-2 with
> Solaris 2.6, and a SPARC Ultra-60 with Solaris 2.8.  My "real" app doesn't
> crash nearly this fast, as it doesn't put nearly the stress-test on OpenSSL
> that the test app does - but it most certainly crashes every time I test it;
> it just takes hours instead of seconds.  
> 
> Can anyone reproduce this and fix it? I'm in a VERY bad spot here because I
> can't ship my product until I get OpenSSL to work.  My company pretty much
> threw sand in RSA's face in favor of using OpenSSL, on my recommendation,
> and now I can't make OpenSSL work and we can't ship my product.  This is
> hardly a great career move for me.  If anyone can identify and fix this bug,
> I would greatly appreciate it.  I look pretty stupid right now to the folks
> in upper management, and I feel like my hands are tied.  I'm trying to use
> Purify to determine the problem, but I've never used it before and will
> probably be slow to figure out how to make it work and understand exactly
> what it's telling me.
> 
> If anyone sees any obvious misuse problem, PLEASE let me know.  I would LOVE
> to hear "you're doing it wrong - you forgot to make this function call!" and
> be done with it, but as far as I can tell, I'm obeying the OpenSSL usage
> laws to the letter.  
> 
> If you run the "comp" and "link" scripts to build this little test program,
> then run the resultant "tst" executable, it should crash after a short time
> and if you run dbx against the resultant core, you should get the following
> stack in response to the dbx "where" command:
> 
> core file header read successfully
> Reading ld.so.1
> Reading libsocket.so.1
> Reading libCrun.so.1
> Reading libm.so.1
> Reading libw.so.1
> Reading libthread.so.1
> Reading libc.so.1
> Reading libnsl.so.1
> Reading libdl.so.1
> Reading libmp.so.2
> Reading libc_psr.so.1
> detected a multithreaded program
> t@3937 (l@48) terminated by signal BUS (invalid address alignment)
> Current function is ThreadMain
>   100           int iErr = ERR_get_error ();
> (/opt/SUNWspro/bin/../WS6/bin/sparcv9/dbx) where
> current thread: t@3937
>   [1] t_delete(0x9, 0xff2b6000, 0x150, 0x65300, 0x651a8, 0x150), at
> 0xff241798
>   [2] realfree(0x9, 0xff2bc7b0, 0xff2b6000, 0x65300, 0x153, 0x65308), at
> 0xff241420
>   [3] cleanfree(0x0, 0xff2b6000, 0xff2bc724, 0xff2bc7a4, 0xff2bc730, 0x0),
> at 0xff241cb4
>   [4] _malloc_unlocked(0x60, 0x0, 0xff2b6000, 0x60, 0x5, 0x0), at 0xff240e20
>   [5] malloc(0x60, 0x60, 0x62798, 0x150, 0x0, 0x0), at 0xff240d3c
>   [6] CRYPTO_malloc(0x5a5b0, 0x470d0, 0x77, 0x5a400, 0x470d0, 0x60), at
> 0x17070
>   [7] lh_new(0x1cba0, 0x1cbb8, 0x470d0, 0x2be, 0x1cbb8, 0x14c), at 0x34604
>   [8] ERR_get_state(0x5a400, 0x0, 0x673e0, 0x430d8, 0x673e0, 0xf7509b28), at
> 0x1ce6c
>   [9] get_error_values(0x1, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x1c4a0
> =>[10] ThreadMain(pNothing = (nil)), line 100 in "tst.cpp"
> (/opt/SUNWspro/bin/../WS6/bin/sparcv9/dbx) quit
> 
> Thanks for your help,
> 
> Bill Rebey
> 
> 
> 
> 
Regards,

Wojtek

comp

link

tst.cpp

g++ -Wall -I/usr/local/ssl/include -L/usr/local/ssl/lib -g -D_REENTRANT -DNO_RSA 
-DNO_RC4 -DNO_RC5 -DNO_BF -DNO_IDEA -fstack-check -o tst2 tst2.cc -lssl -lpthread 
-lcrypto
--- tst.cpp     Wed Aug  9 12:50:06 2000
+++ tst2.cc     Wed Aug  9 13:36:57 2000
@@ -16,6 +16,7 @@
 
\*==============================================================================================*/

 #include <pthread.h>
 
+#include <sys/time.h>
 #include <sys/types.h>
 #include <sys/socket.h>
 #include <netinet/in.h>
@@ -143,8 +144,9 @@
                                ++_iThreadCnt; 
                                ++_iTotThreads;
                        _cCritSec.Leave ();
+                       pthread_t threadID;
 
-                       if (pthread_create (NULL, &_threadAttr, ThreadMain, NULL))
+                       if (pthread_create (&threadID, &_threadAttr, ThreadMain, 
+NULL))
                        {
                                printf ("\nERROR CREATING THREAD!\n");
                        }
@@ -159,3 +161,5 @@
                }
        }       
 }
+
+// vim:sw=4 ts=4

Reply via email to