On 20/06/16 10:49, Mick Saxton via RT wrote: > I modified your patch to also catch the similar problem in ssleay_rand_bytes. > Results from the instrumented tests attached. > > These tests were run on 64-bit Windows 7. > I have not specified a locking callback so will be using the default – could > this be the problem?
Ahhh!!! Yes!!! https://www.openssl.org/docs/faq.html#PROG1 From the "threads" man page: https://www.openssl.org/docs/man1.0.2/crypto/threads.html "OpenSSL can safely be used in multi-threaded applications provided that at least two callback functions are set, locking_function and threadid_func. locking_function(int mode, int n, const char *file, int line) is needed to perform locking on shared data structures. (Note that OpenSSL uses a number of global data structures that will be implicitly shared whenever multiple threads use OpenSSL.) Multi-threaded applications will crash at random if it is not set." In version 1.1.0 (not released yet) this requirement has gone - but this is still needed for all released versions. Matt > > Each thread has it’s own SSL_ctx and each connection is only ever serviced by > the same thread. > > It looks like state_index is going outside of the expected range. > > This is possible if one or more threads do > state_index += num_ceil; > > and then another thread reads it before > if ( state_index > state_num ) > state_index %= st_num.; > > Thanks for your help > > > From: Matt Caswell via RT [mailto:r...@openssl.org] > Sent: 18 June 2016 00:08 > To: Mick Saxton > Cc: openssl-dev@openssl.org > Subject: Re: [openssl-dev] [openssl.org #4545] Crash in crypto/rand/md_rand.c > > > > On 17/06/16 20:56, Matt Caswell via RT wrote: >> >> >> On 17/06/16 19:43, Mick Saxton via RT wrote: >>> Perhaps we should consider if there are any negative consequences to my >>> solution? >>> It does work. >>> >>> I am trying really hard to get contention but I am only seeing this problem >>> in about 1 out of 100,000 successful TLSv1.2 connections >>> On a heavily congested network. >>> I require three machines to just to run the test that causes the failure. >>> >>> All we are trying to do is get a random number – surely getting a slightly >>> less random number is better than crashing? >>> It could be that the problematic instances were going to disconnect anyway >>> due to TCP/IP problems. >>> >> >> I think we need to try instrumenting the code to see if we can get some >> more information out. I will try and pull something together - but it >> might be Monday before I get the opportunity. > > I got to it quicker than I thought. Please see attached patch. Can you > apply it to the latest git 1.0.2 version and re-run your test (capture > stderr output). I'd like to see what we get. > > Also is this 32-bit or 64-bit Windows? Are you able to share your > locking callback implementation? > > Thanks > > Matt > > > -- > Ticket here: > http://rt.openssl.org/Ticket/Display.html?id=4545<http://rt.openssl.org/Ticket/Display.html?id=4545> > Please log in as guest with password guest if prompted > > ________________________________ > > > Legal Notice: This email is intended only for the person(s) to whom it is > addressed. If you are not an intended recipient and have received this > message in error, please notify the sender immediately by replying to this > email or calling +44(0) 2083269015 (UK) or +1 866 592 4214 (USA). This email > and any attachments may be privileged and/or confidential. The unauthorized > use, disclosure, copying or printing of any information it contains is > strictly prohibited. The opinions expressed in this email are those of the > author and do not necessarily represent the views of 1E Ltd. Nothing in this > email will operate to bind 1E to any order or other contract. > > > -- Ticket here: http://rt.openssl.org/Ticket/Display.html?id=4545 Please log in as guest with password guest if prompted -- openssl-dev mailing list To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev