:-( I hope I recall correctly that what I mention next is indeed stuff
happening in RAND_screen()... IIRC RAND_screen() isn't 'only' reading the
screen but also doing a system-level heap traversal and a few other things
and it was exactly that system-level heap traversal that slowed a few
spurious Win boxes down to a snail's pace, so my take is it doing that to
yours too.
Has featured a few times here; there's a a known but quite unpredictable
issue in the Windows specific heap walking code in there (I'd have to check
the code to see again which calls it were, exactly); there's been a patch to
at least limit the scan to an upper bound in time and as with anything
entropy related, there's always the question 'is it really random?' or
better: 'is it good enough random?'

The technical (software) part of the issue is that the problem occurs only
on some machines and is quite unpredictable in where it might pop up and
when; since OpenSSL accesses some Win-internal structures there, which have
been documented to some degree, the problem is 'known' in that we know it
can happen, and a few fixes have been added to the code to at least limit
the bother  to an upper bound, but the issue of 'slow' isn't exactly
/fixable/: turns out the machine is spending all that time inside the
Windows OS itself and OpenSSL has no control over that, once it's called
that one Win API. Some boxes just take ages in there for unknown reasons.


Depending on what your servers do, the 'making certain' move re entropy is
to connect a hardware entropy source to the box, but that's probably
off-topic here (unless it's an IIS webserver working in a military/banking
setting). Anyway, removing [semi-]entropy sources /is/ an option, but it's
dangerous as removing them one by one in the end delivers zero entropy and
we've to thank a Debian fellow for [accidentally, but /quite/ noticably]
showing everyone what happens when you like cleaning up so much you lose a
suddenly-after-the-fact significant chunk of entropy gathering while your
streetcred is in the can, permanently.

Personally, I wouldn't even bother about those 3 secs and let it do what it
wants doing, as they [the 3 secs] only happen at library /init/ time, i.e.
server [re]start. Of course, there'll be plenty who say 'just remove the
code', but it gets quite inconclusive if you only count the 'votes' from
security/cryptography knowledgable folks. And, no, that heap walk in
RAND_screen() isn't a big source of entropy, probably small, rather, but you
grab what you can, when you don't go the hardware entropy source route: you
have to realize that you're 'faking' the whole entropy thing right there,
all the way, so the game isn't about entropy-as-is but about making it
bloody dang hard for any hacker to predict what your 'random' pool looks
like at time t. There are no hard and fast answers to the question 'when
have I done enough gathering?'   And RAND_Screen() does add several chunks
of unpredictability to that game. Now how many bits of /entropy/ it's
delivering, I won't (and can't) say (OpenSSL takes a guess, but that's all);
it's checking several sources and eliminating sources [one at a time]
because they bother you is a plenty dangerous game if you don't /exactly/
know what you're doing. Hence my basic answer: 'let it be'; maybe not what
you'd like to hear, but it saves losts of $$$ in discussion / security
review / calamities down the line.

[For the monetarily inclined, this subject has been discussed a lot in the
ML before and when you count those emails @ some hourly rate and see what
the result (or rather: the amount of change) is, then calc that cost sum and
compare with X times a slower restart of N servers and the $-quantified
cost, material and immaterial, of that... yeah. Let it be.]


And when you go about 'removing it' anyway, be very very careful WHAT you
remove, because I don't think it's the screen sampling that'll turn out to
eat the cycles on that one box of yours but the heap traversal sys calls
which are part of RAND_poll()/RAND_screen() and they are only a part of the
whole RAND_whatever entropy collecting thing.


Bottom line: commenting out the call(s) to RAND_screen() would quite
definitely turn you out as 'the IIS guy who's related to that Debian guy
y'all heard about before' several months down the line. A slightly 'smarter'
removal would take out that heap walk loop if it /really/ hurts, but
remember... Cave canem! (And this one has a /serious/ bite to it!)



On Wed, Jun 30, 2010 at 4:11 PM, Brian Makin <ma...@vivisimo.com> wrote:

> I am seeing a very slow initialization on a single Windows 2003 box with
> openssl-0.9.8l.
>
> During initialization the function RAND_screen gets called.  This
> effectively hashes the frame buffer to generate entropy.  In our case we
> are running as an IIS user and I'm not even sure what screen it's
> getting.
>
> This function takes on the order of 3 seconds.
>
> We have other identical boxes which are behaving correctly and a single
> box which is very slow.
>
> Two questions.
> 1. It appears that this is deprecated so would it be reasonable to
> simply remove it?
>
> 2. Does anyone have any idea why this function is misbehaving?
>
> --
> BRIAN MAKIN
> Senior Software Engineer
> ma...@vivisimo.com
>
> Vivisimo [Search Done Rightâ„¢]
> 1710 Murray Avenue
> Pittsburgh, PA 15217 USA
> tel: +1.412.422.2499
> vivisimo.com
>
> ______________________________________________________________________
> OpenSSL Project                                 http://www.openssl.org
> User Support Mailing List                    openssl-users@openssl.org
> Automated List Manager                           majord...@openssl.org
>



-- 
Met vriendelijke groeten / Best regards,

Ger Hobbelt

--------------------------------------------------
web:    http://www.hobbelt.com/
       http://www.hebbut.net/
mail:   g...@hobbelt.com
mobile: +31-6-11 120 978
--------------------------------------------------

Reply via email to