Well, if you had say a single thread collecting data to feed an entropy pool, once an attacker syncronized on that, they'd win. Not sure that's possible, but it's probably better for security if this is done inline by each thread as needed. (Particularly when you consider the real OpenSSL usage scenarios - web servers with a lot of running threads - good luck making a timing attack work in that use case).
There's one more point. The upper bits of those registers are easier to guess than the lower, again the 'fix' is obvious, what's more difficult is knowing which of the lower bits are actually changing. i.e. P4 the lower 4 bits are effectively 'stuck' as every instruction is a multiple of 16 clocks long, quite a few processors have quirks here. Pete From: Andy Polyakov <ap...@openssl.org> To: openssl-dev@openssl.org Date: 23/01/2012 03:38 Subject: Re: OS-independent entropy source? Sent by: owner-openssl-...@openssl.org > HT processors are a nightmare for security yes :). I've attempted the experiment even on hyper-threading P4. No anomalies in sense that it looks pretty much like another P4. Well, one thread appears to get more interrupts, while spikes tend to be higher on the other thread. But when it comes to "fine print", i.e. variations between interrupts, there is no essential difference and cross-correlation looks essentially the same as on real multi-core. No maximum at zero lag though... On the second thought why would there be difference, when every sample takes several *hundred* clock cycles to complete? Hyper-threading operates at single clock cycle resolution, not hundreds, right? > You are assuming the target software is collecting data continuously as > fast as it can - which I agree, simply turns it into the designated > victim :). Don't do that - the data rate it high enough you can sample > on demand and you can afford some delay between samples. But data will have to be collected in "bursts" and not exactly short ones, e.g. ~700 samples or 300 microseconds are suggested on the page, initial calibration can be tens milliseconds... Would it be appropriate to say that these are not long enough to detect and synchronize on? [Naturally provided that detection and synchronization can give adversary the edge.] Assuming that that collection is continuous is simply first approximation on the problem... > And make sure your sample collection code is branch free - you can still > attack it via the cache, but it's a lot harder to know exactly where the > victim is and your attack code has to be able to get that exactly right. Loop bodies are branch-free on all platforms. Though I don't think it matters a lot, because, once again, sample is several *hundred* cycles, much higher than [mis-]branch penalties. ______________________________________________________________________ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager majord...@openssl.org ______________________________________________________________________ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager majord...@openssl.org