General issues regarding Open Source vs Closed Source encryption _systems_ ( was Re: [backstage] Ashley Highfield on iPlayer - 26min Interview )

Michael Sparks Thu, 01 Nov 2007 16:20:06 -0800

On Thursday 01 November 2007 20:38, Andy wrote:
> Compilation of source code is not a cryptographically secure way to
> protect data or algorithms.


Giving someone the sourcecode is even less secure - >IF< the attacker can gain
a significant advantage by doing so. The question is therefore "can they?".

(Consider the lighthearted analogy of R2D2 getting the sourcecode/schematics
to the defense mechanisms of the Deathstar. It's a lot easier to find
implementation attacks if you have the detailed plans - especially if you
have a strong incentive to do so.)

In the following I'm going to use standard terminology of Alice, Bob and Eve.

Alice & Bob wish to communicate over a secure channel C, without Eve getting
access to their secret. (the names from a message being sent from A to B via C
without anyone evesdropping).

Consider tools like SSH, SSL & TLS. In this case A & B are chatting - eg 
sending personal financial data. A&B are both users of the software & have a 
vested interest in protecting the integrity of C because E may wish to do 
something nasty. Like empty their bank account.

In this usecase, a common solution is public key encryption. In such a system 
Eve gains no advantage in attacking C from knowing the algorithm in use. Eve 
needs to gain access to the private keys in use and since these are never 
communicated, Eve has to try something else - Eve isn't hosted on the same 
machines as A&B after all. (If Eve was, Eve could do things like a memory 
based attack)

The something else usually relies on looking for implementation attacks to 
bypass the secure channel in order to gain further access. (which when 
combined with other attacks can lead to access to keys)
   * Example: http://www.cert.org/advisories/CA-2002-18.html

Rarely you have attacks against the protocol itself, but this did happen with
the change from ssh 1 to ssh 2 - ssh 1 being susceptible to man in the middle 
attacks.

If you have access to the source it is far easier to find such attacks. Rather 
than throwing random data at a piece of software or attempting decompilation, 
you can look at a higher level for locations where buffer overruns can 
happen, double free() etc. You can of course automate some of this.

HOWEVER, E is not the only person interested in finding errors. A & B are also 
interested in finding errors, in order to thwart Eve from accessing their 
personal/financial/etc information. People in groups A&B also have access to 
the same tools. They also have good reason to update their systems to protect 
against holes. There's also *alot* more people who are As & Bs than are Es.

Furthermore, each hole plugged pushes access to the keys further and further 
away from E.

Now consider the case where:
   * Alice is someone who has created a piece of content (the secret) which
     they wish to protect.
   * Bob is a _display device_ which Alice trusts to merely *show* the secret
     to the user.
   * Eve is in this case the user. Alice is afraid that if Eve gets the secret
     (the unecrypted a/v file) that Eve with share it in an unprotected form.

Furthermore: Eve owns the system that Bob is running on. Bob is running on a 
general purpose computer. Eve therefore has access to everywhere that Bob can 
store data. Bob cannot store any secrets from Eve on the computer. Bob can 
however *try* and hide secrets. (Security through obscurity is of course no 
security) Bob could also encrypt these secrets to try and prevent Eve 
accessing them.

If Eve finds where the secrets are however, Eve can use a brute force attack 
against the stored data to find the keys. This is precisely what happens when 
someone use libcss to watch a DVD under Linux. 

OK, back on topic. You claim this:
> Compilation of source code is not a cryptographically secure way to
> protect data or algorithms.

Which is true. The numerous DVD players performing brute force attacks under 
Linux in order for people to be able to watch the DVDs they've paid for 
proves your point here happily.

The question here is can you have a *completely* open source DRM mechanism 
where the keys & algorithm used are protected. (ie hidden)  If you think 
about it, the answer has to be no. Consider two cases:

   * The case where you (Eve) can recompile the code yourself, and merely take
     data from Alice which includes the keys. Clearly in this case it's
     trivial to just dump the decrypted data immediately (as VLC can - cf that
     recent article :) after decryption.

   * Suppose a key is embedded in the source as a literal and this is used
     during initial handshaking with Alice and to protect keys before storage
     on disk. Furthermore, suppose that you are supplied with a precompiled
     binary. Sure you can compile your own version, but you can't use your own
     compiled version to decrypt the data.

     This binary could either be the entire system or a decrypt, decode,
     display binary thunk.

This latter approach is more secure, but is susceptible to a
recompile/correlation attack. (he says coming up with the phrase) 
Specifically, if I provide you with a binary decrypter that looks like this 
as a hex representation:

68656c6c6f20746869732069732061204b455920736f207468657265

If you have the source and replace the embedded KEY with (say) "FISH", you 
would get this:
68656c6c6f20746869732069732061204649534820736f207468657265

Let's correlate these:
68656c6c6f20746869732069732061204b455920736f207468657265
68656c6c6f20746869732069732061204649534820736f207468657265

You should (with a little effort) be able to pick out the difference between 
these two - with the former having this:
4b4559
And the latter having this:
46495348

4b4559 == 4b 45 59 == [hex(ord(x))[2:] for x In "KEY"]

As a result, having access to the source gives you an attack vector for the 
hidden private key and the algorithm needed to unlock the rest.

Remember, unlike SSH, the sender does not trust the recipient since the 
recipient *is* the attacker. If the user (Eve) can change the receiver tool - 
ie change Bob (as a fully open source system would demand) then Bob is not 
trustable. 

If Bob can contain something that Eve can't change (a binary display thunk) 
then Alice can trust it more, but if Eve has access to the Binary Thunk's 
source, Eve can trace where the keys are and attack those locations. Either 
through guile (recompile/correlate) brute force, or simple implementation 
attacks (ala ssh/etc).

The alternative is for the binary thunk to be completely proprietary (which 
last time I looked is what DReaM suggests (I think), but my memory may be 
faulty. I was going to look at that this evening but was writing code 
instead :).

If the binary thunk is completely proprietary then yes, the keys are more
protected. However the user can always still find them. (eg on Linux run the
process under strace, or on solaris run the process under Truss. Similar tools
exist for other OSs) And again, due to access to the source for the rest of 
the system, Eve has the leisure of being able to look for implementaton 
attacks.

Furthermore, unlike SSH, the number of attackers (Eves) massively outnumbers 
the number of Alice's. (All the Bobs are dumb displays/code) Furthermore, 
from the perspective of Eve, the protection mechanism is itself a bug. The 
Eves have a great incentive to fix the bug of not being able to get at the 
data. Giving Eve the source merely makes it easier for Eve to fix this bug.

Alice of course views the "workarounds" that Eve will find as bugs and want to 
fix them. But there are many many more Eve's vs Alice.

ie the very attribute which can make open source implementations of ssh, ssl &
tls more secure (many eyes make all bugs shallow) is what can be used to harm
an open source DRM mechanism. *because many more people have an interest
in breaking the system than a working system*

All things being equal then, neither system can be viewed as cryptographically 
secure (whether you have the source or not). However if you have complete 
access to knowing where the keys are stored and if/how they're encrypted, you 
have more attack vectors.

So, probably the most open thing you can do is to have a hybrid system:

    * Have a largely open code base
    * Have a hook for the restrictions enforcement mechanism
    * Use open source implementations of the encryption algorithms.

But then you need a binary thunk to protect the licenses on disk in a license 
vault of somekind whose structure is also not open. (eg an encrypted loopback 
device with the decryption key & code stored in the binary thunk. You could 
probably pull in some code from a *BSD here to bootstrap that process, though 
that's _possibly_ a bit extreme)

If that on disk store isn't protected by just one key but actually a key from 
a KPI (not PKI) based system, then you can revoke keys, and potentially
even update the hidden keys used.

Again much of the basis of this can be open source, but you can't have a
completely open system (otherwise you can just pluck out the keys). (Though 
you could have a completely open example system based on the concepts, but 
you'd need have a different implementation for practical application to 
prevent a correlation attack)

It's probably worth noting that this is actually quite simple, at least to 
prototype... (he says, thinking about it...)

But then of course Eve then moves onto a different attack. (Such as 
decompilation, reverse engineering or running on specialised emulation
systems or with compromisable output systems. (cf sound cards which can
provide you with a digital capture of what they're going to play) )

OK,  I think that's a _reasonably_ complete description (from one perspective)
of the issues involved.

If you disagree with me, I _am_ genuinely interested in hearing how you
think you can prevent an attacker from accessing the keys they needs in order
to use the decryption algorithm that's protecting the content such that the
key can't be captured and such that the decrypted data can't be siphoned off
when I have complete access to the source code and can recompile my own
interoperable player.

Best Regards,


Michael.
--
  (actually, some of these comments are getting a bit long, maybe I should put
   them on my blog instead... :)
-
Sent via the backstage.bbc.co.uk discussion group.  To unsubscribe, please 
visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html.  
Unofficial list archive: http://www.mail-archive.com/[email protected]/

General issues regarding Open Source vs Closed Source encryption _systems_ ( was Re: [backstage] Ashley Highfield on iPlayer - 26min Interview )

Reply via email to