Re: [tor-dev] GSoC: Tor Messenger CONIKS integration

Marcela Melara Thu, 10 Mar 2016 17:56:38 -0800

Hi Elias,

> On Mar 9, 2016, at 5:31 AM, Elias Rohrer <[email protected]> wrote:
> 
> Hello Marcela!
> Nice to meet you, too, and thank you for that interesting Paper!
> It seems I made a lot of wrong assumptions in my last mail, ugh, thats kind 
> of embarrassing..


I really appreciate your interest in our paper and CONIKS, thanks :) No 
worries, there are a lot of subtle aspects about CONIKS, so it can take some 
time to fully see and understand them.

>> Yes, we propose that the identity providers can act as auditors and 
>> cross-verify each other in this way, and clients may query any auditor when 
>> they are checking for equivocation. However, I should note that we intended 
>> this to be the *ideal deployment* (and I apologize if this doesn’t come 
>> through clearly enough in the paper). CONIKS also allows third-party 
>> auditors whose sole job it is to check that the chain of signed tree roots 
>> is linear. In practice, we expect that auditors and identity providers will 
>> be separate in most cases (at least until there is a standard for CONIKS 
>> protocols and data formats, but we’re not there yet).
>> 
>> It’s not accurate to conflate auditing with being a key server, or say that 
>> auditing is a subset of the key server’s functionality. Auditors only need 
>> to keep records of identity providers’ signed tree roots, and they never 
>> download or record any key directory contents, so it doesn’t really make 
>> sense to equip auditors with key server functionality and “turn it off” when 
>> you’re not acting as a key server.  Key servers have the more complex and 
>> rich functionality needing to maintain significantly larger data structures 
>> and perform more cryptographic computations. This means that key servers 
>> have very different space and availability requirements than auditors do.
>> 
> Ah, yes, that is true. Just turning of features of the server to get an 
> auditor might indeed be a bad idea. But wouldn't there be a lot of 
> overlapping functionality in the server, auditor and even the clients? E.g. 
> the crypto primitives for building and checking the tree? Maybe it would be 
> viable to create a libCONIKS which could be used by all three components?

So it might seem at first glance that there’s a lot of overlapping 
functionality between the server, auditor and clients, but once you start 
implementing each one, it’s not really the case. Think about the different 
tasks of each component: the server is generating a bunch of digital signatures 
(e.g. signed tree roots, lookup indices) and hashes (Merkle tree, lookup 
indices, signed tree roots) to build the Merkle trees and hash chain of signed 
tree roots. Clients, on the other hand, do a bunch of signature and hash 
verifications (which, yes to be fair requires hashing data as well, but for 
different reasons) to verify individual authentication paths and signed tree 
roots. So there’s really no overlap in functionality with the server. 
Similarly, the auditors need to verify the signed tree roots so there’s some 
signature and hash verification as well, but in a different way than clients 
need to do this. 

While generation and verification of these various crypto primitives go hand in 
hand, I would caution against putting them all in a single library if different 
components of the system will need different portions. If you look at the 
reference implementation, the only common code is the definition of the data 
formats. When I first started writing the reference implementation, I actually 
thought that it would be useful (and possible) to build a single libCONIKS, but 
sharing underlying crypto primitives isn’t reason enough to put all of the 
functionality in one place since the application of those primitives is what 
really matters. So I really think it makes more sense to make separate modules 
for server, client and auditor.

>>> On Page 4 the authors state:
>>> "We can implement a VUF using any deterministic, existentially unforgeable 
>>> signature scheme [42]. The signature scheme must be deterministic or else 
>>> (in our application) the identity provider could insert multiple bindings 
>>> for a user at different locations each with a valid authentication path. In 
>>> practice, this could be a classic RSA signature [51] (using a deterministic 
>>> padding scheme such as PKCS v. 1.5 [26]) or BLS “short signatures” [9], 
>>> which provide the best space efficiency. Many common discrete-log based 
>>> signature schemes such as DSA [30] are not immediately applicable as they 
>>> require a random nonce.
>>> In CONIKS, we generate the index for a user u as: i = h (sign_K (u))
>>> where h() is a one-way hash function and sign() is a deterministic, 
>>> existentially unforgeable signature scheme. The extra hash function is 
>>> added because some bits of sign_K (u) might otherwise leak through binding 
>>> validity proofs, and signature schemes are generally not unforgeable given 
>>> some bits of the valid signature. The hash function prevents this by 
>>> ensuring the final lookup index leaks no information about the value of 
>>> sign_K (u).”
>>> 
>>> I understand that using the signature of the user identity ensures that no 
>>> information indicating its existence on the server is leaked. But why are 
>>> they hashing the signature again? RSA-signatures normally already consist 
>>> of a signed hash-value. Why would it bad to leak bits of the signature?
>> 
>> The answer to this question is in the portion of the paper you quote: 
>> "signature schemes are generally not unforgeable given some bits of the 
>> valid signature.” There’s some crypto-jargon here, which you may not be 
>> familiar with if you haven't taken a crypto course. What this sentence is 
>> trying to say is that, given some bits of a valid signature, an adversary 
>> could be able to find some message or data and a corresponding digital 
>> signature which is valid and which has not been generated by the legitimate 
>> signer (in this case the identity provider). This is called an existential 
>> forgery, and it means that an adversary can create forged lookup indices if 
>> the signature isn’t hashed. The hash on top of the signature also ensures 
>> that the lookup index can’t be easily computed from the VUF.
>> 
> I actually had a crypto course some time ago, so crypto jargon should be 
> fine.. However it seems that part of my brain needs some more training to get 
> reactivated ;)
> I want to get this into my brain so here are some more (probably pretty 
> naive) questions and thoughts on the topic:
> - When bits of a valid signature leak, an external attacker could forge a 
> colliding index for a different user? This would allow an attacker to 
> register the same public key to a different username they control? So 
> therefore a hash is used?

So the problem with leaked bits isn’t about an attacker forging an index. The 
problem is actually about privacy. Let’s go back and think about why the server 
transforms usernames into these lookup indices. The main purpose is so that the 
proofs of inclusion (i.e. the Merkle authentication paths) for one user don’t 
leak information about *any other* users in the tree. So if I receive the proof 
of inclusion for Alice, I should not be able to figure out who else might be in 
the tree. Remember that the proof of inclusion gives us the prefix of Alice’s 
lookup index, so in other words, we want to make sure that an attacker can’t 
find a name that results in an index with a shared prefix with Alice since this 
would be a privacy violation. If you use just a hash function, an attacker can 
reasonably hash a large set of usernames until it finds one that shares its 
prefix with Alice's index and determine whether this user exists by querying 
the server for the corresponding key. 

Using only the digital signature doesn’t solve this problem since there are 
attacks that allow you to recover the rest of the signature bits given some of 
the first bits (I don’t know which specific attacks do this, this is the 
explanation I received from my collaborator). So given Alice’s index prefix, an 
attacker could try to guess if the index for a specific user (e.g. Bob) shares 
the same prefix, recover the remaining bits of the signature, and verify — 
using the provider's index public verification key -- whether this is a valid 
signature for Bob’s name. This is also a privacy violation. So by hashing the 
signature, anyone who receives the authentication path for Alice cannot 
determine based on this path, which other usernames exist in the tree since 
this would require finding the signatures whose hash shares a prefix with 
Alice’s index, a much more difficult problem.


> - What still confuses me: When I recall correctly, 'existential forgery' is 
> the lowest level of forgery since the attacker may not have control over the 
> contents of the signed message, but may forge a valid signature for that 
> message.
> You propose to use a existentially unforgeable signature scheme for your VUF. 
> But:
>       - As you explicitly use an existentially unforgeable signature scheme, 
> why can the index then be forged if bits of this (existentially unforgeable) 
> signature are leaked to an attacker? Is the hash needed when an external 
> attacker has no knowledge of the private key used to sign the index?
>       - You later propose RSA for the VUF, but isn't RSA *the* example for 
> which existential forgery can be done?

We realized the potential dangers of using RSA, so we designed a different VUF 
scheme. I think it’s best if you read the new paper to see the description.


>       - So am I right to assume that the attacker model shifts from an 
> external one to a malicious identity provider? As he knows his private key, 
> the signed data and the signature, what is the hash function hiding here?

See my explanation of the VUF above. CONIKS tries to protect against external 
*and* internal attackers. As I explained above, the hash function in the 
private lookup index is used to protect users’ privacy from external attackers. 
Like you said, the key server knows can know who all has name-to-key mappings 
in its own key directory. CONIKS deals with the internal attackers (i.e. 
malicious or compromised key server) using the tamper-evident data structure 
and signed tree roots to leave evidence of fake keys and equivocation.


>>> Also, if I see this correctly, they don't do it in their reference 
>>> implementation 
>>> (https://github.com/coniks-sys/coniks-ref-implementation/blob/master/coniks_server/src/org/coniks/coniks_server/SignatureOps.java
>>>  
>>> <https://github.com/coniks-sys/coniks-ref-implementation/blob/master/coniks_server/src/org/coniks/coniks_server/SignatureOps.java><https://github.com/coniks-sys/coniks-ref-implementation/blob/master/coniks_server/src/org/coniks/coniks_server/SignatureOps.java
>>>  
>>> <https://github.com/coniks-sys/coniks-ref-implementation/blob/master/coniks_server/src/org/coniks/coniks_server/SignatureOps.java>>),
>>>  instead they just use SHA256withRSA...
>>> So, am I misreading the paper? Or do they actually mean sign_K(h(u)), which 
>>> would be the normal 'hash and sign' operation of RSA? Or does it make sense 
>>> to rehash the signature and their own implementation is not consistent with 
>>> it?
>>> This is probably no important point, however I want to make sure I'm not 
>>> missing a piece concerning the security of the user identities.
>> 
>> Unfortunately, the reference implementation is still a work in progress 
>> (although I’ve had to take some time off from working on it due to other 
>> research projects), and is missing several features described in the paper. 
>> The SignatureOps class in the reference implementation simply implements 
>> wrappers around Java signature functions, so that’s not where the user 
>> lookup index calculation happens. The lookup index calculation happens in 
>> the unameToIndex() function in conks_server.ServerUtils. Now, I will note 
>> that the reference implementation currently *isn’t* doing the proper index 
>> calculation as it only currently hashes the username, so thank you for 
>> making me go back and double-check my code! But it would be relative 
>> straight-forward to change this and to add the digital signature computation 
>> that needs to be done before the hashing.
> Ok, good to know that, I'll keep it in mind when reading the reference code! 
> However, this is embarrassing, I only had a glance an the reference 
> implementation, sorry for linking to an unrelated spot in the code and making 
> assumptions...
> 
> I think my next steps will be to read the peer-reviewed version of your Paper 
> and to refresh my crypto-knowledge a bit. I'd appreciate if you had some 
> pointers to resources relevant to this project for me (other than classic 
> textbooks). I may even try to have a look at the Micali/Rabin paper on VRFs. 

As for good pointers for this project, I’m not sure that the Micali/Rabin paper 
on VRFs specifically is really going to be all that helpful for implementing 
CONIKS because the paper describes VRFs at a level of detail that isn’t needed 
to understand why it may be better to use an EC-based VUF construction than 
RSA. I apologize about the confusion arising from the older version of our paper

If you’re interested in the very formal crypto details of VRFs, you should read 
the paper, but for this project it’s more important to understand the security 
and privacy problems that CONIKS tries to solve and how it achieves those 
goals, and to understand how CONIKS fits into the workflow of the existing Tor 
Messenger service. Hopefully, our Usenix paper will explain most of the 
security and privacy design decisions, but definitely feel free to ask more 
questions! Another good resource might be to read about equivocation and fork 
consistency (David Mazieres’ paper on SUNDR first introduced this concept), and 
certificate transparency (there’s a good ACM Queue article) for more background 
on the problems CONIKS tries to solve and the design decisions we made. 
Goldberg et al’s paper on preventing DNS zone enumeration is a fairly recent 
paper that also shows how to use VUFs to prevent enumeration of sensitive 
information, so that might be another good resource. For figuring out how 
CONIKS can be integrated into Tor Messenger, I would look at the portions of 
the Tor Messenger code that currently handle encryption keys (e.g. key 
generation, registration, key lookups), and come up with a plan of how CONIKS 
will affect the existing workflow.

I hope this helps!
Best,
Marcela

_______________________________________________
tor-dev mailing list
[email protected]
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev

Re: [tor-dev] GSoC: Tor Messenger CONIKS integration

Reply via email to