Great paper by Borisov, Danezis, and Goldberg: http://cacr.uwaterloo.ca/techreports/2014/cacr2014-10.pdf
I'm going to attempt a summary, it's not very good but it might help read the paper to know where it's going: Imagine you had an unobservable chat system; maybe it runs over Tor so an observer can't tell who you're talking to. You'd also like a "presence system" that informs you when your friends are available to chat. But you don't want to reveal to the presence system your list of friends, or when you're online. At a high level, this could be done if: - Each pair of friends share symmetric keys. - When users are online, they anonymously publish encrypted status messages to each of their friends through some "Private Information Retrieval" (PIR) system, tagging these messages with an ID derived from the pairwise symmetric key and the current time period (or "epoch"). - Users frequently query the PIR system to retrieve the presence messages being sent from their friends. By definition of "PIR", queries don't reveal which element is being queried, thus preserving privacy. DP5 show how to make this real by: (1) Creating a practical PIR system for lookup of messages based on their ID (i.e. a key:value store). (2) Reducing traffic volume by using pairwise-encrypted messages to distribute signature-verification keys for signed-and-broadcast messages (long-term vs short-term "epochs") (3) Using some sort of unlinkable signature scheme so that your signed messages in each short-term epoch can be retrieved by your friends but can't be linked by the server. PIR ---- Section 4.3 discusses a practical and implemented multi-server PIR system. (Recall that multi-server PIR means the client queries several servers who each have the same database. None of the servers learn which entry is being queried as long as one server is "honest" and doesn't collude with the others. Single-server PIR would be better, since it doesn't require an honest server, but multi-server PIR is more practical). Building from a simple Chor / Pynchon Gate-style multi-server PIR, there's discussion of adding robustness against misbehaving servers, then using the system as a hash table for (key:value) pairs. Long-term / Short-term epochs ---- It's inefficient to publish a separately-encrypted message to each of your friends every few minutes. Instead you could send each of them your signature-verification key and a symmetric key, and then publish a *single* signed-and-encrypted message, in each time period, that all your friends can read. But this loses the ability to revoke friends, so DP5 strikes a balance - users send pairwise-encrypted messages to each friend perhaps once a day. These messages establish the signature-verification and symmetric keys for a "long-term epoch". Within the long-term epoch, users publish signed-and-encrypted messages for each "short-term epoch". Using pairwise messages to establish keys for broadcast messages has been proposed in other systems (mpOTR, [GRP]), and seems like a good pattern. But... Signatures ---- Some trickiness creeps in. The threat model here gets complicated, but one goal is that you don't want people to interfere with your messages by publishing their own messages with colliding IDs. Since the long-term epoch messages are based on pairwise symmetric keys, this is easy: you derive each message ID by hashing the shared key and epoch number. But the short-term messages are being broadcast to all your friends, so you can't rely on a symmetric-key mechanism as any of your friends could forge it. And you can't use the signature-verification key as an ID either, as this would link your presence messages across short-term epochs, so the server could see when you are online (even if it can't tell exactly who you are). The paper has two proposals: (A) Pairing-based signatures. Roughly: If g is a Diffie-Hellman generator, given g^a and g^b you shouldn't be able to compute g^ab. But suppose g generates a special DH group, such that h is a generator of some other group, and you can compute the "pairing" function e(g^a, g^b) = h^ab. Then you can do things like BLS signatures [BLS] where you sign message m with private-key x by calculating m^x. Verification uses public key g^x and checks that e(g, m^x) = e(g^x, m). By setting m to the epoch number you can sign this number to get m^x, send the signature to the server, and it will use e(g, m^x) as the message ID. Your friends can calculate the message ID via your public-key and epoch number = e(g^x, m), but they can't construct interfering messages for new epochs, as that would require a signature forgery. The downside of this is that it only signs the epoch number, so the signature authenticates your presence but not any "additional data" you're sending in the message, which is only symmetrically encrypted/authenticated. (B) The alternative in Appendix A is to use a conventional discrete-log signature (Ed25519 is suggested), but derive a "variant of the public key" for which the signer can derive a corresponding variant private key. The variant public key is then used as the message ID, which allows the message to be signed. The use of more normal cryptography and the more complete signature makes (B) seem appealing to me, but I don't fully understand this tradeoff. Anyways, there's a lot more to this, so read the paper! Trevor [GRP] https://whispersystems.org/blog/private-groups/ [BLS] http://www.iacr.org/archive/asiacrypt2001/22480516.pdf _______________________________________________ Messaging mailing list [email protected] https://moderncrypto.org/mailman/listinfo/messaging
