Thanks Mike. I'll try to answer the parts that I can here. > 1) Why MessagePack vs other binary serialisation formats? (i tend to use > protobufs)
The honest answer is that we're not very familiar with protobufs. Here are some things we like about MessagePack, though, where I'd be curious to learn more about how protobufs does things: - It's really easy to implement. - It's self-delimiting, so we don't need to do anything special to parse a stream of payload packets. - If we avoid maps (because of questions around how duplicate keys and non-hashable keys get handled), we can be pretty sure that all implementations of MessagePack will parse the same blob in the same way. Though there's a good chance there are more problems here I haven't thought of... > 2) When staying in binary, what sort of overhead does the format impose? The biggest part of the header is the keys in the recipients list: with packing overhead and all, that comes out to 52 bytes (anonymous) or 85 bytes (visible) per recipient. The rest of the header is ~100 bytes. In the payload of the message, for every 1 MB payload chunk, we use 34 bytes per recipient to authenticate it, and another ~20 bytes of per-packet overhead. Also every message includes an empty payload packet at the end, which is authenticated like the others. Examples: A 1-byte message, encrypted to a single anonymous recipient, is 262 bytes. With ten visible recipients, it's 1673 bytes. Increasing the plaintext length doesn't add any more overhead, though, up to 1MB. So far we've preferred ease of implementation and sticking to NaCl's defaults and high-level interfaces, over optimizing for the size of small messages. Here are some optimizations that so far we've deliberately avoided: - Defining additional non-streaming modes. We could avoid authenticating the empty payload packet at the end, if there were separate encryption and signing modes that specified a single payload packet. - Using smaller keys. Truncating the payload key and the HMACs to 128 bits (or using crypto_onetimeauth instead) would save a lot. I read that Curve25519 has 128 bits of security, so it's possible that our 256 bit keys/authenticators are overkill, even apart from the question of whether 256 bits is overkill in general. - Using crypto_stream_xor instead of crypto_secretbox. In the places where we use crypto_secretbox, the 16-byte authenticator it prepends is redundant. (Though note that many languages, including I think Ruby and JS, don't have NaCl/libsodium wrappers that expose low-level functions like crypto_stream_xor.) > 3) If you imagine a mix network for routing of small binary messages, is > saltpack an appropriate format to use for protecting the messages in your > estimation? Or are there gotchas that its replacement-for-pgp design would > create for the case of pure machine-to-machine messaging? I don't know anything about mix networks -- would you ever have multiple recipients in a mix network? -- but I can point out at least a couple anonymity issues that could come up, in addition to what Jeff pointed out. Trevor caught one of them in this thread: we leak when two recipients have the same key, and we need to fix that by changing our nonces. Another concern is that even when recipients are anonymous, the *number* of recipients is visible. If for example a message has 37 recipient keys, and I'm the only person in the world who owns exactly 37 laptops, that could identify me. Another general downside of using saltpack for machine-to-machine messaging, is that if you want ephemeral recipient keys for forward secrecy, you'll have to arrange that outside the format. It might make more sense to use a protocol that was designed for having both parties online? > 4) MIME type? Could you maybe forbid/strongly discourage in the spec emails > that contain ASCII armoured saltpack messages? I think some clients have > struggled in the past with the UI for showing a message that contains > partially signed and partially not signed text, as they tend to treat the > signedness of a message as a boolean. Formally forbidding mixing of the two > can solve that. The main use case we wanted saltpack for is the problematic one you're describing, where I paste an encrypted message into GChat or reddit or email, and the recipient decrypts it by hand. So we don't want to forbid that. Maybe we could suggest that implementations using saltpack for something other than decryption-by-a-human-being, should tweak the format name and the nonces to avoid compatibility with the regular format? Also like above, I'd worry that applications using saltpack in an automated, online way might be missing out on forward secrecy. > 5) The format appears to be at least partly defined through unversioned > reference to a particular library (NaCL). In particular it does not specify > what a "NaCL public key" actually is (curve25519 presumably). That seems like > it should be fixed for a realistic spec. Good point. Would explicitly referring to http://cr.yp.to/highspeed/naclcrypto-20090310.pdf be a reasonable way to do that? > 6) It'd be nice if there was a way to embed X.509 cert chains (i.e. signed > curve25519 certificate) into the headers, to allow the sender to authenticate > themselves with a PKI instead of Keybase. Then it could act as a competitor > to CMS. We designed saltpack with the assumption that the client implementation would handle the heavy lifting to figure out what real world identity corresponds to a given public key. In our implementation we use all the Keybase machinery around sigchains and public proofs, but the saltpack format itself doesn't know anything about that. Would it be reasonable for a different implementation to do something similar with X.509? Are there attacks that could come up if the cert chain isn't embedded directly in the message? Or is the idea more that you could verify a sender without talking to any servers? - Jack _______________________________________________ Messaging mailing list [email protected] https://moderncrypto.org/mailman/listinfo/messaging
