Hi Brett, The problem as I see it is that we're pushing more responsibility onto the Subscribers while simultaneously assuming all Hubs will use SSL/TLS correctly. While I wish I had more faith in my fellow programmers, I don't. I do afterall come from PHP where the SSL context of PHP Streams (a popular alternative to curl) does not verify SSL certificates by default. Hard to keep the faith alive over here ;). I can also cite a very large list of clients/libraries which, if not using PHP Streams, will disable cert verification anyway through curl options or similar. Apparently, certificates make testing really really hard. I wish I were joking but that is a common explanation despite the obvious fact that end users are presumably not guinea pigs. We hope. My point, if anything, is that we are dividing what we protect and subjecting all of it to a defence which is already implemented quite poorly in practice. On the one hand, we have a signature for request bodies, and on the other hand, nothing for headers. If the SSL/TLS protection held true - why even rely on body signatures at all? That would put us on a footing with OAuth 2.0's bearer token and the concepts behind it. Of course, we then end up all the same problems. The problem with this approach is exactly what I opened with - it requires clients to obey the rules. Not all of them will. Those with sufficient security oversight will get it perfectly right, and then get completely outpopulated by the little Hubs/Subscribers springing up which value a valid response over dealing with SSL exceptions. In this scenario, SSL/TLS becomes a single point of failure in a system where failure is not only common but sometimes encouraged. In a network of Hubs and Subscribers where parties may fail to verify certs, MITM attacks will be possible. Extending from that, not signing ALL components of the request will allow for requests to be manipulated without invalidating the signature of whatever subset of data is signed. Further, and it's a problem in the existing scheme, replay attacks will remain possible until a non-repeating (within a reasonable time horizon) nonce value is introduced into the signing method. Otherwise you're not only subject to replay attacks, but attempts to craft a validly signed request via remote Timing Attacks. SSL/TLS would be perfect if not for all the programmers in the world happy to ignore it. That's why signatures remain compelling. Yes, they are a pain in the ass to develop. Yes, they can lead to incompatibilities between large posterchild implementations. Yes, programmers hate them with a vengeance. Why? Because programmers can't ignore them by setting curl's peer verification to false. They actually have to go and deal with them properly or nothing will work. In a sense, they are supposed to be a PITA. My 2c for what it's worth. Pádraic Brady
http://blog.astrumfutura.com http://www.survivethedeepend.com Zend Framework Community Review Team ________________________________ From: Brett Slatkin <[email protected]> To: pubsubhubbub <[email protected]> Cc: Monica Keller <[email protected]>; Joseph Smarr <[email protected]>; [email protected] Sent: Thu, October 7, 2010 7:32:03 AM Subject: [pubsubhubbub] Solving the "turduckin" problem for PuSHing arbitrary content types Hey all, Today one of the GitHub guys released a new Node.js library for PuSH. His call for JSON support in the protocol is clear: http://techno-weenie.net/2010/10/5/nub-nub/ We've been wanting to add JSON support for quite a while, with notable contributions from Mart (http://martin.atkins.me.uk/specs/pubsubhubbub-json) and others. A while ago Monica wrote up this proposal about how to support arbitrary content types in PubSubHubbub: http://code.google.com/p/pubsubhubbub/wiki/ArbitraryContentTypes I worry about option 2, translation to JSON, because I think it dictates what format the JSON payload needs to be in when served by publishers. A big benefit of JSON is making things easier to use and more ad hoc. Dictating the packaging format would harm that. I also don't see Facebook changing their JSON API (http://developers.facebook.com/docs/api/realtime) to match and they shouldn't have to. The core issue with option 1, the REST approach, is security (replay attacks). But I believe I've finally cracked the nut! To explain: Feeds are good formats because they are self describing. Update times and IDs are part of the feed body and individual entries, enabling idempotent synchronization and race-condition tie-breaks. This also means the 'X-Hub-Signature' on the body of PuSH new content notifications is sufficient for security/verification because we can ignore everything besides the body (the other headers are ignored). With the REST approach to arbitrary content types, we *need* to represent the HTTP headers in the new content notifications or else we'll have no idea what order the messages came in (Date), etc. And that's the security problem. If we rely on the headers, then we also need to verify them (to prevent replay attacks). But X-Hub-Signature only signs the body, so we're stuck. Some folks have discussed signing headers too, similar to OAuth1.0, but that lead to a lot of pain nobody wants to repeat. Others have talked about putting headers into the body (using mime multipart), but that's just another world of hurt-- the so-called Turduckin problem (thanks jsmarr for the name). The solution is to borrow from the OAuth2 playbook: Treat the X-Hub-Signature like a password/bearer token and require HTTPS for callbacks. That means for security we have these components: 1. hub.secret used by subscriber to verify the hub is authorized to post content (after delivery) 2. Subscriber's SSL cert used by Hub to verify authenticity of subscriber endpoint (before delivery) 3. SSL connection between Hub and Subscriber is encrypted, protecting the header values and preventing replay attacks (during delivery) Thus, Monica's proposal as-is does the trick. All that's left to work out are details around verbs and Link headers. I hope to bring everyone back together to build out a spec around this proposal now that I think the security issue has been solved. Hopefully we can convince GitHub to be the first implementors of it. Let me know what you think! -Brett
