John Dennis wrote: > The Problem: > ------------ > > I've been looking at the encoding exception which is being thrown when > you click on the "Services" menu item in our current implementation. > By default we seem to be using JSON as our RPC mechanism. The > exception is being thrown when the JSON encoder hits a certificate. > Recall that we store certificates in LDAP as binary data and in our > implementation we distinguish binary data from text by Python object > type, text is *always* a unicode object and binary data is *always* a > str object. However in Python 2.x str objects are believed to be text > and are subject to encoding/decoding in many parts of the Python world. > > Unlike XML-RPC JSON does *not* have a binary type. In JSON there are > *only* unicode strings. So what is happening is that that when the > JSON encoder sees our certificate data in a str object it says "str > objects are text and we have to produce a UTF-8 unicode encoding from > that str object". There's the problem! It's completely nonsensical to > try and encode binary to to UTF-8. > > The right way to handle this is to encode the binary data to base64 > ASCII text and then hand it to JSON. FWIW our XML-RPC handler does > this already because XML-RPC knows about binary data and elects to > encode/decode it to base64 as it's marshaled and unmarshaled. But JSON > can't do this during marhasling and unmarshaling because the JSON > protocol has no concept of binary data. > > The python JSON encoder class does give us the option to hook into the > encoder and check if the object is a str object and then base64 > encode. But that doesn't help us at the opposite end. How would we > know when unmarshaling that a given string is supposed to be base64 > decoded back into binary data? We could prepend a special string and > hope that string never gets used by normal text (yuck). Keeping a list > of what needs base64 decoding is not an option within JSON because at > the time of decoding we have no information available about the > context of the JSON objects. > > That means if we want to use JSON we really should push the base64 > encode/decode to the parts of the code which have a priori knowledge > about the objects they're pushing through the command interface. This > would mean any command which passes a certificate should base64 encode > it prior to sending it and base64 decode after it come back from a > command result. Actually it would be preferable to use PEM encoding, > and by the way, the whole reason why PEM encodings for certificates > was developed was exactly for this scenario: transporting a > certificate through a text based interchange mechanism! > > Possible Solutions: > ------------------- > > As I see it we have these options in front of us for how to deal with > this problem: > > * Drop support for JSON, only use XML-RPC > > * Once we read a certificate from LDAP immediately convert it to PEM > format. Adopt the convention that anytime we exchange certificates it > will be in PEM format. Only convert from PEM format when the target > demands binary (e.g. storing it in LDAP, passing it to a library > expecting DER encoded data, etc.). > > * Come up with some hacky protocol on top of JSON which signals "this > string is really binary" and check for it on every JSON encode/decode > and cross our fingers no one tries to send a legitimate string which > would trigger the encode/decode. > > Question: Are certificates the one and only example of binary data we > exchange? > > Recommendation: > --------------- > > My personal recommendation is we adopt the convention that > certificates are always PEM encoded. We've already run into many > problems trying to deduce what format a certificate is (e.g. binary, > base64, PEM) I think it would be good if we just put a stake in the > ground and said "certificates are always PEM encoded" and be done with > all these problems we keep having with the data type of certificates. > > As an aside I'm also skeptical of the robustness of allowing binary > data at all in our implementation. Trying to support binary data has > been nothing but a headache and a source of many many bugs. Do we > really need it? > Yeah, a good Friday afternoon problem to solve... +1 to your recommendations, though I am not a specialist, but suggestion seems logical.
-- Thank you, Dmitri Pal Engineering Manager IPA project, Red Hat Inc. ------------------------------- Looking to carve out IT costs? www.redhat.com/carveoutcosts/ _______________________________________________ Freeipa-devel mailing list [email protected] https://www.redhat.com/mailman/listinfo/freeipa-devel
