John Dennis wrote:
The Problem:
I've been looking at the encoding exception which is being thrown when
you click on the Services menu item in our current implementation.
By default we seem to be using JSON as our RPC mechanism. The
exception is being thrown when the JSON encoder hits a certificate.
Recall that we store certificates in LDAP as binary data and in our
implementation we distinguish binary data from text by Python object
type, text is *always* a unicode object and binary data is *always* a
str object. However in Python 2.x str objects are believed to be text
and are subject to encoding/decoding in many parts of the Python world.
Unlike XML-RPC JSON does *not* have a binary type. In JSON there are
*only* unicode strings. So what is happening is that that when the
JSON encoder sees our certificate data in a str object it says str
objects are text and we have to produce a UTF-8 unicode encoding from
that str object. There's the problem! It's completely nonsensical to
try and encode binary to to UTF-8.
The right way to handle this is to encode the binary data to base64
ASCII text and then hand it to JSON. FWIW our XML-RPC handler does
this already because XML-RPC knows about binary data and elects to
encode/decode it to base64 as it's marshaled and unmarshaled. But JSON
can't do this during marhasling and unmarshaling because the JSON
protocol has no concept of binary data.
The python JSON encoder class does give us the option to hook into the
encoder and check if the object is a str object and then base64
encode. But that doesn't help us at the opposite end. How would we
know when unmarshaling that a given string is supposed to be base64
decoded back into binary data? We could prepend a special string and
hope that string never gets used by normal text (yuck). Keeping a list
of what needs base64 decoding is not an option within JSON because at
the time of decoding we have no information available about the
context of the JSON objects.
That means if we want to use JSON we really should push the base64
encode/decode to the parts of the code which have a priori knowledge
about the objects they're pushing through the command interface. This
would mean any command which passes a certificate should base64 encode
it prior to sending it and base64 decode after it come back from a
command result. Actually it would be preferable to use PEM encoding,
and by the way, the whole reason why PEM encodings for certificates
was developed was exactly for this scenario: transporting a
certificate through a text based interchange mechanism!
Possible Solutions:
---
As I see it we have these options in front of us for how to deal with
this problem:
* Drop support for JSON, only use XML-RPC
* Once we read a certificate from LDAP immediately convert it to PEM
format. Adopt the convention that anytime we exchange certificates it
will be in PEM format. Only convert from PEM format when the target
demands binary (e.g. storing it in LDAP, passing it to a library
expecting DER encoded data, etc.).
* Come up with some hacky protocol on top of JSON which signals this
string is really binary and check for it on every JSON encode/decode
and cross our fingers no one tries to send a legitimate string which
would trigger the encode/decode.
Question: Are certificates the one and only example of binary data we
exchange?
Recommendation:
---
My personal recommendation is we adopt the convention that
certificates are always PEM encoded. We've already run into many
problems trying to deduce what format a certificate is (e.g. binary,
base64, PEM) I think it would be good if we just put a stake in the
ground and said certificates are always PEM encoded and be done with
all these problems we keep having with the data type of certificates.
As an aside I'm also skeptical of the robustness of allowing binary
data at all in our implementation. Trying to support binary data has
been nothing but a headache and a source of many many bugs. Do we
really need it?
Yeah, a good Friday afternoon problem to solve...
+1 to your recommendations, though I am not a specialist, but suggestion
seems logical.
--
Thank you,
Dmitri Pal
Engineering Manager IPA project,
Red Hat Inc.
---
Looking to carve out IT costs?
www.redhat.com/carveoutcosts/
___
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel