John Dennis wrote:
On 01/29/2010 09:28 AM, Rob Crittenden wrote:
John Dennis wrote:
On 01/28/2010 10:30 PM, Rob Crittenden wrote:
John Dennis wrote:
On 01/28/2010 04:15 PM, Rob Crittenden wrote:
Gah, got the description mixed up with the last patch :-(

Be a bit smarter about decoding certificates that might be base64
encoded. First see if it only contains those characters allowed before
trying to decode it. This reduces the number of false positives.

I'm not sure the test is doing what you want or even if it's the right
test.

The test is saying "If there is one or more characters in the bas64
alphabet then try and decode. That means just about anything will
match, which doesn't seem like a very strong test.

Why not just try and decode it and let the decoder decide if it's
really base64, the decoder has much strong rules about the input,
including assuring the padding is correct.


The reason is I had a binary cert that was correctly decoded by the
base64 encoder. I don't know the why's and wherefores but there it is.

Then testing to see if each byte is in the base64 alphabet would not
have prevented this error.

And yet it did in practice. I think you're assuming too much about the
input testing in base64.b64decode(). It gladly takes binary data, as
long as it fits the expected padding.

You're right, I just went and checked the code, it skips any char not in the base64 alphabet :-(


For a while now I've been feeling like we need to associate a format
attribute to the certificate (e.g. DER, PEM, BASE64, etc.).

There is simply no good way to carry that extra data when all you have
is a blob of data. We'd still need some mechanism to look at it and ask
"what are you?" That or we simply reject some types of input.

My concern is that correctly deducing what an object is just by scanning it's contents is not robust. As you've seen it's easy to draw the wrong conclusion. Rather if the convention is "it must be an object in this format" (e.g. canonical) then there is no reason to even ask the question, it's simpler and more robust for most of our (internal) code, we only have to worry about it at the interface boundaries.

So who enforces the canonical format? The only place we have to be concerned is when it's user provided, any item we produce will be guaranteed to be in the canonical format (hopefully :-). That just means at our interface boundaries we *must* specify the canonical format.

If we're taking input from the user on the command line we offer them the option of "input as pem", "input as der", "input as base64", try to validate as best we can trusting the user has told us the correct format and then convert to the canonical format.

Think about the openssl x509 utilities, with those you must specify the input format.

If we're taking input through an exposed API we do essentially the same thing. Require the format be passed along with the data, validate as best we can, and convert to the canonical format as it enters our system.

BTW, by having the user/caller indicate the format they're providing will make the validation more robust, for example if it's stated the data is in DER format then there is no reason to even try to see if it can be base64 decoded which might lead to a false positive. Likewise if it's stated it's in pem format it must have the header and footer.

Bottom line, I'm leery of trying to guess at random points what the format is, it's too easy for the guessing logic to draw the wrong conclusion, I'd much rather see it be explicit.

Perhaps but validators take a single argument so there is no way to pass in type.

rob

_______________________________________________
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel

Reply via email to