Re: [Pharo-users] The opposite of encodeForHTTP

Brenda Larcom Fri, 20 Jul 2012 09:25:53 -0700

I suppose I could unlurk at this point.  :)

I'm a security geek (specifically, a secure development geek focusing on 
security architecture) in my day job, and I have a long unmaintained 
architecture security analysis tool written in Squeak 
(http://www.octotrike.org/ for the curious), which I have been unmothballing.  
We are considering switching to Pharo, partly because we are planning to add 
some P2P collaboration features we think have an HTTP layer in there somewhere 
& partly because we like it small, tidy, and self-compatible.  Hence my lurking.

I've done some work on how data validation should be done for security 
purposes, for my day job.  This includes output encoding and decoding, like 
what Davide is talking about.  It's pretty tricky to get right because of the 
large number of contexts, with subtly different rules.  E.g. I would expect 
encodeForHTTP to be appropriate for HTTP headers, except that e.g. two things 
you usually want to put in HTTP headers are URIs and cookies, each of which 
have different rules (for different subparts, even) for what should be encoded. 
 The differences don't seem like much, but in the wild, my coworkers & I see 
these sorts of differences lead to vulnerabilities on a daily basis.

From a security architecture perspective, the absolute best way to handle 
encoding & decoding for a structured object like an HTTP request or response 
(or a URI, or a cookie, or an HTML document, or..) is to use a validating 
parser.  Basically, when you get an HTTP request, parse it & put it in an 
object structured like the request.  At that time, you know the meaning of each 
portion of the string you are parsing, so you can interpret the bits 
correctly/safely.  The object(s) should store the individual strings that are 
actually content (vs. structure & constants) in a decoded state.  The developer 
should get everything from the objects, in decoded form, and put everything 
into the objects in decoded form.  Then, when it is time to send the response, 
the objects encode everything safely/canonically based on the exact type of 
objects they are.  This design concentrates the hard stuff (encoding, decoding, 
canonicalization, layering encodings on top of each other) near the interfaces, 
at the first/last possible moment enough context is known to interpret the 
information accurately.  It separates the mechanics of using a protocol or 
format from the intent of using the protocol.  It lets someone like me easily 
QA both the library and application code for security.  It is also simple for 
the developer to use safely (all the dev needs to think about is what 
objects/content they want to assemble, and the data validation at that layer is 
taken care of automatically) & is therefore the only design pattern I have seen 
consistently avoid all encoding-related vulnerabilities in the wild.  

So what does this mean?  Basically, from a security perspective, encoding & 
decoding methods should live in the objects they encode and decode, and never 
be called from outside code.  That is, there should be an 
HTTPHeader>>fromString: or fromStream: method, which is called from an 
HTTPResponse >>fromString: or fromStream: method, and no 
String>>decodeFromHTTP.   Adding a String>>decodeFromHTTP method is easy from 
the library maintainer's point of view, approximately correct (way more correct 
than no method at all), and it matches what most languages are doing these 
days, but it shifts the burden of all that thought about the specific HTTP 
header & context to the application developer, who is usually just trying to 
write an application, not learn every single detail of the HTTP & gazillion 
other standards he would need to do this safely.

Since this is a suggestion for substantial architecture change that would cause 
significant backwards compatibility issues throughout the entire Web 
application stack, and I'm new to Pharo to boot, I am expecting some 
interesting discussion to occur next.  Or maybe profound silence.  :)

In my back pocket somewhere amongst the code I am unmothballing, I have 95% of 
a thouroughly documented URI implementation and test suite that follows this 
pattern and is pedantically compliant with one or another of the URI RFCs (it's 
old, may not be the most recent).  I believe Spoon & Slate are using a previous 
version of it or its derivatives.  I'll need a fully pedantic HTTP parsing 
stack to feel comfortable releasing a P2P architecture security analysis tool 
(high value target, large attack surface, potentially very large professional 
embarrassment), so whatever isn't available, I expect we'll end up writing.  If 
Pharo folks are interested in this pattern, I would love to contribute my 
libraries/changes as I finish them, get advice on backward compatibility, 
performance, and APIs people would like to see, review whatever related code 
you'd like for security issues, and/or collaborate with any other developer who 
is interested.

Brenda

On Jul 20, 2012, at 1:47 AM, Davide Varvello <[email protected]> wrote:

> Good Stef, I opened a new feature as reminder here: 
> http://code.google.com/p/pharo/issues/detail?id=6430
>  
> Davide
> 
> ----
> - Cerchi un bravo Dentista, Avvocato, Commercialista? Un buon Hotel, 
> Ristorante, Pizzeria? Io l'ho trovato su Oltre il Passaparola
> 
> - Blog: Cambia il Tempo
> 
> From: Stéphane Ducasse [via Smalltalk] <[hidden email]>
> To: Davide Varvello <[hidden email]> 
> Sent: Thursday, July 19, 2012 10:43 PM
> Subject: Re: The opposite of encodeForHTTP
> 
> Let us fix it and propose a decodeFromHTTP method 
> 
> Stef 
> 
> On Jul 18, 2012, at 2:02 PM, Davide Varvello wrote: 
> 
> > Thanks Sven, 
> > I was looking for String>>decode..whatever... with no luck :-) 
> > Cheers 
> > 
> > -- 
> > View this message in context: 
> > http://forum.world.st/The-opposite-of-encodeForHTTP-tp4640491p4640510.html
> > Sent from the Pharo Smalltalk Users mailing list archive at Nabble.com. 
> > 
> 
> 
> 
> 
> If you reply to this email, your message will be added to the discussion 
> below:
> http://forum.world.st/The-opposite-of-encodeForHTTP-tp4640491p4640822.html
> To unsubscribe from The opposite of encodeForHTTP, click here.
> NAML
> 
> 
> 
> View this message in context: Re: The opposite of encodeForHTTP
> Sent from the Pharo Smalltalk Users mailing list archive at Nabble.com.

smime.p7s
Description: S/MIME cryptographic signature

Re: [Pharo-users] The opposite of encodeForHTTP

Reply via email to