Re: [Pharo-users] The opposite of encodeForHTTP

Norbert Hartl Fri, 20 Jul 2012 10:50:27 -0700

Brenda,

these are all good points as you said from a "security architecture 
perspective" and we should improve on that. The zinc http components do already 
a good job in structuring the entities as they should be. I think security 
add-ons can hook onto what is already there. There is a huge amount of things 
to consider. Even for a single URL the different components of an url have 
different encoding needs. 
On the other hand security is not a major target in a lot of use cases I can 
imagine. There is at least (for me) a triangle of security - performance - 
usability that makes it hard to have a single approach to fit them all. And we 
smalltalkers tend to judge freedom very high if it comes to program. In other 
words I would say we like to preserve the freedom of designing an insecure 
application at will :) The best way to solve those issues is by being modular, 
meaning a layer that can be put on top of the existing stuff to fulfill a 
particular use case.
The things you describe are present in a lot of environments. I mostly call 
this a "at the border of a system" problem. Things like strings inside of an 
environment are harmless. Problems appear if you cross system borders, meaning 
you cross interpretation schemes. And this a topic more broad then only HTTP. 
If we look at a widely known problem like sql injection there is not only the 
need for proper entity handling but for stacking validators and converters for 
different problems. It is such a big thing because you have an URL that goes 
through middleware and ends in a storage system like an SQL database. Here you 
cross at least two borders: HTTP to middleware and middleware to database. So 
you need to stack up converters and validators for HTTP, probably shell escapes 
in a middleware and finally for SQL. I think if you can assemble those things 
by the layers you use a security approach is doable. And for the same reason it 
goes so terribly wrong everywhere. 
So what does this modular thing mean? To have a lot of possibilities to fulfill 
certain needs without restricting everyone to a single scheme. 
My advice would be to have a look at the zinc components and propose things to 
improve from your perspective. Then publish your results here and there will be 
a lot of clever people finding a good way to integrate it in a modular way.


I hope this helps,

Norbert

Am 20.07.2012 um 18:25 schrieb Brenda Larcom:

> I suppose I could unlurk at this point.  :)
> 
> I'm a security geek (specifically, a secure development geek focusing on 
> security architecture) in my day job, and I have a long unmaintained 
> architecture security analysis tool written in Squeak 
> (http://www.octotrike.org/ for the curious), which I have been unmothballing. 
>  We are considering switching to Pharo, partly because we are planning to add 
> some P2P collaboration features we think have an HTTP layer in there 
> somewhere & partly because we like it small, tidy, and self-compatible.  
> Hence my lurking.
> 
> I've done some work on how data validation should be done for security 
> purposes, for my day job.  This includes output encoding and decoding, like 
> what Davide is talking about.  It's pretty tricky to get right because of the 
> large number of contexts, with subtly different rules.  E.g. I would expect 
> encodeForHTTP to be appropriate for HTTP headers, except that e.g. two things 
> you usually want to put in HTTP headers are URIs and cookies, each of which 
> have different rules (for different subparts, even) for what should be 
> encoded.  The differences don't seem like much, but in the wild, my coworkers 
> & I see these sorts of differences lead to vulnerabilities on a daily basis.
> 
> From a security architecture perspective, the absolute best way to handle 
> encoding & decoding for a structured object like an HTTP request or response 
> (or a URI, or a cookie, or an HTML document, or..) is to use a validating 
> parser.  Basically, when you get an HTTP request, parse it & put it in an 
> object structured like the request.  At that time, you know the meaning of 
> each portion of the string you are parsing, so you can interpret the bits 
> correctly/safely.  The object(s) should store the individual strings that are 
> actually content (vs. structure & constants) in a decoded state.  The 
> developer should get everything from the objects, in decoded form, and put 
> everything into the objects in decoded form.  Then, when it is time to send 
> the response, the objects encode everything safely/canonically based on the 
> exact type of objects they are.  This design concentrates the hard stuff 
> (encoding, decoding, canonicalization, layering encodings on top of each 
> other) near the interfaces, at the first/last possible moment enough context 
> is known to interpret the information accurately.  It separates the mechanics 
> of using a protocol or format from the intent of using the protocol.  It lets 
> someone like me easily QA both the library and application code for security. 
>  It is also simple for the developer to use safely (all the dev needs to 
> think about is what objects/content they want to assemble, and the data 
> validation at that layer is taken care of automatically) & is therefore the 
> only design pattern I have seen consistently avoid all encoding-related 
> vulnerabilities in the wild.  
> 
> So what does this mean?  Basically, from a security perspective, encoding & 
> decoding methods should live in the objects they encode and decode, and never 
> be called from outside code.  That is, there should be an 
> HTTPHeader>>fromString: or fromStream: method, which is called from an 
> HTTPResponse >>fromString: or fromStream: method, and no 
> String>>decodeFromHTTP.   Adding a String>>decodeFromHTTP method is easy from 
> the library maintainer's point of view, approximately correct (way more 
> correct than no method at all), and it matches what most languages are doing 
> these days, but it shifts the burden of all that thought about the specific 
> HTTP header & context to the application developer, who is usually just 
> trying to write an application, not learn every single detail of the HTTP & 
> gazillion other standards he would need to do this safely.
> 
> Since this is a suggestion for substantial architecture change that would 
> cause significant backwards compatibility issues throughout the entire Web 
> application stack, and I'm new to Pharo to boot, I am expecting some 
> interesting discussion to occur next.  Or maybe profound silence.  :)
> 
> In my back pocket somewhere amongst the code I am unmothballing, I have 95% 
> of a thouroughly documented URI implementation and test suite that follows 
> this pattern and is pedantically compliant with one or another of the URI 
> RFCs (it's old, may not be the most recent).  I believe Spoon & Slate are 
> using a previous version of it or its derivatives.  I'll need a fully 
> pedantic HTTP parsing stack to feel comfortable releasing a P2P architecture 
> security analysis tool (high value target, large attack surface, potentially 
> very large professional embarrassment), so whatever isn't available, I expect 
> we'll end up writing.  If Pharo folks are interested in this pattern, I would 
> love to contribute my libraries/changes as I finish them, get advice on 
> backward compatibility, performance, and APIs people would like to see, 
> review whatever related code you'd like for security issues, and/or 
> collaborate with any other developer who is interested.
> 
> Brenda
> 
> 
> On Jul 20, 2012, at 1:47 AM, Davide Varvello <[email protected]> wrote:
> 
>> Good Stef, I opened a new feature as reminder here: 
>> http://code.google.com/p/pharo/issues/detail?id=6430
>>  
>> Davide
>> 
>> ----
>> - Cerchi un bravo Dentista, Avvocato, Commercialista? Un buon Hotel, 
>> Ristorante, Pizzeria? Io l'ho trovato su Oltre il Passaparola
>> 
>> - Blog: Cambia il Tempo
>> 
>> From: Stéphane Ducasse [via Smalltalk] <[hidden email]>
>> To: Davide Varvello <[hidden email]> 
>> Sent: Thursday, July 19, 2012 10:43 PM
>> Subject: Re: The opposite of encodeForHTTP
>> 
>> Let us fix it and propose a decodeFromHTTP method 
>> 
>> Stef 
>> 
>> On Jul 18, 2012, at 2:02 PM, Davide Varvello wrote: 
>> 
>> > Thanks Sven, 
>> > I was looking for String>>decode..whatever... with no luck :-) 
>> > Cheers 
>> > 
>> > -- 
>> > View this message in context: 
>> > http://forum.world.st/The-opposite-of-encodeForHTTP-tp4640491p4640510.html
>> > Sent from the Pharo Smalltalk Users mailing list archive at Nabble.com. 
>> > 
>> 
>> 
>> 
>> 
>> If you reply to this email, your message will be added to the discussion 
>> below:
>> http://forum.world.st/The-opposite-of-encodeForHTTP-tp4640491p4640822.html
>> To unsubscribe from The opposite of encodeForHTTP, click here.
>> NAML
>> 
>> 
>> 
>> View this message in context: Re: The opposite of encodeForHTTP
>> Sent from the Pharo Smalltalk Users mailing list archive at Nabble.com.

Re: [Pharo-users] The opposite of encodeForHTTP

Reply via email to