On Jul 20, 2012, at 6:25 PM, Brenda Larcom wrote:
> I suppose I could unlurk at this point. :)
>
> I'm a security geek (specifically, a secure development geek focusing on
> security architecture) in my day job, and I have a long unmaintained
> architecture security analysis tool written in Squeak
> (http://www.octotrike.org/ for the curious), which I have been unmothballing.
> We are considering switching to Pharo, partly because we are planning to add
> some P2P collaboration features we think have an HTTP layer in there
> somewhere & partly because we like it small, tidy, and self-compatible.
> Hence my lurking.
Welcome and I would love to have more people working on these areas :).
> I've done some work on how data validation should be done for security
> purposes, for my day job. This includes output encoding and decoding, like
> what Davide is talking about. It's pretty tricky to get right because of the
> large number of contexts, with subtly different rules. E.g. I would expect
> encodeForHTTP to be appropriate for HTTP headers, except that e.g. two things
> you usually want to put in HTTP headers are URIs and cookies, each of which
> have different rules (for different subparts, even) for what should be
> encoded. The differences don't seem like much, but in the wild, my coworkers
> & I see these sorts of differences lead to vulnerabilities on a daily basis.
>
> From a security architecture perspective, the absolute best way to handle
> encoding & decoding for a structured object like an HTTP request or response
> (or a URI, or a cookie, or an HTML document, or..) is to use a validating
> parser. Basically, when you get an HTTP request, parse it & put it in an
> object structured like the request. At that time, you know the meaning of
> each portion of the string you are parsing, so you can interpret the bits
> correctly/safely. The object(s) should store the individual strings that are
> actually content (vs. structure & constants) in a decoded state. The
> developer should get everything from the objects, in decoded form, and put
> everything into the objects in decoded form. Then, when it is time to send
> the response, the objects encode everything safely/canonically based on the
> exact type of objects they are. This design concentrates the hard stuff
> (encoding, decoding, canonicalization, layering encodings on top of each
> other) near the interfaces, at the first/last possible moment enough context
> is known to interpret the information accurately. It separates the mechanics
> of using a protocol or format from the intent of using the protocol. It lets
> someone like me easily QA both the library and application code for security.
> It is also simple for the developer to use safely (all the dev needs to
> think about is what objects/content they want to assemble, and the data
> validation at that layer is taken care of automatically) & is therefore the
> only design pattern I have seen consistently avoid all encoding-related
> vulnerabilities in the wild.
>
> So what does this mean? Basically, from a security perspective, encoding &
> decoding methods should live in the objects they encode and decode, and never
> be called from outside code. That is, there should be an
> HTTPHeader>>fromString: or fromStream: method, which is called from an
> HTTPResponse >>fromString: or fromStream: method, and no
> String>>decodeFromHTTP. Adding a String>>decodeFromHTTP method is easy from
> the library maintainer's point of view, approximately correct (way more
> correct than no method at all), and it matches what most languages are doing
> these days, but it shifts the burden of all that thought about the specific
> HTTP header & context to the application developer, who is usually just
> trying to write an application, not learn every single detail of the HTTP &
> gazillion other standards he would need to do this safely.
>
> Since this is a suggestion for substantial architecture change that would
> cause significant backwards compatibility issues throughout the entire Web
> application stack, and I'm new to Pharo to boot, I am expecting some
> interesting discussion to occur next. Or maybe profound silence. :)
Thanks for the explanation. It makes sense. String is a dead object just
counting and assembling characters. So
Now what I would love to see is if you interested:
- how can we improve the infrastructure of Pharo?
step by step or via a big refactoring :)
- I would add a simple decodeFromHTTP as a convenience method and in
the future point to the validators.
> In my back pocket somewhere amongst the code I am unmothballing, I have 95%
> of a thouroughly documented URI implementation and test suite that follows
> this pattern and is pedantically compliant with one or another of the URI
> RFCs (it's old, may not be the most recent).
Bring it to life. We were discussing internally that we would like to have a
decent URI implementation and we would like to massively clean
the URL/URI …. with ZnURL whatever. So it would be great to have a good part.
Now what I see from your mail :) is that you are a kind of perfectionist and
you should pay attention (I know some of them) and
you should force yourself to be happy with 80% and release it
- 1 your 80% may be the 95% of somebody else
- 2 release often, make progress is the best way to finish. :)
> I believe Spoon & Slate are using a previous version of it or its
> derivatives. I'll need a fully pedantic HTTP parsing stack to feel
> comfortable releasing a P2P architecture security analysis tool (high value
> target, large attack surface, potentially very large professional
> embarrassment), so whatever isn't available, I expect we'll end up writing.
> If Pharo folks are interested in this pattern,
Yes I'm. I will let the other reply to you because I'm far down in south of
france but I'm quite sure that we are all interested.
> I would love to contribute my libraries/changes as I finish them, get advice
> on backward compatibility, performance, and APIs people would like to see,
> review whatever related code you'd like for security issues, and/or
> collaborate with any other developer who is interested.
I would love to learn from your expertise.
Stef
>
> Brenda
>
>
> On Jul 20, 2012, at 1:47 AM, Davide Varvello <[email protected]> wrote:
>
>> Good Stef, I opened a new feature as reminder here:
>> http://code.google.com/p/pharo/issues/detail?id=6430
>>
>> Davide
>>
>> ----
>> - Cerchi un bravo Dentista, Avvocato, Commercialista? Un buon Hotel,
>> Ristorante, Pizzeria? Io l'ho trovato su Oltre il Passaparola
>>
>> - Blog: Cambia il Tempo
>>
>> From: Stéphane Ducasse [via Smalltalk] <[hidden email]>
>> To: Davide Varvello <[hidden email]>
>> Sent: Thursday, July 19, 2012 10:43 PM
>> Subject: Re: The opposite of encodeForHTTP
>>
>> Let us fix it and propose a decodeFromHTTP method
>>
>> Stef
>>
>> On Jul 18, 2012, at 2:02 PM, Davide Varvello wrote:
>>
>> > Thanks Sven,
>> > I was looking for String>>decode..whatever... with no luck :-)
>> > Cheers
>> >
>> > --
>> > View this message in context:
>> > http://forum.world.st/The-opposite-of-encodeForHTTP-tp4640491p4640510.html
>> > Sent from the Pharo Smalltalk Users mailing list archive at Nabble.com.
>> >
>>
>>
>>
>>
>> If you reply to this email, your message will be added to the discussion
>> below:
>> http://forum.world.st/The-opposite-of-encodeForHTTP-tp4640491p4640822.html
>> To unsubscribe from The opposite of encodeForHTTP, click here.
>> NAML
>>
>>
>>
>> View this message in context: Re: The opposite of encodeForHTTP
>> Sent from the Pharo Smalltalk Users mailing list archive at Nabble.com.