Thanks, Norbert; I'll take a look at Zinc, see how my existing code might integrate, and propose something specific. I personally think having an insecure option for things like URIs and HTTP that are inherently on borders almost all the time is unwise, but I'm happy to resolve my personal issues via documentation. :)
One reason validating parsers are so powerful is that when layers stack as you mentioned, the security starts working as soon as the functional part does. I agree, such a parser definitely belongs at the borders of interpretation schemes, not inside them. Inside them it'll just use up time without providing value. Conveniently, the tool people naturally reach for at interpretation borders usually has a parser in it someplace. And yes, there do seem to be a particular lot of fiddly bits in URIs. So fiddly a few of the examples in the RFCs (as usual) don't match the rest of the spec. Brenda On Jul 20, 2012, at 10:50 AM, Norbert Hartl <[email protected]> wrote: > Brenda, > > these are all good points as you said from a "security architecture > perspective" and we should improve on that. The zinc http components do > already a good job in structuring the entities as they should be. I think > security add-ons can hook onto what is already there. There is a huge amount > of things to consider. Even for a single URL the different components of an > url have different encoding needs. > On the other hand security is not a major target in a lot of use cases I can > imagine. There is at least (for me) a triangle of security - performance - > usability that makes it hard to have a single approach to fit them all. And > we smalltalkers tend to judge freedom very high if it comes to program. In > other words I would say we like to preserve the freedom of designing an > insecure application at will :) The best way to solve those issues is by > being modular, meaning a layer that can be put on top of the existing stuff > to fulfill a particular use case. > The things you describe are present in a lot of environments. I mostly call > this a "at the border of a system" problem. Things like strings inside of an > environment are harmless. Problems appear if you cross system borders, > meaning you cross interpretation schemes. And this a topic more broad then > only HTTP. > If we look at a widely known problem like sql injection there is not only the > need for proper entity handling but for stacking validators and converters > for different problems. It is such a big thing because you have an URL that > goes through middleware and ends in a storage system like an SQL database. > Here you cross at least two borders: HTTP to middleware and middleware to > database. So you need to stack up converters and validators for HTTP, > probably shell escapes in a middleware and finally for SQL. I think if you > can assemble those things by the layers you use a security approach is > doable. And for the same reason it goes so terribly wrong everywhere. > So what does this modular thing mean? To have a lot of possibilities to > fulfill certain needs without restricting everyone to a single scheme. > My advice would be to have a look at the zinc components and propose things > to improve from your perspective. Then publish your results here and there > will be a lot of clever people finding a good way to integrate it in a > modular way. > > I hope this helps, > > Norbert > > Am 20.07.2012 um 18:25 schrieb Brenda Larcom: > >> I suppose I could unlurk at this point. :) >> >> I'm a security geek (specifically, a secure development geek focusing on >> security architecture) in my day job, and I have a long unmaintained >> architecture security analysis tool written in Squeak >> (http://www.octotrike.org/ for the curious), which I have been >> unmothballing. We are considering switching to Pharo, partly because we are >> planning to add some P2P collaboration features we think have an HTTP layer >> in there somewhere & partly because we like it small, tidy, and >> self-compatible. Hence my lurking. >> >> I've done some work on how data validation should be done for security >> purposes, for my day job. This includes output encoding and decoding, like >> what Davide is talking about. It's pretty tricky to get right because of >> the large number of contexts, with subtly different rules. E.g. I would >> expect encodeForHTTP to be appropriate for HTTP headers, except that e.g. >> two things you usually want to put in HTTP headers are URIs and cookies, >> each of which have different rules (for different subparts, even) for what >> should be encoded. The differences don't seem like much, but in the wild, >> my coworkers & I see these sorts of differences lead to vulnerabilities on a >> daily basis. >> >> From a security architecture perspective, the absolute best way to handle >> encoding & decoding for a structured object like an HTTP request or response >> (or a URI, or a cookie, or an HTML document, or..) is to use a validating >> parser. Basically, when you get an HTTP request, parse it & put it in an >> object structured like the request. At that time, you know the meaning of >> each portion of the string you are parsing, so you can interpret the bits >> correctly/safely. The object(s) should store the individual strings that >> are actually content (vs. structure & constants) in a decoded state. The >> developer should get everything from the objects, in decoded form, and put >> everything into the objects in decoded form. Then, when it is time to send >> the response, the objects encode everything safely/canonically based on the >> exact type of objects they are. This design concentrates the hard stuff >> (encoding, decoding, canonicalization, layering encodings on top of each >> other) near the interfaces, at the first/last possible moment enough context >> is known to interpret the information accurately. It separates the >> mechanics of using a protocol or format from the intent of using the >> protocol. It lets someone like me easily QA both the library and >> application code for security. It is also simple for the developer to use >> safely (all the dev needs to think about is what objects/content they want >> to assemble, and the data validation at that layer is taken care of >> automatically) & is therefore the only design pattern I have seen >> consistently avoid all encoding-related vulnerabilities in the wild. >> >> So what does this mean? Basically, from a security perspective, encoding & >> decoding methods should live in the objects they encode and decode, and >> never be called from outside code. That is, there should be an >> HTTPHeader>>fromString: or fromStream: method, which is called from an >> HTTPResponse >>fromString: or fromStream: method, and no >> String>>decodeFromHTTP. Adding a String>>decodeFromHTTP method is easy >> from the library maintainer's point of view, approximately correct (way more >> correct than no method at all), and it matches what most languages are doing >> these days, but it shifts the burden of all that thought about the specific >> HTTP header & context to the application developer, who is usually just >> trying to write an application, not learn every single detail of the HTTP & >> gazillion other standards he would need to do this safely. >> >> Since this is a suggestion for substantial architecture change that would >> cause significant backwards compatibility issues throughout the entire Web >> application stack, and I'm new to Pharo to boot, I am expecting some >> interesting discussion to occur next. Or maybe profound silence. :) >> >> In my back pocket somewhere amongst the code I am unmothballing, I have 95% >> of a thouroughly documented URI implementation and test suite that follows >> this pattern and is pedantically compliant with one or another of the URI >> RFCs (it's old, may not be the most recent). I believe Spoon & Slate are >> using a previous version of it or its derivatives. I'll need a fully >> pedantic HTTP parsing stack to feel comfortable releasing a P2P architecture >> security analysis tool (high value target, large attack surface, potentially >> very large professional embarrassment), so whatever isn't available, I >> expect we'll end up writing. If Pharo folks are interested in this pattern, >> I would love to contribute my libraries/changes as I finish them, get advice >> on backward compatibility, performance, and APIs people would like to see, >> review whatever related code you'd like for security issues, and/or >> collaborate with any other developer who is interested. >> >> Brenda >> >> >> On Jul 20, 2012, at 1:47 AM, Davide Varvello <[email protected]> wrote: >> >>> Good Stef, I opened a new feature as reminder here: >>> http://code.google.com/p/pharo/issues/detail?id=6430 >>> >>> Davide >>> >>> ---- >>> - Cerchi un bravo Dentista, Avvocato, Commercialista? Un buon Hotel, >>> Ristorante, Pizzeria? Io l'ho trovato su Oltre il Passaparola >>> >>> - Blog: Cambia il Tempo >>> >>> From: Stéphane Ducasse [via Smalltalk] <[hidden email]> >>> To: Davide Varvello <[hidden email]> >>> Sent: Thursday, July 19, 2012 10:43 PM >>> Subject: Re: The opposite of encodeForHTTP >>> >>> Let us fix it and propose a decodeFromHTTP method >>> >>> Stef >>> >>> On Jul 18, 2012, at 2:02 PM, Davide Varvello wrote: >>> >>> > Thanks Sven, >>> > I was looking for String>>decode..whatever... with no luck :-) >>> > Cheers >>> > >>> > -- >>> > View this message in context: >>> > http://forum.world.st/The-opposite-of-encodeForHTTP-tp4640491p4640510.html >>> > Sent from the Pharo Smalltalk Users mailing list archive at Nabble.com. >>> > >>> >>> >>> >>> >>> If you reply to this email, your message will be added to the discussion >>> below: >>> http://forum.world.st/The-opposite-of-encodeForHTTP-tp4640491p4640822.html >>> To unsubscribe from The opposite of encodeForHTTP, click here. >>> NAML >>> >>> >>> >>> View this message in context: Re: The opposite of encodeForHTTP >>> Sent from the Pharo Smalltalk Users mailing list archive at Nabble.com. >
smime.p7s
Description: S/MIME cryptographic signature
