Hi Andrew, On Wed, Sep 19, 2012 at 4:39 PM, Andrew Faulds <a...@ajf.me> wrote: >> Than I suggest including "For" in all of them. escapeForHtml, >> escapeForUrl, etc. That should clear it up somewhat that we're not >> targeting whole blocks of HTML/JS/CSS. > > That still isn't clear enough, I think. escapeHTMLAttributeValue and > escapeHTMLText. It needs to be clear what HTML context you're dealing with.
I think we're running into being overly prescriptive. Escaping can never, by definition, apply to anything that isn't a value string or integer, i.e. anything that is capable of altering the meaning of the HTML/Javascript or CSS into which its inserted. The original function names applied to specific escaping "strategies" rather than the actual locations that that strategy was useful for. There is only one HTML escaping strategy, one Javascript escaping strategy, etc. >>>> For example, I'd prefer escapeForCss vs escapeCSSStringLiteral though >>>> both would be valid English literal alternatives to escapeCss. >>> >>> You can't just have escapeForCSS, you need two functions: one for CSS >>> identifier names (.identifier, #identifier, etc.), and one for CSS >>> strings >>> (background-image: url('string'); or content: 'string') >> >> Not really, the target here is breaking out of a CSS or HTML context. >> If you allow users to alter identifiers or properties than escaping is >> just wrong - you should be sanitising instead to make sure the CSS is >> still well formed and agrees to a whitelist of allowed ids/props. > > If property values or identifiers can't be escaped, what can? What do you > mean? > > Are you meaning in style="" or <style></style>? In which case, why have it? > You can just use a bog-standard HTML escaping function. <style> body { background-color: <? echo $e->escapeCss('white'); ?> } </style> In the above we escape the value so that an attacker cannot "breakout" into the CSS propery setting context to create new styles or identifier blocks. So it applies only to the values of properties and nothing else. If anything else is allowed to be subject to user input - then escaping becomes moot because you now need to perform CSS sanitisation to a whitelist to prevent the injection of phishing or clickjacking attacks. The documentation (if not the RFC) should be the place to emphasise when and where to use escaping functions. >>> Also, escapeForJS isn't very clear, you should explicitly specify you're >>> escaping a string of text for a JavaScript string literal. I don't think >>> you >>> can escape JS identifier names. >> >> JS is purely for literal values and not any JS variables/statements or >> anything else. Those can never ever be subject to any form of >> untrusted input. > > It needs to be clear it's a string literal though, and a literal at that. > Otherwise it's a little unclear. Still, I'm more worried about the CSS. Again, I think being overly prescriptive is unnecessary. It has semantic value, of course, but the game is already lost if the user isn't aware of where to safely apply escaping (a different problem to applying the correct encoding over the wrong encoding). I think we're bumping into a slightly different area of education here. Once users know where escaping applies, the names even in their shorter forms are fairly obvious as to which context they apply to. I think that specific education is better served with good quality documentation and examples (I'm all for docs with a dose of reality). Paddy -- Pádraic Brady http://blog.astrumfutura.com http://www.survivethedeepend.com Zend Framework Community Review Team -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php