Hi Andrew,

On Wed, Sep 19, 2012 at 4:39 PM, Andrew Faulds <a...@ajf.me> wrote:
>> Than I suggest including "For" in all of them. escapeForHtml,
>> escapeForUrl, etc. That should clear it up somewhat that we're not
>> targeting whole blocks of HTML/JS/CSS.
>
> That still isn't clear enough, I think. escapeHTMLAttributeValue and
> escapeHTMLText. It needs to be clear what HTML context you're dealing with.

I think we're running into being overly prescriptive. Escaping can
never, by definition, apply to anything that isn't a value string or
integer, i.e. anything that is capable of altering the meaning of the
HTML/Javascript or CSS into which its inserted. The original function
names applied to specific escaping "strategies" rather than the actual
locations that that strategy was useful for. There is only one HTML
escaping strategy, one Javascript escaping strategy, etc.

>>>> For example, I'd prefer escapeForCss vs escapeCSSStringLiteral though
>>>> both would be valid English literal alternatives to escapeCss.
>>>
>>> You can't just have escapeForCSS, you need two functions: one for CSS
>>> identifier names (.identifier, #identifier, etc.), and one for CSS
>>> strings
>>> (background-image: url('string'); or content: 'string')
>>
>> Not really, the target here is breaking out of a CSS or HTML context.
>> If you allow users to alter identifiers or properties than escaping is
>> just wrong - you should be sanitising instead to make sure the CSS is
>> still well formed and agrees to a whitelist of allowed ids/props.
>
> If property values or identifiers can't be escaped, what can? What do you
> mean?
>
> Are you meaning in style="" or <style></style>? In which case, why have it?
> You can just use a bog-standard HTML escaping function.

<style>
body {
 background-color: <? echo $e->escapeCss('white'); ?>
}
</style>

In the above we escape the value so that an attacker cannot "breakout"
into the CSS propery setting context to create new styles or
identifier blocks. So it applies only to the values of properties and
nothing else. If anything else is allowed to be subject to user input
- then escaping becomes moot because you now need to perform CSS
sanitisation to a whitelist to prevent the injection of phishing or
clickjacking attacks.

The documentation (if not the RFC) should be the place to emphasise
when and where to use escaping functions.

>>> Also, escapeForJS isn't very clear, you should explicitly specify you're
>>> escaping a string of text for a JavaScript string literal. I don't think
>>> you
>>> can escape JS identifier names.
>>
>> JS is purely for literal values and not any JS variables/statements or
>> anything else. Those can never ever be subject to any form of
>> untrusted input.
>
> It needs to be clear it's a string literal though, and a literal at that.
> Otherwise it's a little unclear. Still, I'm more worried about the CSS.

Again, I think being overly prescriptive is unnecessary. It has
semantic value, of course, but the game is already lost if the user
isn't aware of where to safely apply escaping (a different problem to
applying the correct encoding over the wrong encoding). I think we're
bumping into a slightly different area of education here. Once users
know where escaping applies, the names even in their shorter forms are
fairly obvious as to which context they apply to. I think that
specific education is better served with good quality documentation
and examples (I'm all for docs with a dose of reality).

Paddy

-- 
Pádraic Brady

http://blog.astrumfutura.com
http://www.survivethedeepend.com
Zend Framework Community Review Team

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to