On Mon, 2010-03-15 at 12:48 +0000, Colin Guthrie wrote:

> 'Twas brillig, and Jochem Maas at 14/03/10 23:56 did gyre and gimble:
> > Op 3/14/10 11:45 AM, Ashley Sheridan schreef:
> >> On Sun, 2010-03-14 at 12:25 +0100, Rene Veerman wrote:
> >>
> >>> On Sun, Mar 14, 2010 at 12:24 PM, Rene Veerman <rene7...@gmail.com> wrote:
> >>>>
> >>>> I'd love to have a copy of whatever function you use to filter out bad
> >>>> HTML/js/flash for use cases where users are allowed to enter html.
> >>>> I'm aware of strip_tags() "allowed tags" param, but haven't got a good 
> >>>> list
> >>>> for it.
> >>>>
> >>>
> >>> oh, and even <img> tags can be used for cookie-stuffing on many browsers..
> >>>
> >>
> >>
> >> Yes, and you call strip_tags() before the data goes to the browser for
> >> display, not before it gets inserted into the database. Essentially, you
> >> need to keep as much original information as possible.
> > 
> > I disagree with both you. I'm like that :)
> > 
> > let's assume we're not talking about data that is allowed to contain HTML,
> > in such cases I would do a strip_tags() on the incoming data then compare
> > the output ofstrip_tags() to the original input ... if they don't match then
> > I would log the problem and refuse to input the data at all.
> > 
> > using strip_tags() on a piece of data everytime you output it if you know
> > that it shouldn't contain any in the first is a waste of resources ... this
> > does assume that you can trust the data source ... which in the case of a 
> > database
> > that you control should be the case.
> I used to think like that too, but I've relatively recently changed my
> position.
> While it's not as extreme an example, I used to keep data in the
> database *after* I had processed it with htmlspecialchars() (not quite
> the same as strip_tags, but the principle is the same).
> The issue I had was that over time, I've found the need to output to
> other formats - e.g. spread sheets, plain text emails, PDFs etc. in
> which case this pre-encoded format is a pain and I have to call
> html_entity_decode() to reverse the htmlspecialchars() I did in the
> first place. This is a royal pain in the bum and it's really ugly in the
> code, remembering what format the data is in in order to process it
> appropriately at the right points.
> Nowadays I work rather differently and always escape at the point of
> output (this does not exclude filtering at the point of input too, but I
> do not keep things encoded any longer - I keep it raw).
> Any half way decently designed caching layer will prevent any major
> impact from escaping at the point of output anyway.
> Now you could argue that encoding at the save point and reversing the
> encoding when needed is still a better approach and I wont argue too
> heavily, but for the sake of my sanity I'm much happier working the way
> I do now. The view layers are very clearly escaping everything that
> needs escaping and no logic for the "is it or is it not already escaped"
> leaks into this layer.
> (I appreciate strip tags and htmlspecialchars are not the same and my
> general usage may not apply to a pure striptags usage).
> > at any rate, strip_tags() doesn't belong in an 'anti-sql-injection' routine 
> > as
> > it has nothing to do with sql injection at all.
> Indeed, it's more about XSS and CSRF rather than SQL injection.
> Col
> -- 
> Colin Guthrie
> gmane(at)colin.guthr.ie
> http://colin.guthr.ie/
> Day Job:
>   Tribalogic Limited [http://www.tribalogic.net/]
> Open Source:
>   Mandriva Linux Contributor [http://www.mandriva.com/]
>   PulseAudio Hacker [http://www.pulseaudio.org/]
>   Trac Hacker [http://trac.edgewall.org/]

You could escape the content with strip_tags() and insert both copies
into the database if you're really worried about wasted resources. That
way, you keep a copy of the original data, and the one you're most
likely going to display in a web page.

It's like the whole argument about modifying textarea content to replace
newlines with <br/> tags. At some point, you might need that content for
another use, and when you do, you'll wish you had the original. Just
because you don't see that use in your immediate future, it doesn't mean
it won't occur.


Reply via email to