found one bug already.... the regex should be as follows, since all HTML 
tags start with an alpha but *CAN* contain numbers.. ie, h1-h5...

        loc = REFindNoCase("<[A-Z][A-Z0-9]*\s+[^>]*#att#=.*?>",str);

In english... find any tag that starts with a letter and is followed by 
zero or more alphanumeric characters, one or more whitespace characters, 
any amount of text that doesn't include a >, the specified attribute and 
an equal sign, any amount of text ending with a >

there does seem to be a flaw in it...

<a ... someatt=">" onmouseover="foo">

would not be detected...

If I try *THIS* regex

        loc = REFindNoCase("<[A-Z][A-Z0-9]*\s+.*#att#=.*?>",str);

it works with the above case, but *NOT* with this case:

<a href="http://www.cflib.org";>Click here!</b> if you
don't like onmouseover= in your code.<a 
href="http://www.opensourcecf.com";>foo</a></cfsavecontent>

So again, there seems to be no valid solution for either detecting or 
stripping unwanted attributes.

However.. I could convert the = sign to an HTML entity, which would 
render it invalid.

<a href="foo.html" onmouseover&#61;"alert('hi')">foo</A>

The &#61; would still be visible on the page as an equal sign, but in 
the source code it would prevent the attribute from working.

So I would regex like this:

reReplaceNoCase(str,"onmouseover\s*=","onmouseover&##61;","ALL")

Of course, the output wouldn't be very xhtml compliant ;)

Rick

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
Introducing the Fusion Authority Quarterly Update. 80 pages of hard-hitting,
up-to-date ColdFusion information by your peers, delivered to your door four 
times a year.
http://www.fusionauthority.com/quarterly

Archive: 
http://www.houseoffusion.com/groups/CF-Talk/message.cfm/messageid:262014
Subscription: http://www.houseoffusion.com/groups/CF-Talk/subscribe.cfm
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4

Reply via email to