found one bug already.... the regex should be as follows, since all HTML
tags start with an alpha but *CAN* contain numbers.. ie, h1-h5...
loc = REFindNoCase("<[A-Z][A-Z0-9]*\s+[^>]*#att#=.*?>",str);
In english... find any tag that starts with a letter and is followed by
zero or more alphanumeric characters, one or more whitespace characters,
any amount of text that doesn't include a >, the specified attribute and
an equal sign, any amount of text ending with a >
there does seem to be a flaw in it...
<a ... someatt=">" onmouseover="foo">
would not be detected...
If I try *THIS* regex
loc = REFindNoCase("<[A-Z][A-Z0-9]*\s+.*#att#=.*?>",str);
it works with the above case, but *NOT* with this case:
<a href="http://www.cflib.org">Click here!</b> if you
don't like onmouseover= in your code.<a
href="http://www.opensourcecf.com">foo</a></cfsavecontent>
So again, there seems to be no valid solution for either detecting or
stripping unwanted attributes.
However.. I could convert the = sign to an HTML entity, which would
render it invalid.
<a href="foo.html" onmouseover="alert('hi')">foo</A>
The = would still be visible on the page as an equal sign, but in
the source code it would prevent the attribute from working.
So I would regex like this:
reReplaceNoCase(str,"onmouseover\s*=","onmouseover&##61;","ALL")
Of course, the output wouldn't be very xhtml compliant ;)
Rick
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
Introducing the Fusion Authority Quarterly Update. 80 pages of hard-hitting,
up-to-date ColdFusion information by your peers, delivered to your door four
times a year.
http://www.fusionauthority.com/quarterly
Archive:
http://www.houseoffusion.com/groups/CF-Talk/message.cfm/messageid:262014
Subscription: http://www.houseoffusion.com/groups/CF-Talk/subscribe.cfm
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4