Hmmm, although it works that code is not quite correct - there's a few issues 
with it.


>> If you don?t mind characters like ñ, then just use \w instead of A-Za-z0-9_

This is *incorrect* - in ColdFusion regex, \w does NOT include accented 
characters. There are other regex engines where it does, but the Apache ORO 
used by CF doesn't. (Unless that's changed with CF9 anyhow, but I suspect not..)


There are several unnecessary escapes since ".?()" do not need escaping inside 
classes.
(However, if '-' wants to be included (it's not currently) then it should be 
escaped as '\-' so it's not treated as a range.)


By including \s you're not just saying space, you're *also* including \r and \n 
and \t and \v. So either just use a literal space (to avoid tabs) or to allow 
tabs don't specify the \r and \n since they're just adding noise.

[^\w\r \n!.?''"()&,;:] or [^\w\s!.?''"()&,;:]


Outside of the character class, the outer group is redundant - regex already 
captures the match to \0 so just do:

rereplace( str , '[^\w\r \n!.?''"()&,;:]' , '<span class="highlight">\0</span>' 
, 'all' )


And finally, one more optimisation - to avoid a long series of HTML spans, just 
add a + to collect multiple characters together:

rereplace( str , '[^\w\r \n!.?''"()&,;:]+' , '<span 
class="highlight">\0</span>' , 'all' )


Hope this helps. :)

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
Order the Adobe Coldfusion Anthology now!
http://www.amazon.com/Adobe-Coldfusion-Anthology-Michael-Dinowitz/dp/1430272155/?tag=houseoffusion
Archive: 
http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:337854
Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm
Unsubscribe: http://www.houseoffusion.com/groups/cf-talk/unsubscribe.cfm

Reply via email to