> All, thanks for the help with this issue. Using CDATA > didn't work as apparently CF does parse whatever's > within the CDATA block and was still choking on the > bad character. I ultimately implemented the rereplace > suggested by S. Isaac, and after some tweaking it > worked. Here's the final code:
> <cfset itemNameClean = > rereplace(itemName,"[#chr(1)#-#chr(8)#-#chr(11)#-#chr(12)# > -#chr(14)#-#chr(28)#-#chr(29)#-#chr(31)#-#chr(38)#]","","A > LL")> > <title>#itemNameClean#</title> > I had to add in chr(28) and chr(29) into the rereplace > as those are the equivalents to Unicode 0x1c and 0x1d > which were the bad characters that the user had entered > somehow. Also added chr(38) which is the '&' character, > also a baddie. > -- Josh Hi Josh, Without wanting to sound critical, I think you may need a little more testing before declaring this issue resolved. You seem to have some extra hyphens in the expression here, that's one issue (you just need a bit of a primer on regex I think), and another issue is that you're handling the & character without also handling the > (>) and < (<) or " (") characters, which means if a user enters any of those into the string, they will also cause problems. This was the reason why my implementation of it had used both XMLFormat() and the regular expression, because neither one of them independantly solved the whole problem. I'll let you decide about the extra special xml characters. :) As to the regular expression, here's the explanation of where I see the problem: The original expression here: [#chr(1)#-#chr(8)##chr(11)#-#chr(12)##chr(14)#-#chr(31)#] is similar to [a-zA-Z0-9] Notice in this expression that there are two places where there is no hyphen between two characters, both at "zA" and at "Z0". This is because of the way the hyphen is interpreted within the class designated by the [ and ] characters. The class itself tells the regular expression engine to match any character within the class, so [ab] will match the letter "a" or the letter "b". The hyphen then allows you to specify a range of characters (in ASCII or unicode numeric order), so that [a-b] will match the letter "a" or the letter "b" or the letter "c". The reason why many people use a-zA-Z instead of a-Z is because when you look at an ASCII table, there are several non-alpha characters between the letter "z" and the letter "A" (or vice versa, I don't remember offhand if ascii has lower-case higher or lower in the list). Now, in your expression above, you've added several hyphens, so your expression is roughly equivalent of [a-z-A-Z-0-9] Off hand, since I haven't tested it, I don't know if this will produce the same result. At a minimum, my expectation would be that it would add the hyphen to the list of characters being removed, because I've been able to include hyphens in a character class before, such as [-0-9]. Since I'm guessing you want to allow users to use hyphens, I'm thinking you don't want that to happen. On the other hand it could potentially add other legal characters (9,10,13 and 32-37) to the list of characters that are removed. You'll have to test it to know exactly how it behaves. hth s. isaac dealey 434.293.6201 new epoch : isn't it time for a change? add features without fixtures with the onTap open source framework http://www.fusiontap.com http://coldfusion.sys-con.com/author/4806Dealey.htm ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~| Message: http://www.houseoffusion.com/lists.cfm/link=i:4:237394 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4 Donations & Support: http://www.houseoffusion.com/tiny.cfm/54