Josh, I think the point that Rob and others were making is that your data should be validated and cleaned up BEFORE being inserted into the database - whether it's inserted as XML or not is completely and utterly irrelevant. If you didn't have invalid data in the database, then you wouldn't have invalid data in your XML. But, since the data obviously is NOT being validated and cleaned up before db entry, the best, most scalable, and most widely accepted "good practice" would be to use CDATA in your XML.
Again though, what you're doing is just a bandaid that covers up the real issue, which is invalid data being entered into the database. Thanks, Matt -----Original Message----- From: Josh Nathanson [mailto:[EMAIL PROTECTED] Sent: Tuesday, November 07, 2006 1:14 PM To: CF-Talk Subject: Re: Cleaning XML - Unicode 0x0 SOLVED sorta OK, I added this to my regex: \x00 Which is a hex representation of the character 0. And it worked. Not sure why chr(0) didn't work. Yes it's non scalable...but, since the data is not going into the database as xml, just plain old form fields, I can't use CDATA on the way in anyway, correct? I would have to run the same regex on each of the incoming form fields that are text...so, this way is more scalable than that I guess. -- Josh ----- Original Message ----- From: "Rob Wilkerson" <[EMAIL PROTECTED]> To: "CF-Talk" <[email protected]> Sent: Tuesday, November 07, 2006 10:19 AM Subject: Re: Cleaning XML - Unicode 0x0 > On 11/7/06, Josh Nathanson <[EMAIL PROTECTED]> wrote: >> Thanks for your help Rob. I just don't know which field is the culprit >> as >> far as the null character (there's no description field or anything >> obvious >> like that), and I'm hesitant to CDATA every single field that's going >> into >> the db, unless I've exhausted every possible other option. > > I wouldn't apply a CDATA block to every field indiscriminately, but I > would apply it to varchar and text fields where the data is likely to > be quite variable. > >> I'll keep grinding on trying to regex the null character out of there and >> let the list know if I figure anything out. > > The problem with this approach is that while it's currently the null > character, next time it might be something else and then something > else. Your regex could just continue to grow. I guess what I'm > saying is that it's not really a scalable solution. > > Handling invalid character in a batch manner by including them in a > CDATA block or by understanding how those characters are being > inserted is a more workable long term solution. > > That said, adding this final character may turn out to be the last you > ever hear of this particular problem. :-) > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~| Introducing the Fusion Authority Quarterly Update. 80 pages of hard-hitting, up-to-date ColdFusion information by your peers, delivered to your door four times a year. http://www.fusionauthority.com/quarterly Archive: http://www.houseoffusion.com/groups/CF-Talk/message.cfm/messageid:259505 Subscription: http://www.houseoffusion.com/groups/CF-Talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4

