Have you tried XMLFormat() around the content?


"This e-mail is from Reed Exhibitions (Gateway House, 28 The Quadrant,
Richmond, Surrey, TW9 1DN, United Kingdom), a division of Reed Business,
Registered in England, Number 678540.  It contains information which is
confidential and may also be privileged.  It is for the exclusive use of the
intended recipient(s).  If you are not the intended recipient(s) please note
that any form of distribution, copying or use of this communication or the
information in it is strictly prohibited and may be unlawful.  If you have
received this communication in error please return it to the sender or call
our switchboard on +44 (0) 20 89107910.  The opinions expressed within this
communication are not necessarily those expressed by Reed Exhibitions." 
Visit our website at http://www.reedexpo.com

-----Original Message-----
From: Les Mizzell
To: CF-Talk
Sent: Fri May 04 02:30:03 2007
Subject: Cleaning stored text to get valid XML

I've had an application set up for awhile now that allows a user to post 
and email newsletters from their site.

The body text is entered on a form using fckeditor, and since these 
folks are lawyers, almost everything is pasted from Word and fckeditor 
is handling whatever is thrown at it.

Now, they wish to create a RSS feed from all their newsletters. Oh boy. 
There's all kinds of crap in the data - curly quotes, apostrophies, HTML 
tags, and gawd knows what else.

I've been going nutz trying to clean the existing text enough to create 
valid XML text so it will display.

I start by getting rid of all the HTML junk, which is working fine:

<cfset request.bodynohtml = 
"#rereplacenocase(stories.body,"<[^>]*>","","all")#" >

After that, it gets a little weird. I've tried all sorts of functions,
xmlFormat2.cfm, ConvertSpecialChars ... chaining rereplacenocase to get 
rid of left and right quotes, apostrophies, whatever other junk I keep 
finding ...

Nothing seems to be getting rid of everything, and the feed still isn't 
displaying correctly. I know my code base is OK because I've created two 
other feeds that are working. There's *something* in the text that's 
still stopping a correct display.

How are you folks handling this sort of thing?

This one is working:
http://www.nelsonmullins.com/rss/rss_press.cfm

This one ain't - there's something in the body text somewhere I'm not 
stripping out...
http://www.nelsonmullins.com/rss/rss_newsletters.cfm

Suggestions?




~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
Upgrade to Adobe ColdFusion MX7
Experience Flex 2 & MX7 integration & create powerful cross-platform RIAs
http://www.adobe.com/products/coldfusion/flex2/?sdid=RVJQ 

Archive: 
http://www.houseoffusion.com/groups/CF-Talk/message.cfm/messageid:276985
Subscription: http://www.houseoffusion.com/groups/CF-Talk/subscribe.cfm
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4

Reply via email to