> I'm trying to parse XML and I've hit a problem where inbetween my XML of
> <abstract></abstract>
>
> There is occasionally some math values or thing like
>
> <abstract>
>     the dog is > than the cat
> </abstract>
>
> I've got 3500 of these documents to run through an insert into a DB.  Does
> anyone know how or in what program I could issue a Find and Replace Between 
> the
> <ABSTRACT></ABSTRACT> for any instances of >< and replace them with 
> the proper &
> code.

First, if you have something in a document, it's not XML. By
definition, an XML document is well-formed - it can be parsed by an
XML parser. These documents can't. I'm not bringing this up simply to
be annoying - if you're getting these documents from somewhere else,
you need to get them to fix their document generation process.

Second, within the text node of any XML element, you can declare the
contents as off-limits to the parser:

<abstract><![CDATA[ ... contents ... ]]></abstract>

Third, if you're creating these documents yourself, you can use the
XMLFormat function in CF to escape XML metacharacters before you
output them to the document.

Dave Watts, CTO, Fig Leaf Software
http://www.figleaf.com/
http://training.figleaf.com/

Fig Leaf Software is a Veteran-Owned Small Business (VOSB) on
GSA Schedule, and provides the highest caliber vendor-authorized
instruction at our training centers, online, or 

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
Order the Adobe Coldfusion Anthology now!
http://www.amazon.com/Adobe-Coldfusion-Anthology/dp/1430272155/?tag=houseoffusion
Archive: 
http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:342002
Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm
Unsubscribe: http://www.houseoffusion.com/groups/cf-talk/unsubscribe.cfm

Reply via email to