An alternative approach - try using the MS Office 2000 HTML filter: http://download.microsoft.com/download/office2000prem/msohtmf/2000/WIN98 /EN-US/msohtmf2.exe
Steve > -----Original Message----- > From: Kay Smoljak [mailto:[EMAIL PROTECTED]] > Sent: 12 December 2001 14:36 > To: CF-Talk > Subject: RE help needed: stripping tags from a string > > > Hi all, > > I'm trying to get a regular expression working to replace some > particularly horrible markup (yes it's Microsoft-generated) with > something manageable. I'm trying to turn this string: > > <P class=MsoNormal style="MARGIN: 0cm 0cm 0pt"> </P> > <P class=MsoNormal style="MARGIN: 0cm 0cm 0pt">Blah Blah Blah</P> > > Into this: > > | | | Blah Blah Blah | > > So far I have this: > > <cfset cleanme = REREplaceNoCase(mystring,"[<][[:print:]]*[>]"," | > ","ALL")> > > to try and turn all paragraph tags into pipe characters flanked by > spaces. But, instead of it matching any printable character from the > first < to the first >, which is what I thought it should do, it's > ignoring the first > and the second < and skipping to the second >. So > the first two paragraph tags, instead of being converted to two pipes, > are being converted to one. Argh! > > If anyone can point out where I'm going wrong, I'd really > appreciate it. > > K. > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Get the mailserver that powers this list at http://www.coolfusion.com FAQ: http://www.thenetprofits.co.uk/coldfusion/faq Archives: http://www.mail-archive.com/cf-talk@houseoffusion.com/ Unsubscribe: http://www.houseoffusion.com/index.cfm?sidebar=lists