Urk (thump)

Ian keels over.

I guess, but the HTML Word generates is only slightly uglier than the RTF
 it
generates. I guess I'd probably do some search and destroy, first, or use
Dreamweaver's Clean Up Word HTML function (yes, I admit it - I use
Dreamweaver) and then filter it.

The other trick I've used with Word is to import the Word doc into Adobe
FrameMaker, and then save it as XML. Kind of convoluted, but it works gre
at.

If you wanted to filter RTF, that might actually be easier, because RTF i
s
fairly structured even if it is really complex.

We've generated RTF, but I've never worked backwords, I must admit.

Ian

-----Original Message-----
From: Ian Vaughan [mailto:[EMAIL PROTECTED]]
Sent: Thursday, January 17, 2002 7:00 AM
To: CF-Talk
Subject: Re: HTML Parser - SOLUTION


Do you think this process could work with Microsoft Word Files ???
----- Original Message -----
From: "Ian Lurie" <[EMAIL PROTECTED]>
To: "CF-Talk" <[EMAIL PROTECTED]>
Sent: Thursday, January 17, 2002 2:43 PM
Subject: HTML Parser - SOLUTION


> Hi all,
>
> Here's what I finally came up with for an HTML parser. It lets you targ
et
> one or more tags in an HTML file and pull related content as name/value
> pairs, in this format:
>
> Tag ::: Content
>
> Seems to work pretty well, and it let me whip through a bunch of files,
> inserting headings into a Heading field, and paragraphs where they need
 to
> go...
>
> -----------------------------------
> <cffile action="READ" variable="coder" file="fullpath/careers.htm
">
> <cfset elementlist="p,h1,h2,h3,h4,h5,font style=me,div class=head
1">
> <cfset ignorelist = "a,b">
> <cfloop list="#coder#" delimiters="<" index="word">
> <!--- grab only opening tags --->
> <cfif find("/",word,1) IS 0>
> <cfoutput>
> <!--- figure out what the element is --->
> <cfset elementstart= 1>
> <cfset elementend = find(">",word,1)>
> <cfset elementend2 = find(" ",word, 1)>
> <cfset elementend3 = len(word)>
> <cfif elementend2 LT elementend>
> <cfset elementend = elementend2>
> </cfif>
> <cfif elementend IS 0>
> <cfset elementend = elementend3>
> </cfif>
> <cfset element = replace(mid(word,elementstart,elementend),">","")>
>
> <!--- now get the content for that element --->
> <cfset contentstart = find(">",word,1)+1>
>
> <cfif contentstart GT 0>
> <cfset content = mid(word,contentstart,elementend3)>
> <cfelse><cfset content="">
> </cfif>
> <cfif find(element,elementlist,1) AND len(content) GT 0>
> #element# ::: #content#<br>
> </cfif>
> </cfoutput>
> </cfif>
> </cfloop>
>
>
>
>
>
>
> Ian
>

______________________________________________________________________
Why Share?
  Dedicated Win 2000 Server � PIII 800 / 256 MB RAM / 40 GB HD / 20 GB MO/XFER
  Instant Activation � $99/Month � Free Setup
  http://www.pennyhost.com/redirect.cfm?adcode=coldfusionc
FAQ: http://www.thenetprofits.co.uk/coldfusion/faq
Archives: http://www.mail-archive.com/[email protected]/
Unsubscribe: http://www.houseoffusion.com/index.cfm?sidebar=lists

Reply via email to