Do you think this process could work with Microsoft Word Files ???
----- Original Message -----
From: "Ian Lurie" <[EMAIL PROTECTED]>
To: "CF-Talk" <[EMAIL PROTECTED]>
Sent: Thursday, January 17, 2002 2:43 PM
Subject: HTML Parser - SOLUTION
> Hi all,
>
> Here's what I finally came up with for an HTML parser. It lets you targ
et
> one or more tags in an HTML file and pull related content as name/value
> pairs, in this format:
>
> Tag ::: Content
>
> Seems to work pretty well, and it let me whip through a bunch of files,
> inserting headings into a Heading field, and paragraphs where they need
to
> go...
>
> -----------------------------------
> <cffile action="READ" variable="coder" file="fullpath/careers.htm
">
> <cfset elementlist="p,h1,h2,h3,h4,h5,font style=me,div class=head
1">
> <cfset ignorelist = "a,b">
> <cfloop list="#coder#" delimiters="<" index="word">
> <!--- grab only opening tags --->
> <cfif find("/",word,1) IS 0>
> <cfoutput>
> <!--- figure out what the element is --->
> <cfset elementstart= 1>
> <cfset elementend = find(">",word,1)>
> <cfset elementend2 = find(" ",word, 1)>
> <cfset elementend3 = len(word)>
> <cfif elementend2 LT elementend>
> <cfset elementend = elementend2>
> </cfif>
> <cfif elementend IS 0>
> <cfset elementend = elementend3>
> </cfif>
> <cfset element = replace(mid(word,elementstart,elementend),">","")>
>
> <!--- now get the content for that element --->
> <cfset contentstart = find(">",word,1)+1>
>
> <cfif contentstart GT 0>
> <cfset content = mid(word,contentstart,elementend3)>
> <cfelse><cfset content="">
> </cfif>
> <cfif find(element,elementlist,1) AND len(content) GT 0>
> #element# ::: #content#<br>
> </cfif>
> </cfoutput>
> </cfif>
> </cfloop>
>
>
>
>
>
>
> Ian
>
______________________________________________________________________
Why Share?
Dedicated Win 2000 Server � PIII 800 / 256 MB RAM / 40 GB HD / 20 GB MO/XFER
Instant Activation � $99/Month � Free Setup
http://www.pennyhost.com/redirect.cfm?adcode=coldfusionc
FAQ: http://www.thenetprofits.co.uk/coldfusion/faq
Archives: http://www.mail-archive.com/[email protected]/
Unsubscribe: http://www.houseoffusion.com/index.cfm?sidebar=lists