Hi all,
Here's what I finally came up with for an HTML parser. It lets you target
one or more tags in an HTML file and pull related content as name/value
pairs, in this format:
Tag ::: Content
Seems to work pretty well, and it let me whip through a bunch of files,
inserting headings into a Heading field, and paragraphs where they need to
go...
-----------------------------------
<cffile action="READ" variable="coder" file="fullpath/careers.htm">
<cfset elementlist="p,h1,h2,h3,h4,h5,font style=me,div class=head1">
<cfset ignorelist = "a,b">
<cfloop list="#coder#" delimiters="<" index="word">
<!--- grab only opening tags --->
<cfif find("/",word,1) IS 0>
<cfoutput>
<!--- figure out what the element is --->
<cfset elementstart= 1>
<cfset elementend = find(">",word,1)>
<cfset elementend2 = find(" ",word, 1)>
<cfset elementend3 = len(word)>
<cfif elementend2 LT elementend>
<cfset elementend = elementend2>
</cfif>
<cfif elementend IS 0>
<cfset elementend = elementend3>
</cfif>
<cfset element = replace(mid(word,elementstart,elementend),">","")>
<!--- now get the content for that element --->
<cfset contentstart = find(">",word,1)+1>
<cfif contentstart GT 0>
<cfset content = mid(word,contentstart,elementend3)>
<cfelse><cfset content="">
</cfif>
<cfif find(element,elementlist,1) AND len(content) GT 0>
#element# ::: #content#<br>
</cfif>
</cfoutput>
</cfif>
</cfloop>
Ian
______________________________________________________________________
Why Share?
Dedicated Win 2000 Server � PIII 800 / 256 MB RAM / 40 GB HD / 20 GB MO/XFER
Instant Activation � $99/Month � Free Setup
http://www.pennyhost.com/redirect.cfm?adcode=coldfusionc
FAQ: http://www.thenetprofits.co.uk/coldfusion/faq
Archives: http://www.mail-archive.com/[email protected]/
Unsubscribe: http://www.houseoffusion.com/index.cfm?sidebar=lists