Hi all,

Here's what I finally came up with for an HTML parser. It lets you target
one or more tags in an HTML file and pull related content as name/value
pairs, in this format:

Tag ::: Content

Seems to work pretty well, and it let me whip through a bunch of files,
inserting headings into a Heading field, and paragraphs where they need to
go...

-----------------------------------
<cffile action="READ" variable="coder" file="fullpath/careers.htm">
<cfset elementlist="p,h1,h2,h3,h4,h5,font style=me,div class=head1">
<cfset ignorelist = "a,b">
<cfloop list="#coder#" delimiters="<" index="word">
<!--- grab only opening tags --->
<cfif find("/",word,1) IS 0>
<cfoutput>
        <!--- figure out what the element is --->
                <cfset elementstart= 1>
                <cfset elementend = find(">",word,1)>
                <cfset elementend2 = find(" ",word, 1)>
                <cfset elementend3 = len(word)>
                        <cfif elementend2 LT elementend>
                                <cfset elementend = elementend2>
                        </cfif>
                        <cfif elementend IS 0>
                                <cfset elementend = elementend3>
                        </cfif>
        <cfset element = replace(mid(word,elementstart,elementend),">","")>

                <!--- now get the content for that element --->
                        <cfset contentstart = find(">",word,1)+1>

                <cfif contentstart GT 0>
                        <cfset content = mid(word,contentstart,elementend3)>
                        <cfelse><cfset content="">
                </cfif>
                <cfif find(element,elementlist,1) AND len(content) GT 0>
                        #element# ::: #content#<br>
                </cfif>
</cfoutput>
</cfif>
</cfloop>






Ian
______________________________________________________________________
Why Share?
  Dedicated Win 2000 Server � PIII 800 / 256 MB RAM / 40 GB HD / 20 GB MO/XFER
  Instant Activation � $99/Month � Free Setup
  http://www.pennyhost.com/redirect.cfm?adcode=coldfusionc
FAQ: http://www.thenetprofits.co.uk/coldfusion/faq
Archives: http://www.mail-archive.com/[email protected]/
Unsubscribe: http://www.houseoffusion.com/index.cfm?sidebar=lists

Reply via email to