On Jun 18, 2006, at 6:27 AM, Terry Vogelaar wrote:

Hi all,

I've got a scripting problem that seems ridiculously simple.

I have to find the corresponding end tag in a HTML file with nested <SPAN> tags:

<SPAN class="FOOTNOTE">
<SPAN Class="reference">
origin
</SPAN>
<SPAN Class="quote">
<SPAN Class="smallcaps">
Text
</SPAN>
more text
</SPAN>
</SPAN>

Some of the tags are nested. How do I find the end tag of each SPAN tag? The tag on line 1 has its end tag on line 11, but the first time it sees </SPAN> is on line 4...

Terry,

A few months ago I was trying to emulate browser-like html rendering and I needed a function like this. The function I wrote returns a list of all instances of the tag you tell it to search for. Once I got the list back I could do whatever I wanted with the tag pairs-- delete them, substitute formatting for tags, etc.

-- This function takes three parameters: (1) a string that describes the opening tab, eg., "<img" -- (2) a string describing the closing tag, eg., "</img>" and (3) a string containing the html code -- to be searched. It returns two items that indicate the beginning and ending character -- offsets of that tag.
function extractTag openTag, closeTag, htmlCode
  put offset(openTag,htmlCode) into openTagPosList
  if openTagPosList = 0 then return 0
  put offset(closeTag,htmlCode) into closeTagPosList
  put openTagPosList into openTagSearchStart
  put closeTagPosList into closeTagSearchStart
  put 1 into i
  repeat -- find all opening and closing tags
put offset(openTag,htmlCode,openTagSearchStart) into openTagFoundChar
    if openTagFoundChar = 0 then exit repeat
    put openTagSearchStart + openTagFoundChar into openTagSearchStart
put offset(closeTag,htmlCode,closeTagSearchStart) into closeTagFoundChar put closeTagSearchStart + closeTagFoundChar into closeTagSearchStart
    add 1 to i
    put openTagSearchStart into item i of openTagPosList
    put closeTagSearchStart into item i of closeTagPosList
  end repeat
  repeat with i = 1 to number of items in openTagPosList
    put "o" after item i of openTagPosList
    put "e" after item i of closeTagPosList
  end repeat
  put openTagPosList & comma & closeTagPosList into tagsList
  sort items of tagsList numeric
  return tagsList
end extractTag

Hope this works for you.

Devin

Devin Asay
Humanities Technology and Research Support Center
Brigham Young University

_______________________________________________
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution

Reply via email to