> There is a line of text that I want to extract. It is found in some HTML > that follows this pattern: > > <a href=...><img src=.../></a><br/> <!-- An image link --> > Some text > <br/> > <a href=...>Another link (variable text)</a> > <br/> > Text I want to extract. > <br/> > Updated by <a href=...>name (variable)</a> X hours ago [<a > href=...>Comment</a>] > > I have the impression that I'd use XPath somehow, but am unsure how to > proceed since the text I want isn't in any page element. > > - Aaron
Here's how I would do it with a regular expression: $ie.html =~ /<br\/>\s*(.*)\s*<br\/>\s*Updated/ puts $1 That doesn't pull the tab before 'Text', and assumes that 'Updated' only appears once in the html. I don't guarantee that this regexp will work with any text other than what you provided above, where it returns "Text I want to extract." in $1 and spits it out to the screen. Hope this gives you a starting place. /\/\ark __________ Information from ESET NOD32 Antivirus, version of virus signature database 4289 (20090729) __________ The message was checked by ESET NOD32 Antivirus. http://www.eset.com --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Watir General" group. To post to this group, send email to watir-general@googlegroups.com Before posting, please read the following guidelines: http://wiki.openqa.org/display/WTR/Support To unsubscribe from this group, send email to watir-general-unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/watir-general -~----------~----~----~----~------~----~------~--~---