> There is a line of text that I want to extract.  It is found in some HTML
> that follows this pattern:
> <a href=...><img src=.../></a><br/> <!-- An image link -->
> Some text
> <br/>
> <a href=...>Another link (variable text)</a>
> <br/>
>       Text I want to extract.
> <br/>
> Updated by <a href=...>name (variable)</a> X hours ago [<a
> href=...>Comment</a>]
> I have the impression that I'd use XPath somehow, but am unsure how to
> proceed since the text I want isn't in any page element.
> - Aaron

Here's how I would do it with a regular expression:

$ie.html =~ /<br\/>\s*(.*)\s*<br\/>\s*Updated/
puts $1

That doesn't pull the tab before 'Text', and assumes that 'Updated' only
appears once in the html.  I don't guarantee that this regexp will work with
any text other than what you provided above, where it returns "Text I want
to extract." in $1 and spits it out to the screen.

Hope this gives you a starting place.


__________ Information from ESET NOD32 Antivirus, version of virus signature
database 4289 (20090729) __________

The message was checked by ESET NOD32 Antivirus.


You received this message because you are subscribed to the Google Groups 
"Watir General" group.
To post to this group, send email to watir-general@googlegroups.com
Before posting, please read the following guidelines: 
To unsubscribe from this group, send email to 
For more options, visit this group at 

Reply via email to