> There is a line of text that I want to extract.  It is found in some HTML
> that follows this pattern:
> 
> <a href=...><img src=.../></a><br/> <!-- An image link -->
> Some text
> <br/>
> <a href=...>Another link (variable text)</a>
> <br/>
>       Text I want to extract.
> <br/>
> Updated by <a href=...>name (variable)</a> X hours ago [<a
> href=...>Comment</a>]
> 
> I have the impression that I'd use XPath somehow, but am unsure how to
> proceed since the text I want isn't in any page element.
> 
> - Aaron

Here's how I would do it with a regular expression:

$ie.html =~ /<br\/>\s*(.*)\s*<br\/>\s*Updated/
puts $1

That doesn't pull the tab before 'Text', and assumes that 'Updated' only
appears once in the html.  I don't guarantee that this regexp will work with
any text other than what you provided above, where it returns "Text I want
to extract." in $1 and spits it out to the screen.

Hope this gives you a starting place.
                        /\/\ark

 

__________ Information from ESET NOD32 Antivirus, version of virus signature
database 4289 (20090729) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com
 



--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Watir General" group.
To post to this group, send email to watir-general@googlegroups.com
Before posting, please read the following guidelines: 
http://wiki.openqa.org/display/WTR/Support
To unsubscribe from this group, send email to 
watir-general-unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/watir-general
-~----------~----~----~----~------~----~------~--~---

Reply via email to