On Sep 21, 2008, at 5:03 PM, blasterpal wrote:
> my_string="blablablabla<coordinates>substring</coordinates>blabla"
> #the parentheses below define the actual match for the overall regex
> pattern
> sub_string = /.*<coordinates>(.*)<\/coordinates>.*/.match(my_string)
> puts sub_string[0]
>
> Regex is the fastest/most effective for one/off text parsing. Another
> good option is Whytheluckystiff's Hpricot:
> http://code.whytheluckystiff.net/hpricot/
>
> Hank


You probably want the regexp to be:
        /<coordinates>(.*)<\/coordinates>/
so there's less backtracking when the .* first tries to gobble  
everything.

You might also need something like:
        /<coordinates\b[^>]*>(.*)<\/coordinates>/
If there can be any attributes on the coordinates tag.  Of course, if  
you really do have XML in my_string, a true parser like Hpricot or  
REXML will be more reliable than regular expressions.  For example, if  
you had to match against:
        "blahblah<coordinates>first one</ 
coordinates>yadayadayada<coordinates>oops! another one</ 
coordinates>yakyakyak"
would you want the substring to be:
        "first one</coordinates>yadayadayada<coordinates>oops! another one"
(yeah, I didn't think so ;-)


-Rob

Rob Biedenharn          http://agileconsultingllc.com
[EMAIL PROTECTED]



--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Ruby 
on Rails: Talk" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/rubyonrails-talk?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to