RE: regexp & html-tags

Dan Muey Mon, 07 Jul 2003 21:36:16 -0700

> Hello Beginners,

Howdy!


> 
> I was looking for a mailing list about REGEXP but I didn't 
> find it. Maybe there is somebody here that can help me...
> 
> Suppose you have the following string
> 
>     $str = "... <b>John <i>Smith</i> (male)</b> and 
> <b>Elisabeth <i>Jones</i> (female)</b> ..."
> 
> and that you want to find out the container of the first 
> B-tag, that is you want to get the string "John <i>Smith</i> 
> (male)". Now, if I use the following regexp
> 
>     $str =~ m/<b>(.*)<\/b>/i;
> 
> I get as result, i.e. $1:
> 
>     "John <i>Smith</i> (male)</b> and <b>Elisabeth 
> <i>Jones</i> (female)"

Its because .* will match as much as it can.
You could do two things:
1) change .* to something else say ,<b>(.*\(mail\))</b>
        which may be bad if the first may not be mail or may not even have(.*)

2) Use a module to parse the html since html can change.
Very good idea since it is very easy to parse out specific tag 
sets in the order they appear.

Check out search.cpan.org for HTML::Parser (I think it's that anyway) or similar.

HTH

DMuey
> 
> and not what I wanted!!
> 
> Thanks & Cheers,
> Michele

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: regexp & html-tags

Reply via email to