Maybe I didn't myself clear enough. I don't want to catch every tag A on the
HTML but all the mail addresses that are not already inside a tag A.

The expression you sugest will catch every tag A, failing to catch mail
addresses outside the scope of a tag A.

Thanks anyway.
Manu.


"Burhan Khalid" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
> Manuel Vázquez Acosta wrote:
> > Hi all:
> >
> > I'm trying to find every simple mail address in an HTML that is not
inside
> > an A tag.
> >
> > I have tried this regexp:
> > (?<!maito\:)([EMAIL PROTECTED](?:\.\w+)+)(?![^<]*?</a>)
>
> Try this (a little more comprehensive) :
>
> preg_match_all("|<a(.*?)href=[\"'](.*?)[\"'](.*?)>(.*?)</a>|i",
> $rawHTML, $arrayoflinks);
> $links = array_unique($arrayoflinks[0]);
> $href = array_unique($arrayoflinks[2]); //href=
> $text = array_unique($arrayoflinks[4]); //link text
>
> $text, $href, etc. are arrays. You can print_r() to find out what they
> contain.
>
> -- 
> Burhan Khalid
> phplist[at]meidomus[dot]com
> http://www.meidomus.com

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to