Hi Roman,

> But on those phpbuilder's page there are also similar questions but
> scripts don't work, for example:
> ...
> preg_match_all("|href=\"?([^\"' >]+)|i(+[ >])", $text,$ar);
> I really don't know :(

There's definitely something wrong with the RegEx above. The string
commences with |, so therefore the last characters of the string must be
|a - where "a" is a letter, in this case "i" meaning case-insensitive. For
some reason the |i is not at the end of the string - either you miscopied or
they misprinted.

However the RegEx doesn't strike me as correct HTML anyway, because there
can be spaces between elements, eg between "href" and "=", eg <href =
www.homepage.com, secondly if the URL is enclosed in quotation marks either
a single or double quotes may be used (' or ").

Unfortunately my RegEx skills are self-taught, so who knows how good/useless
my advice! Here is my current best attempt (abstracted from my web site
links checker routine):

$RegEx = "/(" . "href *" . "= *['\"]?)([^'\" >]*)(['\" >])/i";

if ( DEBUG ) echo "<br>RegEx=$RegEx~";

$bValidity = $iFound

            = preg_match_all( $RegEx, $HTML, $aRegExOut );

An improvement might be for the closing quotes to refer back to (any)
opening quotes. I am willing to watch, listen, and learn, if anyone can
offer improvements/wisdom.


PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to