PHP.net has some good examples if you search under the regex functions. Or
you might use something like the function below. I wrote this in a search
engine spider. It will return a list of local html links found on the given
page. The way I used this in my spider was to build a master list of local
links and test that against a separate array of visited links. Combine with
$fp = fopen($url, "r");
if ($fp !== false)
$contents = implode("", file($url));
preg_match_all("|href=\"?([^\"' >]+)|i", $contents, $arrayoflinks);
foreach ($arrayoflinks as $link)
// Trim out any links with http://
if (!ereg('http://', $link))
// Make sure the links are html files.
if (ereg ('.htm', $link))
// Build array of local links on this page.
$links = $link;
$links = array_unique($links);
$links = array_values($links);
----- Original Message -----
From: "Nick Wilson" <[EMAIL PROTECTED]>
To: "php-general" <[EMAIL PROTECTED]>
Sent: Friday, June 21, 2002 3:15 PM
Subject: [PHP] getting anchor tags
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> In theory I can work out how to get <a href= tags from a page. Before I
> start messing with regexp though I thought I'd see if there were any
> pre-built functions or ways of doing this?
> I'm building a site search and have not found anything in the docs but
> am guessing there might be an easier way of proceeding?
> Many thanks...
> - --
> Nick Wilson // www.explodingnet.com
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.0.6 (GNU/Linux)
> -----END PGP SIGNATURE-----
> PHP General Mailing List (http://www.php.net/)
> To unsubscribe, visit: http://www.php.net/unsub.php
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php