has some good examples if you search under the regex functions.  Or
you might use something like the function below.  I wrote this in a search
engine spider.  It will return a list of local html links found on the given
page.  The way I used this in my spider was to build a master list of local
links and test that against a separate array of visited links.  Combine with
a little Javascript you can index an entire website with visual feedback.

function extract_links($url)
 $fp = fopen($url, "r");
 if ($fp !== false)

  $contents = implode("", file($url));
  preg_match_all("|href=\"?([^\"' >]+)|i", $contents, $arrayoflinks);

  foreach ($arrayoflinks[1] as $link)
   // Trim out any links with http://
   if (!ereg('http://', $link))
    // Make sure the links are html files.
    if (ereg ('.htm', $link))
     // Build array of local links on this page.
     $links[] = $link;
  $links = array_unique($links);
  $links = array_values($links);
  return $links;
  return false;


----- Original Message -----
From: "Nick Wilson" <[EMAIL PROTECTED]>
To: "php-general" <[EMAIL PROTECTED]>
Sent: Friday, June 21, 2002 3:15 PM
Subject: [PHP] getting anchor tags

> Hash: SHA1
> Hi
> In theory I can work out how to get <a href= tags from a page. Before I
> start messing with regexp though I thought I'd see if there were any
> pre-built functions or ways of doing this?
> I'm building a site search and have not found anything in the docs but
> am guessing there might be an easier way of proceeding?
> Many thanks...
> - --
> Nick Wilson     //
> Version: GnuPG v1.0.6 (GNU/Linux)
> iD8DBQE9E5dUHpvrrTa6L5oRAtrRAJ0YqRvKl8WAAG9xYiFHa6u0Nr7RYgCcDIii
> A/dUb7p9De0J1huL+e2QPFs=
> =03Ln
> --
> PHP General Mailing List (
> To unsubscribe, visit:

PHP General Mailing List (
To unsubscribe, visit:

Reply via email to