Re: [PHP-DB] URL checking

2001-09-10 Thread Justin Buist

You could use fopen() to retreive the HTML of the page, then parse it
taking your "best guess" at a 404 message.  Not very robust though.

It'd be much better to open up a socket to the site itself, then issue a
"HEAD /" request.  If the page is not found you'll have a line
in the return result (should be the first line) which says "HTTP/1.0 404
Not Found", then your Content-type: will come across and the actual HTML
for the 404 page.

You'll have to parse the domain name off the entire URL that's stored in
your database which could be done by chopping off a leading "http://"; if
any is provided, then split at the first / mark.  Use the experimental
socket() function (http://www.php.net/manual/en/function.socket.php) to
make your connection and issue the HEAD request.

Hope that helps...


Justin Buist
Trident Technology, Inc.
4700 60th St. SW, Suite 102
Grand Rapids, MI  49512
Ph. 616.554.2700
Fx. 616.554.3331
Mo. 616.291.2612

On Sat, 8 Sep 2001, Larry "RedCobra" Linthicum wrote:

> I have a database with lots of  urls stored
>
> I would like to build a script the retrieves them ... then somehow checks
> each one to see if the link produces a "error 404" and is no longer good,
> then ideally  write a file of "bad links" for me
>
> this seems like it would be possible with PHP, could someone give me some
> hints on where to start?   I know very little about http headers  and such
>
> any information will be appreciated
>
>
>
> --
> PHP Database Mailing List (http://www.php.net/)
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> To contact the list administrators, e-mail: [EMAIL PROTECTED]
>


-- 
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]




[PHP-DB] URL checking

2001-09-08 Thread Larry \"RedCobra\" Linthicum

I have a database with lots of  urls stored

I would like to build a script the retrieves them ... then somehow checks
each one to see if the link produces a "error 404" and is no longer good,
then ideally  write a file of "bad links" for me

this seems like it would be possible with PHP, could someone give me some
hints on where to start?   I know very little about http headers  and such

any information will be appreciated



-- 
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]