* Thus wrote Kevin Stone ([EMAIL PROTECTED]):
>
> "Paul Van Schayck" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED]
> > [EMAIL PROTECTED] (Kevin Stone) wrote
> > Hello Kevin.
> >
> > > This is just a thought.. I have never employed this method
> > > personally.. but I suppose you could read the page into a string and
> > > use the md5() function to generate a hexidecimal value based on that
> > > string. Store the hex value in a database and compare it against the
> > > new value generated the next day. If anything on the page has been
> > > modified the values should not match. Even the most minute change
> > > should trigger a new value. Obviously you won't know *what* has been
> > > modified only that the page *has* been modified
> > >
[...]
> >
> > Too much pittfals and too slow! Really the socket connection only
> > retreiving the headers is really the best way.
> >
> > This function is what you need:
> >
> > function fileStamp($domain, $file)
> > {
[...]
> > return strtotime($time);
> > }
> >
>
> Slow? Hogwash. You're pining over microseconds. Besides most of the time is taken
> opening the file which you're doing anyway. Except that the socket method relies on
> header information that may or may not be there. I agree it would be ideal if you
> could use that information but your fileStamp() function isn't going to work for all
> files on all servers.
Ok. no need to argue here. Both methods arn't correct or the most
efficient. If you want to keep a copy of the file you use the GET
method otherwise for just checking modification state use the HEAD
method.
First time getting a page, there are some headers you want to pay
attention to:
ETag:
Last-modified:
[cache directives]
Expires:
Content-Length:
And keep these values stored somewhere.
Then when checking to see if the document is changed or should be
re-requested:
if the document has expired the document should be re-requested
to see if it has changed, otherwise you are safe to assume that
it is the same. Do note that when calculating this, the expired
time is the server time, so you should keep note (when
retrieving the information) the time difference for the
calculation.
if you dont have a last-modified or etag, the document *should*
be considered modified!
(observe cache directives)
if the document has a query string you must check if it has been
modified.
To get the document:
(HEAD|GET) $url_value HTTP/1.1
host: $host
...other misc headers
If you have an ETag, add a request header
If-Match: $etag_value
if you have last modified add request header
If-Modified-Since: $last_modified_value
if the content-length is available send request header:
Content-Length: $content_length_value
The response:
HTTP/1.1 304 Document not modified
-woot.. it isn't modified.
HTTP/1.1 200 OK
- it should be considered modified
[other responses could be returned]
If a GET was requested the document will follow the headers. And
well, thats all there is to it. Assignment is due next week :)
Reference: [1] http://www.w3.org/Protocols/rfc2616/rfc2616.html
HTH.
Curt
--
"My PHP key is worn out"
PHP List stats since 1997:
http://zirzow.dyndns.org/html/mlists/
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php