RE: [PHP] Finding out when a Web page has changed

2002-09-26 Thread Vikram Vaswani

Yup, thought of that one - but it just seems a little sub-optimal ;) Any
way to do this without resorting to a brute-force "read the entire file
stream over HTTP"?

I can't think of one, so I guess I may do it this way after all - but if
something occurs to you, or anyone else on the list, please let me know.

Thanks for the help :)

Vikram

>You could cache/save the actual contents of the file, then when you read
>it next time, compare it to what you saved and see if it changed. You
>may want to filter out everything but what's between  and ,
>so you're not thinking it changed just b/c of something in the
>headers...
>
>---John Holmes...
>
>> -Original Message-
>> From: Vikram Vaswani [mailto:[EMAIL PROTECTED]]
>> Sent: Thursday, September 26, 2002 7:04 AM
>> To: [EMAIL PROTECTED]
>> Subject: [PHP] Finding out when a Web page has changed
>> 
>> Hi all,
>> 
>> I need to write an application that accepts a list of URLs and checks
>them
>> on a daily basis (via cron) to see if the pages have changed in the
>past
>> day.
>> 
>> I need some help with this. Does anyone know the most optimal way to
>find
>> out when a particular Web page has been modified? I am thinking about
>> using
>> the Last-Modified: HTTP header - however, all servers do not return
>this
>> header - any ideas on what the fallback should be?
>> 
>> TIA,
>> 
>> Vikram
>> --
>> "I find your lack of faith disturbing."
>>  --Darth Vader
>> 
>> --
>> PHP General Mailing List (http://www.php.net/)
>> To unsubscribe, visit: http://www.php.net/unsub.php
>
>

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php




Re: [PHP] Finding out when a Web page has changed

2002-09-26 Thread Vikram Vaswani

>> I need to write an application that accepts a list of URLs and checks them
>> on a daily basis (via cron) to see if the pages have changed in the past
day.
>> 
>> I need some help with this. Does anyone know the most optimal way to find
>> out when a particular Web page has been modified? I am thinking about using
>> the Last-Modified: HTTP header - however, all servers do not return this
>> header - any ideas on what the fallback should be?
>
>You could calculate and store the MD5 hash of the page. If the hash is
>different the next day, you know the page has been modified.

Does this mean that I need to read the entire contents of the HTTP stream
into a variable, calculate the hash and store it for comparison? Or is
there an easier way to get the MD5 hash?

Basically, I'm wondering if there is a way to do this without having to
read the entire URL contents via HTTP.

Vikram

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php




RE: [PHP] Finding out when a Web page has changed

2002-09-26 Thread John Holmes

Yeah, true. Maybe you could just ereg() out the content. Each url would
need it's own ereg, though, so it won't be as easy to set up.

But, technically, if the quote changes, then the page has been updated,
even if it's dynamic. How do you define "updated"??

---John Holmes...

> -Original Message-
> From: Justin French [mailto:[EMAIL PROTECTED]]
> Sent: Thursday, September 26, 2002 10:25 AM
> To: Marek Kilimajer; PHP
> Subject: Re: [PHP] Finding out when a Web page has changed
> 
> Same with sites that have negligible daily changes (like today's date
> dynamically inserted), or random changes (a random quote, tip, stock
> quote,
> product, image, etc etc would all screw that up).
> 
> Justin
> 
> 
> on 26/09/02 11:03 PM, Marek Kilimajer ([EMAIL PROTECTED]) wrote:
> 
> > Hope the sites have no banners :),  they change all the time
> >
> > John Holmes wrote:
> >
> >> You could cache/save the actual contents of the file, then when you
> read
> >> it next time, compare it to what you saved and see if it changed.
You
> >> may want to filter out everything but what's between  and
> ,
> >> so you're not thinking it changed just b/c of something in the
> >> headers...
> >>
> >> ---John Holmes...
> >>
> >>
> >>
> >>> -Original Message-
> >>> From: Vikram Vaswani [mailto:[EMAIL PROTECTED]]
> >>> Sent: Thursday, September 26, 2002 7:04 AM
> >>> To: [EMAIL PROTECTED]
> >>> Subject: [PHP] Finding out when a Web page has changed
> >>>
> >>> Hi all,
> >>>
> >>> I need to write an application that accepts a list of URLs and
checks
> >>>
> >>>
> >> them
> >>
> >>
> >>> on a daily basis (via cron) to see if the pages have changed in
the
> >>>
> >>>
> >> past
> >>
> >>
> >>> day.
> >>>
> >>> I need some help with this. Does anyone know the most optimal way
to
> >>>
> >>>
> >> find
> >>
> >>
> >>> out when a particular Web page has been modified? I am thinking
about
> >>> using
> >>> the Last-Modified: HTTP header - however, all servers do not
return
> >>>
> >>>
> >> this
> >>
> >>
> >>> header - any ideas on what the fallback should be?
> >>>
> >>> TIA,
> >>>
> >>> Vikram
> >>> --
> >>> "I find your lack of faith disturbing."
> >>> --Darth Vader
> >>>
> >>> --
> >>> PHP General Mailing List (http://www.php.net/)
> >>> To unsubscribe, visit: http://www.php.net/unsub.php
> >>>
> >>>
> >>
> >>
> >>
> >>
> >>
> >
> 
> 
> --
> PHP General Mailing List (http://www.php.net/)
> To unsubscribe, visit: http://www.php.net/unsub.php



-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php




Re: [PHP] Finding out when a Web page has changed

2002-09-26 Thread Justin French

Same with sites that have negligible daily changes (like today's date
dynamically inserted), or random changes (a random quote, tip, stock quote,
product, image, etc etc would all screw that up).

Justin


on 26/09/02 11:03 PM, Marek Kilimajer ([EMAIL PROTECTED]) wrote:

> Hope the sites have no banners :),  they change all the time
> 
> John Holmes wrote:
> 
>> You could cache/save the actual contents of the file, then when you read
>> it next time, compare it to what you saved and see if it changed. You
>> may want to filter out everything but what's between  and ,
>> so you're not thinking it changed just b/c of something in the
>> headers...
>> 
>> ---John Holmes...
>> 
>> 
>> 
>>> -Original Message-
>>> From: Vikram Vaswani [mailto:[EMAIL PROTECTED]]
>>> Sent: Thursday, September 26, 2002 7:04 AM
>>> To: [EMAIL PROTECTED]
>>> Subject: [PHP] Finding out when a Web page has changed
>>> 
>>> Hi all,
>>> 
>>> I need to write an application that accepts a list of URLs and checks
>>> 
>>> 
>> them
>> 
>> 
>>> on a daily basis (via cron) to see if the pages have changed in the
>>> 
>>> 
>> past
>> 
>> 
>>> day.
>>> 
>>> I need some help with this. Does anyone know the most optimal way to
>>> 
>>> 
>> find
>> 
>> 
>>> out when a particular Web page has been modified? I am thinking about
>>> using
>>> the Last-Modified: HTTP header - however, all servers do not return
>>> 
>>> 
>> this
>> 
>> 
>>> header - any ideas on what the fallback should be?
>>> 
>>> TIA,
>>> 
>>> Vikram
>>> --
>>> "I find your lack of faith disturbing."
>>> --Darth Vader
>>> 
>>> --
>>> PHP General Mailing List (http://www.php.net/)
>>> To unsubscribe, visit: http://www.php.net/unsub.php
>>> 
>>> 
>> 
>> 
>> 
>> 
>> 
> 


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php




Re: [PHP] Finding out when a Web page has changed

2002-09-26 Thread Erwin

Marek Kilimajer wrote:
> Hope the sites have no banners :),  they change all the time

But the URL to the banners will be the same, so that's no change in the HTML
code ;-))

> [SNIP]


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php




Re: [PHP] Finding out when a Web page has changed

2002-09-26 Thread Marek Kilimajer

Hope the sites have no banners :),  they change all the time

John Holmes wrote:

>You could cache/save the actual contents of the file, then when you read
>it next time, compare it to what you saved and see if it changed. You
>may want to filter out everything but what's between  and ,
>so you're not thinking it changed just b/c of something in the
>headers...
>
>---John Holmes...
>
>  
>
>>-Original Message-
>>From: Vikram Vaswani [mailto:[EMAIL PROTECTED]]
>>Sent: Thursday, September 26, 2002 7:04 AM
>>To: [EMAIL PROTECTED]
>>Subject: [PHP] Finding out when a Web page has changed
>>
>>Hi all,
>>
>>I need to write an application that accepts a list of URLs and checks
>>
>>
>them
>  
>
>>on a daily basis (via cron) to see if the pages have changed in the
>>
>>
>past
>  
>
>>day.
>>
>>I need some help with this. Does anyone know the most optimal way to
>>
>>
>find
>  
>
>>out when a particular Web page has been modified? I am thinking about
>>using
>>the Last-Modified: HTTP header - however, all servers do not return
>>
>>
>this
>  
>
>>header - any ideas on what the fallback should be?
>>
>>TIA,
>>
>>Vikram
>>--
>>"I find your lack of faith disturbing."
>>  --Darth Vader
>>
>>--
>>PHP General Mailing List (http://www.php.net/)
>>To unsubscribe, visit: http://www.php.net/unsub.php
>>
>>
>
>
>
>  
>


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php




RE: [PHP] Finding out when a Web page has changed

2002-09-26 Thread John Holmes

You could cache/save the actual contents of the file, then when you read
it next time, compare it to what you saved and see if it changed. You
may want to filter out everything but what's between  and ,
so you're not thinking it changed just b/c of something in the
headers...

---John Holmes...

> -Original Message-
> From: Vikram Vaswani [mailto:[EMAIL PROTECTED]]
> Sent: Thursday, September 26, 2002 7:04 AM
> To: [EMAIL PROTECTED]
> Subject: [PHP] Finding out when a Web page has changed
> 
> Hi all,
> 
> I need to write an application that accepts a list of URLs and checks
them
> on a daily basis (via cron) to see if the pages have changed in the
past
> day.
> 
> I need some help with this. Does anyone know the most optimal way to
find
> out when a particular Web page has been modified? I am thinking about
> using
> the Last-Modified: HTTP header - however, all servers do not return
this
> header - any ideas on what the fallback should be?
> 
> TIA,
> 
> Vikram
> --
> "I find your lack of faith disturbing."
>   --Darth Vader
> 
> --
> PHP General Mailing List (http://www.php.net/)
> To unsubscribe, visit: http://www.php.net/unsub.php



-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php




[PHP] Finding out when a Web page has changed

2002-09-26 Thread Vikram Vaswani

Hi all,

I need to write an application that accepts a list of URLs and checks them
on a daily basis (via cron) to see if the pages have changed in the past day.

I need some help with this. Does anyone know the most optimal way to find
out when a particular Web page has been modified? I am thinking about using
the Last-Modified: HTTP header - however, all servers do not return this
header - any ideas on what the fallback should be?

TIA,

Vikram
--
"I find your lack of faith disturbing." 
--Darth Vader 

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php