You could MD4 the parts you care about, store that, fetch it and compare.
If there is a reliable timestamp, you could use that. But that would be

In general, you need to store some info about each source document
and figure out whether it is new. This get much hairier with a web
spider when you have dupes and servers that go away then come back.


On 9/14/07 6:08 PM, "Chris Hostetter" <[EMAIL PROTECTED]> wrote:

> How do you know that the document hasn't changed since the last time you
> indexed it?

Reply via email to