Hi Karl

I have a basic question about how web crawler crawls contents after
the first crawling.

Does it crawls and indexes all pages from the root all the time or it
crawls
only pages that are modified.

If it crawls only modified pages how does it figure out the pages are
modified?
By checking the size of the pages? by hash?

How about documents files like PDF, linked in web pages?
if those documents are modified, how does MCF figure out they are modified?

I am using old version, MCF 1.4.1

Best regards,


Shigeki

Reply via email to