Armel T. Nene wrote:
Andrzej, the feature that I am after can be implemented by this patch if I
just adapt it right. I am not sure of this but the patch seems a little bit
old to be implemented in the latest release of Nutch 0.8.1.

Right, that's why I wrote it needs to be brought up-to-date with the current trunk/ .

I want to implement a feature where the fetcher will fetch files but only
add them if there have been modified after the latest fetch time. Now, I
want to implement that on a filesystem first and then update later for
network fetching. I would like to have a look at your full source code for
your patch in a zip file if possible. Once the feature implemented, I will
post it back here. I'd like to start working from your code first. You can
either make the source code available here or mail them to me at armel dot
nene @ idna-solutions dot com.

Patches attached to the JIRA issue already support this. Please bear in mind that the notion of "change" is dependent on how you compare the content of old and new pages, especially if you lack the Last-Modified header from the server.


--
Best regards,
Andrzej Bialecki     <><
___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


Reply via email to