Hi,

in the fetcher line 192 in case the status is NOTMODIFIED we collect null as content but we already have the content. I'm worry what is happen with a page that does not change for 60 days, since the concept of nutch is do delete segments that are older than "db.default.fetch.interval", isn't it?

If this is true, may be someone with write access can change null to content.
Thanks for any comments.
Stefan





-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to