Rod Taylor wrote:
Doing the actual expunging during updatedb is better than as a separate
command for performance. As a periodic option (scrubbing content
generation or abuse sites in my case) combining with updatedb will
reduce the IO and CPU requirements. Updatedb already reads in the DB,
cycles through every entry, sorts it, and write it out.
Doing this in a separate command would kill 4 to 8 hours of otherwise
usable time. Doing it as a part of updatedb probably costs about 1 hour
of work (CPU time to apply filters only).
That's true. A separate tool might be useful anyway, but if you have
some spare cycles and could provide a patch to updatedb that implements
this it would be a great addition.
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers