Aled Jones wrote:
Hi
Is there a way to remove certain urls from a crawled set of data?
Please see the PruneIndexTool. This removes just the index entries,
without actually removing the content from segments. This means that you
will no longer see the hits from these urls, but it doesn't prevent you
from collecting the same urls in the next round of fetching. To prevent
that, you need to modify your URLFilters.
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com