Thanks Andrzej! 

> -----Neges Wreiddiol-----/-----Original Message-----
> Oddi wrth/From: Andrzej Bialecki [mailto:[EMAIL PROTECTED] 
> Anfonwyd/Sent: 04 January 2006 12:10
> At/To: [email protected]
> Pwnc/Subject: Re: Remove links from index
> 
> Aled Jones wrote:
> 
> >Hi
> >
> >Is there a way to remove certain urls from a crawled set of data?
> >  
> >
> 
> Please see the PruneIndexTool. This removes just the index 
> entries, without actually removing the content from segments. 
> This means that you will no longer see the hits from these 
> urls, but it doesn't prevent you from collecting the same 
> urls in the next round of fetching. To prevent that, you need 
> to modify your URLFilters.
> 
> -- 
> Best regards,
> Andrzej Bialecki     <><
>  ___. ___ ___ ___ _ _   __________________________________
> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
> ___|||__||  \|  ||  |  Embedded Unix, System Integration
> http://www.sigram.com  Contact: info at sigram dot com
> 
> 
> 

************************************************************************
This e-mail and any attachments are strictly confidential and intended solely 
for the addressee. They may contain information which is covered by legal, 
professional or other privilege. If you are not the intended addressee, you 
must not copy the e-mail or the attachments, or use them for any purpose or 
disclose their contents to any other person. To do so may be unlawful. If you 
have received this transmission in error, please notify us as soon as possible 
and delete the message and attachments from all places in your computer where 
they are stored. 

Although we have scanned this e-mail and any attachments for viruses, it is 
your responsibility to ensure that they are actually virus free.
 



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_idv37&alloc_id865&op=click
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to