I know this is off on a tangent, but:
 
One huge adavantage to filtering in the FetchListTool (or is that the
Generator, I'm still on 0.7?) is that you can generate separate fetch
lists for separate "scopes", or subsets of your crawl data.  You can
then give your users some control over which of several scopes they're
actually searching in; all while having a single URL database.  I
suspect many people who are using Nutch over one or a small number of
sites are actually doing this.
 
Regards,
David.
 

Date: Wed, 08 Mar 2006 10:42:50 -0800
From: Doug Cutting <[EMAIL PROTECTED]>
To: [email protected]
Subject: [Nutch-dev] Re: svn commit: r384219 -
/lucene/nutch/trunk/src/java/org/apache/nutch/crawl/Generator.java
Reply-To: [EMAIL PROTECTED]

Andrzej Bialecki wrote:
> IMHO doing this here has a minimal impact while preventing a common 
> problem, but if you think this would harm many users then we should
of 
> course make it optional.

Let's just leave it as-is for now.  Thanks!

Doug



********************************************************************************
This email may contain legally privileged information and is intended only for 
the addressee. It is not necessarily the official view or 
communication of the New Zealand Qualifications Authority. If you are not the 
intended recipient you must not use, disclose, copy or distribute this email or 
information in it. If you have received this email in error, please contact the 
sender immediately. NZQA does not accept any liability for changes made to this 
email or attachments after sending by NZQA. 

All emails have been scanned for viruses and content by MailMarshal. 
NZQA reserves the right to monitor all email communications through its network.

********************************************************************************

Reply via email to