+1

--
 Sami Siren


Doug Cutting wrote:
I propose we cleanup Nutch's tools as follows.

First, some definitions:

1. An "action" is an operation on Nutch data. For example, GenerateSegmentFromDB, FetchSegment, UpdateDB, IndexSegment, MergeIndexes, SearchServer, etc. are all actions.

2. A "tool" invokes an action from the command line.

The proposal:

1. Actions and tools should be separate classes, in separate files.

2. A tool class should define no methods other than a main() and perhaps those required to parse the command line. All application logic should be in the action class.

3. All actions must implement the following interface:

  public interface NutchConfigurable {
    void setConf(NutchConf conf);
    NutchConf getConf();
  }

4. Most actions should implement this by extending:

  public class NutchConfigured implements NutchConfigurable {
    private NutchConf conf;
    public NutchConfigured(NutchConf conf) { setConf(conf); }
    public void setConf(NutchConf conf) { this.conf = conf; }
    public NutchConf getConf() { return conf; }
  }

5. All plugins must implement NutchConfigurable.

6. Plugin factory methods must accept a NutchConf.

For example:

  public static Protocol ProtocolFactory.getProtocol(String url);

will become:

  public static Protocol ProtocolFactory.getProtocol(NutchConf, String);

Comments?

Doug





------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ Nutch-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to