[ 
https://issues.apache.org/jira/browse/NUTCH-1201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13188473#comment-13188473
 ] 

Edward Drapkin commented on NUTCH-1201:
---------------------------------------

I've been thinking about this some more and I'd like to suggest the following:

1) Break fetcher into a plugin with an extension point of "fetch" or "fetcher".

2) The Fetcher interface should define a single method `void fetch(Path 
segment)` and extend Pluggable, Configurable.

3) We provide a fetcher-base (or maybe fetcher-abstract?) plugin that other 
plugins may or may not extend from; this plugin should include abstract classes 
that provide a skeletal framework to build a fetcher into.  At least to start 
with, I would roughly copy the structure of the current fetcher, although I'd 
expect some changes to have to be made to fetcher-base as other implementations 
are built. 

4) Provide a fetcher-default (or fetcher-bc) plugin that extends the 
fetcher-base plugin and provides a full implementation of the fetcher that is 
backwards compatible (this is where the current fetcher would be moved to).

5) Once here, we can start creating other fetcher plugins, like 
fetcher-depthfirst, fetcher-scorepriority, fetcher-singlethread, etc.  
Eventually we should be able to offer a plethora of fetching strategies and 
options for people who want to configure their fetching behavior.

I can start working on this sometime this week, but it's a rather huge 
undertaking that's a pretty significant refactoring of a pretty major component 
of Nutch.  That is to say that I don't want to start working on it and invest 
the hours it will take unless you guys who run this project are comfortable 
with the level of refactoring I'm suggesting.
                
> Allow for different FetcherThread impls
> ---------------------------------------
>
>                 Key: NUTCH-1201
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1201
>             Project: Nutch
>          Issue Type: New Feature
>          Components: fetcher
>            Reporter: Markus Jelsma
>            Assignee: Markus Jelsma
>             Fix For: 1.5
>
>
> For certain cases we need to modify parts in FetcherThread and make it 
> pluggable. This introduces a new config directive fetcher.impl that takes a 
> FQCN and uses that setting Fetcher.fetch to load a class to use for 
> job.setMapRunnerClass(). This new class has to extend Fetcher and and inner 
> class FetcherThread. This allows for overriding methods in FetcherThread but 
> also methods in Fetcher itself if required.
> A follow up on this issue would be to refactor parts of FetcherThread to make 
> it easier to override small sections instead of copying the entire method 
> body for a small change, which is now the case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to