[
https://issues.apache.org/jira/browse/NUTCH-1201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13188473#comment-13188473
]
Edward Drapkin commented on NUTCH-1201:
---------------------------------------
I've been thinking about this some more and I'd like to suggest the following:
1) Break fetcher into a plugin with an extension point of "fetch" or "fetcher".
2) The Fetcher interface should define a single method `void fetch(Path
segment)` and extend Pluggable, Configurable.
3) We provide a fetcher-base (or maybe fetcher-abstract?) plugin that other
plugins may or may not extend from; this plugin should include abstract classes
that provide a skeletal framework to build a fetcher into. At least to start
with, I would roughly copy the structure of the current fetcher, although I'd
expect some changes to have to be made to fetcher-base as other implementations
are built.
4) Provide a fetcher-default (or fetcher-bc) plugin that extends the
fetcher-base plugin and provides a full implementation of the fetcher that is
backwards compatible (this is where the current fetcher would be moved to).
5) Once here, we can start creating other fetcher plugins, like
fetcher-depthfirst, fetcher-scorepriority, fetcher-singlethread, etc.
Eventually we should be able to offer a plethora of fetching strategies and
options for people who want to configure their fetching behavior.
I can start working on this sometime this week, but it's a rather huge
undertaking that's a pretty significant refactoring of a pretty major component
of Nutch. That is to say that I don't want to start working on it and invest
the hours it will take unless you guys who run this project are comfortable
with the level of refactoring I'm suggesting.
> Allow for different FetcherThread impls
> ---------------------------------------
>
> Key: NUTCH-1201
> URL: https://issues.apache.org/jira/browse/NUTCH-1201
> Project: Nutch
> Issue Type: New Feature
> Components: fetcher
> Reporter: Markus Jelsma
> Assignee: Markus Jelsma
> Fix For: 1.5
>
>
> For certain cases we need to modify parts in FetcherThread and make it
> pluggable. This introduces a new config directive fetcher.impl that takes a
> FQCN and uses that setting Fetcher.fetch to load a class to use for
> job.setMapRunnerClass(). This new class has to extend Fetcher and and inner
> class FetcherThread. This allows for overriding methods in FetcherThread but
> also methods in Fetcher itself if required.
> A follow up on this issue would be to refactor parts of FetcherThread to make
> it easier to override small sections instead of copying the entire method
> body for a small change, which is now the case.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira