[
https://issues.apache.org/jira/browse/NUTCH-3034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney reassigned NUTCH-3034:
-------------------------------------------
Assignee: Lewis John McGibbney
> Evolve the legacy Nutch plugin framework to use PF4J
> ----------------------------------------------------
>
> Key: NUTCH-3034
> URL: https://issues.apache.org/jira/browse/NUTCH-3034
> Project: Nutch
> Issue Type: Improvement
> Components: pf4j, plugin
> Reporter: Lewis John McGibbney
> Assignee: Lewis John McGibbney
> Priority: Major
> Labels: gsoc2024
> Fix For: 1.22
>
>
> h1. Motivation
> Plugins provide a large part of the functionality of Nutch. Although the
> legacy plugin framework continues to offer lots of value i.e.,
> # [some aspects e.g. examples, are [fairly well
> documented|h[ttps://cwiki.apache.org/confluence/display/NUTCH/PluginCentral|https://cwiki.apache.org/confluence/display/NUTCH/PluginCentral]]
> # it is generally stable, and
> # offers reasonable test coverage (on a plugin-by-plugin basis)
> # … probably loads more positives which I am overlooking...
> … there are also several aspects which could be improved
> # the [core framework is sparsely
> documented|[https://cwiki.apache.org/confluence/display/NUTCH/WhichTechnicalConceptsAreBehindTheNutchPluginSystem]],
> this extends to very important aspects like the {*}plugin lifecycle{*},
> {*}classloading{*}, {*}packaging{*}, {*}thread safety{*}, and lots of other
> topics which are of intrinsic value to developers and maintainers.
> # the core framework is somewhat [sparsely
> tested|[https://github.com/apache/nutch/blob/master/src/test/org/apache/nutch/plugin/TestPluginSystem.java]]…
> currently 7 tests as of writing. Traditionally, developers have focused on
> providing unit tests on the plugin-level as opposed to the legacy plugin
> framework.
> # see’s very low maintenance/attention. It is my gut feeling (and I may be
> totally wrong here) but I _think_ that not many people know much about the
> core legacy plugin framework.
> # writing plugins is clunky. This largely has to do with the legacy Ant +
> Ivy build and dependency management system, but that being said, it is clunky
> non-the-less.
> # generally speaking, any reduction of code in the Nutch codebase through
> careful selection and dependence of well maintained, well tested 3rd party
> libraries would be a good thing for the Nutch codebase.
> *This issue therefore proposes to overhaul the* *legacy* *Nutch plugin
> framework and replace it with Plugin Framework for Java (PF4J).*
> h1. Task Breakdown
> The following is a proposed breakdown of this overall initiative intp Epics.
> These Epics should likely be decomposed further but that will be left down to
> the implementer(s).
> # {*}document the legacy Nutch plugin lifecycle{*}; taking inspiration from
> [PF4J’s plugin lifecycle
> documentaiton|[https://pf4j.org/doc/plugin-lifecycle.html]] provide both
> documentation and a diagram which clearly outline how the legacy plugin
> lifecycle works. Might also be a good idea to make a contribution to PF4J and
> provide them with a diagram to accompany their documentation :). Generally
> speaking just familiarize ones-self with the legacy plugin framework and
> understand where the gaps are.
> # *study PF4J framework and* {*}perform feasibility study{*}{*};{*} this
> will provide an opportunity to identify gaps between what the legacy plugin
> framework does (and what Nutch) needs Vs what PF4J provides. Touch base with
> the PF4J community, describe the intention to replace the legacy Nutch plugin
> framework with PF4J. Obtain guidance on how to proceed. Document this all in
> the Nutch wiki. Create mapping of [legacy
> Classes|[https://github.com/apache/nutch/tree/master/src/java/org/apache/nutch/plugin]]
> to [PF4J
> equivalents|[https://github.com/pf4j/pf4j/tree/master/pf4j/src/main/java/org/pf4j]].
> # {*}Restructure the legacy Nutch plugin package{*}:
> [https://github.com/apache/nutch/tree/master/src/java/org/apache/nutch/plugin]
> # {*}Restructure each plugin in the plugins directory{*}:
> [https://github.com/apache/nutch/tree/master/src/plugin]
> # *Update Nutch plugin documentation*
> # {*}Create/propose plugin utility toolings{*}: #4 in the motivation section
> states that developing plugins in clunky. A utility tool which streamlines
> the creation of new plugins would be ideal. For example, this could take the
> form of a [new bash
> script|[https://github.com/apache/nutch/tree/master/src/bin]] which prompts
> the developer for input and then generates the plugin skeleton. {*}This is a
> nice to have{*}.
> h1. Google Summer of Code Details
> This initiative is being proposed as a GSoC 2024 project.
> {*}Proposed Mentor{*}: [~lewismc]
> {*}Proposed Co-Mentor{*}:
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)