[ 
https://issues.apache.org/jira/browse/NUTCH-3034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney reassigned NUTCH-3034:
-------------------------------------------

    Assignee: Lewis John McGibbney

> Evolve the legacy Nutch plugin framework to use PF4J
> ----------------------------------------------------
>
>                 Key: NUTCH-3034
>                 URL: https://issues.apache.org/jira/browse/NUTCH-3034
>             Project: Nutch
>          Issue Type: Improvement
>          Components: pf4j, plugin
>            Reporter: Lewis John McGibbney
>            Assignee: Lewis John McGibbney
>            Priority: Major
>              Labels: gsoc2024
>             Fix For: 1.22
>
>
> h1. Motivation
> Plugins provide a large part of the functionality of Nutch. Although the 
> legacy plugin framework continues to offer lots of value i.e.,
>  # [some aspects e.g. examples, are [fairly well 
> documented|h[ttps://cwiki.apache.org/confluence/display/NUTCH/PluginCentral|https://cwiki.apache.org/confluence/display/NUTCH/PluginCentral]]
>  # it is generally stable, and
>  # offers reasonable test coverage (on a plugin-by-plugin basis)
>  # … probably loads more positives which I am overlooking...
> … there are also several aspects which could be improved
>  # the [core framework is sparsely 
> documented|[https://cwiki.apache.org/confluence/display/NUTCH/WhichTechnicalConceptsAreBehindTheNutchPluginSystem]],
>  this extends to very important aspects like the {*}plugin lifecycle{*}, 
> {*}classloading{*}, {*}packaging{*}, {*}thread safety{*}, and lots of other 
> topics which are of intrinsic value to developers and maintainers. 
>  # the core framework is somewhat [sparsely 
> tested|[https://github.com/apache/nutch/blob/master/src/test/org/apache/nutch/plugin/TestPluginSystem.java]]…
>  currently 7 tests as of writing. Traditionally, developers have focused on 
> providing unit tests on the plugin-level as opposed to the legacy plugin 
> framework.
>  # see’s very low maintenance/attention. It is my gut feeling (and I may be 
> totally wrong here) but I _think_ that not many people know much about the 
> core legacy plugin framework.
>  # writing plugins is clunky. This largely has to do with the legacy Ant + 
> Ivy build and dependency management system, but that being said, it is clunky 
> non-the-less.
>  # generally speaking, any reduction of code in the Nutch codebase through 
> careful selection and dependence of well maintained, well tested 3rd party 
> libraries would be a good thing for the Nutch codebase.
> *This issue therefore proposes to overhaul the* *legacy* *Nutch plugin 
> framework and replace it with Plugin Framework for Java (PF4J).*
> h1. Task Breakdown
> The following is a proposed breakdown of this overall initiative intp Epics. 
> These Epics should likely be decomposed further but that will be left down to 
> the implementer(s).
>  # {*}document the legacy Nutch plugin lifecycle{*}; taking inspiration from 
> [PF4J’s plugin lifecycle 
> documentaiton|[https://pf4j.org/doc/plugin-lifecycle.html]] provide both 
> documentation and a diagram which clearly outline how the legacy plugin 
> lifecycle works. Might also be a good idea to make a contribution to PF4J and 
> provide them with a diagram to accompany their documentation :). Generally 
> speaking just familiarize ones-self with the legacy plugin framework and 
> understand where the gaps are.
>  # *study PF4J framework and* {*}perform feasibility study{*}{*};{*} this 
> will provide an opportunity to identify gaps between what the legacy plugin 
> framework does (and what Nutch) needs Vs what PF4J provides. Touch base with 
> the PF4J community, describe the intention to replace the legacy Nutch plugin 
> framework with PF4J. Obtain guidance on how to proceed. Document this all in 
> the Nutch wiki. Create mapping of [legacy 
> Classes|[https://github.com/apache/nutch/tree/master/src/java/org/apache/nutch/plugin]]
>  to [PF4J 
> equivalents|[https://github.com/pf4j/pf4j/tree/master/pf4j/src/main/java/org/pf4j]].
>  # {*}Restructure the legacy Nutch plugin package{*}: 
> [https://github.com/apache/nutch/tree/master/src/java/org/apache/nutch/plugin]
>  # {*}Restructure each plugin in the plugins directory{*}: 
> [https://github.com/apache/nutch/tree/master/src/plugin]
>  # *Update Nutch plugin documentation* 
>  # {*}Create/propose plugin utility toolings{*}: #4 in the motivation section 
> states that developing plugins in clunky. A utility tool which streamlines 
> the creation of new plugins would be ideal. For example, this could take the 
> form of a [new bash 
> script|[https://github.com/apache/nutch/tree/master/src/bin]] which prompts 
> the developer for input and then generates the plugin skeleton. {*}This is a 
> nice to have{*}.
> h1. Google Summer of Code Details
> This initiative is being proposed as a GSoC 2024 project. 
> {*}Proposed Mentor{*}: [~lewismc] 
> {*}Proposed Co-Mentor{*}:
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to