Lewis John McGibbney created NUTCH-3125:
-------------------------------------------
Summary: Replace retired MRUnit dependency with Mockito + JUnit 5
Key: NUTCH-3125
URL: https://issues.apache.org/jira/browse/NUTCH-3125
Project: Nutch
Issue Type: Task
Components: build, dependency, test
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
Fix For: 1.22
[Apache MRUnit|https://mrunit.apache.org/] was a specialized Java library for
unit testing MapReduce components (mappers, reducers, and drivers) in
isolation, without needing a full Hadoop cluster. Since its retirement in 2016
due to inactivity, the community shifted toward more general-purpose testing
tools that can handle Hadoop's unique architecture, such as mocking contexts,
writables, and static methods.
I forgot that Nutch depends on MRUnit until I revisited NUTCGH-288.
I propose we replace the MRUnit test dependency with
[Mockito|https://site.mockito.org/]; a popular mocking framework that would
allow us to mock Hadoop's Mapper.Context, Reducer.Context, and other framework
elements. This directly replicates MRUnit's ability to test mappers/reducers in
isolation by simulating input/output without a cluster. Mockito is lightweight,
actively maintained, and doesn't require Hadoop-specific jars beyond the
existing project dependencies defined in ivy.xml. Mockito can be combined with
JUnit 5 for assertions. For static method mocking (e.g., Hadoop counters),
apparently we can even pair it with
[PowerMock|https://github.com/powermock/powermock]!
Currently MRUnit is used in the following test Classes
./org/apache/nutch/crawl/CrawlDbUpdateTestDriver.java
./org/apache/nutch/indexer/TestIndexerMapReduce.java
--
This message was sent by Atlassian Jira
(v8.20.10#820010)