Sebastian Nagel created NUTCH-3127:
--------------------------------------
Summary: Deprecate or remove DmozParser
Key: NUTCH-3127
URL: https://issues.apache.org/jira/browse/NUTCH-3127
Project: Nutch
Issue Type: Improvement
Components: tool
Reporter: Sebastian Nagel
Fix For: 1.22
The tool
[DmozParser|https://github.com/apache/nutch/blob/master/src/java/org/apache/nutch/tools/DmozParser.java]
to import links from [DMOZ|https://en.wikipedia.org/wiki/DMOZ] RDF dumps.
- DMOZ was closed in 2017
- The "successor" Curlie does not provide RDF dumps, although they "are busy on
preparing a clean download" ([Curlie Data -
RDF|https://curlie.org/docs/en/rdf.html])
We should deprecate the tool adding a notice about the state of DMOZ, or simply
remove it.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)