Sebastian Nagel created NUTCH-3127:
--------------------------------------

             Summary: Deprecate or remove DmozParser
                 Key: NUTCH-3127
                 URL: https://issues.apache.org/jira/browse/NUTCH-3127
             Project: Nutch
          Issue Type: Improvement
          Components: tool
            Reporter: Sebastian Nagel
             Fix For: 1.22


The tool 
[DmozParser|https://github.com/apache/nutch/blob/master/src/java/org/apache/nutch/tools/DmozParser.java]
 to import links from [DMOZ|https://en.wikipedia.org/wiki/DMOZ] RDF dumps.
- DMOZ was closed in 2017
- The "successor" Curlie does not provide RDF dumps, although they "are busy on 
preparing a clean download" ([Curlie Data - 
RDF|https://curlie.org/docs/en/rdf.html])

We should deprecate the tool adding a notice about the state of DMOZ, or simply 
remove it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to