Add support for deleting Solr documents with ProtocolStatusCodes.NOTFOUND
-------------------------------------------------------------------------

                 Key: NUTCH-979
                 URL: https://issues.apache.org/jira/browse/NUTCH-979
             Project: Nutch
          Issue Type: New Feature
          Components: indexer
    Affects Versions: 2.0
            Reporter: Markus Jelsma
            Assignee: Markus Jelsma
            Priority: Minor
             Fix For: 2.0


When issuing recrawls it can happen that certain urls have expired (i.e. URLs 
that don't exist anymore and return 404).
This issue creates a new command in the indexer that scans for WebPages with 
ProtocolStatusCodes.NOTFOUND and issues delete commands to Solr.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to