Sebastian Nagel created NUTCH-2008:
--------------------------------------
Summary: IndexerMapReduce to use single instance of
NutchIndexAction for deletions
Key: NUTCH-2008
URL: https://issues.apache.org/jira/browse/NUTCH-2008
Project: Nutch
Issue Type: Improvement
Components: indexer
Affects Versions: 1.10
Reporter: Sebastian Nagel
Priority: Trivial
Fix For: 1.11
For every URL/document to be deleted a new instance of NutchIndexAction is
created in IndexerMapReduce (in multiple positions):
{code}
NutchIndexAction action = new NutchIndexAction(null,
NutchIndexAction.DELETE);
output.collect(key, action);
{code}
Since the index action does not hold any data specific to any URL/document it
would be more efficient to re-use a single instance.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)