[
https://issues.apache.org/jira/browse/NUTCH-1294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Claudiu Chis updated NUTCH-1294:
--------------------------------
Attachment: NUTCH-1294-v3.patch
- no changes to java files
- added logging for IndexCleanerJob
- the patch now fully deploys (in v2 src/bin/nutch and conf/log4j.properties
had to be applied manually)
> IndexClean job with solr implementation.
> ----------------------------------------
>
> Key: NUTCH-1294
> URL: https://issues.apache.org/jira/browse/NUTCH-1294
> Project: Nutch
> Issue Type: Improvement
> Affects Versions: nutchgora
> Reporter: Dan Rosher
> Priority: Minor
> Fix For: 2.3
>
> Attachments: NUTCH-1294.patch, NUTCH-1294-v2.patch,
> NUTCH-1294-v3.patch
>
>
> I started by copying/altering the trunk version of SolrClean, though is was
> inadequate for our needs. We needed to mark particular pages as gone even
> though they still might be visible on the web, this implementation abstracts
> the index cleaning process, has a Solr implementation, and adds a clean index
> plugin extension that allows others to tailor how pages might be removed from
> their store.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira