[
https://issues.apache.org/jira/browse/NUTCH-1958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-1958:
----------------------------------------
Fix Version/s: (was: 1.10)
1.11
> Remove scoring-opic from nutch-default.xml
> ------------------------------------------
>
> Key: NUTCH-1958
> URL: https://issues.apache.org/jira/browse/NUTCH-1958
> Project: Nutch
> Issue Type: Improvement
> Affects Versions: 2.3, 1.9
> Reporter: Markus Jelsma
> Assignee: Markus Jelsma
> Fix For: 2.4, 1.11
>
>
> I propose we remove scoring-opic from nutch-default. We all know it is flawed
> for any kind of incremental crawl, which most of us do. It is also useless if
> you want to perform a single crawl, if you must crawl all records of a
> domain, using OPIC for prioritizing URLS makes no sense. It also confuses
> users as we have seen in the past and recently [1].
> What do you think?
> [1]:
> http://lucene.472066.n3.nabble.com/Nutch-documents-have-huge-scores-in-Solr-td4192064.html
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)