[
https://issues.apache.org/jira/browse/NUTCH-614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dennis Kubes updated NUTCH-614:
-------------------------------
Patch Info: [Patch Available]
> Order Inlinks by OPIC score of parent page
> ------------------------------------------
>
> Key: NUTCH-614
> URL: https://issues.apache.org/jira/browse/NUTCH-614
> Project: Nutch
> Issue Type: Improvement
> Affects Versions: 0.9.0
> Environment: All
> Reporter: Dennis Kubes
> Assignee: Dennis Kubes
> Fix For: 0.9.0, 1.0.0
>
> Attachments: NUTCH-614-1-20080219.patch
>
>
> Currently when saving inlinks there is a max number of inlinks (configurable)
> which get saved and very little logic goes into deciding which inlinks get
> saved. This patch uses the OPIC score of the encompassing page to set a
> score for each inlink. Inlinks are then reverse sorted according to score
> and the best inlinks are saved first. The logic behind this is that pages
> with higher OPIC scores should have better links which they are pointing to.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.