[
https://issues.apache.org/jira/browse/NUTCH-2201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15106667#comment-15106667
]
Dennis Kubes commented on NUTCH-2201:
-------------------------------------
+1 on this.
The loops program, iirc, is a factorial algorithm. After a depth of around 3,
depending on resources and input, the time it takes to run is excessive. It
does find cycles in the webgraph and that can be useful as that is one way
people try to game the search, but there have to be better algorithms.
> Remove loops program from webgraph package
> ------------------------------------------
>
> Key: NUTCH-2201
> URL: https://issues.apache.org/jira/browse/NUTCH-2201
> Project: Nutch
> Issue Type: Task
> Reporter: Markus Jelsma
> Assignee: Markus Jelsma
> Priority: Minor
>
> Recently Dennis mentioned the loops program to be bad program. As developer
> of the package, he recommends not to use it.
> {quote}
> 2. Crawl the pages for 1 shard. Update the WebGraph and Linkrank as
> described here. https://wiki.apache.org/nutch/NewScoring. Don't use
> Loops. It was a bad program with a bad algorithm and I never should
> have put it in. Live and learn.
> {quote}
> See: https://www.mail-archive.com/[email protected]/msg14164.html
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)