[
https://issues.apache.org/jira/browse/NUTCH-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13593357#comment-13593357
]
Markus Jelsma commented on NUTCH-1539:
--------------------------------------
Thanks, this is very interesting. I've checked the sources and read the PDF but
got a question. Where does the root set of URL's (ExpandRoot -rootset
crawl/rootUrls.txt) that is going to get expanded by ExpandRoot supposed to
come from?
> Implement the Hypertext Induced Topic Search (HITS) algorithm in Nutch
> ----------------------------------------------------------------------
>
> Key: NUTCH-1539
> URL: https://issues.apache.org/jira/browse/NUTCH-1539
> Project: Nutch
> Issue Type: Bug
> Components: linkdb
> Environment: CSCI 572: Search Engines and Information Retrieval @
> USC, http://sunset.usc.edu/classes/cs572_2010/
> Nutch 1.1
> Reporter: Chris A. Mattmann
> Assignee: Chris A. Mattmann
> Fix For: 1.7
>
> Attachments: CS572CourseProjectReport_Yongqiang.pdf,
> csci572CourseProject_Yongqiang.rar,
> NUTCH-1538.yongqiang.Mattmann.030413.patch.txt
>
>
> In my Summer 2010 CSCI 572: Search Engines and Information Retrieval class,
> my student Yongqiang Li and I implemented the HITS algorithm in Nutch based
> on Jon Kleinberg's paper:
> Authoritative Sources in a Hyperlinked Environment
> http://dl.acm.org/citation.cfm?id=324140
> I'll put up the code we had shortly.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira