(moved from nutch-user) Nicolás Lichtmaier wrote:
> > Should I post these kind of questions to the dev list instead? Yes :) > Hi, I'm working in a fixed set of URLs and I'd like to replace the > standard OPIC score plugin with something different. I'd like to > create a scoring plugin which entirely bases its score on the document > parsed data (yes, I will trust the document text itself to decide its > relevance). > > I've been reading the code and the ScoringFilter interface seems to be > targeted for use by OPIC like algorithms. For example, the step called > after parsing is called "passScoreAfterParsing()", telling me what am > I supposed to do in that method, and the method setting the scores is > called "distributeScoreToOutlink()". All of this scares me... would it > be safe to use these methods differently and, e.g., modify the > socument score in "passScoreAfterParsing()" instead of just "passing it"? You can modify whichever way you want - it's up to you. These methods simply ensure that the score data (not just the CrawlDatum.getScore(), but possibly a multitude of metadata collected on the way) is passed to appropriate segment parts. E.g. in distributeScoreToOutlink() you could simply set the default score for new pages to a fixed value, without actually using the score information from the source page. -- Best regards, Andrzej Bialecki <>< ___. ___ ___ ___ _ _ __________________________________ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nutch-developers mailing list Nutch-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-developers