Hi, sorry to re-open this thread, but I am facing the same problem of Nicolás. I like both yours (Doğacan) and Nicolas' ideas, more yours as I think abstract classes are not good extension points. Anyway, is any of these implemented? I really need it! Also, I can't understand from the docs what does it means that the adjust datum will update the score of the original datum in updatedb. Update or adjusted in which way? I obtain strange values..
Thanks! Lorenzo > Hi, > On 2/27/07, Nicolás Lichtmaier <[EMAIL PROTECTED] > <http://www.opensubscriber.com/sendEmail.os?message=6159544&inline=0>> > wrote: > [snip] > > > > It doesn't seem a good way to do it. What if there are no outlinks? > This > > method won't be called at all. And anyway, it would be called once per > > each outlink, which would multiplicate the work. > Multiplication is easy to solve but you are right that it won't work > if there are no outlinks. > Maybe scoring filter api should change? A distributeScoreToOutlinks > method may be more useful than the current one: (which will be called > even if there are no outlinks) > CrawlDatum distributeScoreToOutlinks(Text fromUrl, List<String> > toUrlList, List<CrawlDatum> datumList, ParseData parseData, > CrawlDatum adjust) > This method gives more control to the plugin since knowing all the > outlinks the plugin can make more informed decisions. Like, right now, > there is no way a scoring filter can be sure that it has distributed > all its cash (e.g if db.score.internal.link is 0.5 and > db.score.external.link is 1.0, filter will almost always distribute > less than its cash). > This will also work for your case, since you will just ignore the > outlinks and return the adjust datum based on information in parse > metadata. > What do you (and others) think? > > > > Thanks! > > > > > -- > Doğacan Güney ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ Nutch-developers mailing list Nutch-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-developers