Perfect! Now I have it working, and it performs quite well for a focused serch engine like ours! Do you think it could be an interesting plug-in to add to nutch?
Lorenzo Doğacan Güney wrote: > On 4/21/07, Lorenzo <[EMAIL PROTECTED]> wrote: >> >> Uhmm... so, suppose I decided, from its content, that the current page >> http://foo/bar.htm is really desiderable. >> I have put in ParseData's metadata a flag to mark it. >> In distributeScoreToOutlink(s) I read it from the ParseData param, and >> put it in the adjust CrawlData metadata >> >> MapWritable adjustMap = adjust.getMetaData(); >> adjustMap.put(key, new FloatWritable(bootsValue)); >> return adjust; >> >> So in updateDbScore(Text url, CrawlDatum old, CrawlDatum datum, List >> inlinked) >> the adjust CrawlData will be between the inlinked List. Is it right? How >> do I distinguish it? >> I can put the URL in metadata too, and scroll through the list, but >> maybe there is a better method? > > > > Best approach is yours, you should put a flag in adjust datum's > metadata to > mark it, then process it in updateDbScore. > > Also, this CrawlDatum will be the same that is passed to indexerScore? > > > You get 2 CrawlDatum's in indexerScore. First is fetchDatum which is > the one > in crawl_fetch that contains the fetching status. Second is dbDatum which > comes from crawldb. This dbDatum is the one that you set in > updateDbScore(The 'datum' argument of updateDbScore) > > > Thanks a lot! >> >> Lorenzo >> >> > > ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ Nutch-developers mailing list Nutch-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-developers