Doğacan Güney wrote: > On 4/19/07, Lorenzo <[EMAIL PROTECTED]> wrote: >> >> Hi, >> sorry to re-open this thread, but I am facing the same problem of >> Nicolás. >> I like both yours (Doğacan) and Nicolas' ideas, more yours as I think >> abstract >> classes are not good extension points. >> Anyway, is any of these implemented? I really need it! > > > Well, I have implemented a subset of what we discussed in > <https://issues.apache.org/jira/browse/NUTCH-468> > NUTCH-468 <https://issues.apache.org/jira/browse/NUTCH-468>. There is > a lot > more to be done but IMHO, NUTCH-468 may be a good starting point. > > Also, I can't understand from the docs what does it means that the >> adjust datum >> will update the score of the original datum in updatedb. >> Update or adjusted in which way? I obtain strange values.. > > > In ScoringFilter.updateDbScore you get a list of inlinked datums that you > can use to change score. Now, if in distributeScoreToOutlink(s) you > return a > datum with a status of STATUS_LINKED, you will get this datum as one > of the > inlinked datums in updateDbScore. > > I hope, this clears it up a bit. > Uhmm... so, suppose I decided, from its content, that the current page http://foo/bar.htm is really desiderable. I have put in ParseData's metadata a flag to mark it. In distributeScoreToOutlink(s) I read it from the ParseData param, and put it in the adjust CrawlData metadata
MapWritable adjustMap = adjust.getMetaData(); adjustMap.put(key, new FloatWritable(bootsValue)); return adjust; So in updateDbScore(Text url, CrawlDatum old, CrawlDatum datum, List inlinked) the adjust CrawlData will be between the inlinked List. Is it right? How do I distinguish it? I can put the URL in metadata too, and scroll through the list, but maybe there is a better method? Also, this CrawlDatum will be the same that is passed to indexerScore? Thanks a lot! Lorenzo ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ Nutch-developers mailing list Nutch-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-developers