Hi,

As my understanding, link anaylsis is neccessary to 
run whenever a new fetching is updated to webdb.
Because the link graphic is changed ( it is possible
that new links are added and old links are deleted ),
the score for each node is changed so a recaculation
is neccessary. 

Link analysis will update the score for each node (by
page) in webdb, then updatesegmentfromdb needs to run
to copy recalculated score to segment.

I can't see a point that we can skip link anaylsis. Am
I missing something important? Let me know.

thanks,

Michael Ji,


--- AJ Chen <[EMAIL PROTECTED]> wrote:

> I assume you mean UpdateSegmentFromDB, and there is
> no need to run link 
> analysis tool if I want to use the number of inlinks
> for nutch score. 
> Right? I tried to find your patch, but couldn't find
> it. How to find it?
> -AJ
> 
> Piotr Kosiorowski wrote:
> 
> > UpdateDB copies link information and score from
> the WebDB to segments 
> > so it is important to have score calculated before
> updatedb is run. 
> > One can use current standard nutch score (based on
> number of inlinks) 
> > or try to use analyze - I have committed a patch
> for it some time ago 
> > that might help a bit with it disk space
> requirements so the best 
> > approach would be to test it (it worked ok for me)
> and if it is ok for 
> > you - report it so others can also try it out.
> > Regards
> > Piotr
> > AJ Chen wrote:
> >
> >> In a whole-web or vertical crawling setting, is
> it right that link 
> >> analysis and update segment from DB should be
> performed in right 
> >> order before indexing the segments?
> >>
> >> There's not much talk about update segment from
> DB. I think it should 
> >> be an important step. Could someone point out
> when it should be  run 
> >> and what the benefits are?
> >>
> >> I remember it was mentioned sometime ago that the
> link analysis tool 
> >> does not work yet and the number of in-links
> should be used instead. 
> >> Any update? If it's still not working, how to set
> it to use link 
> >> numbers?
> >>
> >> Thanks,
> >> AJ
> >>
> >>
> >
> >
> 
> 



                
__________________________________ 
Yahoo! Mail - PC Magazine Editors' Choice 2005 
http://mail.yahoo.com

Reply via email to