Re: link analysis

Andrzej Bialecki Tue, 19 Apr 2005 11:03:31 -0700

Doug Cutting wrote:

Are many folks sucessfully using the link analysis implementation? I

I stopped using it a long time ago, due to performance problems even with moderately-sized databases.

had problems with it the last time I tried, and others have reported problems. Since no one is currently maintaining this code, I propose that we:
 1. Remove mention of it from the tutorial; and
2. Change the defaults for fetchlist.score.by.link.count and indexer.boost.by.link.count to true.

Objections?

No objections. However, we need to think carefully what is then the recommended procedure to maintain the scoring quality. Originally, the analysis step was intended to do this. Now, the parameters in 2. above are supposed to affect scoring in the right way. It would be useful to write a short explanation why this is so - but even more important IMHO would be to study what are the shortcomings of this approach (link spamming), and how to combat them.

This is probably the most difficult, but the most important step to ensure a high quality of scoring... It would be great to compile a list of suggestions for this - initially this could take a form of reports about problems with scoring, or endless looping or similar. Then we could think of solutions, and how/where they should be implemented.

--
Best regards,
Andrzej Bialecki
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Re: link analysis

Reply via email to