Hi.

I did a test crawl with the seed URL http://www.aalto.fi. When the
crawling and indexing process was over, I opened the index in Luke and
browsed the documents. Every one of them had 0.0f as their score (and
thus their boost value). I doubt that this is what I should have gotten.

The problem seems to be related to the fact that http://www.aalto.fi
redirects to http://www.aalto.fi/fi/ (in my case; probably to ....../en/
or .../sv/ in some other cases). This behavior showed up also when
http://www.muropaketti.com was used as a seed URL. The URL
http://www.muropaketti.com is redirected to http://plaza.fi/muropaketti/.

Is this a flaw in Nutch? If not, then why was every document's boost
value zero? I have lived under impression that a document's boost value
is supposed to describe its relevancy. I did the same tests using the
versions 1.2 and 1.3, and the problem appeared in both of the cases.

Reply via email to