I have mentioned earlier
(http://lucene.472066.n3.nabble.com/URL-redirection-and-zero-scores-td3085311.html)
that I have encountered a problem in which redirected URLs and possibly,
depending on the topography of the graph, all URLs inlinked to them will
have zero scores.

For instance, on the line 818 of Fetcher.java
(http://svn.apache.org/viewvc/nutch/branches/branch-1.3/src/java/org/apache/nutch/fetcher/Fetcher.java?view=markup)
a new CrawlDatum is created for the redirected URL but nowhere is the
original URL's CrawlDatum's score passed to the new one. ScoringFilter
interface's initialScore() method is called for the new CrawlDatum, but
it only sets the score to zero.

Is this how it was mentioned to be or is there a flaw?

I started a crawl from http://www.aalto.fi which is redirected to
http://www.aalto.fi/fi/ (in my case). The URL http://www.aalto.fi had
1.0f as its score but every other had 0.0f which in my opinion indicates
that there's a problem. By adding "newDatum.setScore(datum.getScore());"
after calling initialScore() resulted in a situation where none of the
URLs' scores is zero.

Reply via email to