Hi John,

FreeGenerator unlike Injector does not use db.score.injected (default = 1.0)
but sets the initial score to 0.0. If all URLs stem from FreeGenerator the total
score in the link graph is also 0.0, and no linked documents can get a higher 
score
that 0.0
As possible solutions:
- use FreeGenerator with a initial score > 0.0
  (but don't put thousands URLs with a score of 1.0:
   if the total score is too high some pages may get unreasonable
   high scores)
- use linkrank (https://wiki.apache.org/nutch/NewScoring) to get the scores:
  the default scoring OPIC has the advantage of calculating scores online
  while following links. It gives good and plausible scores if crawl is started
  from few authoritative seeds. But sometimes, esp. in continuous crawls,
  OPIC scores run out of control.

Sebastian

On 03/25/2014 08:31 PM, John Lafitte wrote:
> I setup a script that uses freegen to manually index new/updated URLs.  I
> thought it was working great, but now I'm just realizing that Solr returns
> a score of 0 for these new documents.  I thought the score was calculated
> independent from what Nutch does, just uses the content and other metadata
> to calculate it, however that doesn't seem to be the case.  Anyone have a
> clue what might be causing this?  The content and other metadata look
> normal and I reloaded the core to no avail.
> 

Reply via email to