Hi,
My understanding that nutch "Generate"  job filters all available URLs in a 
Nutch database by two criteria:
1.  fetchTime + fetchInterval should be less than current time.
2. Number selected URLs for "Fetch" job should be less than -topN parameter 
value. Value for "score" field from the database is used for this selection.
During crawling I can see only two values for the  "score" field. Value 1 is 
always set during "Inject" job, value 0 is always set during "Parse" job. 
Looking on the code I see that OPICscoring plugin is used to define these 
values.

Is my understanding correct?   How can be defined "score" value different then 
0 or 1?

Regards,
Vladimir. 

Reply via email to