Hello,

I'm using nutch 1.9.
I want to alter the score used for sorting the topn page for the next parsing.
I found it working by modifying the return of generatorsortvalue of a 
scoringfilter plugin.
But this fonction don't have anchors text in inputs...
I wrote some inelegant and inefficient code that put a "ismarked" key in 
crawldatum for knowing if anchors text or url contains some words... From what 
function i have to do this?
Is there a complete schema of datas path though each plugins type functions?

Benjamin.

Envoyé de mon iPad

> Le 10 sept. 2014 à 04:02, Jorge Luis Betancourt Gonzalez 
> <[email protected]> a écrit :
> 
> You’ll need to write a couple of plugins to accomplish this. Which version of 
> Nutch are you using? In the first case, the score you want to alter is the 
> score that’s indexed into Solr (i.e your backend) ? 
> 
> Regards,
> 
>> On Sep 9, 2014, at 2:38 PM, Benjamin Derei <[email protected]> wrote:
>> 
>> hi,
>> 
>> i'm a beginner in java and nutch.
>> 
>> I want to orient the crawl with two rules:
>> -if language identifier plugin detect that page is non "fr" the score
>> for sorting should be divided by two.
>> -if an anchor text or link cibling this page contain some therms the
>> score for sorting should be multiplied by ten.
>> 
>> Any help ?
>> 
>> Benjamin.
> 
> Concurso "Mi selfie por los 5". Detalles en 
> http://justiciaparaloscinco.wordpress.com

Reply via email to