priorised/scored fetching

Stefan Scheffler Tue, 02 Oct 2012 00:20:39 -0700

Hi.

I crawl a webdatabase for *.html, *.pdf and *.doc documents, with agiven topN. I want nutch to fetch first all of the html documents, thenpdf and at last doc, because html is more important than pdf and so on.Is there a way to make nutch follow such rules (maybe with a scoringalgorithm)?


Regards
Stefan

--
Stefan Scheffler
Avantgarde Labs GbR
Löbauer Straße 19, 01099 Dresden
Telefon: + 49 (0) 351 21590834
Email: [email protected]

priorised/scored fetching

Reply via email to