On 2/19/07, Dennis Kubes <[EMAIL PROTECTED]> wrote:

[snip]

>
> You could drop the HtmlParseFilter part and simply write the post
> crawl/index MR job after to update the CrawlDatum based on your lucene
> queries.  You would still need to write the second part that does the
> generation based on a different sort value.

The second part can be written with a different scoring plugin. Simply
put whatever it is you need in CrawlDatum's metadata then change
ScoringFilter.generatorSortValue to look up that value and give a
good/bad score.

[snip]

-- 
Doğacan Güney
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to