On 2/19/07, Dennis Kubes <[EMAIL PROTECTED]> wrote:
[snip]
You could drop the HtmlParseFilter part and simply write the post crawl/index MR job after to update the CrawlDatum based on your lucene queries. You would still need to write the second part that does the generation based on a different sort value.
The second part can be written with a different scoring plugin. Simply put whatever it is you need in CrawlDatum's metadata then change ScoringFilter.generatorSortValue to look up that value and give a good/bad score. [snip] -- Doğacan Güney
