Eric Osgood wrote:
Andrzej,

Based on what you suggested below, I have begun to write my own scoring plugin:

Great!


in distributeScoreToOutlinks() if the link contains the string im looking for, I set its score to kept_score and add a flag to the metaData in parseData ("KEEP", "true"). How do I check for this flag in generatorSortValue()? I only see a way to check the score, not a flag.

The flag should have been automagically added to the target CrawlDatum metadata after you have updated your crawldb (see the details in CrawlDbReducer). Then in generatorSortValue() you can check for the presence of this flag by using the datum.getMetaData().

BTW - you are right, the Generator doesn't treat Float.MIN_VALUE in any special way ... I thought it did. It's easy to add this, though - in Generator.java:161 just add this:

if (sort == Float.MIN_VALUE) {
        return;
}


--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Reply via email to