It looks you syntax is correct ( category:video searchString). Try to
write a LOG.info line into
org.apache.nutch.searcher.LuceneQueryOptimizer(Line 178), just at the
begining of the optimize method:
public TopDocs optimize(BooleanQuery original,
Searcher searcher, int numHits,
String sortField, boolean reverse)
throws IOException {
LOG.info("Query -> "+original.toString());
Recompile nutch a make a query, for example category:video funny if your
category plugin works fine you'll get an info line within hadoop.log similar
to this:
+(url:funny^0.0 anchor:funny^0.0 content:funny title:funny^0.0
host:funny^0.0) +category:video
First part means (+(url:funny^0.0 anchor:funny^0.0 content:funny
title:funny^0.0
host:funny^0.0)) that funny must appear at least in one of that fields (url,
anchor...). The second part filters results to obtain only the ones
tagged as video.
In your case it looks like the word video is being included into the first
part. Check your plugin implementation is correct, and the plugin.xml and
build.xml are correct. Your plugin.xml should look similar to this:
...
<extension id="..."
name="...."
point="org.apache.nutch.searcher.QueryFilter">
<implementation id="..." class="...."/>
<parameter name="raw-fields" value="category"/>
</extension>
Hope it helps.
2006/10/3, Dima Gritsenko < [EMAIL PROTECTED]>:
Hi,
I have categorized web sites during crawl to provide filtered results
similar to google Video, Images tabs.
But when I enter
category:video MySearchString
nutch matches both the video and MySearchString as terms (though it
filters results correctly and displays links to only video categorized
pages) but the search is not relevant since "video" string is matched as
well.
How do I filter category string off during search?
Great thanks.
Dima.
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general