Doug,
I'm actually implementing a QueryFilter directly instead of
extending one of the others. I'm setting the boost to 2.0. Here's the
code:
public class MetaQueryFilter implements QueryFilter {
private static final Logger LOG = LogFormatter
.getLogger(MetaQueryFilter.class.getName());
/**
* Need to pull out the list of meta tags from the configuration
*/
private static String [] META_TAGS =
NutchConf.get().getStrings("meta.names");
/**
* We're going to go through and create search filters for each of the
meta-tags we were asked to index.
*/
public BooleanQuery filter(Query input, BooleanQuery output) {
// If no meta-tags were specified in the conf file,
then don't bother wasting cycles
if ( META_TAGS.equals(null) ) {
return output;
}
addTerms(input, output);
return output;
}
private static void addTerms(Query input, BooleanQuery output) {
Clause[] clauses = input.getClauses();
for (int x = 0; x < clauses.length; x++) {
Clause c = clauses[x];
if
(!c.getField().equals(Clause.DEFAULT_FIELD))
continue; // skip
non-default fields
// These are the fields we're
interested in indexing
String [] tagsToIndex = META_TAGS;
for (int i = 0; i <
tagsToIndex.length; ++i) {
LOG.info("Meta Query
Filter: Adding a search for " + tagsToIndex[i]);
Term term = new
Term(tagsToIndex[i], c.getTerm().toString());
// add a lucene
PhraseQuery for this tag
PhraseQuery metaQuery
= new PhraseQuery();
metaQuery.setSlop(0);
metaQuery.add(term);
// set boost
metaQuery.setBoost(2.0f);
// add it as a
specified query
output.add(metaQuery,
false, false);
}
}
}
}
-----Original Message-----
From: Doug Cutting [mailto:[EMAIL PROTECTED]
Sent: Thursday, February 23, 2006 5:09 PM
To: [email protected]
Subject: Re: Search Particulars
Vanderdray, Jacob wrote:
> I'm not sure I understand what you're getting at. In this case
> I've added a comma separated list of names of meta tags that I want to
> index and search against. I've written a parse filter, an index
filter
> and this query filter that all read in that list of meta tags from the
> nutch-site.xml file.
>
> That much seems to work. In the explain link I can see that the
> fields are in the index and the ranking of pages are affected by them,
> but if I search for a term which is in one of the meta tags, but not
in
> any other fields I get 0 results.
Are you using RawFieldQueryFilter? If so, are you specifying a non-zero
boost to the constructor? RawFieldQueryFilter defaults to a zero boost.
Query terms with a zero boost are automatically converted into
filters. And filters cannot select documents, only remove them.
Doug
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid0944&bid$1720&dat1642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general