On Sat, Feb 11, 2012 at 10:03:37PM +0100, Nick Wellnhofer wrote:
> What's the best way to apply a boost factor dynamically to a (small)
> subset of documents?
I would suggest using a RequiredOptionalQuery. Have the logical results
depend on the required_query and boost using the optional_query.
my $parsed_query = $query_parser->parse($user_query_string);
my $user_id_boost_query = Lucy::Search::TermQuery->new(
field => 'user_id',
term => $user_id,
);
$user_id_boost_query->set_boost($arbitrary_boost);
my $req_opt_query = Lucy::Search::RequiredOptionalQuery->new(
required_query => $parsed_query,
optional_query => $user_id_boost_query,
);
If the query to identify the subset of documents is very expensive, you might
look into using LucyX::Search::Filter to cache the results (but note that
Filter does not cache in a clustered environment).
> Is there a better way than to simply retrieve all the results, apply the
> boost factor manually to the scores and sort the results again?
I hope you don't have to resort to post-search filtering. That's slow to
begin with and it doesn't scale very well because of the costs of retrieving
so many documents. You also have to resort to non-idiomatic sorting code
(using a priority queue rather than the Perl sort() function) if you don't
want memory usage to balloon.
Marvin Humphrey