On Mon, Aug 27, 2012 at 8:09 PM, Peter Karman <[email protected]> wrote:
> * I am intrigued by MatchEngine. Would it make what I'm trying to do above
> any easier?
While it wouldn't diminish the raw volume of code, I think it would make
things easier to understand.
* It allows us to eliminate Compiler.
* People creating custom query subclasses would have the option of writing
only a Query/Matcher pair rather than a Query/weighted-Query/Matcher trio.
Here's a rapid prototype for TFIDFMatchEngine:
package Lucy::TFIDF::TFIDFMatchEngine;
use base qw( Lucy::Search::MatchEngine );
sub weight_query {
my ($self, %args) = @_;
my ($query, $searcher) = @args{qw( query searcher )};
my $query_class = ref($query);
if ($query_class eq 'Lucy::Search::TermQuery') {
return Lucy::TFIDF::TFIDFTermQuery->new(
parent => $query,
searcher => $searcher,
);
}
elsif ($query_class eq 'Lucy::Search::PhraseQuery') {
return Lucy::TFIDF::TFIDFPhraseQuery->new(
parent => $query,
searcher => $searcher,
);
}
... # ...and so on for all core Query classes.
else {
# Unknown Query class, so fall back to delegating weighting to the
# Query itself. (Since TFIDFMatchEngine doesn't know how to
# produce a weighted query for this Query class, we have to hope
# that this is a custom Query class written with TFIDFMatchEngine
# as a target and that it knows how to create a weighted query
# appropriate for TFIDFMatchEngine.)
return $query->make_weighted_query(
searcher => $searcher
match_engine => $self,
);
}
}
> I started thinking of Compiler as WeightedQuery, and its relationship to Query
> as similar to the relationship between Doc and HitDoc. I imagined code like:
>
> my $query = $queryparser->parse('foo'); # $query isa Query
> $query->apply_weight(searcher => $searcher); # $query isa WeightedQuery
> if (!$query->weighted) {
> die "can't score a search without weighting the query";
> }
FWIW, mutating Query objects is what Lucene originally did, back in
version 1.3 IIRC. :P
Here's something very similar:
my $query = $queryparser->parse('foo');
my $weighted_query = $match_engine->weight_query(
searcher => $searcher,
query => $query
);
if (!$weighted_query->weighted) {
die "can't score a search without weighting the query";
}
The difference is that while your sample code mutates $query, this version
uses a factory method which generates a new $weighted_query and leaves the
original $query untouched.
There are a lot of reasons that we want to avoid mutating the original Query
object. For starters, we have long wanted to sever the connection between our
Query classes and TFIDF (or any other scoring model) -- but if they must hold
weighting information, they have to know about the scoring model.
Marvin Humphrey