On Mar 10, 2008, at 8:43 PM, Chris Hostetter wrote:
: Does it make (any) sense to try implementing this within Solr or
should I
: just forget about this ?
: As a more general note, does it make sense to try to use Solr as a
: "research" playground for similarities instead of Lucene? Or is
this the
: "wrong" level (aka Lucene being a better one)?
If i were going to sit down and reallyresearch alternate SImilarity
systems -- I would use Lucene directly. Solr adds a lot of nice
features
and abstractions, but for experimentations like this, those features
and
abstractions can get in the way of experimenting. In addition, the
benchmarking contrib in Lucene is designed to make it really easy to
run lots of repeatable tests changing small variables -- i beleive
Grant
already did some work to support evaluating "quality" metrics, so
you just
have to decide what "good" is and then you can run lots of tests
where you
change lots of variables in your custom similarity to which
combination of
varaibles gets you the closest to "good"
FYI, it was Doron that hooked in the quality stuff, but the point is
valid. Lucene contrib/benchmark is a better place for doing low level
similarity testing.