Re: [Solr Wiki] Update of "MoreLikeThis" by ryan

Ryan McKinley Thu, 24 May 2007 09:00:52 -0700

Chris Hostetter wrote:

: + If termVectors are not stored, !MoreLikeThis will generate terms from
: stored fields.  If multiple fields are used for similarity, solr will
: use the default Analyzer -- NOTE: this may or ''may not'' match the
: Analyzer used to index the field.  If only one field is used for
: similarity, solr will use the Analyzer defined in schema.xml


what do you mean by the "default Analyzer" .. is that StandardAnalyzer,
IndexSchema.getAnalyzer(), or IndexSchema.getQueryAnalyzer() ? ... in the
case of hte later two they will automaticly pick the correct Analyzer for
hte FieldType.

Ahhh! I didn't realize that is how those worked. Currently I am onlysetting the analyzer if there is only one field and usingfieldType.getAnalyzer() -- a better solution is to use:searcher.getSchema().getAnalyzer()


In that case, the comment should read something like:

"If termVectors are not stored, !MoreLikeThis will generate terms fromstored fields using the Analyzer defined in schema.xml."

(although an interesting question is what happens if i want to find
similar docs based on a field htat is stored by not indexed so it *really*
has no analyzer)

I think the MLT implementation would need some modification to supportthat -- what you are suggesting is to get the top tf/idf terms for astored but not indexed field then query against a different field (thatis indexed). As is, it compares like fields to one another...

Re: [Solr Wiki] Update of "MoreLikeThis" by ryan

Reply via email to