Chris Hostetter wrote:
: + If termVectors are not stored, !MoreLikeThis will generate terms from
: stored fields.  If multiple fields are used for similarity, solr will
: use the default Analyzer -- NOTE: this may or ''may not'' match the
: Analyzer used to index the field.  If only one field is used for
: similarity, solr will use the Analyzer defined in schema.xml

what do you mean by the "default Analyzer" .. is that StandardAnalyzer,
IndexSchema.getAnalyzer(), or IndexSchema.getQueryAnalyzer() ? ... in the
case of hte later two they will automaticly pick the correct Analyzer for
hte FieldType.


Ahhh! I didn't realize that is how those worked. Currently I am only setting the analyzer if there is only one field and using fieldType.getAnalyzer() -- a better solution is to use: searcher.getSchema().getAnalyzer()

In that case, the comment should read something like:

"If termVectors are not stored, !MoreLikeThis will generate terms from stored fields using the Analyzer defined in schema.xml."


(although an interesting question is what happens if i want to find
similar docs based on a field htat is stored by not indexed so it *really*
has no analyzer)


I think the MLT implementation would need some modification to support that -- what you are suggesting is to get the top tf/idf terms for a stored but not indexed field then query against a different field (that is indexed). As is, it compares like fields to one another...


Reply via email to