Hi,
I am going to implement a searchcomponent for Solr to return document main
keywords with using the more like this interesting terms. The main part of
implemented component which uses mlt.retrieveInterestingTerms by lucene
docID does not work for all of the documents. I mean for some of the
documents solr interestingterms returns some useful terms as top tf-idf
terms; however, the implemented method returns null! But for other
documents both results (solr MLT interesting terms and the
mlt.retrieveInterestingTerms(docId)) are the same! Would you please help me
through solving this issue?
public List<String> getKeywords(int docId) throws SyntaxError {
String[] fields = new String[keywordSourceFields.size()];
List<String> terms = new ArrayList<String>();
fields = keywordSourceFields.toArray(fields);
mlt.setFieldNames(fields);
mlt.setAnalyzer(indexSearcher.getSchema().getIndexAnalyzer());
mlt.setMinTermFreq(minTermFreq);
mlt.setMinDocFreq(minDocFreq);
mlt.setMinWordLen(minWordLen);
mlt.setMaxQueryTerms(maxNumKeywords);
mlt.setMaxNumTokensParsed(maxTokensParsed);
try {
terms = Arrays.asList(mlt.retrieveInterestingTerms(docId));
} catch (IOException e) {
LOGGER.error(e.getMessage());
throw new RuntimeException();
}
return terms;
}
*Note:*
I did define termVectors=true for all the required fields that I am going
to use for the purpose of generating interesting terms (fields array in the
corresponding method)
Best regards.
--
A.Nazemian