Hi everyone, I currently use lucene's moreLikeThis function through solr to find documents that are related to one another. A single call, however, takes around 4 seconds to complete and I would like to reduce this. I got to thinking that I might be able to use Mahout to generate a document similarity matrix offline that could then be looked-up in real time for serving. Is this a reasonable use of Mahout? If so, what functions will generate a document similarity matrix? Also, I would like to be able to keep the text processing advantages provided through lucene so it would help if I could still use my lucene index. If not, then could you recommend any alternative solutions please?
Many thanks, Kris
