Re: MoreLikeThisQuery term frequency caching

Michael McCandless Tue, 07 Apr 2009 01:05:26 -0700

I don't have direct experience with MLT, but this sounds like a great
improvement, so in answer to (3) I would say "definitely!".


Mike

On Tue, Apr 7, 2009 at 2:28 AM, Richard Marr <richard.m...@gmail.com> wrote:
> Hi all,
>
> I've been exploring MoreLikeThisQuery as part of a recent project and
> something that came out of that might be useful to others here.
>
> I found that using MoreLikeThisQuery could be quite slow for my use
> case, but that most of the time involved was spent looking up term
> frequencies to calculate weightings. Since those term frequencies
> usually don't need to be anywhere near real-time I found that caching
> them in a hashmap had a very good cost/benefit ratio for my
> application, speeding up MLT queries by an order of magnitude.
>
> My use case was possibly unusual in that I was looking at a limited
> vocabulary rather than full English, but in theory other applications
> that make use of the MLT class could benefit.
>
> So at this point I have some questions: (1) Have others experienced
> similar performance characteristics for MLT code? (2) Am I missing
> some fatal flaw in this approach? (3) Are the modifications worth
> sharing?
>
> Cheers,
>
> Rich
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-dev-h...@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: MoreLikeThisQuery term frequency caching

Reply via email to