Yeah... you keep wanting to solve fun problems. The OP had a very simple problem.
On Mon, Apr 25, 2011 at 3:17 PM, Dmitriy Lyubimov <[email protected]> wrote: > But then again i think i misunderstood the problem the second time... > > On Mon, Apr 25, 2011 at 2:59 PM, Dmitriy Lyubimov <[email protected]> > wrote: > > oh never mind. > > > > that's trivial. As Ted mentioned, i perhaps by mistake assumed the > > problem is to find most frequently used queries. > > > > if he just needs top N with the highest similarity score...well... > > that's kind of a problem i am solving for LSI over hbase right now... > > I don't want to disclose exactly how, or Ted will say that's not the > > way :) But, there are definitely ways to organize the vector space > > model to find N closest without scanning the entire vector set. > > >
