Jonathan Ariel skrev:
Smart idea, but it won't help me. I have almost 50 categories and eventually
I would like to "filter" not just on category but maybe also on language,
etc.
Karl: what do you mean by measure the distance between the term vectors and
cluster them in real time?

I mean exactly what I say, that if your subsets are small enough you could evalute the cosine coefficient and group documents accordingly.

2 million documents is however way to much data to do that in real time.

I would probably create one index for each "filter" you want to use.


        karl

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to