Re: setMaxClauseCount ??

Andrzej Bialecki Wed, 21 Jan 2004 03:49:50 -0800

Karl Koch wrote:

Hi Doug,

thank you for the answer so far.

I actually wanted to add a large amount of text from an existing document to
find a close related one. Can you suggest another good way of doing this. A
direct match will not occur anyway. How can I make a most Vector Space Model
(VSM) like query (each word a dimension value - find documents close to
that)? You know as good as I that the standard VSM does not have any Boolean logic
inside... how do I need to formuate the query to make it as much similar to
a vector in order to find similar document in the vector space of the Lucene
index?

You should try to reduce the dimensionality by reducing the number of unique features. In this case, you could for example use only keywords (or key phrases) instead of the full content of documents.

--
Best regards,
Andrzej Bialecki

-------------------------------------------------
Software Architect, System Integration Specialist
CEN/ISSS EC Workshop, ECIMF project chair
EU FP6 E-Commerce Expert/Evaluator
-------------------------------------------------
FreeBSD developer (http://www.freebsd.org)


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: setMaxClauseCount ??

Reply via email to