Re: Vector Space Model in Lucene?

Erik Hatcher Fri, 14 Nov 2003 13:11:21 -0800

On Friday, November 14, 2003, at 02:54 PM, Chong, Herb wrote:

it solves one part of the problem, but there are a lot of sentences in a typical document. you'll need to composite a rank of a document from its constituent sentences then. there are less drastic ways to solve the problem. the other problem is that Lucene doesn't consider the term order in the query unless the query is formulated as a phrase. a simple bag-of-words query doesn't make use of the ordering of terms that apply in a given language.

BooleanQuery _could_ take the order of terms into account for weighting and scale the weights accordingly, I believe.

I get the feeling you're looking for reasons that Lucene is inadequate. This may be the case for the uses you're speaking of, but there is quite a bit of flexibility with Lucene in terms of Analysis, scoring, and custom Query implementations that all relate to what you've been speaking of. And, of course, Lucene is a low-level component of which a higher level piece could be built around.

Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Vector Space Model in Lucene?

Reply via email to