On Mar 17, 2009, at 5:44 AM, liat oren wrote:
Thanks for all the answers.
I am new to Lucene and in the emails its the first time I heard of the
bigrams and thus read about them a bit.
Question - if I query for "cat animal" - or use boosting - "cat^2
animal^0.5" - will the results return ONLY documents that contain
both?
From what I saw until now - it can also show documents that contain
one of
them, no?
I think if you are using bigrams, then you would only match on one,
but if you do the prefix/wildard approach you could match on either.
I'm not sure if you will be able to pull off doing the individual term
boosting and the bigrams. You will likely need to write your own
Query classes to do that.
If you don't mind me asking, what is the problem you are trying to
solve? I know the solution you want (I think, namely boosted bigrams
of some sort), but I'm still clueless on the problem and I think that
is really hindering me helping. It sounds like it is some type of co-
occurrence problem, but I'm not sure. Is there a bigger category that
what you are doing fits in? If you can't say, that is fine, too. It
may be some proprietary thing.
Can you please elaborate a bit more on your suggestion?
I read a bit on the synonyms and the wordNet package.
Isn't there a way to use an index that is structured in the same way
the
index of the wordNet (any idea how is this index built?), but stores
other
values?
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org