Hmmm, I have some inklings of an idea, but can we take a step back? Can you explain the problem you are trying to solve at a higher level (instead of the current solution)? I imagine it is something related to co-occurrence analysis.

On Mar 8, 2009, at 8:05 AM, liat oren wrote:

Hi Grant,

No, you can only have two words - the score is between two words.

"cat dog" and "dog cat" is equivalent, it will actually always be "cat dog"
- going by alphabetic order.

About the boosting, I read a bit about it - but couldn't find how it can help me, unless I change every appearance of the word dog to have also cat
and animal using the weight of the score.
So, for example, every word will appear 10 times from what it is - if apple
appears 1, I will do the boosting so it appears 10 times.
If dog appears, then it will also have cat twice (0.2*10) and animal 5
times(0.5*10).

But I hope to have another better solution.


Thanks
2009/3/8 Grant Ingersoll <gsing...@apache.org>

Hi Liat,

Some questions inline below.

On Mar 8, 2009, at 5:49 AM, liat oren wrote:

Hi,

I have scores between words, for example - dog and animal have a score of
0.5 (and not 0), dog and cat have a score of 0.2, etc.
These scores are stored in an index:
Doc1: field words: dog animal
      field score: 0.5
Doc2: field words: dog cat
      field score: 0.2

If the user searches for the word dog - I would like that documents that contain the word animal or cat will also get a good score (that will take
into account the 0.5 and 0.2).


Is it always the case that these come in pairs? In other words, would you
ever have:
field words: dog cat animal
score: 0.9

Also, is the following equivalent, or would it have a different score:
field words: cat dog
score: 0.2




Basically what I do is: for every document in the database, I loop over
the
words that appear in the query (the query is long in a size of an article) and for every word that appears in each document I take the score from the index mentioned above and calculating a score between the query and each
document.

Any suggestion how to do it using Lucene search? How to add these values
to
the searcher?


Thinking...



I looked at the boosting option, but couldn't really see how it helps me
to
that matter.


What "boosting option" did you look at?  Can you explain a bit more?


--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using
Solr/Lucene:
http://www.lucidimagination.com/search


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to