Re: Scores between words. Boosting?

Grant Ingersoll Mon, 09 Mar 2009 07:28:36 -0700

Hmmm, I have some inklings of an idea, but can we take a step back?Can you explain the problem you are trying to solve at a higher level(instead of the current solution)? I imagine it is something relatedto co-occurrence analysis.


On Mar 8, 2009, at 8:05 AM, liat oren wrote:

Hi Grant,

No, you can only have two words - the score is between two words.
"cat dog" and "dog cat" is equivalent, it will actually always be"cat dog"
- going by alphabetic order.
About the boosting, I read a bit about it - but couldn't find how itcanhelp me, unless I change every appearance of the word dog to havealso cat
and animal using the weight of the score.
So, for example, every word will appear 10 times from what it is -if apple
appears 1, I will do the boosting so it appears 10 times.
If dog appears, then it will also have cat twice (0.2*10) and animal 5
times(0.5*10).

But I hope to have another better solution.


Thanks
2009/3/8 Grant Ingersoll <[email protected]>
Hi Liat,

Some questions inline below.

On Mar 8, 2009, at 5:49 AM, liat oren wrote:

Hi,
I have scores between words, for example - dog and animal have ascore of
0.5 (and not 0), dog and cat have a score of 0.2, etc.
These scores are stored in an index:
Doc1: field words: dog animal
      field score: 0.5
Doc2: field words: dog cat
      field score: 0.2
If the user searches for the word dog - I would like thatdocuments thatcontain the word animal or cat will also get a good score (thatwill take
into account the 0.5 and 0.2).
Is it always the case that these come in pairs? In other words,would you
ever have:
field words: dog cat animal
score: 0.9
Also, is the following equivalent, or would it have a differentscore:
field words: cat dog
score: 0.2
Basically what I do is: for every document in the database, I loopover
the
words that appear in the query (the query is long in a size of anarticle)and for every word that appears in each document I take the scorefrom theindex mentioned above and calculating a score between the queryand each
document.
Any suggestion how to do it using Lucene search? How to add thesevalues
to
the searcher?
Thinking...
I looked at the boosting option, but couldn't really see how ithelps me
to
that matter.
What "boosting option" did you look at?  Can you explain a bit more?


--------------------------
Grant Ingersoll
http://www.lucidimagination.com/
Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)using
Solr/Lucene:
http://www.lucidimagination.com/search


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: Scores between words. Boosting?

Reply via email to