Hello everyone,

I am Dwaipayan, a research scholar from Indian Statistical Institute,
Kolkata working in the field of Information Retrieval.
For my research purpose, I use Lucene (4.10.4).

Recently, I am facing a doubt regarding Lucene on how to boost the query
term at the time of searching. Preciously, I am implementing a paper on
query expansion (Relevance Based Language Model - Victor Lavrenko, Bruce
Croft, SIGIR-2001). In the paper, the expanded query is formed with terms
taken from the initially retrieved documents. The expansion terms are
selected and weighted following a probability. Thus, the weight of the
expansion terms are some probability values which are normalized to summed
into one. This results into making the term weights a small fractional
decimal value; e.g. for most of the cases, it is some where near to 0.1 if
10 expansion terms are added and the weight keeps on reducing if more
expansion terms are considered.
When I am using this fractional decimal value as the expansion term weight
in Lucene BooleanQuery, I am not getting the expected result. I think the
problem is with the weight that is applied with setBoost()of lucene boolean
query. Exactly following the paper, I am setting these weights with those
normalized probability values.

Can anyone of you please help me out in this problem?

Thanks,
Dwaipayan Roy.
Research Scholar
Indian Statistical Institute
Kolkata, India

Reply via email to