Re: indexing fields with multiplicity

Karl Wettin Wed, 29 Aug 2007 12:59:24 -0700


29 aug 2007 kl. 21.37 skrev Tim Sturge:

That's exactly my question. I feel like

for (i = 0 ; i < XXXX ; i++) {
document.add(new Field("anchor","USA"));
}

is exactly equivalent to

field = new Field("anchor","USA"));
field.setBoost(YYYY);
document.add(field);
but I don't know the function that relates XXXX and YYYY. I feellike there's a correct information-theorectical answer and I'd liketo know what it is.


You would have to refactor norm(t,d) in this computation:

http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/javadoc/org/apache/lucene/search/Similarity.html

However, field boost is merged in to the document boost, so it mightnot translate that easy as you want. Perhaps payloads andBoostingTermQuery fits your needs better.



--
karl

Tim

Karl Wettin wrote:
29 aug 2007 kl. 19.13 skrev Tim Sturge:
I'm looking for a boost when the anchor text is more commonlyassociated with one topic than another. For example the UnitedStates of Americais called "USA" by a lot of people. The United Space Alliance isalso called "USA" but by many less people.
If I just index them both with "USA" once, they will rankequally. I want the United States of America to rank higher.
Why not use Field#setBoost(float)?
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: indexing fields with multiplicity

Reply via email to