Erik Hatcher writes: > > > > There are some information retrieval settings which tend to say that > > things that appear early in the document should be considered with > > greater score... is there nothing such in Lucene's scoring ? > > No, Lucene doesn't have that feature, at least not explicitly.... it > could be hacked, sort of, by injecting multiple of the same term in the > same position (to get a higher term frequency) for the earlier terms. > Back to the original question - the position information will not > adversely affect scoring. > Wouldn't it be easier to fake that by using a proximity query and a document start marker? E.g. index `xxxstartxxx some text other text' and search for "xxxstartxxx some"~10000000 or "xxxstartxxx other"~10000000 If I understand proximity query correctly the latter should have a lower score (given that 'some' and 'other' have equal scores). Untested though.
Alternatively it should be able to write a query that does such a scoring directly (without the document start anchor) by the same means proximity query uses. Proximity query uses positional information so it should be possible to use that information for scoring based on document position also. Morus --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]