You could stuff your custom weights into a payload, and index that, but this is per term per document per position, while it sounds like you just want one float for each term regardless of which documents/positions where that term occurred?
Doing your own custom attribute would be a challenge: not only must you create & set this attribute during indexing, but you then must change the indexing process (custom chain, custom codec) to get the new attribute into the index, and then make a custom query that can pull this attribute at search time. What are these term weights? Are you sure you can't compute these weights at search time with a custom similarity using the stats that are already stored (docFreq, totalTermFreq, maxDoc, etc.)? Mike McCandless http://blog.mikemccandless.com On Thu, Feb 13, 2014 at 2:40 AM, Rune Stilling <[email protected]> wrote: > Hi list > > I'm trying to figure out how customizable scoring and weighting is in the > Lucene API. I read about the API's but still can't figure out if the > following is possible. > > I would like to do normal document text indexing, but I would like to control > the weight added to tokens my self, also I would like to control the > weighting of query tokens and the how things are added together. > > When indexing a word I would like attache my own weights to the word, and use > these weights when querying for documents. F.ex. > > Doc 1 > Lucene(0.7) is(0) a(0) powerful(0.9) indexing(0.62) and(0) search(0.99) > API(0.3) > > Doc 2 > Lucene(0.5) is(0) used by(0) a(0) lot of(0) smart(0) people(0.1) > > The floats in parentheses are some I would like to add in the indexing > process, not something coming from Lucene tdf/id ex. > > Wen querying I would like to repeat this and also create the weights for each > term "myself" and control how the final doc score is calculated. > > I have read that it's possible to attach your own custom attributes to > tokens. Is this the way to go? Ie. should I add my custom weight as > attributes to tokens, and then access these attributes when calculating > document score in the search process (described here > https://lucene.apache.org/core/4_4_0/core/org/apache/lucene/analysis/package-summary.html > under "adding a custom attribute")? > > The reason why I'm asking is that I can't find any examples of this being > done anywhere. But I found someone stating "With Lucene, it is impossible to > increase or decrease the weight of individual terms in a document". > > With regards > Rune --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
