On Sun, Nov 18, 2012 at 12:09 PM, wgggfiy <wuqiu....@qq.com> wrote: > I'm now studying lucene 4.0. > 1, what is the startOffset and endOffset for ? is there a code example ?
These are set by the analyzer, to the start and end character offset for this token (using the OffsetAttribute). The offsets are used for highlighting. > 2, what is payload ? I know just a little about it, and it can be used for > things like font weight, or XML enclosing tag. It's an arbitrary per-token-position byte[] that you set during analysis (using the PayloadAttribute). > 3, I have a item like (lucene, 350, 450, 33.2, 2), where 350,450 is the > offset of the term 'lucene', and 33.2 is a score, and 2 is some id, my > question is how I can make it indexed ? > my first idea is to relized my own posting list format, but is it possible > to make it with the startOffset, endOffset and payload ? You should probably encode them all into the payload; Lucene requires that the offsets are "in order". Mike McCandless http://blog.mikemccandless.com --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org