Hi Kendall, "Position" and "Offset" are often confused in Lucene ;)
Lucene uses offset to track what you referred to ("(character, not byte) offset into a text file", or into an indexed string). Lucene uses position to track the Nth token: position 0 is first token, position 1 is the second token, etc. But since tokens are usually N > 1 characters, the offsets grow faster than the positions. These tokens need not be only a linear sequence: they can be a graph structure when multi-token synonyms are applied. Lucene indexes both of these, and you can turn them individually on/off if you want. Finally, you might be interested in Lucene's highlighters module -- this contains tooling to do hit highlighting, to solve the "final inch" problem of showing your users precisely which words/excerpts matched inside each matched hit. Here's an example <https://jirasearch.mikemccandless.com/search.py?chg=new&text=python&a1=&a2=&page=0&searcher=24390&sort=recentlyUpdated&format=list&id=jvmz29ec86du&dd=project%3ALucene&newText=python> (searching Lucene's issues for the word "python"). Mike McCandless http://blog.mikemccandless.com On Fri, Jul 22, 2022 at 12:51 AM Mikhail Khludnev <m...@apache.org> wrote: > Hello, Kendall. > > You can read about Token Position Increments at > > https://lucene.apache.org/core/9_2_0/core/org/apache/lucene/analysis/package-summary.html#package.description > Usually position is a number of word and offset is a number of symbol. > Modeling entries via positions is boilerplate, I suppose. Nowadays we > either denormalize by copying values across children into a single parent > document. Also, here are more relational options > > https://lucene.apache.org/core/9_2_0/join/org/apache/lucene/search/join/package-summary.html > > > On Fri, Jul 22, 2022 at 7:02 AM Kendall Shaw <ks...@kendallshaw.com> > wrote: > > > Hi, > > > > I'm trying to figure out if I should be learning to use Lucene. I > > imagine wanting to provide a user with a way to search for something and > > present that found thing, in some way. If what is ultimately searched is > > text files, then position would be an offset into the text file, I > > think. But, that seems like a pretty unlikely scenario. > > > > If I have stored structured data into a database of some sort, does > > Lucene provide some way to associate a position with an entry in a > > database? Or is that left to the programmer to implement, outside of > > Lucene? > > > > Kendall > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > > > -- > Sincerely yours > Mikhail Khludnev >