Re: Unclear on what position means

2022-07-22 Thread Michael McCandless
Hi Kendall, "Position" and "Offset" are often confused in Lucene ;) Lucene uses offset to track what you referred to ("(character, not byte) offset into a text file", or into an indexed string). Lucene uses position to track the Nth token: position 0 is first token, position 1 is the second toke

Re: Unclear on what position means

2022-07-21 Thread Mikhail Khludnev
Hello, Kendall. You can read about Token Position Increments at https://lucene.apache.org/core/9_2_0/core/org/apache/lucene/analysis/package-summary.html#package.description Usually position is a number of word and offset is a number of symbol. Modeling entries via positions is boilerplate, I supp

Unclear on what position means

2022-07-21 Thread Kendall Shaw
Hi, I'm trying to figure out if I should be learning to use Lucene. I imagine wanting to provide a user with a way to search for something and present that found thing, in some way. If what is ultimately searched is text files, then position would be an offset into the text file, I think. But