On Thu, May 23, 2013 at 9:54 AM, Igor Shalyminov <ishalymi...@yandex-team.ru> wrote:
> But, just to clarify, is there a way to get, let's say, a vector of position > increments directly from the index, without re-parsing document contents? Term vectors (as Jack suggested) are one option, but they are very heavy (slows down indexing, takes lots of disk space, slow (seek-per-document) to load at search time). You can enumerate all positions for each termXdoc in the postings, but you'd then need to collate by document to get the max position (last term) for that document. I guess an int[maxDoc] would do the trick, then walk that array dividing each maxPosition by 1000. Or index the sentence token :) Mike McCandless http://blog.mikemccandless.com --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org