+1 I guess we'd add a Fieldable.setOmitPositions? And then save that in FieldInfos, and fix the postings writing/reading to respect it? Ie, we can just change the index format. Encoding as negative numbers isn't great because the termFreq is written as a vInt, which consumes 5 bytes to encode any negative number. Wanna cough up a patch? Probably this should wait until 3.1.
Mike On Sat, Nov 7, 2009 at 7:47 PM, Andrzej Bialecki <a...@getopt.org> wrote: > Hi, > > During one of discussions at ApacheCon it occurred to me that it would be > useful to have an option to discard positional information but still keep > the term frequency. Even though position-dependent queries wouldn't work > then, still any other queries would work fine and we would get the right > scoring. > > I believe it should be possible to do this without changing the file format, > if we used a negative term frequency for terms without postings - we would > have to check for that condition in SegmentTermDocs, change the flags there > and flip the sign of docFreq. And eventually we may want to add a separate > flag for this and bump the format version. > > -- > Best regards, > Andrzej Bialecki <>< > ___. ___ ___ ___ _ _ __________________________________ > [__ || __|__/|__||\/| Information Retrieval, Semantic Web > ___|||__|| \| || | Embedded Unix, System Integration > http://www.sigram.com Contact: info at sigram dot com > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-dev-h...@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org