Is anyone doing anything interesting with the Token.setPositionIncrement during analysis?

Just for fun, I've written a simple stop filter that bumps the position increments to account for the stop words removed:

public final Token next() throws IOException {
int increment = 0;
for (Token token = input.next(); token != null; token = input.next()) {


if (table.get(token.termText()) == null) {
token.setPositionIncrement(token.getPositionIncrement() + increment);
return token;
}


      increment++;
    }

    return null;
  }


But its practically impossible to formulate a Query that can take advantage of this. A PhraseQuery, because Terms don't have positional info (only the transient tokens), only works using a slop factor which doesn't guarantee an exact match like I'm after. A PhrasePrefixQuery won't work any better as there is no way to add in a "blank" term to indicate a missing position.


I certainly see the benefit of putting tokens into zero-increment positions, but are increments of 2 or more at all useful? If so, how are folks using it?

Thanks,
        Erik


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to