On Tuesday, October 21, 2003, at 03:36 AM, Pierrick Brihaye wrote:
The basic idea is to have several tokens at the same position (i.e. setPositionIncrement(0)) which are different possible stems for the same word.

Right. Like I said, I recognize the benefits of using a position increment of 0.


I certainly see the benefit of putting tokens into zero-increment positions, but are increments of 2 or more at all useful?

Who knows ? I may be interesting to keep track of the *presence* of "empty words", e.g. "[the] sky [is] blue", "[the] sky [is] [really] blue", "[the] sky [is] [that] [really] blue". The traditionnal reduction to "sky blue" is maybe over-simplistic for some cases...

But, how would you actually *use* an index that was indexed with the holes noted by > 1 position increments?


Erik


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to