Benson Margulies created LUCENE-5386:
----------------------------------------
Summary: Make Tokenizers deliver their final offsets
Key: LUCENE-5386
URL: https://issues.apache.org/jira/browse/LUCENE-5386
Project: Lucene - Core
Issue Type: Improvement
Reporter: Benson Margulies
Tokenizers _must_ have an implementation of #end() in which they set up the
final offset. Currently, nothing enforces this. end() has a useful
implementation in TokenStream, so just making it abstract is not attractive.
Proposal: add
abstract int finalOffset();
to tokenizer, and then make
void end() {
super.end();
int fo = finalOffset();
offsetAttr.setOffsets(fo, fo);
}
or something to that effect.
Other alternative to be considered depending on how this looks.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]