[ https://issues.apache.org/jira/browse/LUCENE-9588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17222913#comment-17222913 ]
Robert Muir commented on LUCENE-9588: ------------------------------------- Sorry, the JapaneseTokenizer example doesn't hold up: that's comparing apples with oranges. It doesn't subclass this class: so of course its incrementToken throws IOException: it has to read from Reader... its logic mixes that i/o with segmentation. On the other hand, this subclass (the entire point of it!) is to separate these two things. If you want to mix i/o and segmentation (like JapaneseTokenizer, doing them in a streaming fashion), then this subclass is simply inappropriate and you should just subclass {{Tokenizer}}. I agree that incrementSentence() should not throw IOException, that's a bug. It is an oversight and it gives the wrong impression. We can remove the {{throws IOException}} there, it doesn't break any subclasses. > Exceptions handling in methods of SegmentingTokenizerBase > --------------------------------------------------------- > > Key: LUCENE-9588 > URL: https://issues.apache.org/jira/browse/LUCENE-9588 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/analysis > Affects Versions: 8.6.3 > Reporter: Nguyen Minh Gia Huy > Priority: Minor > > The current interface of *setNextSentence* and *incrementWord* methods in > *SegmentingTokenizerBase* do not define the checked exceptions, which makes > it troublesome to be inherited. > For example, if we override the incrementWord with a logic that invoke > incrementToken on another tokenizer, the incrementToken raises the > IOException but the incrementWord is not defined to handle it. > I think having setNextSentence and incrementWord handle the IOException would > make the SegmentingTokenizerBase easier to be used. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org