[ https://issues.apache.org/jira/browse/LUCENE-2183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12800939#action_12800939 ]
Uwe Schindler commented on LUCENE-2183: --------------------------------------- Have not looked detailed into it yet, but it looks correct. I am not sure about the overhead of passing each char through the proxy class. My idea would be to declare CharFunction as a private interface and let CharTokenizer implement it (invisible to the outside, so it can be removed in later versions). The ctor then passes "this" as CharFunction if >=3.1 and a new proxy instance of the interface for the deprecated case. By this at least the new stuff does not have extra method calls. The VirtualMethod stuff looks ok, thanks for using it as suggested here! :-) > Supplementary Character Handling in CharTokenizer > ------------------------------------------------- > > Key: LUCENE-2183 > URL: https://issues.apache.org/jira/browse/LUCENE-2183 > Project: Lucene - Java > Issue Type: Improvement > Components: Analysis > Reporter: Simon Willnauer > Fix For: 3.1 > > Attachments: LUCENE-2183.patch, LUCENE-2183.patch > > > CharTokenizer is an abstract base class for all Tokenizers operating on a > character level. Yet, those tokenizers still use char primitives instead of > int codepoints. CharTokenizer should operate on codepoints and preserve bw > compatibility. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org