[ 
https://issues.apache.org/jira/browse/LUCENE-2183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12800939#action_12800939
 ] 

Uwe Schindler commented on LUCENE-2183:
---------------------------------------

Have not looked detailed into it yet, but it looks correct. I am not sure about 
the overhead of passing each char through the proxy class. My idea would be to 
declare CharFunction as a private interface and let CharTokenizer implement it 
(invisible to the outside, so it can be removed in later versions). The ctor 
then passes "this" as CharFunction if >=3.1 and a new proxy instance of the 
interface for the deprecated case. By this at least the new stuff does not have 
extra method calls.

The VirtualMethod stuff looks ok, thanks for using it as suggested here! :-)

> Supplementary Character Handling in CharTokenizer
> -------------------------------------------------
>
>                 Key: LUCENE-2183
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2183
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Analysis
>            Reporter: Simon Willnauer
>             Fix For: 3.1
>
>         Attachments: LUCENE-2183.patch, LUCENE-2183.patch
>
>
> CharTokenizer is an abstract base class for all Tokenizers operating on a 
> character level. Yet, those tokenizers still use char primitives instead of 
> int codepoints. CharTokenizer should operate on codepoints and preserve bw 
> compatibility. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to