Replacement for TermAttribute+Impl with extended capabilities (byte[] support, 
CharSequence, Appendable)
--------------------------------------------------------------------------------------------------------

                 Key: LUCENE-2302
                 URL: https://issues.apache.org/jira/browse/LUCENE-2302
             Project: Lucene - Java
          Issue Type: Improvement
          Components: Analysis
    Affects Versions: Flex Branch
            Reporter: Uwe Schindler
             Fix For: Flex Branch


For flexible indexing terms can be simple byte[] arrays, while the current 
TermAttribute only supports char[]. This is fine for plain text, but e.g 
NumericTokenStream should directly work on the byte[] array.
Also TermAttribute lacks of some interfaces that would make it simplier for 
users to work with them: Appendable and CharSequence

I propose to create a new interface "ExtendedTermAttribute extends 
TermAttribute". The corresponding -Impl class is always an implementation that 
extends ExtendedTermAttribute . So if somebody adds a TermAttribute an 
AttributeSource he will get an implementation class that can be also used as 
TermAttribute2. As both attributes create the same impl instance both calls to 
addAttribute are equal. So a TokenFilter that adds ExtendedTermAttribute to the 
source will work with the same instance as the Tokenizer that requested the 
(deprecated) TermAttribute.

To support both byte[] and char[] the internals will be implemented like Token 
in 2.9: Support for String and char[]. So the buffers are both available, but 
you can only use one of them. as soon as you call getByteBuffer(), and the 
char[] buffer is used, it will be transformed. So the inder will always call 
getBytes() and get the UTF-8 bytes. NumericTokenStream will modify the byte[] 
directly and if no filter that uses char[] is plugged on top, the buffer is 
never transformed.

This issue will also convert the rest of NRQ to byte[] and deprecate all old 
methods in NumericUtils. NRQ will directly request ByteRef from splitRange and 
so on.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to