[jira] [Updated] (LUCENE-6653) Cleanup TermToBytesRefAttribute

Uwe Schindler (JIRA) Wed, 01 Jul 2015 16:16:29 -0700

     [ 
https://issues.apache.org/jira/browse/LUCENE-6653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Uwe Schindler updated LUCENE-6653:
----------------------------------
    Description: 
While working on LUCENE-6652, I figured out that there were so many test with 
wrongly implemented TermsToBytesRefAttribute. In addition, the whole concept 
back from Lucene 4.0 was no longer correct:
- We don't return the hash code anymore; it is calculated by BytesRefHash
- The interface is horrible to use. It tends to reuse the BytesRef instance but 
the whole thing is not correct.

Instead we should remove the fillBytesRef() method from the interface and let 
getBytesRef() populate and return the BytesRef. It does not matter if the 
attribute reuses the BytesRef or returns a new one. It just get consumed like a 
standard CharTermAttribute. You get a BytesRef and can use it until you call 
incrementToken().

As the TermsToBytesRefAttribute is marked experimental, I see no reason why we 
should not change the semantics to be more easy to understand and behave like 
all other attributes. I will add a note to the backwards incompatible changes 
in Lucene 5.3.


  was:
While working on LUCENE-6652, I figured out that there were so many test with 
wrongly implemented TermsToBytesRefAttribute. In addition, the whole concept 
back from Lucene 4.0 was no longer correct:
- We don't return the hash code anymore; it is calculated by BytesRefHash
- The interface is horrible to use. It tends to reuse the BytesRef instance but 
the whole thing is not correct.

Instead we should remove the fillBytesRef() method from the interface and let 
getBytesRef() populate and return the BytesRef. It does not matter if the 
attribute reuses the BytesRef or returns a new one. It just get consumed like a 
standard CharTermAttribute. You get a BytesRef and can use it until you call 
incrementToken().

As the TermsToBytesRefAttribute is marked experimental, I see no reason to 
change the semantics to be more easy to understand and behave like all other 
attributes. I will add a note to the backwards incompatible changes in Lucene 
5.3.



> Cleanup TermToBytesRefAttribute
> -------------------------------
>
>                 Key: LUCENE-6653
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6653
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/analysis
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 5.3, Trunk
>
>         Attachments: LUCENE-6653.patch
>
>
> While working on LUCENE-6652, I figured out that there were so many test with 
> wrongly implemented TermsToBytesRefAttribute. In addition, the whole concept 
> back from Lucene 4.0 was no longer correct:
> - We don't return the hash code anymore; it is calculated by BytesRefHash
> - The interface is horrible to use. It tends to reuse the BytesRef instance 
> but the whole thing is not correct.
> Instead we should remove the fillBytesRef() method from the interface and let 
> getBytesRef() populate and return the BytesRef. It does not matter if the 
> attribute reuses the BytesRef or returns a new one. It just get consumed like 
> a standard CharTermAttribute. You get a BytesRef and can use it until you 
> call incrementToken().
> As the TermsToBytesRefAttribute is marked experimental, I see no reason why 
> we should not change the semantics to be more easy to understand and behave 
> like all other attributes. I will add a note to the backwards incompatible 
> changes in Lucene 5.3.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (LUCENE-6653) Cleanup TermToBytesRefAttribute

Reply via email to