[ 
https://issues.apache.org/jira/browse/LUCENE-5634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13987841#comment-13987841
 ] 

Uwe Schindler commented on LUCENE-5634:
---------------------------------------

Patch looks fine. I was afraid of complexity, but that looks quite good. I am 
not sure about backwards compatibility issues, but implementing your own 
IndexableField instance is still very expert. With Java 8 we could handle that 
with default interface methods (LOOOOOOL).

The current patch is fine for the 2 special cases, although its a bit risky, if 
we add new "settings" to NTS or change its API (we should have equals...). 
Maybe in LUCENE-5605 we can improve the check. If we pass FieldType directly to 
NTS and NRQ, we can handle the whole thing by comparing the field type and not 
rely on crazy internals like precStep.

It would be great if we could in the future remove the ThreadLocal from 
Analyzer, too - by using the same trick. Unfortunately with the current 
contract on TokenStream its hard to compare, unless we have a well-defined 
TokenStream#equals(). Ideally TokenStream#equals() should compare the 
"settings" of the stream and its inputs (for Filters), but that is too advanced 
for the simple 2 cases.

Another solution for this would be to have some "holder" around the TokenStream 
thats cached and provides hashcode/equals. By that a Field could determine 
better if its his own tokenstream (e.g. by putting a refernce to its field type 
into the holder).

> Reuse TokenStream instances in Field
> ------------------------------------
>
>                 Key: LUCENE-5634
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5634
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>             Fix For: 4.9, 5.0
>
>         Attachments: LUCENE-5634.patch, LUCENE-5634.patch, LUCENE-5634.patch
>
>
> If you don't reuse your Doc/Field instances (which is very expert: I
> suspect few apps do) then there's a lot of garbage created to index each
> StringField because we make a new StringTokenStream or
> NumericTokenStream (and their Attributes).
> We should be able to re-use these instances via a static
> ThreadLocal...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to