[
https://issues.apache.org/jira/browse/LUCENE-5634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13987841#comment-13987841
]
Uwe Schindler commented on LUCENE-5634:
---------------------------------------
Patch looks fine. I was afraid of complexity, but that looks quite good. I am
not sure about backwards compatibility issues, but implementing your own
IndexableField instance is still very expert. With Java 8 we could handle that
with default interface methods (LOOOOOOL).
The current patch is fine for the 2 special cases, although its a bit risky, if
we add new "settings" to NTS or change its API (we should have equals...).
Maybe in LUCENE-5605 we can improve the check. If we pass FieldType directly to
NTS and NRQ, we can handle the whole thing by comparing the field type and not
rely on crazy internals like precStep.
It would be great if we could in the future remove the ThreadLocal from
Analyzer, too - by using the same trick. Unfortunately with the current
contract on TokenStream its hard to compare, unless we have a well-defined
TokenStream#equals(). Ideally TokenStream#equals() should compare the
"settings" of the stream and its inputs (for Filters), but that is too advanced
for the simple 2 cases.
Another solution for this would be to have some "holder" around the TokenStream
thats cached and provides hashcode/equals. By that a Field could determine
better if its his own tokenstream (e.g. by putting a refernce to its field type
into the holder).
> Reuse TokenStream instances in Field
> ------------------------------------
>
> Key: LUCENE-5634
> URL: https://issues.apache.org/jira/browse/LUCENE-5634
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Michael McCandless
> Fix For: 4.9, 5.0
>
> Attachments: LUCENE-5634.patch, LUCENE-5634.patch, LUCENE-5634.patch
>
>
> If you don't reuse your Doc/Field instances (which is very expert: I
> suspect few apps do) then there's a lot of garbage created to index each
> StringField because we make a new StringTokenStream or
> NumericTokenStream (and their Attributes).
> We should be able to re-use these instances via a static
> ThreadLocal...
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]