[
https://issues.apache.org/jira/browse/LUCENE-5638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13987965#comment-13987965
]
Uwe Schindler commented on LUCENE-5638:
---------------------------------------
bq. Changes like LUCENE-5634 make it clear that the default AttributeFactory
stuff has a very high cost: weakmaps/reflection/etc.
The problem are not the weak maps and reflections. The reason why it is
expensive is the fact that all attribute instances have to be put into the 2
LinkedHashMaps on creating the TokenStream. I just repeat: It is not the
refection! We had this discussion already back 5 years ago with Michael Busch!
In addition, the AttributeFactory itsself has less impact (this was already
tested while developing it in 2.9). This is why the weak maps are there - so it
is fast, the *only* reflection ever happens is: Class#newInstance() is cheap in
recent Java versions, the speed difference in micro benchmarks is small, as
fast as a native {{new}}.
So I disagree with removing the default AttributeFactory, we still need it for
non-default attributes, so: The simple workaround would be to use
TOKEN_ATTRIBUTE_FACTORY instead, which falls back to the default one for
unknown attributes.
I agree with clearAttributes(), but this should be solved with
TOKEN_ATTRIBUTE_FACTORY , too.
> Default Attributes are expensive
> --------------------------------
>
> Key: LUCENE-5638
> URL: https://issues.apache.org/jira/browse/LUCENE-5638
> Project: Lucene - Core
> Issue Type: Bug
> Components: modules/analysis
> Reporter: Robert Muir
>
> Changes like LUCENE-5634 make it clear that the default AttributeFactory
> stuff has a very high cost: weakmaps/reflection/etc.
> Additionally I think clearAttributes() is more expensive than it should be:
> it has to traverse a linked-list, calling clear() per token.
> Operations like cloning (save/restoreState) have a high cost tll.
> Maybe we can have a better Default? In other words, rename
> DEFAULT_ATTRIBUTE_FACTORY to REFLECTION_ATTRIBUTE_FACTORY, and instead have a
> faster default factory that just has one AttributeImpl with the "core ones"
> that 95% of users are dealing with (TOKEN_ATTRIBUTE_FACTORY?): anything
> outside of that falls back to reflection.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]