[ https://issues.apache.org/jira/browse/LUCENE-2051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778184#action_12778184 ]
Simon Willnauer commented on LUCENE-2051: ----------------------------------------- bq. should we expose the getDefaultStopSet() as public yet, This is different. the StopawareAnalyzer#getStopwords() is an instance method to get the "current" stopword set of the instance. while the ones I introduced here are static to get the default set instead. We need to provide a replacement for the public static final Sting[] stuff for deprecation an I thing they have to be there. thoughts? bq. also, I'm not sure i like the copy() method in CharArraySet, i think it should return a real copy even if it is an EMPTY_SET, and if you give it a CharArraySet it should call .clone() ? the deal with this copy method is that StopFilter converts the incoming set to a chararrayset if its not a such already. I want to have all sets in analyzers to be unmodifiable and an instance of ChararraySet. Further they shoud be a real copy as otherwise they could be modified by the caller of the Analyzer ctor. Thats why I introduced this helper as such code was duplicated all over the place. bq. nothing to do with your issue, but maybe while we are here cleaning up these ctors we should fix the fact that a lot of these never call super() ? Java guarantees that the default super ctor is called implicitly. I would not add all this noise (calling it explicitly) just for the sake of typing super() 20 times. Thoughts? > Contrib Analyzer Setters should be deprecated and replace with ctor arguments > ----------------------------------------------------------------------------- > > Key: LUCENE-2051 > URL: https://issues.apache.org/jira/browse/LUCENE-2051 > Project: Lucene - Java > Issue Type: Improvement > Components: contrib/analyzers > Affects Versions: 2.9.1 > Reporter: Simon Willnauer > Assignee: Simon Willnauer > Priority: Minor > Fix For: 3.0 > > Attachments: LUCENE-2051.patch > > > Some analyzers in contrib provide setters for stopword / stem exclusion sets > / hashtables etc. Those setters should be deprecated as they yield unexpected > behaviour. The way they work is they set the reusable token stream instance > to null in a thread local cache which only affects the tokenstream in the > current thread. Analyzers itself should be immutable except of the > threadlocal. > will attach a patch soon. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org