java.lang.IllegalStateException: TokenStream contract violation: close() call missing

2017-02-07 Thread sumitj25
Hi, I am getting IllegalStateException even though I am following the TokenStream guidelines: public List tokenize(String text){ List tokens = new ArrayList<>(); try (TokenStream ts = analyzer.tokenStream("",text)){ //note that this is implicit finally

RE: java.lang.IllegalStateException: TokenStream contract violation: close() call missing

2017-02-07 Thread Uwe Schindler
Hi, The issue is: Your Analyzer's createComponents method returns the same instance on every call, this method has to _create_ a new instance on every call. Change your code to build the TokenStream *inside* createComponents, not outside. Uwe - Uwe Schindler Achterdiek 19, D-28357 Bremen h

RE: java.lang.IllegalStateException: TokenStream contract violation: close() call missing

2017-02-07 Thread sumitj25
Thanks Uwe, but I am using the same TokenStream so that I don't have to create the filter chain every time. This is my filter chain construction: Tokenizer tokenizer = tokenizerFactory.create(); TokenStream stream = new StandardFilter(tokenizer);

RE: java.lang.IllegalStateException: TokenStream contract violation: close() call missing

2017-02-07 Thread Uwe Schindler
Hi, you still have to create it. That is the contract and there is no way around. Keep in mind, the Analyzer instance is internally caching the components, so the factory is only called if needed. Don't try to further optimize this, it will fail! The TokenStreamComponents class is made for this

SynonymFilterFactory deprecated since 6.4.0

2017-02-07 Thread Bernd Fehling
I just tried Solr 6.4.1 and noticed that SynonymFilterFactory is deprecated, as reported in the logs. I hope that this is just to note that there is also an alternative SynonymGraphFilterFactory now available. And _not_ that SynonymFilterFactory will disappear, because it runs my multi-word Synon

Re: SynonymFilterFactory deprecated since 6.4.0

2017-02-07 Thread Michael McCandless
That's great that multi-token synonyms are working for you; can you describe how use them? This blog post describes some of the problems: http://blog.mikemccandless.com/2012/04/lucenes-tokenstreams-are-actually.html I'm working on another blog post to describe the recent changes ... should be out

Re: SynonymFilterFactory deprecated since 6.4.0

2017-02-07 Thread Bernd Fehling
Years ago (2007) I've installed Eurovoc Thesaurus to work with our Search Engine as multilingual search (terms and phrases in 22 languages). http://www.ub.uni-bielefeld.de/~befehl/base/solr/InsideBase_eurovocThesaurus.html The synonyms.txt file is 8.8MB in size and gets as FST over 300.000 mappin

Re: Maintain SortedSetDocValuesReaderState in Cache

2017-02-07 Thread Chitra R
Hi, How can I improve the performance of SortedSetDocValuesReaderState which is responsible for computing '$facets' global ordinals at search time? Is there any better way to warm up this state instead of putting it in cache? Any help is much appreciated. Thanks & Regards, Chitra On Fri,

Re: SynonymFilterFactory deprecated since 6.4.0

2017-02-07 Thread Michael McCandless
Thanks for sharing; it looks like a nice set of synonyms! It's good that you already apply them at search-time not index-time. In that case, you should not use the FlattenGraphFilter, because SynonymGraphFilter will produce a correct graph (unlike SynonymFilter) and the Lucene query parsers (not