Hi,
I am getting IllegalStateException even though I am following the
TokenStream guidelines:
public List tokenize(String text){
List tokens = new ArrayList<>();
try (TokenStream ts = analyzer.tokenStream("",text)){ //note
that this
is implicit finally
Hi,
The issue is: Your Analyzer's createComponents method returns the same instance
on every call, this method has to _create_ a new instance on every call. Change
your code to build the TokenStream *inside* createComponents, not outside.
Uwe
-
Uwe Schindler
Achterdiek 19, D-28357 Bremen
h
Thanks Uwe, but I am using the same TokenStream so that I don't have to
create the filter chain every time. This is my filter chain construction:
Tokenizer tokenizer = tokenizerFactory.create();
TokenStream stream = new StandardFilter(tokenizer);
Hi,
you still have to create it. That is the contract and there is no way around.
Keep in mind, the Analyzer instance is internally caching the components, so
the factory is only called if needed. Don't try to further optimize this, it
will fail! The TokenStreamComponents class is made for this
I just tried Solr 6.4.1 and noticed that SynonymFilterFactory is
deprecated, as reported in the logs.
I hope that this is just to note that there is also an alternative
SynonymGraphFilterFactory now available.
And _not_ that SynonymFilterFactory will disappear, because it runs my
multi-word Synon
That's great that multi-token synonyms are working for you; can you
describe how use them?
This blog post describes some of the problems:
http://blog.mikemccandless.com/2012/04/lucenes-tokenstreams-are-actually.html
I'm working on another blog post to describe the recent changes ...
should be out
Years ago (2007) I've installed Eurovoc Thesaurus to work with our
Search Engine as multilingual search (terms and phrases in 22 languages).
http://www.ub.uni-bielefeld.de/~befehl/base/solr/InsideBase_eurovocThesaurus.html
The synonyms.txt file is 8.8MB in size and gets as FST over 300.000 mappin
Hi,
How can I improve the performance of SortedSetDocValuesReaderState
which is responsible for computing '$facets' global ordinals at search time?
Is there any better way to warm up this state instead of putting it in
cache?
Any help is much appreciated.
Thanks & Regards,
Chitra
On Fri,
Thanks for sharing; it looks like a nice set of synonyms!
It's good that you already apply them at search-time not index-time.
In that case, you should not use the FlattenGraphFilter, because
SynonymGraphFilter will produce a correct graph (unlike SynonymFilter)
and the Lucene query parsers (not