Re: Phrase query no hits when stopwords and FlattenGraphFilterFactory used

2020-11-11 Thread Edward Turner
Many thanks Walter, that's useful information. And yes, if we are able to keep stopwords, then we will. We have been exploring it because we've noticed its use leads to a sizable drop in index size (5%, in some of our tests), which then had the knock on effect of better performance. (Also,

Re: Phrase query no hits when stopwords and FlattenGraphFilterFactory used

2020-11-10 Thread Walter Underwood
By far the simplest solution is to leave stopwords in the index. That also improves relevance, because it becomes possible to search for “vitamin a” or “to be or not to be”. Stopword remove was a performance and disk space hack from the 1960s. It is no longer needed. We were keeping stopwords

Re: Phrase query no hits when stopwords and FlattenGraphFilterFactory used

2020-11-10 Thread Edward Turner
Hi all, Okay, I've been doing more research about this problem and from what I understand, phrase queries + stopwords are known to have some difficulties working together in some circumstances. E.g., https://stackoverflow.com/questions/56802656/stopwords-and-phrase-queries-solr?rq=1

Phrase query no hits when stopwords and FlattenGraphFilterFactory used

2020-11-06 Thread Edward Turner
Hi all, We are experiencing some unexpected behaviour for phrase queries which we believe might be related to the FlattenGraphFilterFactory and stopwords. Brief description: when performing a phrase query "Molecular cloning and evolution of the" => we get expected hits "Molecular cloning and