I'm in the need of skipping some query analysis steps for some
queries. Or more precisely, make it switchable with a query
parameter.
Use case:
<fieldType name="text_spec" class="solr.TextField" positionIncrementGap="100"
autoGeneratePhraseQueries="true">
<analyzer type="index">
<charFilter class="solr.MappingCharFilterFactory"
mapping="mapping-FoldToASCII.txt"/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"
enablePositionIncrements="true"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1"
catenateAll="0" splitOnCaseChange="0" splitOnNumerics="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<charFilter class="solr.MappingCharFilterFactory"
mapping="mapping-FoldToASCII.txt"/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"
enablePositionIncrements="true"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0"
catenateAll="0" splitOnCaseChange="0" splitOnNumerics="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.ShingleFilterFactory" maxShingleSize="3" outputUnigrams="false"
outputUnigramsIfNoShingles="true"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
format="solr"
tokenizerFactory="solr.KeywordTokenizerFactory"
ignoreCase="true" expand="true"/>
</analyzer>
</fieldType>
For some queries I want to skip SynonymFilterFactory with or without
ShingleFilterFactory.
First I thought of a second field with a seperate fieldType, but why stuffing
content twice in the index?
So I had the idea to make things switchable with query parameter.
E.g. for SynonymFilterFactory class there will we two optional attributes,
querycontrol=true/false (default=false)
queryparam=sff. (default=sff)
With query ...&sff=true&... it will use SynonymFilterFactory
with query ...&sff=false&... it will do nothing in SynonymFilterFactory.
Easy to implement but this is only for SynonymFilterFactory.
What if I want to swith of other filters with my query?
Should I patch all FilterFactories?
Next idea. How about to modify the analyzer?
<analyzer type="query">
<charFilter...
<tokenizer...
<filter...
<optional switch="foo">
<filter...
<filter...
</optional>
</analyzer>
Now with query ...&foo=true&... it will use the filters enclosed by the
optional tag,
with query ...&foo=false&... they are skipped.
Advantage:
- more flexibility
- no need to index content twice or more times if only changes in query analysis
makes the difference
Any opinions?
Regards,
Bernd