I'm in the need of skipping some query analysis steps for some
queries. Or more precisely, make it switchable with a query
parameter.

Use case:
<fieldType name="text_spec" class="solr.TextField" positionIncrementGap="100" 
autoGeneratePhraseQueries="true">
      <analyzer type="index">
        <charFilter class="solr.MappingCharFilterFactory" 
mapping="mapping-FoldToASCII.txt"/>
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" 
enablePositionIncrements="true"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="0" splitOnNumerics="1"/>
        <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <charFilter class="solr.MappingCharFilterFactory" 
mapping="mapping-FoldToASCII.txt"/>
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" 
enablePositionIncrements="true"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="0" splitOnNumerics="1"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.ShingleFilterFactory" maxShingleSize="3" outputUnigrams="false" 
outputUnigramsIfNoShingles="true"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" 
format="solr"
                                                  tokenizerFactory="solr.KeywordTokenizerFactory" 
ignoreCase="true" expand="true"/>
      </analyzer>
</fieldType>

For some queries I want to skip SynonymFilterFactory with or without 
ShingleFilterFactory.
First I thought of a second field with a seperate fieldType, but why stuffing 
content twice in the index?
So I had the idea to make things switchable with query parameter.
E.g. for SynonymFilterFactory class there will we two optional attributes,
querycontrol=true/false (default=false)
queryparam=sff.         (default=sff)

With query ...&sff=true&... it will use SynonymFilterFactory
with query ...&sff=false&... it will do nothing in SynonymFilterFactory.

Easy to implement but this is only for SynonymFilterFactory.
What if I want to swith of other filters with my query?
Should I patch all FilterFactories?

Next idea. How about to modify the analyzer?
<analyzer type="query">
  <charFilter...
  <tokenizer...
  <filter...
  <optional switch="foo">
    <filter...
    <filter...
  </optional>
</analyzer>

Now with query ...&foo=true&... it will use the filters enclosed by the 
optional tag,
with query ...&foo=false&... they are skipped.

Advantage:
- more flexibility
- no need to index content twice or more times if only changes in query analysis
  makes the difference


Any opinions?

Regards,
Bernd



Reply via email to