Re: ReversedWildcardFilter - should it be applied only at the index time?

TK Solr Wed, 15 Apr 2020 11:40:59 -0700

Thank you.

Is there any harm if I use it on the query side too? In my case it seems workingOK (even with withOriginal="false"), and even faster.I see the query parser code is taking a look at index analyzer and applyingReversedWildcardFilter at query time. But I didn'tquite understand what happens if the query analyzer also usesReversedWildcardFilter.


On 4/15/20 1:51 AM, Colvin Cowie wrote:

You only need apply it in the index analyzer:
https://lucene.apache.org/solr/8_4_0/solr-core/org/apache/solr/analysis/ReversedWildcardFilterFactory.html
If it appears in the index analyzer, the query part of it is automatically
applied at query time.

The ReversedWildcardFilter indexes *every* token in reverse, with a special
character at the start ('\u0001' I believe) to avoid false positive matches
when the query term isn't reversed (e.g. if the term being indexed is mar,
then the reversed token would be \u0001ram, so a search for 'ram' wouldn't
accidentally match that). If *withOriginal* is set to true then it will
reverse the normal token as well as the reversed token.


On Thu, 9 Apr 2020 at 02:27, TK Solr <tksol...@sonic.net> wrote:

I experimented with the index-time only use of ReversedWildcardFilter and
the
both time use.

My result shows using ReverseWildcardFilter both times runs twice as fast
but my
dataset is not very large (in the order of 10k docs), so I'm not sure if I
can
make a conclusion.

On 4/8/20 2:49 PM, TK Solr wrote:

In the usage example shown in ReversedWildcardFilter
<

https://lucene.apache.org/solr/guide/8_3/filter-descriptions.html#reversed-wildcard-filter>

in Solr Ref Guide,
and only usage find in managed-schema to define text_general_rev, the

filter

is used only for indexing.

<fieldType name="text_general_rev" class="solr.TextField"
positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" words="stopwords.txt"
ignoreCase="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.ReversedWildcardFilterFactory"

maxPosQuestion="2"

maxFractionAsterisk="0.33" maxPosAsterisk="3" withOriginal="true"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.SynonymGraphFilterFactory" expand="true"
ignoreCase="true" synonyms="synonyms.txt"/>
<filter class="solr.StopFilterFactory" words="stopwords.txt"
ignoreCase="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>


Is it incorrect to use the same analyzer for query like?

<fieldType name="lowercase_rev" class="solr.TextField"
positionIncrementGap="100">
<!-- Added to handle right-anchored substring match for email fields

-->

<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.ReversedWildcardFilterFactory"

maxPosQuestion="0"

maxFractionAsterisk="0" maxPosAsterisk="100" withOriginal="false"/>
</analyzer>
</fieldType>

In the description of filter, I see "Tokens without wildcards are not

reversed."

But the wildcard appears only in the query string. How can
ReversedWildcardFilter know if the wildcard is being used
if the filter is used only at the indexing time?

TK

Re: ReversedWildcardFilter - should it be applied only at the index time?

Reply via email to