AW: fuzzy/case insensitive AnalyzingSuggester )

Clemens Wyss DEV Sat, 24 Jan 2015 05:52:19 -0800

I am back on this topic ;)

>Case- and diacritics insensitivity is supported out-of-the-box by the 
>analyzing suggesters, including the FuzzySuggester. 
>The logic is in the Analyzer.
So how do I force case-insensitivity?
I tried
...
                <str 
name="lookupImpl">org.apache.solr.spelling.suggest.fst.FuzzyLookupFactory</str>
                <str name="ignoreCase=">true</str>
...
or
...
                <str 
name="lookupImpl">org.apache.solr.spelling.suggest.fst.AnalyzingLookupFactory</str>
                <str name="ignoreCase=">true</str>
...
to no avail


-----Ursprüngliche Nachricht-----
Von: Oliver Christ [mailto:[email protected]] 
Gesendet: Freitag, 20. Juni 2014 15:52
An: [email protected]
Betreff: RE: fuzzy/case insensitive AnalyzingSuggester )

Hi Clemens,

I haven't yet built a suggester which combines all three, and am not aware of 
one. I'd love to have one though ;-)

Case- and diacritics insensitivity is supported out-of-the-box by the analyzing 
suggesters, including the FuzzySuggester. The logic is in the Analyzer.

I haven't yet tried out AnalyzingInfixSuggester, and haven't investigated 
whether it's possible to combine that with FuzzySuggester (which also is an 
analyzing suggester).

Due to memory constraints, we build infix suggesters by adding each relevant 
substring, but use WFST suggesters with payloads as the base, to reduce RAM 
load at runtime. We call the analyzer in the dictionary iterator. At search 
time, we look up the surface form (completion) in a secondary index using the 
payload as a key (and for deduping).

If FuzzySuggester supports payloads (haven't checked), you could get an infix 
suggester using the same approach. That will lead to large automata, and as 
you'd have to look up the completion in a secondary index, you'd never use the 
surface form returned by the automaton itself, so it's a waste of space. WFSTs 
are more space-efficient but don't support payloads (if I remember correctly) 
and there's no fuzzy WFST suggester either :(

Generally, we found it beneficial to not combine all functionality in a single 
suggester, but use separate automata in a cascaded model. We first look up 
completions in the prefix non-fuzzy suggester. Based on several criteria, we 
may then consult the infix suggester, and if needed, the fuzzy suggester. The 
rationale is that we don't want high-ranking fuzzy or infix hits to fill up the 
completion list while there are good (but less popular) prefix hits. Having 
control over which suggester is used when, and how its specific suggestions are 
merged into the final result list, helps improving the user experience, at 
least with our use cases.

Cheers, Oli

-----Original Message-----
From: Clemens Wyss DEV [mailto:[email protected]] 
Sent: Friday, June 20, 2014 6:47 AM
To: [email protected]
Subject: AW: fuzzy/case insensitive AnalyzingSuggester )

Sorry for re-asking. 
Has anyone implemented an AnalyzingSuggester which 
- is fuzzy
- is case insensitive (or must/should this be implemented by the analyzer?)
- does infix search
[- has a small memory footprint]

-----Ursprüngliche Nachricht-----
Von: Clemens Wyss DEV [mailto:[email protected]] 
Gesendet: Freitag, 13. Juni 2014 14:53
An: [email protected]
Betreff: fuzzy/case insensitive AnalyzingSuggester )

Looking for an AnalyzingSuggester which supports
- fuzzyness
- case insensitivity
- small (in memors) footprint (*)

(*)Just tried to "hand" my big IndexReader (see oher post " [lucene 4.6] NPE 
when calling IndexReader#openIfChanged") into JaspellLookup. Got an OOM.
Is there any (Jaspell)Lookup implementation that can handle really big indexes 
(by swapping  out part of the "lookup-table")?


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

B�KKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKCB��[��X��ܚX�KK[XZ[
��]�K]\�\�][��X��ܚX�PX�[�K�\X�K�ܙ�B��܈Y][ۘ[��[X[��K[XZ[
��]�K]\�\�Z[X�[�K�\X�K�ܙ�B�B

AW: fuzzy/case insensitive AnalyzingSuggester )

Reply via email to