I am back on this topic ;)
>Case- and diacritics insensitivity is supported out-of-the-box by the
>analyzing suggesters, including the FuzzySuggester.
>The logic is in the Analyzer.
So how do I force case-insensitivity?
I tried
...
<str
name="lookupImpl">org.apache.solr.spelling.suggest.fst.FuzzyLookupFactory</str>
<str name="ignoreCase=">true</str>
...
or
...
<str
name="lookupImpl">org.apache.solr.spelling.suggest.fst.AnalyzingLookupFactory</str>
<str name="ignoreCase=">true</str>
...
to no avail
-----Ursprüngliche Nachricht-----
Von: Oliver Christ [mailto:[email protected]]
Gesendet: Freitag, 20. Juni 2014 15:52
An: [email protected]
Betreff: RE: fuzzy/case insensitive AnalyzingSuggester )
Hi Clemens,
I haven't yet built a suggester which combines all three, and am not aware of
one. I'd love to have one though ;-)
Case- and diacritics insensitivity is supported out-of-the-box by the analyzing
suggesters, including the FuzzySuggester. The logic is in the Analyzer.
I haven't yet tried out AnalyzingInfixSuggester, and haven't investigated
whether it's possible to combine that with FuzzySuggester (which also is an
analyzing suggester).
Due to memory constraints, we build infix suggesters by adding each relevant
substring, but use WFST suggesters with payloads as the base, to reduce RAM
load at runtime. We call the analyzer in the dictionary iterator. At search
time, we look up the surface form (completion) in a secondary index using the
payload as a key (and for deduping).
If FuzzySuggester supports payloads (haven't checked), you could get an infix
suggester using the same approach. That will lead to large automata, and as
you'd have to look up the completion in a secondary index, you'd never use the
surface form returned by the automaton itself, so it's a waste of space. WFSTs
are more space-efficient but don't support payloads (if I remember correctly)
and there's no fuzzy WFST suggester either :(
Generally, we found it beneficial to not combine all functionality in a single
suggester, but use separate automata in a cascaded model. We first look up
completions in the prefix non-fuzzy suggester. Based on several criteria, we
may then consult the infix suggester, and if needed, the fuzzy suggester. The
rationale is that we don't want high-ranking fuzzy or infix hits to fill up the
completion list while there are good (but less popular) prefix hits. Having
control over which suggester is used when, and how its specific suggestions are
merged into the final result list, helps improving the user experience, at
least with our use cases.
Cheers, Oli
-----Original Message-----
From: Clemens Wyss DEV [mailto:[email protected]]
Sent: Friday, June 20, 2014 6:47 AM
To: [email protected]
Subject: AW: fuzzy/case insensitive AnalyzingSuggester )
Sorry for re-asking.
Has anyone implemented an AnalyzingSuggester which
- is fuzzy
- is case insensitive (or must/should this be implemented by the analyzer?)
- does infix search
[- has a small memory footprint]
-----Ursprüngliche Nachricht-----
Von: Clemens Wyss DEV [mailto:[email protected]]
Gesendet: Freitag, 13. Juni 2014 14:53
An: [email protected]
Betreff: fuzzy/case insensitive AnalyzingSuggester )
Looking for an AnalyzingSuggester which supports
- fuzzyness
- case insensitivity
- small (in memors) footprint (*)
(*)Just tried to "hand" my big IndexReader (see oher post " [lucene 4.6] NPE
when calling IndexReader#openIfChanged") into JaspellLookup. Got an OOM.
Is there any (Jaspell)Lookup implementation that can handle really big indexes
(by swapping out part of the "lookup-table")?
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
B�KKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKCB��[��X��ܚX�KK[XZ[
��]�K]\�\�][��X��ܚX�PX�[�K�\X�K�ܙ�B��܈Y][ۘ[��[X[��K[XZ[
��]�K]\�\�Z[X�[�K�\X�K�ܙ�B�B