This is true of v8.3.1 and can be demonstrated easily using the _default
configset to create a default_test collection.

I create multiple documents, all with unique ids, containing the following
information,

{
  "id": "4",
  "title_t": "controle four"
  "content_t": "toegankelijkheid"
  "_text_": "controle four toegankelijkheid"
}

This adds the content of the title_t and content_t also into the _text_
spellcheck field.

When I use the spellcheck handler endpoint for the term controld it
correctly returns controle as a suggestion; [
http://localhost:8983/solr/default_test/spell?q=_text_:controld&spellcheck=true&spellcheck.count=10&rows=100000000&spellcheck.build=true&wt=json&indent=true
]

"suggestions": [
    "controld",
    {
        "numFound": 1,
        "startOffset": 7,
        "endOffset": 15,
        "origFreq": 0,
        "suggestion": [
            {
                "word": "controle",
                "freq": 10
            }
        ]
    }
]

however, when I search for kontrole it does not find *any* sugggestions,
even though it is likewise a single edit difference from controle. [
http://localhost:8983/solr/default_test/spell?q=_text_:kontrole&spellcheck=true&spellcheck.count=10&rows=100000000&spellcheck.build=true&wt=json&indent=true
]

"suggestions": []

I tried adding minPrefix=0 and also fiddling with accuracy=x, trying
different values for x, but could not find anything that would allow the
Dutch typo kontrole to have controle returned as a suggestion.

I am not really familiar with Levenshtein distance algorithm and how it
works under-the-hood, but I find it quite annoying that a simple initial
character typo, eich van wasily gappen, cannot be corrected.

Is there some configuration that would allow the DirectSolrSpellChecker to
suggest controle from the spell check field for the queried word kontrole?

-- 
Martin Graney

-- 
 <https://www.linkedin.com/company/sooqr-com/>

Reply via email to