I have been Googling for an hour with no success whatsoever about how to
configure Lucene (Elasticsearch, actually, but presumably the same deal) to
index edge ngrams for typeahead. I don't really know how filters,
analyzers, and tokenizers work together - documentation isn't helpful on
that count either - but I managed to cobble together the following
configuration that I thought would work. It doesn't, though - when I index
documents into the collection with that setting, they still only match
whole words instead of ngrams. What am I missing? *Should* this work?
How do I debug this to see if the documents are having these settings
applied to them?
{
"settings":{
"index":{
"analysis":{
"analyzer":{
"typeahead_analyzer":{
"type":"custom",
"tokenizer":"edgeNGram",
"filter":["typeahead_ngram"]
}
},
"filter":{
"typeahead_ngram":{
"type":"edgeNGram",
"min_gram":1,
"max_gram":8,
"side":"front"
}
}
}
}
},
"mappings": {
"name": {
"properties": {
"name": {
"type": "string",
"analyzer": "typeahead_analyzer"
}
}
}
}
}
--
C. Benson Manica
[email protected]