Hi All,

Still fairly new to Elasticsearch, but very impressed so far.  Right now 
I'm working on a place finder service that will access a repository of 
place names.  I'm attempting to build in some autocomplete functionality, 
and while I've made significant progress, it's not perfect.  My current 
mapping on the given field for both index and search is based on the 
following analyzer:

"analyzer_shingle" : { "tokenizer" : "standard", "filter" : [ "standard", 
"lowercase", "filter_shingle"] }

where filter_shingle is defined as follows:

"filter_shingle" : { "type" : "shingle", "max_shingle_size" : 5, 
"min_shingle_size" : 2, "output_unigrams" : "true }

I use this analyzer with a matchPhrasePrefixQuery, include a fuzziness of 
0.8 and a maxExpansions of 30.

I also have a keyword analyzer which utilizes the matchPhrasePrefixQuery as 
well, and is boosted so that fields that start with the entered value can 
be boosted significantly

For the most part, this works great!  I mean it really nails the search 
every time and it's blazing fast.  

So here's my issue, while this set up is working well, it fails if there 
are any additional words after the phrase that aren't found in the actual 
data.  For instance, if I search for Goat, I get results like the following:

Goat
Goat Corral Flat
Goat Island
Goat Island Preserve Trail
Big Goat Road

Then if I search for "Goat Isla", I find a whole bunch of Goat Islands.

However, if I continue typing say, "Goat Island United States", the search 
doesn't return any results.  Now that bums me out for two reasons.  On one 
hand, this doesn't seem to make sense with the shingle filter, but maybe 
i'm wrong.  In my understanding, the shingle filter will make something 
like the following tokens:

Goat
Goat Island
Goat Island United States
Island
Island United
United States

and so on and so forth...

Since all these tokens are passed into the search, and they are searching 
on shingle tokenized data, then there should definitely be matches, 
correct?  "Goat Island" should still match some Goat Islands, and Island 
should match a whole bunch of other things.  Shouldn't I be finding data 
here?  Any thoughts on what I might be doing wrong.  I would like to use 
the United States part of the search in an additional query on another 
field.

Thanks in advance for any help or direction!

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/12003ba2-6c52-4ec5-83f6-45926a1a6551%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to