Take a look at suggesters - they are meant for that plus they are more performant! http://www.elasticsearch.org/blog/you-complete-me/
-- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.com/synhershko/> On Thu, Aug 28, 2014 at 10:22 PM, Germán Carrillo <[email protected] > wrote: > The use case I'm addressing right now is searching place hierarchies (that > could include place types as well). In my country, you can specify place > hierarchy in several ways. For instance: > > "El corregimiento de Mulaló, jurisdicción del municipio de Yumbo (Valle > del Cauca)" > "El corregimiento de Mulaló, en jurisdicción del municipio de Yumbo del > Valle del Cauca" > "El corregimiento de Mulaló, ubicado en Yumbo, Valle del Cauca" > "El corregimiento de Mulaló, en Yumbo, Valle del Cauca" > "El corregimiento de Mulaló, en el municipio de Yumbo (Valle del Cauca)" > "El corregimiento de Mulaló - Yumbo, Valle del Cauca" > "Mulaló, Yumbo, Valle del Cauca" > "Mulaló, Municipio de Yumbo, en el Valle del Cauca" > "Corregimiento de Mulaló, Municipio de Yumbo, Departamento del Valle del > Cauca" > "Corregimiento de Mulaló, Municipio de Yumbo, Departamento de Valle del > Cauca" > "Corregimiento de Mulaló, Municipio de Yumbo, en el Valle del Cauca" > "Corregimiento de Mulaló, Municipio de Yumbo, en el Valle del Cauca" > ... > > All of those are equivalent. > > I want to get rid of articles ("el", "la", "los", "las"), prepositions > ("de", "del"), and other synonyms (e.g. "en" and "jurisdicción", "ubicado > en") so that I can compare analyzed queries with some pre-generated (few) > cases I can handle from my original JSON docs. > > > Thanks for the link, the only caveat I see is (of course) to figure out > the cutoff_frequency. Additionally, There are other very common words in > my index I wouldn't like to overlook. For instance, a place type such as > "municipio" (municipality) is the second level in the place hierarchy, so > it could appear in any other place from the third level down the hierarchy. > The sample data I mentioned above is a third level place. > > > > 2014-08-28 13:55 GMT-05:00 Itamar Syn-Hershko <[email protected]>: > >> >> http://www.elasticsearch.org/blog/stop-stopping-stop-words-a-look-at-common-terms-query/ > > > > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/CANaz7mx0tqxJsdbHgw9JONUFLWDSW7zdvtA%3DA%2B-yUV%3DN69kXzg%40mail.gmail.com > <https://groups.google.com/d/msgid/elasticsearch/CANaz7mx0tqxJsdbHgw9JONUFLWDSW7zdvtA%3DA%2B-yUV%3DN69kXzg%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zt6M_Q%3DBbqPvzBNA6Zy6m%2Bx6SDgvstK5avHW_Kr2oYMzg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
