Out of curiosity, what kind of performance do you get when you only run the search on '.raw' fields and not regular fields (with edgengram). Obviously the result of the query will not be the same as before as the whole world should match if the edgengram are out of the picture. I had some pretty weird result in the past where under specific circonstances I had better performance results with prefix queries than edgengram with a huge volume of data.

This reminds me a project I worked on indexing data from geonames. One thing we did with altername names and support for multiple languages was to remove the field for the default language. A default language after all is a language that exists (either 'en', 'fr', etc.). This will make your index smaller and make you run the query on less fields (15 instead of 25).

Also I noticed that there is no edgengram on the postcode. Any reason for that? It might be useful to also do a partial match.

Stéphane

On 06/25/2014 05:33 PM, Christoph Lingg wrote:

    Could you also share the query you are running? do run the
    cross_field query against the default field or the 'raw'  field?


it looks like this:

    {

    "function_score": {"functions": [{"script_score": {"script": "1. +
    50. * doc['importance'].value"}}],

    "boost_mode": "sum",

    "score_mode": "sum",

    "query": {

    "multi_match": {

      "analyzer": "search",

      "type": "cross_fields",

      "fields": [

        "name.default.raw^18", "name.default^2.5",
    "name.${lang}.raw^18", "name.${lang}^2.5",
    "name.alternatives.raw^14", "name.alternatives^1.5",

        "city.default.raw^8", "city.default^2", "city.${lang}.raw^8",
    "city.${lang}^2",

        "street.default.raw^8", "street.default^2",
    "street.${lang}.raw^8", "street.${lang}^2",

        "housenumber.raw^6", "housenumber",

        "postcode^5",

        "country.default.raw^3", "country.default",
    "country.${lang}.raw^3", "country.${lang}",
    "context.default.raw^3", "context.default",
    "context.${lang}.raw^3", "context.${lang}"

      ],

      "minimum_should_match": ${should_match},

      "query": "${query}"

    }

        }

      }

    }

--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/bjl2PJEhYsg/unsubscribe. To unsubscribe from this group and all its topics, send an email to [email protected] <mailto:[email protected]>. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/46157fe0-5397-4413-923d-8991ccbbeb02%40googlegroups.com <https://groups.google.com/d/msgid/elasticsearch/46157fe0-5397-4413-923d-8991ccbbeb02%40googlegroups.com?utm_medium=email&utm_source=footer>.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/53AAF6C8.8060604%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to