Aggregation: trouble bucketing over a text field value

marko Tue, 07 Apr 2015 01:21:06 -0700

Hi,

I'm trying to bucket docs stored in Elasticsearch 1.4 based on nested 
document field values.
To use an illustrative example, I've an index with documents representing 
companies and their agents like this:


{
  "companyName": "Acme ltd.",
  "country": "us",
  "agents": [ { "name": "McKenzie, Brackman, Chaney and Kuzak" } ]
}


I'd like to count the number of companies that each agent represents and 
order agents by this count.
In SQL I would do something like this:

SELECT COUNT(*) cnt FROM agents WHERE country = 'us' AND GROUP BY name 
ORDER BY cnt DESC;

The following ES aggregation query comes close to solving this:

{
  "query": {
    "term": { "country": "us" }
  },
  "aggs": {
    "agents": {
      "nested": {
        "path": "agents"
      },
      "aggs": {
        "agent_stats": {
          "terms": {
            "field": "agents.name",
            "size": 99999
          }
        }
      }
    }
  }
}

However, there's a problem that in the above case e.g. separate buckets get 
created for McKenzie and Brackman etc. instead of just one bucket. This is 
probably caused by agents.name field being currently mapped as analyzed.

One solution I've thought of is to copy agents.name field and index it also 
as not_analyzed.

Is there another way to get this aggregation query to work without index 
mapping changes?
I'd prefer to not use ES-side scripting for security reasons.

marko

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c113a1fd-1797-4ef9-b281-3f71c9d0245c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Aggregation: trouble bucketing over a text field value

Reply via email to