I finished indexing the same dataset in an index with only one shard.

$ curl 'http://localhost:9200/52b1e8c1f8b9d73130000004/_search?pretty=true' 
-d '{
   "size": 0,
   "facets": {
      "participants": {
         "terms": {
            "field": "actor.displayName",
            "size": 10
         }
      }
   },
   "aggs": {
      "participants": {
         "terms": {
            "field": "actor.displayName",
            "size": 10
         }
      }
   }
}'
{
  "took" : 1377,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "hits" : {
    "total" : 1060387,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "facets" : {
    "participants" : {
      "_type" : "terms",
      "missing" : 0,
      "total" : 1129848,
      "other" : 1111270,
      "terms" : [ {
        "term" : "totaltrafficbos",
        "count" : 3599
      }, {
        "term" : "mai93thm",
        "count" : 2517
      }, {
        "term" : "mai95thm",
        "count" : 2207
      }, {
        "term" : "mai90thm",
        "count" : 2207
      }, {
        "term" : "totaltrafficnyc",
        "count" : 1660
      }, {
        "term" : "confessions",
        "count" : 1534
      }, {
        "term" : "incidentreports",
        "count" : 1468
      }, {
        "term" : "nji80thm",
        "count" : 1180
      }, {
        "term" : "pai76thm",
        "count" : 1142
      }, {
        "term" : "txi35thm",
        "count" : 1064
      } ]
    }
  },
  "aggregations" : {
    "participants" : {
      "buckets" : [ {
        "key" : "totaltrafficbos",
        "doc_count" : 3599
      }, {
        "key" : "mai93thm",
        "doc_count" : 2517
      }, {
        "key" : "mai90thm",
        "doc_count" : 2207
      }, {
        "key" : "mai95thm",
        "doc_count" : 2207
      }, {
        "key" : "totaltrafficnyc",
        "doc_count" : 1660
      }, {
        "key" : "confessions",
        "doc_count" : 1534
      }, {
        "key" : "incidentreports",
        "doc_count" : 1468
      }, {
        "key" : "nji80thm",
        "doc_count" : 1180
      }, {
        "key" : "pai76thm",
        "doc_count" : 1142
      }, {
        "key" : "txi35thm",
        "doc_count" : 1064
      } ]
    }
  }
}


Now the counts and are the same as with faceting, and more important, 
consistent.

Seems like the problem resides in aggs on multiple shards. How to proceed 
from here?

-- Nils

On Friday, January 31, 2014 4:30:55 PM UTC+1, Nils Dijk wrote:
>
> Hi,
>
> I am tinkering with elasticsearch 1.0.0RC1 for a bit. Especially the part 
> of aggregations. When looking closer to the responses of the aggregations I 
> noticed the numbers fluctuated all the time.
>
> I have an index:
>   shards: 10
>   replicas: 0
>   documents: ~1M
>
> Currently I'm not ingesting data anymore.
>
> When I try to recreate the terms facet in aggregations I came up with the 
> following:
>
> {
>    "size": 0,
>    "facets": {
>       "participants": {
>          "terms": {
>             "field": "actor.displayName",
>             "size": 10
>          }
>       }
>    },
>    "aggs": {
>       "participants": {
>          "terms": {
>             "field": "actor.displayName",
>             "size": 10
>          }
>       }
>    }
> }
>
>
> This should give me roundabout the top 10 
> (*<https://github.com/elasticsearch/elasticsearch/issues/1305>) 
> occurring terms in the 'actor.displayName' field. The terms facet gives the 
> same counts over and over again, which is what is expected. However, the 
> counts from the aggregations return different numbers every time I invoke 
> it. Results of 3 consecutive runs: 
> https://gist.github.com/thanodnl/8733837.
>
> Currently I'm reindexing all the documents in an index with only one shard 
> to see if that makes a difference.
> This would only solve the problem short term, but our production load is 
> too big to fit in one shard.
>
> -- Nils
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e2e84dc5-cd11-476c-90b4-a0aa5e0fdd72%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to