1. The histogram aggregation (and facet) work on indexed values not based
on the current time or "now". So, if the last indexed document timestamp
is 3/15/14T16:15 you will not get empty buckets between 3/15/14T16:15 and the
current time. It would be interesting to be able to set the "to" and
"from" on histogram based aggregations to allow for generating buckets on
intervals between the defined range.
2. I believe this is the way the keys are pulled from the fielddata which
is index level data. So if you are using the "all" index you are going to
get data from all indices. Not sure if this is a bug or not. You can try
applying a filter aggregation:
POST _all/summary_phys/_search
{
"aggs": {
"summary_phys_events": {
"filter": {
"type": {"value": "summary_phys_events"}
},
"aggs": {
"events_by_date": {
"date_histogram": {
"field": "@timestamp",
"interval": "300s",
"min_doc_count": 0
},
"aggs": {
"events_by_host": {
"terms": {
"field": "host.raw",
"min_doc_count": 0
},
"aggs": {
"avg_used": {
"avg": {
"field": "used"
}
},
"max_used": {
"max": {
"field": "used"
}
}
}
}
}
}
}
}
}
}
On Tue, Mar 18, 2014 at 12:39 PM, John Stanford <[email protected]>wrote:
> Hi,
>
> I'm trying to get a better understanding of aggregations, so here are a
> couple of questions that came up recently.
>
> Question 1:
>
> I have some time based data that I am using aggregations to chart. The
> data may be sparsely populated, so I've been setting min_doc_count to 0 so
> I get empty buckets back anyway. I've noticed that it will fill in empty
> buckets unless they are before or after the first record of the range.
>
> For example, if I use a query similar to the one below, and there are no
> records after 3/15/14T16:15, the last aggregation record will be for
> 3/15/14T16:15. On the other hand, if there is a gap in between the start
> time and 3/15/14T16:15, I will get a bucket with a 0 doc count (as
> expected).
>
> POST _all/summary_phys/_search
>
> {
> "aggs": {
> "events_by_date": {
> "date_histogram": {
> "field": "@timestamp",
> "interval": "300s",
> "min_doc_count": 0
> },
> "aggs": {
> "events_by_host": {
> "terms": {
> "field": "host.raw"
> },
> "aggs": {
> "avg_used": {
> "avg": {
> "field": "used"
> }
> },
> "max_used": {
> "max": {
> "field": "used"
> }
> }
> }
> }
> }
> }
> }
> }
>
> Not getting the 0 doc count buckets back at the front and back of the
> range seems contrary to the documented purpose of min_doc_count. Am I
> doing something wrong?
>
> Question 2:
>
>
> If I add a min_doc_count = 0 to the inner aggregation, but limit the
> search to a specific doc type like:
>
> doc type
> v
> POST _all/summary_phys/_search
> {
> "aggs": {
> "events_by_date": {
> "date_histogram": {
> "field": "@timestamp",
> "interval": "300s",
> "min_doc_count": 0
> },
> "aggs": {
> "events_by_host": {
> "terms": {
> "field": "host.raw",
> "min_doc_count": 0
> },
> "aggs": {
> "avg_used": {
> "avg": {
> "field": "used"
> }
> },
> "max_used": {
> "max": {
> "field": "used"
> }
> }
> }
> }
> }
> }
> }
> }
>
> I get buckets with entries matching hosts that do not show up in this doc
> type. For example, I have only 3 values for host in this doc type
> [compute-4, compute-2, compute-3], but I will get buckets back with hosts
> from other doc types like:
>
> "events_by_host": {
> "buckets": [
> {
> "key": "compute-4",
> "doc_count": 11,
> "max_used": {
> "value": 4608
> },
> "avg_used": {
> "value": 3677.090909090909
> }
> },
> {
> "key": "compute-2",
> "doc_count": 8,
> "max_used": {
> "value": 4608
> },
> "avg_used": {
> "value": 2304
> }
> },
> {
> "key": "compute-3",
> "doc_count": 2,
> "max_used": {
> "value": 4608
> },
> "avg_used": {
> "value": 4608
> }
> },
> {
> "key": "10.10.11.22:49509",
> "doc_count": 0,
> "max_used": {
> "value": null
> },
> "avg_used": {
> "value": null
> }
> },
> {
> "key": "controller",
> "doc_count": 0,
> "max_used": {
> "value": null
> },
> "avg_used": {
> "value": null
> }
> },
> {
> "key": "object-1",
> "doc_count": 0,
> "max_used": {
> "value": null
> },
> "avg_used": {
> "value": null
> }
> }
> ]
> }
>
> Is there a way to ensure that the inner aggregation also only buckets
> things matching the search doc type?
>
> Thanks in advance...
>
> John
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/856133dc-c4ae-4cfc-adab-39453671d76d%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/856133dc-c4ae-4cfc-adab-39453671d76d%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAJ3KEoD1S47%2Bdu4hU8wAugzJW4LnWgP4A2XhjARLBnP2hvStJA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.