As the open issue you quote suggests, ES currently has no support for an equivalent to to SQL’s HAVING clause. Here's another reference which supports that: https://groups.google.com/forum/#!msg/elasticsearch/UsrCG2Abj-A/IDO9DX_PoQwJ
What I did as a workaround is get all the results in an intermediate layer and then loop through them to leaving out the ones not meeting my boolean criteria (COUNT(*) = x). But that is not really a solution to your problem of too many results. I had 200,000 results which worked fine but if I had 200M that would not work so well. And it won't work for any of the aggregation functions (sum, min, max, avg) other than count as far as I can tell. Have you considered the '"min_doc_count": 50' feature? http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#_minimum_document_count I used it to filter out all the groups that have less than x documents and then manually removed the groups with more than x. In your case it looks like you want to filter on something like HAVING SUM(impression) > 49 and I don't think there is even a workaround for that because the functions, and even script filters, are applied to documents, not to the aggregations. At least as far as I can tell. It would be great if someone showed me otherwise. Zennet On Wednesday, February 5, 2014 7:50:01 AM UTC-8, [email protected] wrote: > > Hi guys, > I’m using elasticsearch 1.0.0RC2 and wondering if there is an > equivalent to SQL’s “having-clause” for the aggregation framework there. > Below is an example query and a link to a ticket that describes the issue > well. The part of the query that's highlighted doesn't work, and is there > purely to give an idea of what I'm after. This query (omitting the > highlighted portion) gives impression counts for every > placement-referer-device-date combo. This is fine but the output is HUGE! I > was wondering if there was a way (like a having clause or filter) to reduce > the amount of results based off some logic (in this case, impressions > counts greater than 50). Thanks all! > > - Trev > > https://github.com/elasticsearch/elasticsearch/issues/4404 > > > curl -XPOST "XXXXXXXX/YYYYYYYY/_search?pretty=true" -d ' > > { > > "size":0, > > "query": { > > "filtered": { > > "query": { > > "range": { > > "date_time": { > > "from": "ZZZZZZZ", > > "to": "QQQQQQQQ", > > "include_lower": true, > > "include_upper": true > > } > > } > > } > > } > > }, > > "aggs": { > > "placement": { > > "terms": { > > "field": "placement" > > }, > > "aggs": { > > "device": { > > "terms": { > > "field": "device" > > }, > > "aggs": { > > "referer": { > > "terms": { > > "field": "referer" > > }, > > "aggs": { > > "totals": { > > "date_histogram": { > > "field": "date_time", > > "interval": "day" > > }, > > "aggs": { > > "impression": { > > "sum": { > > "field": "impression" > > } > > ,"having" : { "from" : 50 } > > } > > } > > } > > } > > } > > } > > } > > } > > } > > } > > } > > ' > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/99cd80a9-8f5d-4378-b4a5-09b8421f8c4e%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
