As the open issue you quote suggests, ES currently has no support for an 
equivalent to to SQL’s HAVING clause.
Here's another reference which supports that:
https://groups.google.com/forum/#!msg/elasticsearch/UsrCG2Abj-A/IDO9DX_PoQwJ

What I did as a workaround is get all the results in an intermediate layer 
and then loop through them to leaving out the ones not meeting my boolean 
criteria (COUNT(*) = x). But that is not really a solution to your problem 
of too many results. I had 200,000 results which worked fine but if I had 
200M that would not work so well. And it won't work for any of the 
aggregation functions (sum, min, max, avg) other than count as far as I can 
tell.

Have you considered the '"min_doc_count": 50' feature?
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#_minimum_document_count
I used it to filter out all the groups that have less than x documents and 
then manually removed the groups with more than x.
In your case it looks like you want to filter on something like HAVING 
SUM(impression) > 49 and I don't think there is even a workaround for that 
because the functions, and even script filters, are applied to documents, 
not to the aggregations. At least as far as I can tell. It would be great 
if someone showed me otherwise.

Zennet 


On Wednesday, February 5, 2014 7:50:01 AM UTC-8, [email protected] wrote:
>
> Hi guys,
>     I’m using elasticsearch 1.0.0RC2 and wondering if there is an 
> equivalent to SQL’s “having-clause” for the aggregation framework there. 
> Below is an example query and a link to a ticket that describes the issue 
> well. The part of the query that's highlighted doesn't work, and is there 
> purely to give an idea of what I'm after. This query (omitting the 
> highlighted portion) gives impression counts for every 
> placement-referer-device-date combo. This is fine but the output is HUGE! I 
> was wondering if there was a way (like a having clause or filter) to reduce 
> the amount of results based off some logic (in this case, impressions 
> counts greater than 50). Thanks all!
>
> - Trev
>
> https://github.com/elasticsearch/elasticsearch/issues/4404
>
>
> curl -XPOST "XXXXXXXX/YYYYYYYY/_search?pretty=true" -d '
>
> {
>
> "size":0,
>
> "query": {
>
> "filtered": {
>
> "query": {
>
> "range": {
>
> "date_time": {
>
> "from": "ZZZZZZZ",
>
> "to": "QQQQQQQQ",
>
> "include_lower": true,
>
> "include_upper": true
>
> }
>
> }
>
> }
>
> }
>
> },
>
> "aggs": {
>
> "placement": {
>
> "terms": {
>
> "field": "placement"
>
> },
>
> "aggs": {
>
> "device": {
>
> "terms": {
>
> "field": "device"
>
> },
>
> "aggs": {
>
> "referer": {
>
> "terms": {
>
> "field": "referer"
>
> },
>
> "aggs": {
>
> "totals": {
>
> "date_histogram": {
>
> "field": "date_time",
>
> "interval": "day"
>
> },
>
> "aggs": {
>
> "impression": {
>
> "sum": {
>
> "field": "impression"
>
> }
>
> ,"having" : { "from" : 50 }
>
> }
>
> }
>
> }
>
> }
>
> }
>
> }
>
> }
>
> }
>
> }
>
> }
>
> }
>
> '
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/99cd80a9-8f5d-4378-b4a5-09b8421f8c4e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to