Greetings , We have discovered an issue with ES aggregations when grouping on non-textual fields (date, long). Currently the following sql fails because for missing value <https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#_missing_value_13> we inject __MISSING__ sentinel which is not date / number parseable (it can’t be null either) :
select max(amount), date from orders group by date -- special ES type The solution is to make sentinel type specific : 1. Integer.MIN_VALUE for integers 2. 9999-12-31 for dates etc. For low cardinality types like boolean, byte, short I’m not sure what to do since there is high probability of collision between missing field and actual value (eg. what value to choose for missing boolean?) : select max(amount), isActive from orders group by isActive -- boolean type Let me know if you solved this problem differently before. Composite aggregations <https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-composite-aggregation.html> (available since 6.1) should help in future. PR: https://github.com/apache/calcite/pull/946 JIRA: https://issues.apache.org/jira/browse/CALCITE-2689 Many Thanks in Advance, Andrei.
