[jira] [Created] (CALCITE-5124) LIMIT won't work when GROUP BY two or more columns in Elasticsearch Adapter

ZheHu (Jira) Thu, 28 Apr 2022 04:48:05 -0700

ZheHu created CALCITE-5124:
------------------------------

             Summary: LIMIT won't work when GROUP BY two or more columns in 
Elasticsearch Adapter
                 Key: CALCITE-5124
                 URL: https://issues.apache.org/jira/browse/CALCITE-5124
             Project: Calcite
          Issue Type: Bug
          Components: elasticsearch-adapter
    Affects Versions: 1.30.0
            Reporter: ZheHu



Add one doc(like following doc4) in AggregationTest :
{code:java}
String doc4 = "{val1:1, cat4:'2018-01-02'}"
{code}

Then running the following test case:
{code:java}
@Test void dateCat2() {
    CalciteAssert.that()
        .with(AggregationTest::createConnection)
        .query("select val1, cat4 from view group by val1, cat4 limit 2")
        .returnsUnordered("val1=1; cat4=1514764800000",
            "val1=1; cat4=1514851200000",
            "val1=null; cat4=1576108800000");
  }
{code}

We can see that *+limit 2+* in SQL doesn't take effect. The generated ES script 
is:
 {code:java}
{
  "_source": false,
  "size": 0,
  "stored_fields": "_none_",
  "aggregations": {
    "g_val1": {
      "terms": {
        "field": "val1",
        "missing": -9223372036854775808,
        "size": 2
      },
      "aggregations": {
        "g_cat4": {
          "terms": {
            "field": "cat4",
            "missing": 253402214400000,
            "size": 2
          }
        }
      }
    }
  }
}
{code}

There are two bucket aggregations in the script, which both have the size 2. 
However, the size can only control the doc's num for the current bucket, when 
two buckets interact, the total results cannot be assured.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Created] (CALCITE-5124) LIMIT won't work when GROUP BY two or more columns in Elasticsearch Adapter

Reply via email to