This is an automated email from the ASF dual-hosted git repository.

mblow pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/asterixdb.git

commit 71cd417ee6c279b20968a9f1067765a7e00f78f2
Author: Simon Dew <[email protected]>
AuthorDate: Thu Aug 12 12:06:02 2021 +0100

    [NO ISSUE][DOC] Documentation for ROLLUP and CUBE
    
      - Add documentation for ROLLUP subclause
      - Add documentation for CUBE subclause
    
    Change-Id: I67a63f17ca459e313321bce569056a7f07f3a17f
    Reviewed-on: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/12783
    Integration-Tests: Jenkins <[email protected]>
    Tested-by: Jenkins <[email protected]>
    Reviewed-by: Dmitry Lychagin <[email protected]>
---
 .../asterix-doc/src/main/markdown/sqlpp/3_query.md | 323 ++++++++++++++++++++-
 1 file changed, 322 insertions(+), 1 deletion(-)

diff --git a/asterixdb/asterix-doc/src/main/markdown/sqlpp/3_query.md 
b/asterixdb/asterix-doc/src/main/markdown/sqlpp/3_query.md
index 3e70922..c54467f 100644
--- a/asterixdb/asterix-doc/src/main/markdown/sqlpp/3_query.md
+++ b/asterixdb/asterix-doc/src/main/markdown/sqlpp/3_query.md
@@ -742,7 +742,7 @@ Of course, a grouping expression need not be a simple 
field-name. In Q3.18, orde
     WHERE get_year(date(o.order_date)) = 2020
     GROUP BY get_month(date(o.order_date)) AS month
     SELECT month, COUNT(*) AS order_count
-    ORDER BY order_count desc
+    ORDER BY order_count DESC, month DESC
     LIMIT 3;
 
 Result:
@@ -794,6 +794,327 @@ Result:
         }
     ]
 
+#### <a id="Rollup">ROLLUP</a>
+
+The `ROLLUP` subclause is an aggregation feature that extends the 
functionality of the `GROUP BY` clause.
+It returns extra _super-aggregate_ items in the query results, giving 
subtotals and a grand total for the aggregate
+functions in the query.
+To illustrate, first consider the following query.
+
+##### Example
+
+(Q3.R1) List the number of orders, grouped by customer region and city.
+
+    SELECT customer_region AS Region,
+           customer_city AS City,
+           COUNT(o.orderno) AS `Order Count`
+    FROM customers AS c LEFT OUTER JOIN orders AS o ON c.custid = o.custid
+    LET address_line = SPLIT(c.address.city, ","),
+        customer_city = TRIM(address_line[0]),
+        customer_region = TRIM(address_line[1])
+    GROUP BY customer_region, customer_city
+    ORDER BY customer_region ASC, customer_city ASC, `Order Count` DESC;
+
+Result:
+
+    [
+      {
+        "Region": "Italy",
+        "City": "Rome",
+        "Order Count": 0
+      },
+      {
+        "Region": "MA",
+        "City": "Boston",
+        "Order Count": 2
+      },
+      {
+        "Region": "MA",
+        "City": "Hanover",
+        "Order Count": 0
+      },
+      {
+        "Region": "MO",
+        "City": "St. Louis",
+        "Order Count": 7
+      }
+    ]
+
+This query uses string functions to split each customer's address into city 
and region.
+The query then counts the total number of orders placed by each customer, and 
groups the results first by customer
+region, then by customer city.
+The aggregate results (labeled `Order Count`) are only shown by city, and 
there are no subtotals or grand total.
+We can add these using the `ROLLUP` subclause, as in the following example.
+
+##### Example
+
+(Q3.R2) List the number of orders by customer region and city, including 
subtotals and a grand total.
+
+    SELECT customer_region AS Region,
+           customer_city AS City,
+           COUNT(o.orderno) AS `Order Count`
+    FROM customers AS c LEFT OUTER JOIN orders AS o ON c.custid = o.custid
+    LET address_line = SPLIT(c.address.city, ","),
+        customer_city = TRIM(address_line[0]),
+        customer_region = TRIM(address_line[1])
+    GROUP BY ROLLUP(customer_region, customer_city)
+    ORDER BY customer_region ASC, customer_city ASC, `Order Count` DESC;
+
+Result:
+
+    [
+      {
+        "Region": null,
+        "City": null,
+        "Order Count": 9
+      },
+      {
+        "Region": "Italy",
+        "City": null,
+        "Order Count": 0
+      },
+      {
+        "Region": "Italy",
+        "City": "Rome",
+        "Order Count": 0
+      },
+      {
+        "Region": "MA",
+        "City": null,
+        "Order Count": 2
+      },
+      {
+        "Region": "MA",
+        "City": "Boston",
+        "Order Count": 2
+      },
+      {
+        "Region": "MA",
+        "City": "Hanover",
+        "Order Count": 0
+      },
+      {
+        "Region": "MO",
+        "City": null,
+        "Order Count": 7
+      },
+      {
+        "Region": "MO",
+        "City": "St. Louis",
+        "Order Count": 7
+      }
+    ]
+
+With the addition of the `ROLLUP` subclause, the results now include an extra 
item at the start of each region,
+giving the subtotal for that region.
+There is also another extra item at the very start of the results, giving the 
grand total for all regions.
+
+The order of the fields specified by the `ROLLUP` subclause determines the 
hierarchy of the super-aggregate items.
+The customer region is specified first, followed by the customer city; so the 
results are aggregated by region first,
+and then by city within each region.
+
+The grand total returns `null` as a value for the city and the region, and the 
subtotals return `null` as the
+value for the city, which may make the results hard to understand at first 
glance.
+A workaround for this is given in the next example.
+
+##### Example
+
+(Q3.R3) List the number of orders by customer region and city, with meaningful 
subtotals and grand total.
+
+    SELECT IFNULL(customer_region, "All regions") AS Region,
+           IFNULL(customer_city, "All cities") AS City,
+           COUNT(o.orderno) AS `Order Count`
+    FROM customers AS c LEFT OUTER JOIN orders AS o ON c.custid = o.custid
+    LET address_line = SPLIT(c.address.city, ","),
+        customer_city = TRIM(address_line[0]),
+        customer_region = TRIM(address_line[1])
+    GROUP BY ROLLUP(customer_region, customer_city)
+    ORDER BY customer_region ASC, customer_city ASC, `Order Count` DESC;
+
+Result:
+
+    [
+      {
+        "Region": "All regions",
+        "City": "All cities",
+        "Order Count": 9
+      },
+      {
+        "Region": "Italy",
+        "City": "All cities",
+        "Order Count": 0
+      },
+      {
+        "Region": "Italy",
+        "City": "Rome",
+        "Order Count": 0
+      },
+      {
+        "Region": "MA",
+        "City": "All cities",
+        "Order Count": 2
+      },
+      {
+        "Region": "MA",
+        "City": "Boston",
+        "Order Count": 2
+      },
+      {
+        "Region": "MA",
+        "City": "Hanover",
+        "Order Count": 0
+      },
+      {
+        "Region": "MO",
+        "City": "All cities",
+        "Order Count": 7
+      },
+      {
+        "Region": "MO",
+        "City": "St. Louis",
+        "Order Count": 7
+      }
+    ]
+
+This query uses the `IFNULL` function to populate the region and city fields 
with meaningful values for the
+super-aggregate items.
+This makes the results clearer and more readable.
+
+#### <a id="Cube">CUBE</a>
+
+The `CUBE` subclause is similar to the `ROLLUP` subclause, in that it returns 
extra super-aggregate items in the query
+results, giving subtotals and a grand total for the aggregate functions.
+Whereas `ROLLUP` returns a grand total and a hierarchy of subtotals based on 
the specified fields,
+the `CUBE` subclause returns a grand total and subtotals for every possible 
combination of the specified fields.
+
+The following example is a modification of Q3.R3 which illustrates the `CUBE` 
subclause.
+
+##### Example
+
+(Q3.C) List the number of orders by customer region and order date, with all 
possible subtotals and a grand total.
+
+    SELECT IFNULL(customer_region, "All regions") AS Region,
+           IFNULL(order_month, "All months") AS Month,
+           COUNT(o.orderno) AS `Order Count`
+    FROM customers AS c INNER JOIN orders AS o ON c.custid = o.custid
+    LET address_line = SPLIT(c.address.city, ","),
+        customer_region = TRIM(address_line[1]),
+        order_month = get_month(date(o.order_date))
+    GROUP BY CUBE(customer_region, order_month)
+    ORDER BY customer_region ASC, order_month ASC;
+
+Result:
+
+    [
+      {
+        "Region": "All regions",
+        "Order Count": 9,
+        "Month": "All months"
+      },
+      {
+        "Region": "All regions",
+        "Order Count": 1,
+        "Month": 4
+      },
+      {
+        "Region": "All regions",
+        "Order Count": 1,
+        "Month": 5
+      },
+      {
+        "Region": "All regions",
+        "Order Count": 1,
+        "Month": 6
+      },
+      {
+        "Region": "All regions",
+        "Order Count": 1,
+        "Month": 7
+      },
+      {
+        "Region": "All regions",
+        "Order Count": 1,
+        "Month": 8
+      },
+      {
+        "Region": "All regions",
+        "Order Count": 2,
+        "Month": 9
+      },
+      {
+        "Region": "All regions",
+        "Order Count": 2,
+        "Month": 10
+      },
+      {
+        "Region": "MA",
+        "Order Count": 2,
+        "Month": "All months"
+      },
+      {
+        "Region": "MA",
+        "Order Count": 1,
+        "Month": 7
+      },
+      {
+        "Region": "MA",
+        "Order Count": 1,
+        "Month": 8
+      },
+      {
+        "Region": "MO",
+        "Order Count": 7,
+        "Month": "All months"
+      },
+      {
+        "Region": "MO",
+        "Order Count": 1,
+        "Month": 4
+      },
+      {
+        "Region": "MO",
+        "Order Count": 1,
+        "Month": 5
+      },
+      {
+        "Region": "MO",
+        "Order Count": 1,
+        "Month": 6
+      },
+      {
+        "Region": "MO",
+        "Order Count": 2,
+        "Month": 9
+      },
+      {
+        "Region": "MO",
+        "Order Count": 2,
+        "Month": 10
+      }
+    ]
+
+To simplify the results, this query uses an inner join, so that customers who 
have not placed an order are not included
+in the totals.
+The query uses string functions to extract the region from each customer's 
address,
+and a temporal function to extract the year from the order date.
+
+The query uses the `CUBE` subclause with customer region and order month.
+This means that there are four possible aggregates to calculate:
+
+* All regions, all months
+* All regions, each month
+* Each region, all months
+* Each region, each month
+
+The results start with the grand total, showing the total number of orders 
across all regions for all months.
+This is followed by date subtotals, showing the number of orders across all 
regions for each month.
+Following that are the regional subtotals, showing the total number of orders 
for all months in each region;
+and the result items, giving the number of orders for each month in each 
region.
+
+The query also uses the `IFNULL` function to populate the region and date 
fields with meaningful values for the
+super-aggregate items.
+This makes the results clearer and more readable.
+
 ### <a id="Having_clauses">HAVING Clause</a>
 
 ##### HavingClause

Reply via email to