Re: Return data rows with summary rows in Drill?

Andries Engelbrecht Thu, 29 Oct 2015 12:03:13 -0700

OK. Will be good if you file a JIRA for the enhancement.

BTW made mistake, in this example the right way is to use partition by and not 
order by in the windows functions.


—Andries



> On Oct 29, 2015, at 8:27 AM, Minnow Noir <[email protected]> wrote:
> 
> Thanks Andries.
> 
> For this task, we need summary rows, akin to this:
> https://stackoverflow.com/questions/14786990/subtotals-and-total-keeping-the-details/14787017#14787017
> .
> 
> Drill doesn't appear to support WITH ROLLUP yet, so I was looking for
> workarounds.  Right now, we're doing this convoluted workaround and thought
> I would see if there's a smarter way.
> 
> 
> 
> 
> 
> On Thu, Oct 29, 2015 at 11:19 AM, Andries Engelbrecht <
> [email protected]> wrote:
> 
>> You may not be able to display as separate rows in the result set, but
>> using window functions you can make it separate columns. Don’t have your
>> data set, but here is an example using the Drill Sandbox data for orders
>> table.
>> 
>> 0: jdbc:drill:> describe orders;
>> +--------------+--------------------+--------------+
>> | COLUMN_NAME  |     DATA_TYPE      | IS_NULLABLE  |
>> +--------------+--------------------+--------------+
>> | order_id     | BIGINT             | YES          |
>> | month        | CHARACTER VARYING  | YES          |
>> | purchdate    | TIMESTAMP          | YES          |
>> | cust_id      | BIGINT             | YES          |
>> | state        | CHARACTER VARYING  | YES          |
>> | prod_id      | BIGINT             | YES          |
>> | order_total  | INTEGER            | YES          |
>> +--------------+--------------------+--------------+
>> 7 rows selected (0.182 seconds)
>> 
>> Query with Window functions
>> 0: jdbc:drill:> select state, order_total, count(state) over(order by
>> state) as `count`, sum(order_total) over(order by state) as total from
>> orders order by state limit 20;
>> — Limited to 20 rows to show result
>> 
>> 
>> Result set
>> +--------+--------------+--------+--------+
>> | state  | order_total  | count  | total  |
>> +--------+--------------+--------+--------+
>> | ak     | 35           | 325    | 16703  |
>> | ak     | 21           | 325    | 16703  |
>> | ak     | 13           | 325    | 16703  |
>> | ak     | 11           | 325    | 16703  |
>> | ak     | 11           | 325    | 16703  |
>> | ak     | 29           | 325    | 16703  |
>> | ak     | 22           | 325    | 16703  |
>> | ak     | 24           | 325    | 16703  |
>> | ak     | 25           | 325    | 16703  |
>> | ak     | 57           | 325    | 16703  |
>> | ak     | 109          | 325    | 16703  |
>> | ak     | 107          | 325    | 16703  |
>> | ak     | 10           | 325    | 16703  |
>> | ak     | 73           | 325    | 16703  |
>> | ak     | 118          | 325    | 16703  |
>> | ak     | 108          | 325    | 16703  |
>> | ak     | 107          | 325    | 16703  |
>> | ak     | 89           | 325    | 16703  |
>> | ak     | 81           | 325    | 16703  |
>> | ak     | 20           | 325    | 16703  |
>> +--------+--------------+--------+--------+
>> 20 rows selected (0.75 seconds)
>> 
>> Verification of results
>> 0: jdbc:drill:> select sum(order_total) from orders where state = 'ak';
>> +---------+
>> | EXPR$0  |
>> +---------+
>> | 16703   |
>> +————+
>> 
>> 0: jdbc:drill:> select Count(*) from orders where state = 'ak';
>> +---------+
>> | EXPR$0  |
>> +---------+
>> | 325     |
>> +---------+
>> 1 row selected (0.373 seconds)
>> 
>> 
>> —Andries
>> 
>> 
>> 
>> 
>> 
>>> On Oct 28, 2015, at 7:06 PM, Minnow Noir <[email protected]> wrote:
>>> 
>>> I'm trying to query a table in Drill and return a result set that
>> combines
>>> grouping with the granular rows such that the rows for each group has a
>>> summary row.  Is there anything in Drill that supports returning a
>> summary
>>> row for each group of returned rows?
>>> 
>>> For example, imagine a sales table with employee, sale_amount columns.
>>> 
>>> /* get all sales */
>>> select employee, sale_amount from sales;
>>> 
>>> /* summarize sales by employee */
>>> select employee, count(*) `count`, sum(sale_amount) total from sales
>> group
>>> by employee;
>>> 
>>> /* results with a summary row for each data row */
>>> employee, count, total
>>> Frank, 29, 1000000      // summary row for employee 1
>>> Frank, , 100000            // first data row for emp 1
>>> Frank, , 100000            // second data row for emp 2
>>> Mark, 10, 500000         // summary row for employee 2
>>> Mark, , 100000             // first data row for emp 2
>>> ...
>>> 
>>> Thanks,
>>> Minnow
>> 
>>

Re: Return data rows *with* summary rows in Drill?

Reply via email to

Re: Return data rows with summary rows in Drill?