Re: Return data rows with summary rows in Drill?

Andries Engelbrecht Thu, 29 Oct 2015 14:42:08 -0700

OK.

I think you may be interested in this JIRA
https://issues.apache.org/jira/browse/DRILL-3802 
<https://issues.apache.org/jira/browse/DRILL-3802>


Which should cover your use case.

—Andries
 
> On Oct 29, 2015, at 2:32 PM, Minnow Noir <[email protected]> wrote:
> 
> Got it.
> 
> I think there's already a JIRA:
> https://issues.apache.org/jira/browse/DRILL-3962.
> 
> On Thu, Oct 29, 2015 at 3:01 PM, Andries Engelbrecht <
> [email protected]> wrote:
> 
>> OK. Will be good if you file a JIRA for the enhancement.
>> 
>> BTW made mistake, in this example the right way is to use partition by and
>> not order by in the windows functions.
>> 
>> —Andries
>> 
>> 
>> 
>>> On Oct 29, 2015, at 8:27 AM, Minnow Noir <[email protected]> wrote:
>>> 
>>> Thanks Andries.
>>> 
>>> For this task, we need summary rows, akin to this:
>>> 
>> https://stackoverflow.com/questions/14786990/subtotals-and-total-keeping-the-details/14787017#14787017
>>> .
>>> 
>>> Drill doesn't appear to support WITH ROLLUP yet, so I was looking for
>>> workarounds.  Right now, we're doing this convoluted workaround and
>> thought
>>> I would see if there's a smarter way.
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Thu, Oct 29, 2015 at 11:19 AM, Andries Engelbrecht <
>>> [email protected]> wrote:
>>> 
>>>> You may not be able to display as separate rows in the result set, but
>>>> using window functions you can make it separate columns. Don’t have your
>>>> data set, but here is an example using the Drill Sandbox data for orders
>>>> table.
>>>> 
>>>> 0: jdbc:drill:> describe orders;
>>>> +--------------+--------------------+--------------+
>>>> | COLUMN_NAME  |     DATA_TYPE      | IS_NULLABLE  |
>>>> +--------------+--------------------+--------------+
>>>> | order_id     | BIGINT             | YES          |
>>>> | month        | CHARACTER VARYING  | YES          |
>>>> | purchdate    | TIMESTAMP          | YES          |
>>>> | cust_id      | BIGINT             | YES          |
>>>> | state        | CHARACTER VARYING  | YES          |
>>>> | prod_id      | BIGINT             | YES          |
>>>> | order_total  | INTEGER            | YES          |
>>>> +--------------+--------------------+--------------+
>>>> 7 rows selected (0.182 seconds)
>>>> 
>>>> Query with Window functions
>>>> 0: jdbc:drill:> select state, order_total, count(state) over(order by
>>>> state) as `count`, sum(order_total) over(order by state) as total from
>>>> orders order by state limit 20;
>>>> — Limited to 20 rows to show result
>>>> 
>>>> 
>>>> Result set
>>>> +--------+--------------+--------+--------+
>>>> | state  | order_total  | count  | total  |
>>>> +--------+--------------+--------+--------+
>>>> | ak     | 35           | 325    | 16703  |
>>>> | ak     | 21           | 325    | 16703  |
>>>> | ak     | 13           | 325    | 16703  |
>>>> | ak     | 11           | 325    | 16703  |
>>>> | ak     | 11           | 325    | 16703  |
>>>> | ak     | 29           | 325    | 16703  |
>>>> | ak     | 22           | 325    | 16703  |
>>>> | ak     | 24           | 325    | 16703  |
>>>> | ak     | 25           | 325    | 16703  |
>>>> | ak     | 57           | 325    | 16703  |
>>>> | ak     | 109          | 325    | 16703  |
>>>> | ak     | 107          | 325    | 16703  |
>>>> | ak     | 10           | 325    | 16703  |
>>>> | ak     | 73           | 325    | 16703  |
>>>> | ak     | 118          | 325    | 16703  |
>>>> | ak     | 108          | 325    | 16703  |
>>>> | ak     | 107          | 325    | 16703  |
>>>> | ak     | 89           | 325    | 16703  |
>>>> | ak     | 81           | 325    | 16703  |
>>>> | ak     | 20           | 325    | 16703  |
>>>> +--------+--------------+--------+--------+
>>>> 20 rows selected (0.75 seconds)
>>>> 
>>>> Verification of results
>>>> 0: jdbc:drill:> select sum(order_total) from orders where state = 'ak';
>>>> +---------+
>>>> | EXPR$0  |
>>>> +---------+
>>>> | 16703   |
>>>> +————+
>>>> 
>>>> 0: jdbc:drill:> select Count(*) from orders where state = 'ak';
>>>> +---------+
>>>> | EXPR$0  |
>>>> +---------+
>>>> | 325     |
>>>> +---------+
>>>> 1 row selected (0.373 seconds)
>>>> 
>>>> 
>>>> —Andries
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>>> On Oct 28, 2015, at 7:06 PM, Minnow Noir <[email protected]> wrote:
>>>>> 
>>>>> I'm trying to query a table in Drill and return a result set that
>>>> combines
>>>>> grouping with the granular rows such that the rows for each group has a
>>>>> summary row.  Is there anything in Drill that supports returning a
>>>> summary
>>>>> row for each group of returned rows?
>>>>> 
>>>>> For example, imagine a sales table with employee, sale_amount columns.
>>>>> 
>>>>> /* get all sales */
>>>>> select employee, sale_amount from sales;
>>>>> 
>>>>> /* summarize sales by employee */
>>>>> select employee, count(*) `count`, sum(sale_amount) total from sales
>>>> group
>>>>> by employee;
>>>>> 
>>>>> /* results with a summary row for each data row */
>>>>> employee, count, total
>>>>> Frank, 29, 1000000      // summary row for employee 1
>>>>> Frank, , 100000            // first data row for emp 1
>>>>> Frank, , 100000            // second data row for emp 2
>>>>> Mark, 10, 500000         // summary row for employee 2
>>>>> Mark, , 100000             // first data row for emp 2
>>>>> ...
>>>>> 
>>>>> Thanks,
>>>>> Minnow
>>>> 
>>>> 
>> 
>>

Re: Return data rows *with* summary rows in Drill?

Reply via email to

Re: Return data rows with summary rows in Drill?