I tested on MapR Drill 1.3 and it seems to work correctly

select prod_id as prod, date_add(to_date(purchdate), 1) as date_purch from 
hive.orders where prod_id > 100 group by prod_id, date_add(to_date(purchdate), 
1) limit 2;
+-------+-------------+
| prod  | date_purch  |
+-------+-------------+
| 226   | 2014-03-01  |
| 167   | 2014-02-11  |
+-------+-------------+


Here is the version information.

Version     : 1.3.0.201511202117
Release     : 1
Size        : 288 M
Repo        : installed
From repo   : /mapr-drill-1.3.0.201511202117-1.noarch


Above was with Hive as data source, so did a quick test on parquet files 
(similar data). Result was the same.

select prod_id as prod, date_add(to_date(purchdate), 1) as date_purch from 
dfs.par.`/orders` where prod_id > 100 group by prod_id, 
date_add(to_date(purchdate), 1) limit 2;
+-------+-------------+
| prod  | date_purch  |
+-------+-------------+
| 226   | 2014-03-01  |
| 167   | 2014-02-11  |
+-------+-------------+

Then tried using the same column, similar to your query.

select purchdate as purch_date, date_add(to_date(purchdate), 1) as date_purch 
from dfs.par.`/orders` where purchdate > '2014-02-28 12:53:06.0' group by 
purchdate, date_add(to_date(purchdate), 1) limit 2;
+------------------------+-------------+
|       purch_date       | date_purch  |
+------------------------+-------------+
| 2014-02-28 18:51:28.0  | 2014-03-01  |
| 2014-02-28 22:11:13.0  | 2014-03-01  |
+------------------------+-------------+


--Andries


> On Dec 21, 2015, at 11:08 PM, Petri Lehtonen <[email protected]> 
> wrote:
> 
> Tableau TDC has the lines present.
> 
> The Drill version we have is 1.3.0. MapR developer preview.
> 
> I ran the query using sqlline and it gives me the same results:
> 
> SELECT DISTINCT
>  valid_from as valid_from_raw,
>  date_add(to_date(valid_from),1) as valid_from_to_date
> FROM
>  dw.`/directory/parquet-file`
> WHERE
>  valid_from = '1356991200000'
> GROUP BY
>  valid_from,
>  date_add(to_date(valid_from),1)
> 
> +-----------------+-------------+
> | valid_from_raw  |     $f1     |
> +-----------------+-------------+
> | 1356991200000   | 2013-01-01  |
> +-----------------+-------------+
> 
> And I get the same results regardless of the tool I'm using (sqlline, Drill 
> explorer, dbVisualizer, Tableau).
> 
>> -----Original Message-----
>> From: Andries Engelbrecht [mailto:[email protected]]
>> Sent: 21. joulukuuta 2015 18:53
>> To: [email protected]
>> Subject: Re: Column aliases lost when using Dates + GROUP BY in SQL
>> 
>> I think the issue is with the group by and using the column name and alias
>> clause 'group by valid_from as valid_from_raw'.
>> 
>> Removing the alias clause in the group by statement works as Aman shows.
>> 
>> Can you check your Tableau TDC file to see if these lines are present.
>>  <customization name='CAP_QUERY_GROUP_BY_ALIAS' value='no' />
>>  <customization name='CAP_QUERY_GROUP_BY_DEGREE' value='no' />
>> 
>> --Andries
>> 
>>> On Dec 21, 2015, at 8:25 AM, Aman Sinha <[email protected]> wrote:
>>> 
>>> The aliases work for me in the following query similar to yours.
>>> However, I am using latest master branch and running directly through
>>> sqlline command, not through Tableau.  Can you confirm what Drill
>>> version you are using and check if you can repro the behavior through
>>> sqlline ?  If so, you should file a JIRA.
>>> 
>>> select o_custkey as x, date_add(to_date(o_orderdate), 1) as y from
>>> cp.`tpch/orders.parquet` where o_orderkey < 10 group by o_custkey,
>>> date_add(to_date(o_orderdate), 1) limit 2;
>>> +------+-------------+
>>> |  x   |      y      |
>>> +------+-------------+
>>> | 370  | 1996-01-03  |
>>> | 781  | 1996-12-02  |
>>> +------+-------------+
>>> 
>>> On Mon, Dec 21, 2015 at 2:18 AM, Petri Lehtonen
>>> <[email protected]>
>>> wrote:
>>> 
>>>> Hi all,
>>>> 
>>>> I'm using Tableau with MapR Drill ODBC drivers.  I have successfully
>>>> set up the drivers and am able to query the data from our source
>>>> which is basically parquet files. All seems well as long as I'm using
>>>> all rows from the source but when I try to limit the size of the
>>>> results by grouping and rolling up the data I came across something that
>> feels like a bug:
>>>> 
>>>> When I use any functions that return dates and add a group by to my
>>>> query, the column alias in the underlying sql won't show up in the
>> resultset.
>>>> 
>>>> Plain select from the source works:
>>>> 
>>>> SELECT
>>>> valid_from as valid_from_raw,
>>>> date_add(to_date(valid_from),1)as valid_from_to_date FROM
>>>> dw.`/directory/parquet-file` WHERE  valid_from = '1356991200000'
>>>> 
>>>> ---------------+---------------------
>>>> valid_from_raw | valid_from_to_date
>>>> ---------------+---------------------
>>>> 1356991200000  | 2013-01-01
>>>> ---------------+---------------------
>>>> 
>>>> When you add GROUP BY to the SQL, Drill looses the original column
>>>> alias and replaces it with it's internal variable or something similar 
>>>> ($f1..n):
>>>> 
>>>> SELECT
>>>> valid_from as valid_from_raw,
>>>> date_add(to_date(valid_from),1) as valid_from_to_date FROM
>>>> dw.`/directory/parquet-file` WHERE  valid_from = '1356991200000'
>>>> GROUP BY
>>>> valid_from as valid_from_raw,
>>>> date_add(to_date(valid_from),1)
>>>> 
>>>> ---------------+-------------
>>>> valid_from_raw | $f1
>>>> ---------------+-------------
>>>> 1356991200000  | 2013-01-01
>>>> ---------------+-------------
>>>> 
>>>> And naturally this creates a problem with Tableau which is mapping
>>>> the column aliases from the initial SQL and fails to find the
>>>> corresponding aliases from the resultset. Only way to get this
>>>> working is to use $f0..n column aliases in the original query and rename
>> them in Tableau...
>>>> Something that one definitely shouldn't do. I'd appreciate if anyone
>>>> has a solution or a more solid workaround.
>>>> 
>>>> Thanks,
>>>> Petri
>>>> 
>>>> 
> 

Reply via email to