[ 
https://issues.apache.org/jira/browse/DRILL-4678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15286232#comment-15286232
 ] 

Khurram Faraaz commented on DRILL-4678:
---------------------------------------

Upon reducing the size the query returns results, however if we add one or two 
more values the time taken to execute almost doubles.

This query took 16 seconds.
{noormat}
0: jdbc:drill:schema=dfs.tmp> SELECT DISTINCT dt FROM (
. . . . . . . . . . . . . . > VALUES(CAST('1964-03-07' AS DATE)),
. . . . . . . . . . . . . . >       (CAST('2002-03-04' AS DATE)),
. . . . . . . . . . . . . . >       (CAST('1966-09-04' AS DATE)),
. . . . . . . . . . . . . . >       (CAST('1993-08-18' AS DATE)),
. . . . . . . . . . . . . . >       (CAST('1970-06-11' AS DATE)),
. . . . . . . . . . . . . . >       (CAST('1970-06-11' AS DATE)),
. . . . . . . . . . . . . . >       (CAST('1970-06-11' AS DATE)),
. . . . . . . . . . . . . . >       (CAST('1970-06-11' AS DATE)),
. . . . . . . . . . . . . . >       (CAST('1970-06-11' AS DATE)),
. . . . . . . . . . . . . . >       (CAST('1959-10-23' AS DATE)),
. . . . . . . . . . . . . . >       (CAST('1992-01-14' AS DATE)),
. . . . . . . . . . . . . . >       (CAST('1994-07-24' AS DATE)),
. . . . . . . . . . . . . . >       (CAST('1979-11-25' AS DATE)),
. . . . . . . . . . . . . . >       (CAST('1945-01-14' AS DATE)),
. . . . . . . . . . . . . . >       (CAST('1982-07-25' AS DATE)),
. . . . . . . . . . . . . . >       (CAST('1966-09-06' AS DATE)),
. . . . . . . . . . . . . . >       (CAST('1989-05-01' AS DATE)),
. . . . . . . . . . . . . . > (CAST('1951-05-16' AS DATE)))
. . . . . . . . . . . . . . > tbl(dt);
+-------------+
|     dt      |
+-------------+
| 1964-03-07  |
| 2002-03-04  |
| 1966-09-04  |
| 1993-08-18  |
| 1970-06-11  |
| 1959-10-23  |
| 1992-01-14  |
| 1994-07-24  |
| 1979-11-25  |
| 1945-01-14  |
| 1982-07-25  |
| 1966-09-06  |
| 1989-05-01  |
| 1951-05-16  |
+-------------+
14 rows selected (16.718 seconds)
{noformat}

I added few more rows enclosed in parentheses and it took 32 seconds (almost 
twice the time)
{noformat}
0: jdbc:drill:schema=dfs.tmp> SELECT DISTINCT dt FROM (
. . . . . . . . . . . . . . > VALUES(CAST('1964-03-07' AS DATE)),
. . . . . . . . . . . . . . >       (CAST('2002-03-04' AS DATE)),
. . . . . . . . . . . . . . >       (CAST('1966-09-04' AS DATE)),
. . . . . . . . . . . . . . >       (CAST('1993-08-18' AS DATE)),
. . . . . . . . . . . . . . >       (CAST('1970-06-11' AS DATE)),
. . . . . . . . . . . . . . >       (CAST('1970-06-11' AS DATE)),
. . . . . . . . . . . . . . >       (CAST('1970-06-11' AS DATE)),
. . . . . . . . . . . . . . >       (CAST('1970-06-11' AS DATE)),
. . . . . . . . . . . . . . >       (CAST('1970-06-11' AS DATE)),
. . . . . . . . . . . . . . >       (CAST('1959-10-23' AS DATE)),
. . . . . . . . . . . . . . >       (CAST('1992-01-14' AS DATE)),
. . . . . . . . . . . . . . >       (CAST('1994-07-24' AS DATE)),
. . . . . . . . . . . . . . >       (CAST('1979-11-25' AS DATE)),
. . . . . . . . . . . . . . >       (CAST('1945-01-14' AS DATE)),
. . . . . . . . . . . . . . >       (CAST('1982-07-25' AS DATE)),
. . . . . . . . . . . . . . >       (CAST('1966-09-06' AS DATE)),
. . . . . . . . . . . . . . >       (CAST('1989-05-01' AS DATE)),
. . . . . . . . . . . . . . >       (CAST('1996-03-08' AS DATE)),
. . . . . . . . . . . . . . > (CAST('1951-05-16' AS DATE)))
. . . . . . . . . . . . . . > tbl(dt);
+-------------+
|     dt      |
+-------------+
| 1964-03-07  |
| 2002-03-04  |
| 1966-09-04  |
| 1993-08-18  |
| 1970-06-11  |
| 1959-10-23  |
| 1992-01-14  |
| 1994-07-24  |
| 1979-11-25  |
| 1945-01-14  |
| 1982-07-25  |
| 1966-09-06  |
| 1989-05-01  |
| 1996-03-08  |
| 1951-05-16  |
+-------------+
15 rows selected (32.182 seconds)
{noformat}

Adding any more rows (even adding just one) would keep the query in STARTING 
state and the query would never complete.
explain plan over the original query reported in this JIRA would not return any 
results. So it looks like it has to do something with the total number of 
values enclosed in parentheses inside the VALUES clause.

> Query HANG - SELECT DISTINCT over date data
> -------------------------------------------
>
>                 Key: DRILL-4678
>                 URL: https://issues.apache.org/jira/browse/DRILL-4678
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization
>    Affects Versions: 1.7.0
>         Environment: 4 node cluster CentOS
>            Reporter: Khurram Faraaz
>            Priority: Critical
>         Attachments: hung_Date_Query.log
>
>
> Below query hangs
> {noformat}
> 2016-05-16 10:33:57,506 [28c65de9-9f67-dadb-5e4e-e1a12f8dda49:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 28c65de9-9f67-dadb-5e4e-e1a12f8dda49: SELECT DISTINCT dt FROM (
> VALUES(CAST('1964-03-07' AS DATE)),
>       (CAST('2002-03-04' AS DATE)),
>       (CAST('1966-09-04' AS DATE)),
>       (CAST('1993-08-18' AS DATE)),
>       (CAST('1970-06-11' AS DATE)),
>       (CAST('1970-06-11' AS DATE)),
>       (CAST('1970-06-11' AS DATE)),
>       (CAST('1970-06-11' AS DATE)),
>       (CAST('1970-06-11' AS DATE)),
>       (CAST('1959-10-23' AS DATE)),
>       (CAST('1992-01-14' AS DATE)),
>       (CAST('1994-07-24' AS DATE)),
>       (CAST('1979-11-25' AS DATE)),
>       (CAST('1945-01-14' AS DATE)),
>       (CAST('1982-07-25' AS DATE)),
>       (CAST('1966-09-06' AS DATE)),
>       (CAST('1989-05-01' AS DATE)),
>       (CAST('1996-03-08' AS DATE)),
>       (CAST('1998-08-19' AS DATE)),
>       (CAST('2013-08-13' AS DATE)),
>       (CAST('2013-08-13' AS DATE)),
>       (CAST('2013-08-13' AS DATE)),
>       (CAST('2013-08-13' AS DATE)),
>       (CAST('2013-08-13' AS DATE)),
>       (CAST('2013-08-13' AS DATE)),
> (CAST('1999-07-20' AS DATE)),
>     (CAST('1962-07-03' AS DATE)),
>       (CAST('2011-08-17' AS DATE)),
>       (CAST('2011-05-16' AS DATE)),
>       (CAST('1946-05-08' AS DATE)),
>       (CAST('1994-02-13' AS DATE)),
>       (CAST('1978-08-09' AS DATE)),
>       (CAST('1978-08-09' AS DATE)),
>       (CAST('1978-08-09' AS DATE)),
>       (CAST('1978-08-09' AS DATE)),
>       (CAST('1958-02-06' AS DATE)),
>       (CAST('2012-06-11' AS DATE)),
>       (CAST('2012-06-11' AS DATE)),
>       (CAST('2012-06-11' AS DATE)),
>       (CAST('2012-06-11' AS DATE)),
>       (CAST('1998-03-26' AS DATE)),
>       (CAST('1996-11-04' AS DATE)),
>       (CAST('1953-09-25' AS DATE)),
>       (CAST('2003-06-17' AS DATE)),
>       (CAST('2003-06-17' AS DATE)),
>       (CAST('2003-06-17' AS DATE)),
>       (CAST('2003-06-17' AS DATE)),
>       (CAST('2003-06-17' AS DATE)),
>       (CAST('1980-07-05' AS DATE)),
>       (CAST('1982-06-15' AS DATE)),
>       (CAST('1951-05-16' AS DATE)))
> tbl(dt)
> {noformat}
> Details from Web UI Profile tab, please note that the query is still in 
> STARTING state
> {noformat}
> Running Queries
> Time  User    Query   State   Foreman
> 05/16/2016 10:33:57   
> mapr
>  SELECT DISTINCT dt FROM ( VALUES(CAST('1964-03-07' AS DATE)), 
> (CAST('2002-03-04' AS DATE)), (CAST('1966-09-04' AS DATE)), (CAST('199
> STARTING
> centos-01.qa.lab
> {noformat}
> There is no other useful information in drillbit.log. jstack output is 
> attached here for your reference.
> The same query works fine on Postgres 9.3



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to