[jira] [Updated] (DRILL-3482) LIMIT not terminating early enough when concurrent queries are running

Aman Sinha (JIRA) Thu, 09 Jul 2015 16:16:44 -0700

     [ 
https://issues.apache.org/jira/browse/DRILL-3482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Aman Sinha updated DRILL-3482:
------------------------------
    Description: 
This behavior was observed on a 30 node cluster, width.max_per_node = 23.  Run 
a long-running CTAS query that does joins, aggregations and takes about 5 
minutes.  While this query is running, submit the following type of query: 
{code}
SELECT * FROM table LIMIT 10;
{code}
The table consists of 2500 parquet files and the total row count of this table 
is about 4 billion rows. 

The elapsed time of this query was seen to be more than 2 minutes.   There were 
690 minor fragments doing the Scan and the majority of them finished in a few 
seconds while a small number took 2 mins.   

When the same query was run on an idle system, the query took about 6 seconds 
to complete.  It appears that when concurrent queries are running, the early 
termination for Limit is not getting processed fast enough on all the minor 
fragments. 

  was:
This behavior was observed on a 30 node cluster, width.max_per_node = 23.  Run 
a long-running CTAS query that does joins, aggregations and takes about 5 
minutes.  While this query is running, submit the following type of query: 
{code}
SELECT * FROM table LIMIT 10;
{code}
The table consists of parquet files and the size of this table is of the order 
of few hundred GB.

The elapsed time of this query was seen to be more than 2 minutes.   There were 
690 minor fragments doing the Scan and the majority of them finished in a few 
seconds while a small number took 2 mins.   

When the same query was run on an idle system, the query took about 6 seconds 
to complete.  It appears that when concurrent queries are running, the early 
termination for Limit is not getting processed fast enough on all the minor 
fragments. 


> LIMIT not terminating early enough when concurrent queries are running
> ----------------------------------------------------------------------
>
>                 Key: DRILL-3482
>                 URL: https://issues.apache.org/jira/browse/DRILL-3482
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Flow
>    Affects Versions: 1.1.0
>            Reporter: Aman Sinha
>            Assignee: Chris Westin
>
> This behavior was observed on a 30 node cluster, width.max_per_node = 23.  
> Run a long-running CTAS query that does joins, aggregations and takes about 5 
> minutes.  While this query is running, submit the following type of query: 
> {code}
> SELECT * FROM table LIMIT 10;
> {code}
> The table consists of 2500 parquet files and the total row count of this 
> table is about 4 billion rows. 
> The elapsed time of this query was seen to be more than 2 minutes.   There 
> were 690 minor fragments doing the Scan and the majority of them finished in 
> a few seconds while a small number took 2 mins.   
> When the same query was run on an idle system, the query took about 6 seconds 
> to complete.  It appears that when concurrent queries are running, the early 
> termination for Limit is not getting processed fast enough on all the minor 
> fragments. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3482) LIMIT not terminating early enough when concurrent queries are running

Reply via email to