[ 
https://issues.apache.org/jira/browse/DRILL-510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhiyong Liu updated DRILL-510:
------------------------------

    Description: 
In our drill automation system, a batch of over 200 tests were submitted via 
the JDBC connection which ended in large quantities of failures.  The following 
investigating analyses were performed to root cause the problem:

1. I put a sleep of 15 minutes in a single test case execution, and the 
zookeeper connection was not closed until the test terminated after sleep time 
ran out.
2. I then ran a couple of tests and reduced the sleep time to 5 minutes.  Two 
zookeeper connections were established (the second one after the second test 
case was running) and were not closed until both tests terminated after sleep 
time ran out.
3. I ran a batch of four tests and further reduced the sleep time to 2 minutes. 
 Four zookeeper connections were established (one after another till the last 
test case started to run) and again, none of them was closed until all tests 
terminated after sleep time out.  Then it took about 10-15 seconds for all 
connections to close.

>From these experiments, it appears to me that zookeeper connections are 
>eventually closed, but not before all tests executed in one batch are 
>completed.  This explains that, with the default maximum number of zookeeper 
>connections allowed, over 200 queries submitted in a batch are destined to 
>fail shortly into the execution when all zookeeper connections are used up.

  was:
In our drill automation system, a batch of over 200 tests were submitted via 
the JDBC connection.  The following investigating analyses were performed:

1. I put a sleep of 15 minutes in a single test case execution, and the 
zookeeper connection was not closed until the test terminated after sleep time 
ran out.
2. I then ran a couple of tests and reduced the sleep time to 5 minutes.  Two 
zookeeper connections were established (the second one after the second test 
case was running) and were not closed until both tests terminated after sleep 
time ran out.
3. I ran a batch of four tests and further reduced the sleep time to 2 minutes. 
 Four zookeeper connections were established (one after another till the last 
test case started to run) and again, none of them was closed until all tests 
terminated after sleep time out.  Then it took about 10-15 seconds for all 
connections to close.

>From these experiments, it appears to me that zookeeper connections are 
>eventually closed, but not before all tests executed in one batch are 
>completed.  This explains that, with the default maximum number of zookeeper 
>connections allowed, over 200 queries submitted in a batch are destined to 
>fail shortly into the execution when all zookeeper connections are used up.


> zookeeper connections don't get released until all queries are completed in a 
> batch of tests
> --------------------------------------------------------------------------------------------
>
>                 Key: DRILL-510
>                 URL: https://issues.apache.org/jira/browse/DRILL-510
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Zhiyong Liu
>
> In our drill automation system, a batch of over 200 tests were submitted via 
> the JDBC connection which ended in large quantities of failures.  The 
> following investigating analyses were performed to root cause the problem:
> 1. I put a sleep of 15 minutes in a single test case execution, and the 
> zookeeper connection was not closed until the test terminated after sleep 
> time ran out.
> 2. I then ran a couple of tests and reduced the sleep time to 5 minutes.  Two 
> zookeeper connections were established (the second one after the second test 
> case was running) and were not closed until both tests terminated after sleep 
> time ran out.
> 3. I ran a batch of four tests and further reduced the sleep time to 2 
> minutes.  Four zookeeper connections were established (one after another till 
> the last test case started to run) and again, none of them was closed until 
> all tests terminated after sleep time out.  Then it took about 10-15 seconds 
> for all connections to close.
> From these experiments, it appears to me that zookeeper connections are 
> eventually closed, but not before all tests executed in one batch are 
> completed.  This explains that, with the default maximum number of zookeeper 
> connections allowed, over 200 queries submitted in a batch are destined to 
> fail shortly into the execution when all zookeeper connections are used up.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to