[ 
https://issues.apache.org/jira/browse/PHOENIX-4503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser resolved PHOENIX-4503.
---------------------------------
    Resolution: Duplicate

Marking as a duplicate of PHOENIX-4489

> Phoenix-Spark plugin doesn't release zookeeper connections
> ----------------------------------------------------------
>
>                 Key: PHOENIX-4503
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4503
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.11.0
>         Environment: HBase 1.2 on Linux (Ubuntu, CentOS)
>            Reporter: Suhas Nalapure
>            Priority: Major
>
> *1. Phoenix-Spark plugin doesn't release zookeeper connections*
> Example: 
>               
> {code:java}
> for(int i=0; i < 50; i++){
>                       Dataset<Row> df = 
> sqlContext.read().format("org.apache.phoenix.spark")
>                                       .option("table", 
> "\"Sales\"").option("zkUrl", "localhost:2181")
>                                       .load();
>                       df.show(2);
>               }
>               Thread.sleep(1000*60); 
> {code}
>    
>  When the above snippet is executed, we can see number of connections to 2181 
> increasing and not getting released until after the main thread wakes up from 
> sleep and program ends as can be seen below (14 is the number of connections 
> even before the program starts to run) :
> netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
> 14
> 16:52:05
> root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
> 22
> 16:52:15
> root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
> 38
> 16:52:18
> root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
> 68
> 16:52:23
> root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
> 100
> 16:52:27
> root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
> 116
> 16:52:32
> root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
> 116
> 16:52:38
> root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
> 116
> 16:52:52
> root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
> 116
> 16:53:00
> root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
> 116
> 16:53:24
> root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
> 14
> 16:53:32
> root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
> 14
> 16:53:34
> root@user1 ~ $
> *2. Instead if "jdbc" format is used to create Spark Dataframe, the 
> connection count doesn't shoot up*
> Example:
>               
> {code:java}
> for(int i=0; i < 50; i++){                    
>                       Dataset<Row> df = sqlContext.read().format("jdbc")
>                                       .option("url", 
> "jdbc:phoenix:localhost:2181")
>                                       .option("dbtable", "\"Sales\"")
>                                       .option("driver", 
> "org.apache.phoenix.jdbc.PhoenixDriver")
>                                       .load();
>                       df.show(2);
>               }
>               Thread.sleep(1000*60);  
> {code}
>               
> Connection counts during program execution(14 being the count before 
> execution starts):
> root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
> 14
> 17:00:42
> root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
> 14
> 17:00:43
> root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
> 16
> 17:00:46
> root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
> 16
> 17:00:50
> root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
> 16
> 17:00:55
> root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
> 16
> 17:01:12
> root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
> 16
> 17:01:18
> root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
> 16
> 17:01:28
> root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
> 16
> 17:01:34
> root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
> 16
> 17:01:37
> root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
> 16
> 17:01:39
> root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
> 14
> 17:02:07



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to