Re: Review Request 34293: HIVE-10721 SparkSessionManagerImpl leaks SparkSessions [Spark Branch]
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/34293/#review84094 --- ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionManagerImpl.java https://reviews.apache.org/r/34293/#comment135202 SparkClientFactory.initialize would be invoked only once, which means RpcServer would be initialized once inside either, so while we update spark client rpc related paramters, RpcServer does not really updated. This should be another issue, i just list here as it's found while read the code. - chengxiang li On 五月 15, 2015, 9:53 p.m., Jimmy Xiang wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/34293/ --- (Updated 五月 15, 2015, 9:53 p.m.) Review request for hive and Xuefu Zhang. Bugs: HIVE-10721 https://issues.apache.org/jira/browse/HIVE-10721 Repository: hive-git Description --- Add a SparkSession to createdSessions only after the session is opened properly if doOpen is specified. Diffs - ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java 7e33a3f ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java bae30f3 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java 603f1ca ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionManagerImpl.java ad012b6 spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 1bcd221 Diff: https://reviews.apache.org/r/34293/diff/ Testing --- Thanks, Jimmy Xiang
Re: Review Request 34293: HIVE-10721 SparkSessionManagerImpl leaks SparkSessions [Spark Branch]
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/34293/#review84096 --- ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionManagerImpl.java https://reviews.apache.org/r/34293/#comment135204 Just curious, it looks to me that AtomaticBoolean works here either, is that possible 2 threads executed this block both? - chengxiang li On 五月 15, 2015, 9:53 p.m., Jimmy Xiang wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/34293/ --- (Updated 五月 15, 2015, 9:53 p.m.) Review request for hive and Xuefu Zhang. Bugs: HIVE-10721 https://issues.apache.org/jira/browse/HIVE-10721 Repository: hive-git Description --- Add a SparkSession to createdSessions only after the session is opened properly if doOpen is specified. Diffs - ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java 7e33a3f ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java bae30f3 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java 603f1ca ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionManagerImpl.java ad012b6 spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 1bcd221 Diff: https://reviews.apache.org/r/34293/diff/ Testing --- Thanks, Jimmy Xiang
Re: Review Request 34293: HIVE-10721 SparkSessionManagerImpl leaks SparkSessions [Spark Branch]
On May 18, 2015, 2:37 a.m., chengxiang li wrote: ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionManagerImpl.java, line 88 https://reviews.apache.org/r/34293/diff/1/?file=961679#file961679line88 Just curious, it looks to me that AtomaticBoolean works here either, is that possible 2 threads executed this block both? If several sessions connect to the same HS2, they might execute this block concurrently. One issue with AtomaticBoolean instead of synchonized here is that we have to make sure the SparkClientFactory is properly initialized. Sometimes, we see it throws an exception, in which case, we may need to initialize it again. - Jimmy --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/34293/#review84096 --- On May 15, 2015, 9:53 p.m., Jimmy Xiang wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/34293/ --- (Updated May 15, 2015, 9:53 p.m.) Review request for hive and Xuefu Zhang. Bugs: HIVE-10721 https://issues.apache.org/jira/browse/HIVE-10721 Repository: hive-git Description --- Add a SparkSession to createdSessions only after the session is opened properly if doOpen is specified. Diffs - ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java 7e33a3f ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java bae30f3 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java 603f1ca ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionManagerImpl.java ad012b6 spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 1bcd221 Diff: https://reviews.apache.org/r/34293/diff/ Testing --- Thanks, Jimmy Xiang
Re: Review Request 34293: HIVE-10721 SparkSessionManagerImpl leaks SparkSessions [Spark Branch]
On May 18, 2015, 2:26 a.m., chengxiang li wrote: ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionManagerImpl.java, line 96 https://reviews.apache.org/r/34293/diff/1/?file=961679#file961679line96 SparkClientFactory.initialize would be invoked only once, which means RpcServer would be initialized once inside either, so while we update spark client rpc related paramters, RpcServer does not really updated. This should be another issue, i just list here as it's found while read the code. Are you saying the RpcServer should be restated too, because some configuration used by RpcServer could be changed? We may need to track those related properties separately. This could complicate the code however. Of course, I agree with you this is indeed an issue. - Jimmy --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/34293/#review84094 --- On May 15, 2015, 9:53 p.m., Jimmy Xiang wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/34293/ --- (Updated May 15, 2015, 9:53 p.m.) Review request for hive and Xuefu Zhang. Bugs: HIVE-10721 https://issues.apache.org/jira/browse/HIVE-10721 Repository: hive-git Description --- Add a SparkSession to createdSessions only after the session is opened properly if doOpen is specified. Diffs - ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java 7e33a3f ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java bae30f3 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java 603f1ca ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionManagerImpl.java ad012b6 spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 1bcd221 Diff: https://reviews.apache.org/r/34293/diff/ Testing --- Thanks, Jimmy Xiang
Re: Review Request 34293: HIVE-10721 SparkSessionManagerImpl leaks SparkSessions [Spark Branch]
On 五月 18, 2015, 2:26 a.m., chengxiang li wrote: ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionManagerImpl.java, line 96 https://reviews.apache.org/r/34293/diff/1/?file=961679#file961679line96 SparkClientFactory.initialize would be invoked only once, which means RpcServer would be initialized once inside either, so while we update spark client rpc related paramters, RpcServer does not really updated. This should be another issue, i just list here as it's found while read the code. Jimmy Xiang wrote: Are you saying the RpcServer should be restated too, because some configuration used by RpcServer could be changed? We may need to track those related properties separately. This could complicate the code however. Of course, I agree with you this is indeed an issue. Yes, if we want to make these rpc configurations dynamically effectual, RpcServer should be restarted as well. - chengxiang --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/34293/#review84094 --- On 五月 15, 2015, 9:53 p.m., Jimmy Xiang wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/34293/ --- (Updated 五月 15, 2015, 9:53 p.m.) Review request for hive and Xuefu Zhang. Bugs: HIVE-10721 https://issues.apache.org/jira/browse/HIVE-10721 Repository: hive-git Description --- Add a SparkSession to createdSessions only after the session is opened properly if doOpen is specified. Diffs - ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java 7e33a3f ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java bae30f3 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java 603f1ca ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionManagerImpl.java ad012b6 spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 1bcd221 Diff: https://reviews.apache.org/r/34293/diff/ Testing --- Thanks, Jimmy Xiang
Re: Review Request 34293: HIVE-10721 SparkSessionManagerImpl leaks SparkSessions [Spark Branch]
On 五月 18, 2015, 2:37 a.m., chengxiang li wrote: ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionManagerImpl.java, line 88 https://reviews.apache.org/r/34293/diff/1/?file=961679#file961679line88 Just curious, it looks to me that AtomaticBoolean works here either, is that possible 2 threads executed this block both? Jimmy Xiang wrote: If several sessions connect to the same HS2, they might execute this block concurrently. One issue with AtomaticBoolean instead of synchonized here is that we have to make sure the SparkClientFactory is properly initialized. Sometimes, we see it throws an exception, in which case, we may need to initialize it again. Ok, I see, thanks for explaination. - chengxiang --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/34293/#review84096 --- On 五月 15, 2015, 9:53 p.m., Jimmy Xiang wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/34293/ --- (Updated 五月 15, 2015, 9:53 p.m.) Review request for hive and Xuefu Zhang. Bugs: HIVE-10721 https://issues.apache.org/jira/browse/HIVE-10721 Repository: hive-git Description --- Add a SparkSession to createdSessions only after the session is opened properly if doOpen is specified. Diffs - ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java 7e33a3f ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java bae30f3 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java 603f1ca ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionManagerImpl.java ad012b6 spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 1bcd221 Diff: https://reviews.apache.org/r/34293/diff/ Testing --- Thanks, Jimmy Xiang
Re: Review Request 34293: HIVE-10721 SparkSessionManagerImpl leaks SparkSessions [Spark Branch]
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/34293/#review83990 --- ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionManagerImpl.java https://reviews.apache.org/r/34293/#comment135091 Any concurrency issue for this, or it doesn't matter since it's in shutdown()? - Xuefu Zhang On May 15, 2015, 9:53 p.m., Jimmy Xiang wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/34293/ --- (Updated May 15, 2015, 9:53 p.m.) Review request for hive and Xuefu Zhang. Bugs: HIVE-10721 https://issues.apache.org/jira/browse/HIVE-10721 Repository: hive-git Description --- Add a SparkSession to createdSessions only after the session is opened properly if doOpen is specified. Diffs - ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java 7e33a3f ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java bae30f3 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java 603f1ca ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionManagerImpl.java ad012b6 spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 1bcd221 Diff: https://reviews.apache.org/r/34293/diff/ Testing --- Thanks, Jimmy Xiang
Review Request 34293: HIVE-10721 SparkSessionManagerImpl leaks SparkSessions [Spark Branch]
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/34293/ --- Review request for hive and Xuefu Zhang. Bugs: HIVE-10721 https://issues.apache.org/jira/browse/HIVE-10721 Repository: hive-git Description --- Add a SparkSession to createdSessions only after the session is opened properly if doOpen is specified. Diffs - ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java 7e33a3f ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java bae30f3 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java 603f1ca ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionManagerImpl.java ad012b6 spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 1bcd221 Diff: https://reviews.apache.org/r/34293/diff/ Testing --- Thanks, Jimmy Xiang
Re: Review Request 34293: HIVE-10721 SparkSessionManagerImpl leaks SparkSessions [Spark Branch]
On May 15, 2015, 10:19 p.m., Xuefu Zhang wrote: ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionManagerImpl.java, line 169 https://reviews.apache.org/r/34293/diff/1/?file=961679#file961679line169 Any concurrency issue for this, or it doesn't matter since it's in shutdown()? This is fine. Two reasons: 1. inited is volatile, 2, it is in shutdown(). Probably we can remove this line since it is in shutdown(). - Jimmy --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/34293/#review83990 --- On May 15, 2015, 9:53 p.m., Jimmy Xiang wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/34293/ --- (Updated May 15, 2015, 9:53 p.m.) Review request for hive and Xuefu Zhang. Bugs: HIVE-10721 https://issues.apache.org/jira/browse/HIVE-10721 Repository: hive-git Description --- Add a SparkSession to createdSessions only after the session is opened properly if doOpen is specified. Diffs - ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java 7e33a3f ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java bae30f3 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionImpl.java 603f1ca ql/src/java/org/apache/hadoop/hive/ql/exec/spark/session/SparkSessionManagerImpl.java ad012b6 spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 1bcd221 Diff: https://reviews.apache.org/r/34293/diff/ Testing --- Thanks, Jimmy Xiang