QQhuxuhui opened a new pull request, #15094:
URL: https://github.com/apache/dolphinscheduler/pull/15094

   Fix the YARN resource leakage issue caused by the HikariCP connection pool
   
   fix: [#14187](https://github.com/apache/dolphinscheduler/issues/14187)
   fix: [#6092](https://github.com/apache/dolphinscheduler/issues/6092)
   
   
   
   <!--Thanks very much for contributing to Apache DolphinScheduler, we are 
happy that you want to help us improve DolphinScheduler! -->
   
   ## Purpose of the pull request
   
   
   The purpose of this submission is to address the resource leakage issue 
caused by the connection pool. Starting from version 2.x, third-party data 
sources utilize the HikariCP connection pool. Here is a description of the 
problem:
   
   1. When running tasks in Hive on Spark mode, the JDBC connection in Hive is 
heavier compared to regular RDBMS connections. This is especially true in Hive 
on Spark mode because when the underlying connection needs to execute SQL, HS2 
(HiveServer2) requests CONTAINER resources from YARN and starts a distributed 
Spark on YARN cluster to execute the compiled SQL in a distributed manner. 
After the SQL execution is complete, the Spark on YARN resources are not 
immediately released. Instead, they are kept for a certain period of time to 
reuse these resources for executing new SQL submitted through the same JDBC 
connection by the client. The Spark on YARN resources are only completely 
released when the JDBC connection is closed or when the configured timeout is 
reached without any new SQL submissions from the client.
   2. When using a database connection pool technology, closing a JDBC 
connection essentially returns the connection to the connection pool without 
actually closing the underlying JDBC connection. As a result, the Spark on YARN 
resources associated with the connection are not released promptly, leading to 
resource leakage. Consequently, other jobs that request resources from YARN 
have to wait in a queue, affecting the execution of those jobs.
   3. In the project, HikariCP is used as the database connection pool, and the 
idle timeout duration (idletimeout) for the database connections is not 
configured. The effective idle timeout duration is the default value of 10 
minutes configured in the HikariCP source code. Therefore, after each 
connection's underlying SQL job is completed, it takes 10 minutes for the 
associated Spark on YARN resources to be truly released. This results in other 
jobs queuing and waiting for YARN resources.
   
   
   The approach in this solution is to adjust the parameters of the database 
connection pool, especially the minimum connection count and idle timeout time, 
in order to close idle database connections more quickly and actively. For 
example, configuring the IdleTimeout to 30 seconds and setting the MinimumIdle 
to 0. This means that all connections will be closed 30 seconds after the SQL 
job finishes, releasing all the Spark on YARN resources and resolving the 
resource leakage issue.
   
   <!--(For example: This pull request adds checkstyle plugin).-->
   
   ## Brief change log
   
   <!--*(for example:)*
   - *Add maven-checkstyle-plugin to root pom.xml*
   -->
   
   ## Verify this pull request
   
   <!--*(Please pick either of the following options)*-->
   
   This pull request is code cleanup without any test coverage.
   
   *(or)*
   
   This pull request is already covered by existing tests, such as *(please 
describe tests)*.
   
   (or)
   
   This change added tests and can be verified as follows:
   
   <!--*(example:)*
   - *Added dolphinscheduler-dao tests for end-to-end.*
   - *Added CronUtilsTest to verify the change.*
   - *Manually verified the change by testing locally.* -->
   
   (or)
   
   If your pull request contain incompatible change, you should also add it to 
`docs/docs/en/guide/upgrede/incompatible.md`
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to