GitHub user zsxwing opened a pull request:

    https://github.com/apache/spark/pull/22106

    [SPARK-25116][Tests]Fix the kafka cluster leak and clean up cached producers

    ## What changes were proposed in this pull request?
    
    KafkaContinuousSinkSuite leaks a Kafka cluster because both KafkaSourceTest 
and KafkaContinuousSinkSuite create a Kafka cluster but `afterAll` only shuts 
down one cluster. This leaks a Kafka cluster and causes that some Kafka thread 
crash and kill JVM when SBT is trying to clean up tests.
    
    This PR fixes the leak and also adds a shut down hook to detect Kafka 
cluster leak.
    
    In additions, it also fixes `AdminClient` leak and cleans up cached 
producers to eliminate the following annoying logs:
    ```
    8/13 15:34:42.568 kafka-admin-client-thread | adminclient-4 WARN 
NetworkClient: [AdminClient clientId=adminclient-4] Connection to node 0 could 
not be established. Broker may not be available.
    18/08/13 15:34:42.570 kafka-admin-client-thread | adminclient-6 WARN 
NetworkClient: [AdminClient clientId=adminclient-6] Connection to node 0 could 
not be established. Broker may not be available.
    18/08/13 15:34:42.606 kafka-admin-client-thread | adminclient-8 WARN 
NetworkClient: [AdminClient clientId=adminclient-8] Connection to node -1 could 
not be established. Broker may not be available.
    18/08/13 15:34:42.729 kafka-producer-network-thread | producer-797 WARN 
NetworkClient: [Producer clientId=producer-797] Connection to node -1 could not 
be established. Broker may not be available.
    18/08/13 15:34:42.906 kafka-producer-network-thread | producer-1598 WARN 
NetworkClient: [Producer clientId=producer-1598] Connection to node 0 could not 
be established. Broker may not be available.
    ```
    
    ## How was this patch tested?
    
    Jenkins


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/zsxwing/spark SPARK-25116

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22106.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22106
    
----
commit 6a0ec9c53e2949de724c3596171b59aae94523c1
Author: Shixiong Zhu <zsxwing@...>
Date:   2018-08-14T17:59:48Z

    fix kafka cluster leak

commit 59d10e2dc82bdbec1b0da19aaf3eff060db82804
Author: Shixiong Zhu <zsxwing@...>
Date:   2018-08-14T18:00:35Z

    Revert "don't kill JVM during termination"
    
    This reverts commit b5eb54244ed573c8046f5abf7bf087f5f08dba58.

commit e13b21a3d16c22ba069ef67f9a263af18f238691
Author: Shixiong Zhu <zsxwing@...>
Date:   2018-08-14T18:00:42Z

    Merge branch 'master' into SPARK-25116

commit b561528196a4a6d3d9e0bb951358fc5f288fdb3d
Author: Shixiong Zhu <zsxwing@...>
Date:   2018-08-14T18:03:07Z

    update

commit 574f5fa3fc6e7313774b1408eceb4543d4422ba0
Author: Shixiong Zhu <zsxwing@...>
Date:   2018-08-14T18:11:34Z

    update

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to