[jira] [Commented] (SPARK-4694) Long-run user thread(such as HiveThriftServer2) causes the 'process leak' in yarn-client mode
[ https://issues.apache.org/jira/browse/SPARK-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14232778#comment-14232778 ] Apache Spark commented on SPARK-4694: - User 'SaintBacchus' has created a pull request for this issue: https://github.com/apache/spark/pull/3576 Long-run user thread(such as HiveThriftServer2) causes the 'process leak' in yarn-client mode - Key: SPARK-4694 URL: https://issues.apache.org/jira/browse/SPARK-4694 Project: Spark Issue Type: Bug Components: YARN Reporter: SaintBacchus Recently when I use the Yarn HA mode to test the HiveThriftServer2 I found a problem that the driver can't exit by itself. To reappear it, you can do as fellow: 1.use yarn HA mode and set am.maxAttemp = 1for convenience 2.kill the active resouce manager in cluster The expect result is just failed, because the maxAttemp was 1. But the actual result is that: all executor was ended but the driver was still there and never close. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-4694) Long-run user thread(such as HiveThriftServer2) causes the 'process leak' in yarn-client mode
[ https://issues.apache.org/jira/browse/SPARK-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14232724#comment-14232724 ] SaintBacchus commented on SPARK-4694: - Thanks for reply. [~vanzin] the problem is very sure: the scheduler backend was aware of the AM had been exited so it call sc.stop to exit the driver process but there was a user thread which was still alive and cause this problem. To fix this, just using System.exit(-1) instead of the sc.stop so that jvm will not wait all the user threads being down and exit clearly. Can I use System.exit() in spark code? Long-run user thread(such as HiveThriftServer2) causes the 'process leak' in yarn-client mode - Key: SPARK-4694 URL: https://issues.apache.org/jira/browse/SPARK-4694 Project: Spark Issue Type: Bug Components: YARN Reporter: SaintBacchus Recently when I use the Yarn HA mode to test the HiveThriftServer2 I found a problem that the driver can't exit by itself. To reappear it, you can do as fellow: 1.use yarn HA mode and set am.maxAttemp = 1for convenience 2.kill the active resouce manager in cluster The expect result is just failed, because the maxAttemp was 1. But the actual result is that: all executor was ended but the driver was still there and never close. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-4694) Long-run user thread(such as HiveThriftServer2) causes the 'process leak' in yarn-client mode
[ https://issues.apache.org/jira/browse/SPARK-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14233321#comment-14233321 ] Marcelo Vanzin commented on SPARK-4694: --- To answer your question, you can call System.exit() if you want. It's just recommended that it's done after you properly shutdown the SparkContext, otherwise Yarn won't report your app status correctly. But it seems your patch doesn't use System.exit(), so this is kinda moot. Long-run user thread(such as HiveThriftServer2) causes the 'process leak' in yarn-client mode - Key: SPARK-4694 URL: https://issues.apache.org/jira/browse/SPARK-4694 Project: Spark Issue Type: Bug Components: YARN Reporter: SaintBacchus Recently when I use the Yarn HA mode to test the HiveThriftServer2 I found a problem that the driver can't exit by itself. To reappear it, you can do as fellow: 1.use yarn HA mode and set am.maxAttemp = 1for convenience 2.kill the active resouce manager in cluster The expect result is just failed, because the maxAttemp was 1. But the actual result is that: all executor was ended but the driver was still there and never close. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-4694) Long-run user thread(such as HiveThriftServer2) causes the 'process leak' in yarn-client mode
[ https://issues.apache.org/jira/browse/SPARK-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231440#comment-14231440 ] SaintBacchus commented on SPARK-4694: - The reason was that Yarn had reported the status to the RM and the YarnClientSchedulerBackend would detect the status to stop sc in function 'asyncMonitorApplication'. But the HiveThriftServer2 is a long-run user thread. JVM will never exit until all the no-demo threads have ended or using System.exit(). It cause such problem. The easiest way to reslove this problem is using System.exit(0) instead of sc.stop in funciton 'asyncMonitorApplication' . But system.exit is not recommended in https://issues.apache.org/jira/browse/SPARK-4584 Do you have any ideas about this problem? [~vanzin] Long-run user thread(such as HiveThriftServer2) causes the 'process leak' in yarn-client mode - Key: SPARK-4694 URL: https://issues.apache.org/jira/browse/SPARK-4694 Project: Spark Issue Type: Bug Components: YARN Reporter: SaintBacchus Recently when I use the Yarn HA mode to test the HiveThriftServer2 I found a problem that the driver can't exit by itself. To reappear it, you can do as fellow: 1.use yarn HA mode and set am.maxAttemp = 1for convenience 2.kill the active resouce manager in cluster The expect result is just failed, because the maxAttemp was 1. But the actual result is that: all executor was ended but the driver was still there and never close. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org