[jira] [Commented] (YARN-10720) YARN WebAppProxyServlet should support connection timeout to prevent proxy server from hanging
[ https://issues.apache.org/jira/browse/YARN-10720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17511652#comment-17511652 ] Akira Ajisaka commented on YARN-10720: -- Opened https://github.com/apache/hadoop/pull/4103 for branch-2.10. I'm seeing this issue in a prod cluster, so I want to backport the fix to all the release branches. > YARN WebAppProxyServlet should support connection timeout to prevent proxy > server from hanging > -- > > Key: YARN-10720 > URL: https://issues.apache.org/jira/browse/YARN-10720 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Critical > Labels: pull-request-available > Fix For: 3.4.0, 3.3.3 > > Attachments: YARN-10720.001.patch, YARN-10720.002.patch, > YARN-10720.003.patch, YARN-10720.004.patch, YARN-10720.005.patch, > YARN-10720.006.patch, image-2021-03-29-14-04-33-776.png, > image-2021-03-29-14-05-32-708.png > > Time Spent: 20m > Remaining Estimate: 0h > > Following is proxy server show, {color:#de350b}too many connections from one > client{color}, this caused the proxy server hang, and the yarn web can't jump > to web proxy. > !image-2021-03-29-14-04-33-776.png|width=632,height=57! > Following is the AM which is abnormal, but proxy server don't know it is > abnormal already, so the connections can't be closed, we should add time out > support in proxy server to prevent this. And one abnormal AM may cause > hundreds even thousands of connections, it is very heavy. > !image-2021-03-29-14-05-32-708.png|width=669,height=101! > > After i kill the abnormal AM, the proxy server become healthy. This case > happened many times in our production clusters, our clusters are huge, and > the abnormal AM will be existed in a regular case. > > I will add timeout supported in web proxy server in this jira. > > cc [~pbacsko] [~ebadger] [~Jim_Brennan] [~ztang] [~epayne] [~gandras] > [~bteke] > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10720) YARN WebAppProxyServlet should support connection timeout to prevent proxy server from hanging
[ https://issues.apache.org/jira/browse/YARN-10720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17511610#comment-17511610 ] Akira Ajisaka commented on YARN-10720: -- When cherry-picking to branch-3.2, I had to fix some conflicts. Opened https://github.com/apache/hadoop/pull/4102 for testing. > YARN WebAppProxyServlet should support connection timeout to prevent proxy > server from hanging > -- > > Key: YARN-10720 > URL: https://issues.apache.org/jira/browse/YARN-10720 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Critical > Labels: pull-request-available > Fix For: 3.4.0, 3.3.3 > > Attachments: YARN-10720.001.patch, YARN-10720.002.patch, > YARN-10720.003.patch, YARN-10720.004.patch, YARN-10720.005.patch, > YARN-10720.006.patch, image-2021-03-29-14-04-33-776.png, > image-2021-03-29-14-05-32-708.png > > Time Spent: 10m > Remaining Estimate: 0h > > Following is proxy server show, {color:#de350b}too many connections from one > client{color}, this caused the proxy server hang, and the yarn web can't jump > to web proxy. > !image-2021-03-29-14-04-33-776.png|width=632,height=57! > Following is the AM which is abnormal, but proxy server don't know it is > abnormal already, so the connections can't be closed, we should add time out > support in proxy server to prevent this. And one abnormal AM may cause > hundreds even thousands of connections, it is very heavy. > !image-2021-03-29-14-05-32-708.png|width=669,height=101! > > After i kill the abnormal AM, the proxy server become healthy. This case > happened many times in our production clusters, our clusters are huge, and > the abnormal AM will be existed in a regular case. > > I will add timeout supported in web proxy server in this jira. > > cc [~pbacsko] [~ebadger] [~Jim_Brennan] [~ztang] [~epayne] [~gandras] > [~bteke] > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10720) YARN WebAppProxyServlet should support connection timeout to prevent proxy server from hanging
[ https://issues.apache.org/jira/browse/YARN-10720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17312945#comment-17312945 ] Peter Bacsko commented on YARN-10720: - +1 thanks [~zhuqi] for the patch, committed to trunk. > YARN WebAppProxyServlet should support connection timeout to prevent proxy > server from hanging > -- > > Key: YARN-10720 > URL: https://issues.apache.org/jira/browse/YARN-10720 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Critical > Attachments: YARN-10720.001.patch, YARN-10720.002.patch, > YARN-10720.003.patch, YARN-10720.004.patch, YARN-10720.005.patch, > YARN-10720.006.patch, image-2021-03-29-14-04-33-776.png, > image-2021-03-29-14-05-32-708.png > > > Following is proxy server show, {color:#de350b}too many connections from one > client{color}, this caused the proxy server hang, and the yarn web can't jump > to web proxy. > !image-2021-03-29-14-04-33-776.png|width=632,height=57! > Following is the AM which is abnormal, but proxy server don't know it is > abnormal already, so the connections can't be closed, we should add time out > support in proxy server to prevent this. And one abnormal AM may cause > hundreds even thousands of connections, it is very heavy. > !image-2021-03-29-14-05-32-708.png|width=669,height=101! > > After i kill the abnormal AM, the proxy server become healthy. This case > happened many times in our production clusters, our clusters are huge, and > the abnormal AM will be existed in a regular case. > > I will add timeout supported in web proxy server in this jira. > > cc [~pbacsko] [~ebadger] [~Jim_Brennan] [~ztang] [~epayne] [~gandras] > [~bteke] > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org