[jira] [Updated] (YARN-10725) Backport YARN-10120 to branch-3.3
[ https://issues.apache.org/jira/browse/YARN-10725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10725: - Attachment: YARN-10725-branch-3.3.v2.patch > Backport YARN-10120 to branch-3.3 > - > > Key: YARN-10725 > URL: https://issues.apache.org/jira/browse/YARN-10725 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10120-branch-3.3.patch, > YARN-10725-branch-3.3.patch, YARN-10725-branch-3.3.v2.patch, > image-2021-04-05-16-48-57-034.png, image-2021-04-05-16-50-55-238.png > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10725) Backport YARN-10120 to branch-3.3
[ https://issues.apache.org/jira/browse/YARN-10725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10725: - Attachment: YARN-10725-branch-3.3.patch > Backport YARN-10120 to branch-3.3 > - > > Key: YARN-10725 > URL: https://issues.apache.org/jira/browse/YARN-10725 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10120-branch-3.3.patch, YARN-10725-branch-3.3.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10725) Backport YARN-10120 to branch-3.3
[ https://issues.apache.org/jira/browse/YARN-10725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17314662#comment-17314662 ] Bilwa S T commented on YARN-10725: -- Hi [~brahmareddy] As discussed i have attached patch for this to backport to branch-3.3 . Please do check. Thanks > Backport YARN-10120 to branch-3.3 > - > > Key: YARN-10725 > URL: https://issues.apache.org/jira/browse/YARN-10725 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10120-branch-3.3.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10725) Backport YARN-10120 to branch-3.3
[ https://issues.apache.org/jira/browse/YARN-10725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10725: - Attachment: YARN-10120-branch-3.3.patch > Backport YARN-10120 to branch-3.3 > - > > Key: YARN-10725 > URL: https://issues.apache.org/jira/browse/YARN-10725 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10120-branch-3.3.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10120) In Federation Router Nodes/Applications/About pages throws 500 exception when https is enabled
[ https://issues.apache.org/jira/browse/YARN-10120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17312088#comment-17312088 ] Bilwa S T commented on YARN-10120: -- [~brahmareddy] I have raised YARN-10725 to backport to branch-3.3 > In Federation Router Nodes/Applications/About pages throws 500 exception when > https is enabled > -- > > Key: YARN-10120 > URL: https://issues.apache.org/jira/browse/YARN-10120 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Reporter: Sushanta Sen >Assignee: Bilwa S T >Priority: Critical > Fix For: 3.4.0 > > Attachments: YARN-10120-YARN-7402.patch, > YARN-10120-YARN-7402.v2.patch, YARN-10120-addendum-01.patch, > YARN-10120-branch-3.3.patch, YARN-10120-branch-3.3.v2.patch, > YARN-10120.001.patch, YARN-10120.002.patch > > > In Federation Router Nodes/Applications/About pages throws 500 exception when > https is enabled. > yarn.router.webapp.https.address =router ip:8091 > {noformat} > 2020-02-07 16:38:49,990 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error > handling URI: /cluster/apps > java.lang.reflect.InvocationTargetException > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:166) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) > at > com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:287) > at > com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:277) > at > com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:182) > at > com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) > at > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:85) > at > com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:941) > at > com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:875) > at > com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:829) > at > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82) > at > com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:119) > at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:133) > at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:130) > at > com.google.inject.servlet.GuiceFilter$Context.call(GuiceFilter.java:203) > at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:130) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767) > at > org.apache.hadoop.security.http.XFrameOptionsFilter.doFilter(XFrameOptionsFilter.java:57) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767) > at > org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:644) > at > org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:592) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767) > at > org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1622) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767) > at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767) > at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:583) > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) > at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548) > at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226) > at > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180) > at > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:513) > at > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) > at > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112) > at >
[jira] [Created] (YARN-10725) Backport YARN-10120 to branch-3.3
Bilwa S T created YARN-10725: Summary: Backport YARN-10120 to branch-3.3 Key: YARN-10725 URL: https://issues.apache.org/jira/browse/YARN-10725 Project: Hadoop YARN Issue Type: Bug Reporter: Bilwa S T Assignee: Bilwa S T -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9606) Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient
[ https://issues.apache.org/jira/browse/YARN-9606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17309378#comment-17309378 ] Bilwa S T commented on YARN-9606: - [~brahmareddy] This can be backported once YARN-10120 is merged to branch-3.3 > Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient > -- > > Key: YARN-9606 > URL: https://issues.apache.org/jira/browse/YARN-9606 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Fix For: 3.4.0 > > Attachments: YARN-9606-001.patch, YARN-9606-002.patch, > YARN-9606.003.patch, YARN-9606.004.patch, YARN-9606.005.patch, > YARN-9606.006.patch, YARN-9606.007.patch, YARN-9606.008.patch > > > Yarn logs fails for running containers > > > {quote} > > > > Unable to fetch log files list > Exception in thread "main" java.io.IOException: > com.sun.jersey.api.client.ClientHandlerException: > javax.net.ssl.SSLHandshakeException: Error while authenticating with > endpoint: > [https://vm2:65321/ws/v1/node/containers/container_e05_1559802125016_0001_01_08/logs] > at > org.apache.hadoop.yarn.client.cli.LogsCLI.getContainerLogFiles(LogsCLI.java:543) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedContainerLogFiles(LogsCLI.java:1338) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedOptionForRunningApp(LogsCLI.java:1514) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.fetchContainerLogs(LogsCLI.java:1052) > at org.apache.hadoop.yarn.client.cli.LogsCLI.runCommand(LogsCLI.java:367) > at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:152) > at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:399) > {quote} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10697) Resources are displayed in bytes in UI for schedulers other than capacity
[ https://issues.apache.org/jira/browse/YARN-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17306824#comment-17306824 ] Bilwa S T commented on YARN-10697: -- [~Jim_Brennan] I have changed method name. Please check updated patch. Thanks > Resources are displayed in bytes in UI for schedulers other than capacity > - > > Key: YARN-10697 > URL: https://issues.apache.org/jira/browse/YARN-10697 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10697.001.patch, YARN-10697.002.patch, > YARN-10697.003.patch, image-2021-03-17-11-30-57-216.png > > > Resources.newInstance expects MB as memory whereas in MetricsOverviewTable > passes resources in bytes . Also we should display memory in GB for better > readability for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10697) Resources are displayed in bytes in UI for schedulers other than capacity
[ https://issues.apache.org/jira/browse/YARN-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10697: - Attachment: YARN-10697.003.patch > Resources are displayed in bytes in UI for schedulers other than capacity > - > > Key: YARN-10697 > URL: https://issues.apache.org/jira/browse/YARN-10697 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10697.001.patch, YARN-10697.002.patch, > YARN-10697.003.patch, image-2021-03-17-11-30-57-216.png > > > Resources.newInstance expects MB as memory whereas in MetricsOverviewTable > passes resources in bytes . Also we should display memory in GB for better > readability for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10697) Resources are displayed in bytes in UI for schedulers other than capacity
[ https://issues.apache.org/jira/browse/YARN-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17305380#comment-17305380 ] Bilwa S T commented on YARN-10697: -- Hi [~Jim_Brennan] I have attached .002 patch with latest changes. Please review. Thanks > Resources are displayed in bytes in UI for schedulers other than capacity > - > > Key: YARN-10697 > URL: https://issues.apache.org/jira/browse/YARN-10697 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10697.001.patch, YARN-10697.002.patch, > image-2021-03-17-11-30-57-216.png > > > Resources.newInstance expects MB as memory whereas in MetricsOverviewTable > passes resources in bytes . Also we should display memory in GB for better > readability for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10697) Resources are displayed in bytes in UI for schedulers other than capacity
[ https://issues.apache.org/jira/browse/YARN-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10697: - Attachment: YARN-10697.002.patch > Resources are displayed in bytes in UI for schedulers other than capacity > - > > Key: YARN-10697 > URL: https://issues.apache.org/jira/browse/YARN-10697 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10697.001.patch, YARN-10697.002.patch, > image-2021-03-17-11-30-57-216.png > > > Resources.newInstance expects MB as memory whereas in MetricsOverviewTable > passes resources in bytes . Also we should display memory in GB for better > readability for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10697) Resources are displayed in bytes in UI for schedulers other than capacity
[ https://issues.apache.org/jira/browse/YARN-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10697: - Attachment: (was: YARN-10697.002.patch) > Resources are displayed in bytes in UI for schedulers other than capacity > - > > Key: YARN-10697 > URL: https://issues.apache.org/jira/browse/YARN-10697 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10697.001.patch, image-2021-03-17-11-30-57-216.png > > > Resources.newInstance expects MB as memory whereas in MetricsOverviewTable > passes resources in bytes . Also we should display memory in GB for better > readability for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10697) Resources are displayed in bytes in UI for schedulers other than capacity
[ https://issues.apache.org/jira/browse/YARN-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10697: - Attachment: YARN-10697.002.patch > Resources are displayed in bytes in UI for schedulers other than capacity > - > > Key: YARN-10697 > URL: https://issues.apache.org/jira/browse/YARN-10697 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10697.001.patch, YARN-10697.002.patch, > image-2021-03-17-11-30-57-216.png > > > Resources.newInstance expects MB as memory whereas in MetricsOverviewTable > passes resources in bytes . Also we should display memory in GB for better > readability for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10697) Resources are displayed in bytes in UI for schedulers other than capacity
[ https://issues.apache.org/jira/browse/YARN-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17304629#comment-17304629 ] Bilwa S T commented on YARN-10697: -- Thanks [~Jim_Brennan] [~jhung] for your comments. I basically added changes in Resource#toString so that its easier for user to read. I agree its not correct to add it there as its called from many other places. So can we introduce a new method in Resource.java which can print it in MB|GB|TB? > Resources are displayed in bytes in UI for schedulers other than capacity > - > > Key: YARN-10697 > URL: https://issues.apache.org/jira/browse/YARN-10697 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10697.001.patch, image-2021-03-17-11-30-57-216.png > > > Resources.newInstance expects MB as memory whereas in MetricsOverviewTable > passes resources in bytes . Also we should display memory in GB for better > readability for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10697) Resources are displayed in bytes in UI for schedulers other than capacity
[ https://issues.apache.org/jira/browse/YARN-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17303123#comment-17303123 ] Bilwa S T commented on YARN-10697: -- [~epayne] [~jbrennan] can you please take a look at this? > Resources are displayed in bytes in UI for schedulers other than capacity > - > > Key: YARN-10697 > URL: https://issues.apache.org/jira/browse/YARN-10697 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10697.001.patch, image-2021-03-17-11-30-57-216.png > > > Resources.newInstance expects MB as memory whereas in MetricsOverviewTable > passes resources in bytes . Also we should display memory in GB for better > readability for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10697) Resources are displayed in bytes in UI for schedulers other than capacity
[ https://issues.apache.org/jira/browse/YARN-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10697: - Attachment: YARN-10697.001.patch > Resources are displayed in bytes in UI for schedulers other than capacity > - > > Key: YARN-10697 > URL: https://issues.apache.org/jira/browse/YARN-10697 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10697.001.patch, image-2021-03-17-11-30-57-216.png > > > Resources.newInstance expects MB as memory whereas in MetricsOverviewTable > passes resources in bytes . Also we should display memory in GB for better > readability for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10697) Resources are displayed in bytes in UI for schedulers other than capacity
[ https://issues.apache.org/jira/browse/YARN-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17303120#comment-17303120 ] Bilwa S T commented on YARN-10697: -- In YARN-10251 in if case they removed multiplying by BYTES_IN_MB whereas in else case it was missed. !image-2021-03-17-11-30-57-216.png! > Resources are displayed in bytes in UI for schedulers other than capacity > - > > Key: YARN-10697 > URL: https://issues.apache.org/jira/browse/YARN-10697 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: image-2021-03-17-11-30-57-216.png > > > Resources.newInstance expects MB as memory whereas in MetricsOverviewTable > passes resources in bytes . Also we should display memory in GB for better > readability for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10697) Resources are displayed in bytes in UI for schedulers other than capacity
[ https://issues.apache.org/jira/browse/YARN-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10697: - Attachment: image-2021-03-17-11-30-57-216.png > Resources are displayed in bytes in UI for schedulers other than capacity > - > > Key: YARN-10697 > URL: https://issues.apache.org/jira/browse/YARN-10697 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: image-2021-03-17-11-30-57-216.png > > > Resources.newInstance expects MB as memory whereas in MetricsOverviewTable > passes resources in bytes . Also we should display memory in GB for better > readability for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10697) Resources are displayed in bytes in UI for schedulers other than capacity
Bilwa S T created YARN-10697: Summary: Resources are displayed in bytes in UI for schedulers other than capacity Key: YARN-10697 URL: https://issues.apache.org/jira/browse/YARN-10697 Project: Hadoop YARN Issue Type: Bug Reporter: Bilwa S T Assignee: Bilwa S T Resources.newInstance expects MB as memory whereas in MetricsOverviewTable passes resources in bytes . Also we should display memory in GB for better readability for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10691) DominantResourceCalculator isInvalidDivisor should consider only countable resource types
[ https://issues.apache.org/jira/browse/YARN-10691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10691: - Summary: DominantResourceCalculator isInvalidDivisor should consider only countable resource types (was: DominantResourceCalculator divide and ratio methods should consider only countable resource types) > DominantResourceCalculator isInvalidDivisor should consider only countable > resource types > - > > Key: YARN-10691 > URL: https://issues.apache.org/jira/browse/YARN-10691 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10588) Percentage of queue and cluster is zero in WebUI
[ https://issues.apache.org/jira/browse/YARN-10588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17301434#comment-17301434 ] Bilwa S T commented on YARN-10588: -- Thanks [~Jim_Brennan] and [~epayne] for review comments. I have raised YARN-10691 to handle above issue. I think this one can be merged. > Percentage of queue and cluster is zero in WebUI > - > > Key: YARN-10588 > URL: https://issues.apache.org/jira/browse/YARN-10588 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10588.001.patch, YARN-10588.002.patch, > YARN-10588.003.patch, YARN-10588.004.patch > > > Steps to reproduce: > Configure below property in resource-types.xml > {code:java} > > yarn.resource-types > yarn.io/gpu > {code} > Submit a job > In UI you can see % Of Queue and % Of Cluster is zero for the submitted > application > > This is because in SchedulerApplicationAttempt has below check for > calculating queueUsagePerc and clusterUsagePerc > {code:java} > if (!calc.isInvalidDivisor(cluster)) { > float queueCapacityPerc = queue.getQueueInfo(false, false) > .getCapacity(); > queueUsagePerc = calc.divide(cluster, usedResourceClone, > Resources.multiply(cluster, queueCapacityPerc)) * 100; > if (Float.isNaN(queueUsagePerc) || Float.isInfinite(queueUsagePerc)) { > queueUsagePerc = 0.0f; > } > clusterUsagePerc = > calc.divide(cluster, usedResourceClone, cluster) * 100; > } > {code} > calc.isInvalidDivisor(cluster) always returns true as gpu resource is 0 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10691) DominantResourceCalculator divide and ratio methods should consider only countable resource types
Bilwa S T created YARN-10691: Summary: DominantResourceCalculator divide and ratio methods should consider only countable resource types Key: YARN-10691 URL: https://issues.apache.org/jira/browse/YARN-10691 Project: Hadoop YARN Issue Type: Bug Reporter: Bilwa S T Assignee: Bilwa S T -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10588) Percentage of queue and cluster is zero in WebUI
[ https://issues.apache.org/jira/browse/YARN-10588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17298009#comment-17298009 ] Bilwa S T commented on YARN-10588: -- Hi [~Jim_Brennan] can you please take a look at this Jira when you get time? Thanks > Percentage of queue and cluster is zero in WebUI > - > > Key: YARN-10588 > URL: https://issues.apache.org/jira/browse/YARN-10588 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10588.001.patch, YARN-10588.002.patch, > YARN-10588.003.patch, YARN-10588.004.patch > > > Steps to reproduce: > Configure below property in resource-types.xml > {code:java} > > yarn.resource-types > yarn.io/gpu > {code} > Submit a job > In UI you can see % Of Queue and % Of Cluster is zero for the submitted > application > > This is because in SchedulerApplicationAttempt has below check for > calculating queueUsagePerc and clusterUsagePerc > {code:java} > if (!calc.isInvalidDivisor(cluster)) { > float queueCapacityPerc = queue.getQueueInfo(false, false) > .getCapacity(); > queueUsagePerc = calc.divide(cluster, usedResourceClone, > Resources.multiply(cluster, queueCapacityPerc)) * 100; > if (Float.isNaN(queueUsagePerc) || Float.isInfinite(queueUsagePerc)) { > queueUsagePerc = 0.0f; > } > clusterUsagePerc = > calc.divide(cluster, usedResourceClone, cluster) * 100; > } > {code} > calc.isInvalidDivisor(cluster) always returns true as gpu resource is 0 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10120) In Federation Router Nodes/Applications/About pages throws 500 exception when https is enabled
[ https://issues.apache.org/jira/browse/YARN-10120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17297905#comment-17297905 ] Bilwa S T commented on YARN-10120: -- Hi [~brahmareddy] looks like this didn't get merged to branch-3.3 . Can you please backport it ? Thanks > In Federation Router Nodes/Applications/About pages throws 500 exception when > https is enabled > -- > > Key: YARN-10120 > URL: https://issues.apache.org/jira/browse/YARN-10120 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Reporter: Sushanta Sen >Assignee: Bilwa S T >Priority: Critical > Fix For: 3.3.0, 3.4.0 > > Attachments: YARN-10120-YARN-7402.patch, > YARN-10120-YARN-7402.v2.patch, YARN-10120-addendum-01.patch, > YARN-10120-branch-3.3.patch, YARN-10120-branch-3.3.v2.patch, > YARN-10120.001.patch, YARN-10120.002.patch > > > In Federation Router Nodes/Applications/About pages throws 500 exception when > https is enabled. > yarn.router.webapp.https.address =router ip:8091 > {noformat} > 2020-02-07 16:38:49,990 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error > handling URI: /cluster/apps > java.lang.reflect.InvocationTargetException > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:166) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) > at > com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:287) > at > com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:277) > at > com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:182) > at > com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) > at > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:85) > at > com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:941) > at > com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:875) > at > com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:829) > at > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82) > at > com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:119) > at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:133) > at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:130) > at > com.google.inject.servlet.GuiceFilter$Context.call(GuiceFilter.java:203) > at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:130) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767) > at > org.apache.hadoop.security.http.XFrameOptionsFilter.doFilter(XFrameOptionsFilter.java:57) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767) > at > org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:644) > at > org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:592) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767) > at > org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1622) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767) > at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767) > at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:583) > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) > at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548) > at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226) > at > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180) > at > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:513) > at > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) > at >
[jira] [Assigned] (YARN-10670) YARN: Opportunistic Container : : In distributed shell job if containers are killed then application is failed. But in this case as containers are killed to make room fo
[ https://issues.apache.org/jira/browse/YARN-10670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T reassigned YARN-10670: Assignee: Bilwa S T > YARN: Opportunistic Container : : In distributed shell job if containers are > killed then application is failed. But in this case as containers are killed > to make room for guaranteed containers which is not correct to fail an > application > > > Key: YARN-10670 > URL: https://issues.apache.org/jira/browse/YARN-10670 > Project: Hadoop YARN > Issue Type: Bug > Components: distributed-shell >Affects Versions: 3.1.1 >Reporter: Sushanta Sen >Assignee: Bilwa S T >Priority: Major > > Preconditions: > # Secure Hadoop 3.1.1 - 3 Nodes cluster is installed > # Set the below parameters in RM:: > yarn.resourcemanager.opportunistic-container-allocation.enabled > true > > # Set this in NM[s]: > yarn.nodemanager.opportunistic-containers-max-queue-length > 30 > > > Test Steps: > Job Command : : yarn > org.apache.hadoop.yarn.applications.distributedshell.Client -jar > HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1-hw-ei-310001-SNAPSHOT.jar > -shell_command sleep -shell_args 20 -num_containers 20 -container_type > OPPORTUNISTIC > Actual Result: Distributed Shell Yarn Job Failed with below Diagnostics > message > {noformat} > Attempt recovered after RM restartApplication Failure: desired = 20, > completed = 20, allocated = 20, failed = 1, diagnostics = [2021-02-09 > 22:11:48.440]Container De-queued to meet NM queuing limits. > [2021-02-09 22:11:48.441]Container terminated before launch. > {noformat} > Expected Result: Distributed Shell Yarn Job should not fail. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10668) [DS] Disable distributed scheduling when client doesn't configure scheduler address as amrmproxy address
Bilwa S T created YARN-10668: Summary: [DS] Disable distributed scheduling when client doesn't configure scheduler address as amrmproxy address Key: YARN-10668 URL: https://issues.apache.org/jira/browse/YARN-10668 Project: Hadoop YARN Issue Type: Bug Reporter: Bilwa S T Assignee: Bilwa S T In distributed scheduling setup if client wants to submit application with normal client conf ie scheduler address not same as amrmproxyaddress , then application fails with Invalid AMRMToken. So i think distributed scheduling should be disabled and job should be executed with opportunistic containers enabled. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10588) Percentage of queue and cluster is zero in WebUI
[ https://issues.apache.org/jira/browse/YARN-10588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17294998#comment-17294998 ] Bilwa S T commented on YARN-10588: -- [~epayne] I have updated the patch. Please take a look at it. Thanks > Percentage of queue and cluster is zero in WebUI > - > > Key: YARN-10588 > URL: https://issues.apache.org/jira/browse/YARN-10588 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10588.001.patch, YARN-10588.002.patch, > YARN-10588.003.patch, YARN-10588.004.patch > > > Steps to reproduce: > Configure below property in resource-types.xml > {code:java} > > yarn.resource-types > yarn.io/gpu > {code} > Submit a job > In UI you can see % Of Queue and % Of Cluster is zero for the submitted > application > > This is because in SchedulerApplicationAttempt has below check for > calculating queueUsagePerc and clusterUsagePerc > {code:java} > if (!calc.isInvalidDivisor(cluster)) { > float queueCapacityPerc = queue.getQueueInfo(false, false) > .getCapacity(); > queueUsagePerc = calc.divide(cluster, usedResourceClone, > Resources.multiply(cluster, queueCapacityPerc)) * 100; > if (Float.isNaN(queueUsagePerc) || Float.isInfinite(queueUsagePerc)) { > queueUsagePerc = 0.0f; > } > clusterUsagePerc = > calc.divide(cluster, usedResourceClone, cluster) * 100; > } > {code} > calc.isInvalidDivisor(cluster) always returns true as gpu resource is 0 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10588) Percentage of queue and cluster is zero in WebUI
[ https://issues.apache.org/jira/browse/YARN-10588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10588: - Attachment: YARN-10588.004.patch > Percentage of queue and cluster is zero in WebUI > - > > Key: YARN-10588 > URL: https://issues.apache.org/jira/browse/YARN-10588 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10588.001.patch, YARN-10588.002.patch, > YARN-10588.003.patch, YARN-10588.004.patch > > > Steps to reproduce: > Configure below property in resource-types.xml > {code:java} > > yarn.resource-types > yarn.io/gpu > {code} > Submit a job > In UI you can see % Of Queue and % Of Cluster is zero for the submitted > application > > This is because in SchedulerApplicationAttempt has below check for > calculating queueUsagePerc and clusterUsagePerc > {code:java} > if (!calc.isInvalidDivisor(cluster)) { > float queueCapacityPerc = queue.getQueueInfo(false, false) > .getCapacity(); > queueUsagePerc = calc.divide(cluster, usedResourceClone, > Resources.multiply(cluster, queueCapacityPerc)) * 100; > if (Float.isNaN(queueUsagePerc) || Float.isInfinite(queueUsagePerc)) { > queueUsagePerc = 0.0f; > } > clusterUsagePerc = > calc.divide(cluster, usedResourceClone, cluster) * 100; > } > {code} > calc.isInvalidDivisor(cluster) always returns true as gpu resource is 0 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-10667) The current logic only sets the subdirectory of nm-aux-services to 700, but does not set nm-aux-services dir.
[ https://issues.apache.org/jira/browse/YARN-10667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T reassigned YARN-10667: Assignee: Bilwa S T > The current logic only sets the subdirectory of nm-aux-services to 700, but > does not set nm-aux-services dir. > -- > > Key: YARN-10667 > URL: https://issues.apache.org/jira/browse/YARN-10667 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Sushanta Sen >Assignee: Bilwa S T >Priority: Major > Attachments: Permission 755.PNG > > > Current code logic only sets the subdirectory of nm-aux-services to 700, but > does not set nm-aux-services dir. > The permissions of some files and directories in the yarn deployment node are > 755. > !Permission 755.PNG! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9017) PlacementRule order is not maintained in CS
[ https://issues.apache.org/jira/browse/YARN-9017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17286510#comment-17286510 ] Bilwa S T commented on YARN-9017: - [~brahmareddy] please cherry-pick this to branch 3.3.1. Thanks > PlacementRule order is not maintained in CS > --- > > Key: YARN-9017 > URL: https://issues.apache.org/jira/browse/YARN-9017 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.4.0 >Reporter: Bibin Chundatt >Assignee: Bilwa S T >Priority: Major > Fix For: 3.4.0 > > Attachments: YARN-9017.001.patch, YARN-9017.002.patch, > YARN-9017.003.patch > > > {{yarn.scheduler.queue-placement-rules}} doesn't work as expected in Capacity > Scheduler > {quote} > * **Queue Mapping Interface based on Default or User Defined Placement > Rules** - This feature allows users to map a job to a specific queue based on > some default placement rule. For instance based on user & group, or > application name. User can also define their own placement rule. > {quote} > As per current UserGroupMapping is always added in placementRule. > {{CapacityScheduler#updatePlacementRules}} > {code} > // Initialize placement rules > Collection placementRuleStrs = conf.getStringCollection( > YarnConfiguration.QUEUE_PLACEMENT_RULES); > List placementRules = new ArrayList<>(); > ... > // add UserGroupMappingPlacementRule if absent > distingushRuleSet.add(YarnConfiguration.USER_GROUP_PLACEMENT_RULE); > {code} > PlacementRule configuration order is not maintained -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9606) Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient
[ https://issues.apache.org/jira/browse/YARN-9606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17286507#comment-17286507 ] Bilwa S T commented on YARN-9606: - [~brahmareddy] please cherry-pick this to branch 3.3.1. Thanks > Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient > -- > > Key: YARN-9606 > URL: https://issues.apache.org/jira/browse/YARN-9606 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Fix For: 3.4.0 > > Attachments: YARN-9606-001.patch, YARN-9606-002.patch, > YARN-9606.003.patch, YARN-9606.004.patch, YARN-9606.005.patch, > YARN-9606.006.patch, YARN-9606.007.patch, YARN-9606.008.patch > > > Yarn logs fails for running containers > > > {quote} > > > > Unable to fetch log files list > Exception in thread "main" java.io.IOException: > com.sun.jersey.api.client.ClientHandlerException: > javax.net.ssl.SSLHandshakeException: Error while authenticating with > endpoint: > [https://vm2:65321/ws/v1/node/containers/container_e05_1559802125016_0001_01_08/logs] > at > org.apache.hadoop.yarn.client.cli.LogsCLI.getContainerLogFiles(LogsCLI.java:543) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedContainerLogFiles(LogsCLI.java:1338) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedOptionForRunningApp(LogsCLI.java:1514) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.fetchContainerLogs(LogsCLI.java:1052) > at org.apache.hadoop.yarn.client.cli.LogsCLI.runCommand(LogsCLI.java:367) > at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:152) > at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:399) > {quote} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9301) Too many InvalidStateTransitionException with SLS
[ https://issues.apache.org/jira/browse/YARN-9301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17286509#comment-17286509 ] Bilwa S T commented on YARN-9301: - [~brahmareddy] please cherry-pick this to branch 3.3.1. Thanks > Too many InvalidStateTransitionException with SLS > - > > Key: YARN-9301 > URL: https://issues.apache.org/jira/browse/YARN-9301 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin Chundatt >Assignee: Bilwa S T >Priority: Major > Labels: simulator > Fix For: 3.4.0 > > Attachments: YARN-9301-001.patch, YARN-9301.002.patch > > > Too many InvalidStateTransistionExcetion > {noformat} > 19/02/13 17:44:43 ERROR rmcontainer.RMContainerImpl: Can't handle this event > at current state > org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: > LAUNCHED at RUNNING > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487) > at > org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:483) > at > org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:65) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.containerLaunchedOnNode(SchedulerApplicationAttempt.java:655) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.containerLaunchedOnNode(AbstractYarnScheduler.java:359) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.updateNewContainerInfo(AbstractYarnScheduler.java:1010) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.nodeUpdate(AbstractYarnScheduler.java:1112) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:1295) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1752) > at > org.apache.hadoop.yarn.sls.scheduler.SLSCapacityScheduler.handle(SLSCapacityScheduler.java:205) > at > org.apache.hadoop.yarn.sls.scheduler.SLSCapacityScheduler.handle(SLSCapacityScheduler.java:60) > at > org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66) > at java.lang.Thread.run(Thread.java:745) > 19/02/13 17:44:43 ERROR rmcontainer.RMContainerImpl: Invalid event LAUNCHED > on container container_1550059705491_0067_01_01 > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8942) PriorityBasedRouterPolicy throws exception if all sub-cluster weights have negative value
[ https://issues.apache.org/jira/browse/YARN-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17286506#comment-17286506 ] Bilwa S T commented on YARN-8942: - [~brahmareddy] please cherry-pick this to branch 3.3.1. Thanks > PriorityBasedRouterPolicy throws exception if all sub-cluster weights have > negative value > - > > Key: YARN-8942 > URL: https://issues.apache.org/jira/browse/YARN-8942 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.3.0 >Reporter: Akshay Agarwal >Assignee: Bilwa S T >Priority: Minor > Fix For: 3.4.0 > > Attachments: YARN-8942.001.patch, YARN-8942.002.patch > > > In *PriorityBasedRouterPolicy* if all sub-cluster weights are *set to > negative values* it is throwing exception while running a job. > Ideally it should handle the negative priority as well according to the home > sub cluster selection process of the policy. > *Exception Details:* > {code:java} > java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Unable > to insert the ApplicationId application_1540356760422_0015 into the > FederationStateStore > at > org.apache.hadoop.yarn.server.router.RouterServerUtil.logAndThrowException(RouterServerUtil.java:56) > at > org.apache.hadoop.yarn.server.router.clientrm.FederationClientInterceptor.submitApplication(FederationClientInterceptor.java:418) > at > org.apache.hadoop.yarn.server.router.clientrm.RouterClientRMService.submitApplication(RouterClientRMService.java:218) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:282) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:579) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) > Caused by: > org.apache.hadoop.yarn.server.federation.store.exception.FederationStateStoreInvalidInputException: > Missing SubCluster Id information. Please try again by specifying Subcluster > Id information. > at > org.apache.hadoop.yarn.server.federation.store.utils.FederationMembershipStateStoreInputValidator.checkSubClusterId(FederationMembershipStateStoreInputValidator.java:247) > at > org.apache.hadoop.yarn.server.federation.store.utils.FederationApplicationHomeSubClusterStoreInputValidator.checkApplicationHomeSubCluster(FederationApplicationHomeSubClusterStoreInputValidator.java:160) > at > org.apache.hadoop.yarn.server.federation.store.utils.FederationApplicationHomeSubClusterStoreInputValidator.validate(FederationApplicationHomeSubClusterStoreInputValidator.java:65) > at > org.apache.hadoop.yarn.server.federation.store.impl.ZookeeperFederationStateStore.addApplicationHomeSubCluster(ZookeeperFederationStateStore.java:159) > at sun.reflect.GeneratedMethodAccessor30.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359) > at com.sun.proxy.$Proxy84.addApplicationHomeSubCluster(Unknown Source) > at > org.apache.hadoop.yarn.server.federation.utils.FederationStateStoreFacade.addApplicationHomeSubCluster(FederationStateStoreFacade.java:402) > at > org.apache.hadoop.yarn.server.router.clientrm.FederationClientInterceptor.submitApplication(FederationClientInterceptor.java:413) > ... 11 more > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (YARN-10359) Log container report only if list is not empty
[ https://issues.apache.org/jira/browse/YARN-10359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17286504#comment-17286504 ] Bilwa S T commented on YARN-10359: -- [~brahmareddy] please cherry-pick this to branch 3.3.1. Thanks > Log container report only if list is not empty > -- > > Key: YARN-10359 > URL: https://issues.apache.org/jira/browse/YARN-10359 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Minor > Fix For: 3.4.0 > > Attachments: YARN-10359.001.patch, YARN-10359.002.patch > > > In NodeStatusUpdaterImpl print log only if containerReports list is not empty > {code:java} > if (containerReports != null) { > LOG.info("Registering with RM using containers :" + containerReports); > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10364) Absolute Resource [memory=0] is considered as Percentage config type
[ https://issues.apache.org/jira/browse/YARN-10364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17286503#comment-17286503 ] Bilwa S T commented on YARN-10364: -- [~brahmareddy] please cherry-pick this to branch 3.3.1. Thanks > Absolute Resource [memory=0] is considered as Percentage config type > > > Key: YARN-10364 > URL: https://issues.apache.org/jira/browse/YARN-10364 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.4.0 >Reporter: Prabhu Joseph >Assignee: Bilwa S T >Priority: Major > Fix For: 3.4.0 > > Attachments: YARN-10364.001.patch, YARN-10364.002.patch, > YARN-10364.003.patch > > > Absolute Resource [memory=0] is considered as Percentage config type. This > causes failure while converting queues from Percentage to Absolute Resources > automatically. > *Repro:* > 1. Queue A = 100% and child queues Queue A.B = 0%, A.C=100% > 2. While converting above to absolute resource automatically, capacity of > queue A = [memory=], A.B = [memory=0] > This fails with below as A is considered as Absolute Resource whereas B is > considered as Percentage config type. > {code} > 2020-07-23 09:36:40,499 WARN > org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices: > CapacityScheduler configuration validation failed:java.io.IOException: Failed > to re-init queues : Parent queue 'root.A' and child queue 'root.A.B' should > use either percentage based capacityconfiguration or absolute resource > together for label: > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10588) Percentage of queue and cluster is zero in WebUI
[ https://issues.apache.org/jira/browse/YARN-10588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17286446#comment-17286446 ] Bilwa S T edited comment on YARN-10588 at 2/18/21, 12:27 PM: - [~Jim_Brennan] [~epayne] Changing *DominantResourceCalculator#isInvalidDivisor* to *DominantResourceCalculator#isAllInvalidDivisor* would solve problem. What do you think? {quote} Currently it returns true if any resource is zero, while {{divide}} is only going to return zero if all of the countable ones are zero. {quote} was (Author: bilwast): [~Jim_Brennan] [~epayne] Changing *DominantResourceCalculator#isInvalidDivisor* to ** *DominantResourceCalculator#isAllInvalidDivisor* would solve problem. What do you think? {quote} Currently it returns true if any resource is zero, while {{divide}} is only going to return zero if all of the countable ones are zero. {quote} > Percentage of queue and cluster is zero in WebUI > - > > Key: YARN-10588 > URL: https://issues.apache.org/jira/browse/YARN-10588 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10588.001.patch, YARN-10588.002.patch, > YARN-10588.003.patch > > > Steps to reproduce: > Configure below property in resource-types.xml > {code:java} > > yarn.resource-types > yarn.io/gpu > {code} > Submit a job > In UI you can see % Of Queue and % Of Cluster is zero for the submitted > application > > This is because in SchedulerApplicationAttempt has below check for > calculating queueUsagePerc and clusterUsagePerc > {code:java} > if (!calc.isInvalidDivisor(cluster)) { > float queueCapacityPerc = queue.getQueueInfo(false, false) > .getCapacity(); > queueUsagePerc = calc.divide(cluster, usedResourceClone, > Resources.multiply(cluster, queueCapacityPerc)) * 100; > if (Float.isNaN(queueUsagePerc) || Float.isInfinite(queueUsagePerc)) { > queueUsagePerc = 0.0f; > } > clusterUsagePerc = > calc.divide(cluster, usedResourceClone, cluster) * 100; > } > {code} > calc.isInvalidDivisor(cluster) always returns true as gpu resource is 0 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10588) Percentage of queue and cluster is zero in WebUI
[ https://issues.apache.org/jira/browse/YARN-10588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17286446#comment-17286446 ] Bilwa S T edited comment on YARN-10588 at 2/18/21, 12:27 PM: - [~Jim_Brennan] [~epayne] Changing *DominantResourceCalculator#isInvalidDivisor* to ** *DominantResourceCalculator#isAllInvalidDivisor* would solve problem. What do you think? {quote} Currently it returns true if any resource is zero, while {{divide}} is only going to return zero if all of the countable ones are zero. {quote} was (Author: bilwast): [~Jim_Brennan] [~epayne] Changing *DominantResourceCalculator#isInvalidDivisor* to ** *DominantResourceCalculator#isAllInvalidDivisor* would ** solve problem. What do you think?** {quote} Currently it returns true if any resource is zero, while {{divide}} is only going to return zero if all of the countable ones are zero. {quote} > Percentage of queue and cluster is zero in WebUI > - > > Key: YARN-10588 > URL: https://issues.apache.org/jira/browse/YARN-10588 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10588.001.patch, YARN-10588.002.patch, > YARN-10588.003.patch > > > Steps to reproduce: > Configure below property in resource-types.xml > {code:java} > > yarn.resource-types > yarn.io/gpu > {code} > Submit a job > In UI you can see % Of Queue and % Of Cluster is zero for the submitted > application > > This is because in SchedulerApplicationAttempt has below check for > calculating queueUsagePerc and clusterUsagePerc > {code:java} > if (!calc.isInvalidDivisor(cluster)) { > float queueCapacityPerc = queue.getQueueInfo(false, false) > .getCapacity(); > queueUsagePerc = calc.divide(cluster, usedResourceClone, > Resources.multiply(cluster, queueCapacityPerc)) * 100; > if (Float.isNaN(queueUsagePerc) || Float.isInfinite(queueUsagePerc)) { > queueUsagePerc = 0.0f; > } > clusterUsagePerc = > calc.divide(cluster, usedResourceClone, cluster) * 100; > } > {code} > calc.isInvalidDivisor(cluster) always returns true as gpu resource is 0 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10588) Percentage of queue and cluster is zero in WebUI
[ https://issues.apache.org/jira/browse/YARN-10588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17286446#comment-17286446 ] Bilwa S T commented on YARN-10588: -- [~Jim_Brennan] [~epayne] Changing *DominantResourceCalculator#isInvalidDivisor* to ** *DominantResourceCalculator#isAllInvalidDivisor* would ** solve problem. What do you think?** {quote} Currently it returns true if any resource is zero, while {{divide}} is only going to return zero if all of the countable ones are zero. {quote} > Percentage of queue and cluster is zero in WebUI > - > > Key: YARN-10588 > URL: https://issues.apache.org/jira/browse/YARN-10588 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10588.001.patch, YARN-10588.002.patch, > YARN-10588.003.patch > > > Steps to reproduce: > Configure below property in resource-types.xml > {code:java} > > yarn.resource-types > yarn.io/gpu > {code} > Submit a job > In UI you can see % Of Queue and % Of Cluster is zero for the submitted > application > > This is because in SchedulerApplicationAttempt has below check for > calculating queueUsagePerc and clusterUsagePerc > {code:java} > if (!calc.isInvalidDivisor(cluster)) { > float queueCapacityPerc = queue.getQueueInfo(false, false) > .getCapacity(); > queueUsagePerc = calc.divide(cluster, usedResourceClone, > Resources.multiply(cluster, queueCapacityPerc)) * 100; > if (Float.isNaN(queueUsagePerc) || Float.isInfinite(queueUsagePerc)) { > queueUsagePerc = 0.0f; > } > clusterUsagePerc = > calc.divide(cluster, usedResourceClone, cluster) * 100; > } > {code} > calc.isInvalidDivisor(cluster) always returns true as gpu resource is 0 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-10634) The config parameter "mapreduce.job.num-opportunistic-maps-percent" is confusing when requesting Opportunistic containers in YARN job
[ https://issues.apache.org/jira/browse/YARN-10634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T reassigned YARN-10634: Assignee: Bilwa S T > The config parameter "mapreduce.job.num-opportunistic-maps-percent" is > confusing when requesting Opportunistic containers in YARN job > - > > Key: YARN-10634 > URL: https://issues.apache.org/jira/browse/YARN-10634 > Project: Hadoop YARN > Issue Type: Bug > Components: applications >Reporter: Sushanta Sen >Assignee: Bilwa S T >Priority: Minor > > Execute the below job by Passing this config > -Dmapreduce.job.num-opportunistic-maps-percent ,which actually represents the > number of containers to be launched as Opportunistic, not in % of the total > mappers requested , i think this configuration name should be modified > accordingly and also {color:#de350b}the same gets printed in AM logs{color} > Job Command: hadoop jar > HDFS/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.1-hw-ei-310001-SNAPSHOT.jar > pi -{color:#de350b}Dmapreduce.job.num-opportunistic-maps-percent{color}="20" > 20 99 > In AM logs this message is displayed. it should be {color:#de350b}20 , not > 20% {color}? > “2021-02-10 20:23:23,023 | INFO | main | {color:#de350b}20% of the > mappers{color} will be scheduled using OPPORTUNISTIC containers | > RMContainerAllocator.java:257” > Job Command: hadoop jar > HDFS/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.1-hw-ei-310001-SNAPSHOT.jar > pi > {color:#de350b}-Dmapreduce.job.num-opportunistic-maps-percent{color}="100" 20 > 99 > In AM logs this message is displayed. It should be {color:#de350b}100, not > 100%{color} ? > 2021-02-10 20:28:16,016 | INFO | main | {color:#de350b}100% of the > mapper{color}s will be scheduled using OPPORTUNISTIC containers | > RMContainerAllocator.java:257 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8047) RMWebApp make external class pluggable
[ https://issues.apache.org/jira/browse/YARN-8047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17286357#comment-17286357 ] Bilwa S T commented on YARN-8047: - Hi [~brahma] can you please cherry-pick this Jira to 3.3.1 ? Thanks > RMWebApp make external class pluggable > -- > > Key: YARN-8047 > URL: https://issues.apache.org/jira/browse/YARN-8047 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Bibin Chundatt >Assignee: Bilwa S T >Priority: Minor > Fix For: 3.4.0 > > Attachments: YARN-8047-001.patch, YARN-8047-002.patch, > YARN-8047-003.patch, YARN-8047.004.patch, YARN-8047.005.patch, > YARN-8047.006.patch > > > JIra should make sure we should be able to plugin webservices and web pages > of scheduler in Resourcemanager > * RMWebApp allow to bind external classes > * RMController allow to plugin scheduler classes -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10258) Add metrics for 'ApplicationsRunning' in NodeManager
[ https://issues.apache.org/jira/browse/YARN-10258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17285697#comment-17285697 ] Bilwa S T commented on YARN-10258: -- Thank you [~gb.ana...@gmail.com] for your contribution. Patch LGTM. there are few checkstyle issues. Please fix. Resubmitting patch to trigger build again > Add metrics for 'ApplicationsRunning' in NodeManager > > > Key: YARN-10258 > URL: https://issues.apache.org/jira/browse/YARN-10258 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 3.1.3 >Reporter: ANANDA G B >Assignee: ANANDA G B >Priority: Minor > Attachments: YARN-10258-001.patch, YARN-10258-002.patch > > > Add metrics for 'ApplicationsRunning' in NodeManagers. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10258) Add metrics for 'ApplicationsRunning' in NodeManager
[ https://issues.apache.org/jira/browse/YARN-10258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10258: - Attachment: YARN-10258-002.patch > Add metrics for 'ApplicationsRunning' in NodeManager > > > Key: YARN-10258 > URL: https://issues.apache.org/jira/browse/YARN-10258 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 3.1.3 >Reporter: ANANDA G B >Assignee: ANANDA G B >Priority: Minor > Attachments: YARN-10258-001.patch, YARN-10258-002.patch > > > Add metrics for 'ApplicationsRunning' in NodeManagers. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10258) Add metrics for 'ApplicationsRunning' in NodeManager
[ https://issues.apache.org/jira/browse/YARN-10258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10258: - Target Version/s: (was: 3.1.3) > Add metrics for 'ApplicationsRunning' in NodeManager > > > Key: YARN-10258 > URL: https://issues.apache.org/jira/browse/YARN-10258 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 3.1.3 >Reporter: ANANDA G B >Assignee: ANANDA G B >Priority: Minor > Attachments: YARN-10258-001.patch > > > Add metrics for 'ApplicationsRunning' in NodeManagers. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10258) Add metrics for 'ApplicationsRunning' in NodeManager
[ https://issues.apache.org/jira/browse/YARN-10258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10258: - Fix Version/s: (was: 3.1.3) > Add metrics for 'ApplicationsRunning' in NodeManager > > > Key: YARN-10258 > URL: https://issues.apache.org/jira/browse/YARN-10258 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 3.1.3 >Reporter: ANANDA G B >Assignee: ANANDA G B >Priority: Minor > Attachments: YARN-10258-001.patch > > > Add metrics for 'ApplicationsRunning' in NodeManagers. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Issue Comment Deleted] (YARN-10258) Add metrics for 'ApplicationsRunning' in NodeManager
[ https://issues.apache.org/jira/browse/YARN-10258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10258: - Comment: was deleted (was: Thank you [~gb.ana...@gmail.com] for working on this. Looks there are some checkstyle issues. other than that patch LGTM) > Add metrics for 'ApplicationsRunning' in NodeManager > > > Key: YARN-10258 > URL: https://issues.apache.org/jira/browse/YARN-10258 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 3.1.3 >Reporter: ANANDA G B >Assignee: ANANDA G B >Priority: Minor > Fix For: 3.1.3 > > Attachments: YARN-10258-001.patch > > > Add metrics for 'ApplicationsRunning' in NodeManagers. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10258) Add metrics for 'ApplicationsRunning' in NodeManager
[ https://issues.apache.org/jira/browse/YARN-10258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17285691#comment-17285691 ] Bilwa S T commented on YARN-10258: -- Thank you [~gb.ana...@gmail.com] for working on this. Looks there are some checkstyle issues. other than that patch LGTM > Add metrics for 'ApplicationsRunning' in NodeManager > > > Key: YARN-10258 > URL: https://issues.apache.org/jira/browse/YARN-10258 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 3.1.3 >Reporter: ANANDA G B >Assignee: ANANDA G B >Priority: Minor > Fix For: 3.1.3 > > Attachments: YARN-10258-001.patch > > > Add metrics for 'ApplicationsRunning' in NodeManagers. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10588) Percentage of queue and cluster is zero in WebUI
[ https://issues.apache.org/jira/browse/YARN-10588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17282988#comment-17282988 ] Bilwa S T edited comment on YARN-10588 at 2/12/21, 8:23 AM: [~epayne] Modifying *DominantResourceCalculator#isInvalidDivisor* to match logic of *DominantResourceCalculator#divide* is nothing but returning true only if all resource value is *0*. We already have a method called *DominantResourceCalculator#isAllInvalidDivisor* which will return true only if all resources are *zero*. I think we can just change isInvalidDivisor to isAllInvalidDivisor. Correct me if i am wrong was (Author: bilwast): [~epayne] Modifying *DominantResourceCalculator#isInvalidDivisor* to match logic of *DominantResourceCalculator#divide* is nothing but returning true only if all resource value is *0*. We already have a method called *DominantResourceCalculator#isAllInvalidDivisor* which will return true only if all resources are *zero*. I think we can just change isInvalidDivisor to isAllInvalidDivisor. > Percentage of queue and cluster is zero in WebUI > - > > Key: YARN-10588 > URL: https://issues.apache.org/jira/browse/YARN-10588 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10588.001.patch, YARN-10588.002.patch, > YARN-10588.003.patch > > > Steps to reproduce: > Configure below property in resource-types.xml > {code:java} > > yarn.resource-types > yarn.io/gpu > {code} > Submit a job > In UI you can see % Of Queue and % Of Cluster is zero for the submitted > application > > This is because in SchedulerApplicationAttempt has below check for > calculating queueUsagePerc and clusterUsagePerc > {code:java} > if (!calc.isInvalidDivisor(cluster)) { > float queueCapacityPerc = queue.getQueueInfo(false, false) > .getCapacity(); > queueUsagePerc = calc.divide(cluster, usedResourceClone, > Resources.multiply(cluster, queueCapacityPerc)) * 100; > if (Float.isNaN(queueUsagePerc) || Float.isInfinite(queueUsagePerc)) { > queueUsagePerc = 0.0f; > } > clusterUsagePerc = > calc.divide(cluster, usedResourceClone, cluster) * 100; > } > {code} > calc.isInvalidDivisor(cluster) always returns true as gpu resource is 0 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10588) Percentage of queue and cluster is zero in WebUI
[ https://issues.apache.org/jira/browse/YARN-10588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17282988#comment-17282988 ] Bilwa S T commented on YARN-10588: -- [~epayne] Modifying *DominantResourceCalculator#isInvalidDivisor* to match logic of *DominantResourceCalculator#divide* is nothing but returning true only if all resource value is *0*. We already have a method called *DominantResourceCalculator#isAllInvalidDivisor* which will return true only if all resources are *zero*. I think we can just change isInvalidDivisor to isAllInvalidDivisor. > Percentage of queue and cluster is zero in WebUI > - > > Key: YARN-10588 > URL: https://issues.apache.org/jira/browse/YARN-10588 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10588.001.patch, YARN-10588.002.patch, > YARN-10588.003.patch > > > Steps to reproduce: > Configure below property in resource-types.xml > {code:java} > > yarn.resource-types > yarn.io/gpu > {code} > Submit a job > In UI you can see % Of Queue and % Of Cluster is zero for the submitted > application > > This is because in SchedulerApplicationAttempt has below check for > calculating queueUsagePerc and clusterUsagePerc > {code:java} > if (!calc.isInvalidDivisor(cluster)) { > float queueCapacityPerc = queue.getQueueInfo(false, false) > .getCapacity(); > queueUsagePerc = calc.divide(cluster, usedResourceClone, > Resources.multiply(cluster, queueCapacityPerc)) * 100; > if (Float.isNaN(queueUsagePerc) || Float.isInfinite(queueUsagePerc)) { > queueUsagePerc = 0.0f; > } > clusterUsagePerc = > calc.divide(cluster, usedResourceClone, cluster) * 100; > } > {code} > calc.isInvalidDivisor(cluster) always returns true as gpu resource is 0 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-9927) RM multi-thread event processing mechanism
[ https://issues.apache.org/jira/browse/YARN-9927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T reassigned YARN-9927: --- Assignee: Bilwa S T > RM multi-thread event processing mechanism > -- > > Key: YARN-9927 > URL: https://issues.apache.org/jira/browse/YARN-9927 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 3.0.0, 2.9.2 >Reporter: hcarrot >Assignee: Bilwa S T >Priority: Major > Attachments: RM multi-thread event processing mechanism.pdf, > YARN-9927.001.patch > > > Recently, we have observed serious event blocking in RM event dispatcher > queue. After analysis of RM event monitoring data and RM event processing > logic, we found that > 1) environment: a cluster with thousands of nodes > 2) RMNodeStatusEvent dominates 90% time consumption of RM event scheduler > 3) Meanwhile, RM event processing is in a single-thread mode, and It results > in the low headroom of RM event scheduler, thus performance of RM. > So we proposed a RM multi-thread event processing mechanism to improve RM > performance. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10588) Percentage of queue and cluster is zero in WebUI
[ https://issues.apache.org/jira/browse/YARN-10588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17282328#comment-17282328 ] Bilwa S T commented on YARN-10588: -- Hi [~epayne] I added change in FicaSchedulerApp.java as same issue can occur ie cluster and queue resource will not be calculated if one of the resource is zero. I added instanceOf check because that method is applicable only for capacityscheduler . Many testcases were failing once i removed DominantResourceCalculator.isInvalidDivisor() check as testcases had configured Fifoscheduler. > Percentage of queue and cluster is zero in WebUI > - > > Key: YARN-10588 > URL: https://issues.apache.org/jira/browse/YARN-10588 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10588.001.patch, YARN-10588.002.patch, > YARN-10588.003.patch > > > Steps to reproduce: > Configure below property in resource-types.xml > {code:java} > > yarn.resource-types > yarn.io/gpu > {code} > Submit a job > In UI you can see % Of Queue and % Of Cluster is zero for the submitted > application > > This is because in SchedulerApplicationAttempt has below check for > calculating queueUsagePerc and clusterUsagePerc > {code:java} > if (!calc.isInvalidDivisor(cluster)) { > float queueCapacityPerc = queue.getQueueInfo(false, false) > .getCapacity(); > queueUsagePerc = calc.divide(cluster, usedResourceClone, > Resources.multiply(cluster, queueCapacityPerc)) * 100; > if (Float.isNaN(queueUsagePerc) || Float.isInfinite(queueUsagePerc)) { > queueUsagePerc = 0.0f; > } > clusterUsagePerc = > calc.divide(cluster, usedResourceClone, cluster) * 100; > } > {code} > calc.isInvalidDivisor(cluster) always returns true as gpu resource is 0 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10588) Percentage of queue and cluster is zero in WebUI
[ https://issues.apache.org/jira/browse/YARN-10588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17282328#comment-17282328 ] Bilwa S T edited comment on YARN-10588 at 2/10/21, 9:19 AM: Thanks [~epayne] [~Jim_Brennan] for taking a look at this issue. I added change in FicaSchedulerApp.java as same issue can occur ie cluster and queue resource will not be calculated if one of the resource is zero. I added instanceOf check because that method is applicable only for capacityscheduler . Many testcases were failing once i removed DominantResourceCalculator.isInvalidDivisor() check as testcases had configured Fifoscheduler. was (Author: bilwast): Hi [~epayne] I added change in FicaSchedulerApp.java as same issue can occur ie cluster and queue resource will not be calculated if one of the resource is zero. I added instanceOf check because that method is applicable only for capacityscheduler . Many testcases were failing once i removed DominantResourceCalculator.isInvalidDivisor() check as testcases had configured Fifoscheduler. > Percentage of queue and cluster is zero in WebUI > - > > Key: YARN-10588 > URL: https://issues.apache.org/jira/browse/YARN-10588 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10588.001.patch, YARN-10588.002.patch, > YARN-10588.003.patch > > > Steps to reproduce: > Configure below property in resource-types.xml > {code:java} > > yarn.resource-types > yarn.io/gpu > {code} > Submit a job > In UI you can see % Of Queue and % Of Cluster is zero for the submitted > application > > This is because in SchedulerApplicationAttempt has below check for > calculating queueUsagePerc and clusterUsagePerc > {code:java} > if (!calc.isInvalidDivisor(cluster)) { > float queueCapacityPerc = queue.getQueueInfo(false, false) > .getCapacity(); > queueUsagePerc = calc.divide(cluster, usedResourceClone, > Resources.multiply(cluster, queueCapacityPerc)) * 100; > if (Float.isNaN(queueUsagePerc) || Float.isInfinite(queueUsagePerc)) { > queueUsagePerc = 0.0f; > } > clusterUsagePerc = > calc.divide(cluster, usedResourceClone, cluster) * 100; > } > {code} > calc.isInvalidDivisor(cluster) always returns true as gpu resource is 0 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10588) Percentage of queue and cluster is zero in WebUI
[ https://issues.apache.org/jira/browse/YARN-10588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278786#comment-17278786 ] Bilwa S T commented on YARN-10588: -- cc [~epayne] [~Jim_Brennan] > Percentage of queue and cluster is zero in WebUI > - > > Key: YARN-10588 > URL: https://issues.apache.org/jira/browse/YARN-10588 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10588.001.patch, YARN-10588.002.patch, > YARN-10588.003.patch > > > Steps to reproduce: > Configure below property in resource-types.xml > {code:java} > > yarn.resource-types > yarn.io/gpu > {code} > Submit a job > In UI you can see % Of Queue and % Of Cluster is zero for the submitted > application > > This is because in SchedulerApplicationAttempt has below check for > calculating queueUsagePerc and clusterUsagePerc > {code:java} > if (!calc.isInvalidDivisor(cluster)) { > float queueCapacityPerc = queue.getQueueInfo(false, false) > .getCapacity(); > queueUsagePerc = calc.divide(cluster, usedResourceClone, > Resources.multiply(cluster, queueCapacityPerc)) * 100; > if (Float.isNaN(queueUsagePerc) || Float.isInfinite(queueUsagePerc)) { > queueUsagePerc = 0.0f; > } > clusterUsagePerc = > calc.divide(cluster, usedResourceClone, cluster) * 100; > } > {code} > calc.isInvalidDivisor(cluster) always returns true as gpu resource is 0 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10588) Percentage of queue and cluster is zero in WebUI
[ https://issues.apache.org/jira/browse/YARN-10588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17277110#comment-17277110 ] Bilwa S T commented on YARN-10588: -- Fixed all UT's in .003 patch > Percentage of queue and cluster is zero in WebUI > - > > Key: YARN-10588 > URL: https://issues.apache.org/jira/browse/YARN-10588 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10588.001.patch, YARN-10588.002.patch, > YARN-10588.003.patch > > > Steps to reproduce: > Configure below property in resource-types.xml > {code:java} > > yarn.resource-types > yarn.io/gpu > {code} > Submit a job > In UI you can see % Of Queue and % Of Cluster is zero for the submitted > application > > This is because in SchedulerApplicationAttempt has below check for > calculating queueUsagePerc and clusterUsagePerc > {code:java} > if (!calc.isInvalidDivisor(cluster)) { > float queueCapacityPerc = queue.getQueueInfo(false, false) > .getCapacity(); > queueUsagePerc = calc.divide(cluster, usedResourceClone, > Resources.multiply(cluster, queueCapacityPerc)) * 100; > if (Float.isNaN(queueUsagePerc) || Float.isInfinite(queueUsagePerc)) { > queueUsagePerc = 0.0f; > } > clusterUsagePerc = > calc.divide(cluster, usedResourceClone, cluster) * 100; > } > {code} > calc.isInvalidDivisor(cluster) always returns true as gpu resource is 0 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10588) Percentage of queue and cluster is zero in WebUI
[ https://issues.apache.org/jira/browse/YARN-10588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10588: - Attachment: YARN-10588.003.patch > Percentage of queue and cluster is zero in WebUI > - > > Key: YARN-10588 > URL: https://issues.apache.org/jira/browse/YARN-10588 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10588.001.patch, YARN-10588.002.patch, > YARN-10588.003.patch > > > Steps to reproduce: > Configure below property in resource-types.xml > {code:java} > > yarn.resource-types > yarn.io/gpu > {code} > Submit a job > In UI you can see % Of Queue and % Of Cluster is zero for the submitted > application > > This is because in SchedulerApplicationAttempt has below check for > calculating queueUsagePerc and clusterUsagePerc > {code:java} > if (!calc.isInvalidDivisor(cluster)) { > float queueCapacityPerc = queue.getQueueInfo(false, false) > .getCapacity(); > queueUsagePerc = calc.divide(cluster, usedResourceClone, > Resources.multiply(cluster, queueCapacityPerc)) * 100; > if (Float.isNaN(queueUsagePerc) || Float.isInfinite(queueUsagePerc)) { > queueUsagePerc = 0.0f; > } > clusterUsagePerc = > calc.divide(cluster, usedResourceClone, cluster) * 100; > } > {code} > calc.isInvalidDivisor(cluster) always returns true as gpu resource is 0 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10588) Percentage of queue and cluster is zero in WebUI
[ https://issues.apache.org/jira/browse/YARN-10588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10588: - Attachment: YARN-10588.002.patch > Percentage of queue and cluster is zero in WebUI > - > > Key: YARN-10588 > URL: https://issues.apache.org/jira/browse/YARN-10588 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10588.001.patch, YARN-10588.002.patch > > > Steps to reproduce: > Configure below property in resource-types.xml > {code:java} > > yarn.resource-types > yarn.io/gpu > {code} > Submit a job > In UI you can see % Of Queue and % Of Cluster is zero for the submitted > application > > This is because in SchedulerApplicationAttempt has below check for > calculating queueUsagePerc and clusterUsagePerc > {code:java} > if (!calc.isInvalidDivisor(cluster)) { > float queueCapacityPerc = queue.getQueueInfo(false, false) > .getCapacity(); > queueUsagePerc = calc.divide(cluster, usedResourceClone, > Resources.multiply(cluster, queueCapacityPerc)) * 100; > if (Float.isNaN(queueUsagePerc) || Float.isInfinite(queueUsagePerc)) { > queueUsagePerc = 0.0f; > } > clusterUsagePerc = > calc.divide(cluster, usedResourceClone, cluster) * 100; > } > {code} > calc.isInvalidDivisor(cluster) always returns true as gpu resource is 0 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10588) Percentage of queue and cluster is zero in WebUI
[ https://issues.apache.org/jira/browse/YARN-10588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10588: - Attachment: YARN-10588.001.patch > Percentage of queue and cluster is zero in WebUI > - > > Key: YARN-10588 > URL: https://issues.apache.org/jira/browse/YARN-10588 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10588.001.patch > > > Steps to reproduce: > Configure below property in resource-types.xml > {code:java} > > yarn.resource-types > yarn.io/gpu > {code} > Submit a job > In UI you can see % Of Queue and % Of Cluster is zero for the submitted > application > > This is because in SchedulerApplicationAttempt has below check for > calculating queueUsagePerc and clusterUsagePerc > {code:java} > if (!calc.isInvalidDivisor(cluster)) { > float queueCapacityPerc = queue.getQueueInfo(false, false) > .getCapacity(); > queueUsagePerc = calc.divide(cluster, usedResourceClone, > Resources.multiply(cluster, queueCapacityPerc)) * 100; > if (Float.isNaN(queueUsagePerc) || Float.isInfinite(queueUsagePerc)) { > queueUsagePerc = 0.0f; > } > clusterUsagePerc = > calc.divide(cluster, usedResourceClone, cluster) * 100; > } > {code} > calc.isInvalidDivisor(cluster) always returns true as gpu resource is 0 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10588) Percentage of queue and cluster is zero in WebUI
[ https://issues.apache.org/jira/browse/YARN-10588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17269264#comment-17269264 ] Bilwa S T commented on YARN-10588: -- I think calc.isInvalidDivisor(cluster) check can be removed as YARN-9785 handles divide function when resource is zero > Percentage of queue and cluster is zero in WebUI > - > > Key: YARN-10588 > URL: https://issues.apache.org/jira/browse/YARN-10588 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > > Steps to reproduce: > Configure below property in resource-types.xml > {code:java} > > yarn.resource-types > yarn.io/gpu > {code} > Submit a job > In UI you can see % Of Queue and % Of Cluster is zero for the submitted > application > > This is because in SchedulerApplicationAttempt has below check for > calculating queueUsagePerc and clusterUsagePerc > {code:java} > if (!calc.isInvalidDivisor(cluster)) { > float queueCapacityPerc = queue.getQueueInfo(false, false) > .getCapacity(); > queueUsagePerc = calc.divide(cluster, usedResourceClone, > Resources.multiply(cluster, queueCapacityPerc)) * 100; > if (Float.isNaN(queueUsagePerc) || Float.isInfinite(queueUsagePerc)) { > queueUsagePerc = 0.0f; > } > clusterUsagePerc = > calc.divide(cluster, usedResourceClone, cluster) * 100; > } > {code} > calc.isInvalidDivisor(cluster) always returns true as gpu resource is 0 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10588) Percentage of queue and cluster is zero in WebUI
Bilwa S T created YARN-10588: Summary: Percentage of queue and cluster is zero in WebUI Key: YARN-10588 URL: https://issues.apache.org/jira/browse/YARN-10588 Project: Hadoop YARN Issue Type: Bug Reporter: Bilwa S T Assignee: Bilwa S T Steps to reproduce: Configure below property in resource-types.xml {code:java} yarn.resource-types yarn.io/gpu {code} Submit a job In UI you can see % Of Queue and % Of Cluster is zero for the submitted application This is because in SchedulerApplicationAttempt has below check for calculating queueUsagePerc and clusterUsagePerc {code:java} if (!calc.isInvalidDivisor(cluster)) { float queueCapacityPerc = queue.getQueueInfo(false, false) .getCapacity(); queueUsagePerc = calc.divide(cluster, usedResourceClone, Resources.multiply(cluster, queueCapacityPerc)) * 100; if (Float.isNaN(queueUsagePerc) || Float.isInfinite(queueUsagePerc)) { queueUsagePerc = 0.0f; } clusterUsagePerc = calc.divide(cluster, usedResourceClone, cluster) * 100; } {code} calc.isInvalidDivisor(cluster) always returns true as gpu resource is 0 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10449) Flexing doesn't consider containers which were stopped
[ https://issues.apache.org/jira/browse/YARN-10449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17267824#comment-17267824 ] Bilwa S T commented on YARN-10449: -- cc [~brahma] [~leftnoteasy] [~eyang] > Flexing doesn't consider containers which were stopped > -- > > Key: YARN-10449 > URL: https://issues.apache.org/jira/browse/YARN-10449 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > > we have use case where if worker is idle for some period of time then user > would want to shutdown the worker to release resource and request more > workers when load is more. > In case of ON_FAILURE retry policy if user gracefully shutdown worker its > exit status will be 0 so container wont be relaunched. In this case > if user try to flex up then it currently doesn't consider stopped containers > which is not correct. > i could think of two possible solutions: > 1. Consider deducting succeeded containers from Number Of Containers and then > clear succeeded component if flex up/down is done. > 2. Consider updating number of containers when stopped > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10531) Be able to disable user limit factor for CapacityScheduler Leaf Queue
[ https://issues.apache.org/jira/browse/YARN-10531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17267763#comment-17267763 ] Bilwa S T commented on YARN-10531: -- Hi [~zhuqi] Thanks for the patch. patch looks good to me. +1(Non-binding). Please check findbug and checkstyle issues > Be able to disable user limit factor for CapacityScheduler Leaf Queue > - > > Key: YARN-10531 > URL: https://issues.apache.org/jira/browse/YARN-10531 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: zhuqi >Priority: Major > Attachments: YARN-10531.001.patch, YARN-10531.002.patch, > YARN-10531.003.patch > > > User limit factor is used to define max cap of how much resource can be > consumed by single user. > Under Auto Queue Creation context, it doesn't make much sense to set user > limit factor, because initially every queue will set weight to 1.0, we want > user can consume more resource if possible. It is hard to pre-determine how > to set up user limit factor. So it makes more sense to add a new value (like > -1) to indicate we will disable user limit factor > Logic need to be changed is below: > (Inside LeafQueue.java) > {code} > Resource maxUserLimit = Resources.none(); > if (schedulingMode == SchedulingMode.RESPECT_PARTITION_EXCLUSIVITY) { > maxUserLimit = Resources.multiplyAndRoundDown(queueCapacity, > getUserLimitFactor()); > } else if (schedulingMode == SchedulingMode.IGNORE_PARTITION_EXCLUSIVITY) > { > maxUserLimit = partitionResource; > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10554) NPE in ResourceCommiterService when async scheduling enabled
[ https://issues.apache.org/jira/browse/YARN-10554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10554: - Affects Version/s: 3.1.1 > NPE in ResourceCommiterService when async scheduling enabled > > > Key: YARN-10554 > URL: https://issues.apache.org/jira/browse/YARN-10554 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.1.1 >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > > {code:java} > 2020-12-22 04:58:30,600 | ERROR | Thread-62 | Thread Thread[Thread-62,5,main] > threw an Exception. | YarnUncaughtExceptionHandler.java:68 > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.placement.LocalityAppPlacementAllocator.decResourceRequest(LocalityAppPlacementAllocator.java:318) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.placement.LocalityAppPlacementAllocator.allocateNodeLocal(LocalityAppPlacementAllocator.java:300) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.placement.LocalityAppPlacementAllocator.allocate(LocalityAppPlacementAllocator.java:416) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:556) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.apply(FiCaSchedulerApp.java:597) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2948) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$ResourceCommitterService.run(CapacityScheduler.java:644) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10554) NPE in ResourceCommiterService when async scheduling enabled
[ https://issues.apache.org/jira/browse/YARN-10554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17255880#comment-17255880 ] Bilwa S T commented on YARN-10554: -- [~zhuqi] it happened in 3.1 version > NPE in ResourceCommiterService when async scheduling enabled > > > Key: YARN-10554 > URL: https://issues.apache.org/jira/browse/YARN-10554 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > > {code:java} > 2020-12-22 04:58:30,600 | ERROR | Thread-62 | Thread Thread[Thread-62,5,main] > threw an Exception. | YarnUncaughtExceptionHandler.java:68 > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.placement.LocalityAppPlacementAllocator.decResourceRequest(LocalityAppPlacementAllocator.java:318) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.placement.LocalityAppPlacementAllocator.allocateNodeLocal(LocalityAppPlacementAllocator.java:300) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.placement.LocalityAppPlacementAllocator.allocate(LocalityAppPlacementAllocator.java:416) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:556) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.apply(FiCaSchedulerApp.java:597) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2948) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$ResourceCommitterService.run(CapacityScheduler.java:644) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10554) NPE in ResourceCommiterService when async scheduling enabled
[ https://issues.apache.org/jira/browse/YARN-10554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10554: - Summary: NPE in ResourceCommiterService when async scheduling enabled (was: NPE in ResourceCommiterService#tryCommit when async scheduling enabled) > NPE in ResourceCommiterService when async scheduling enabled > > > Key: YARN-10554 > URL: https://issues.apache.org/jira/browse/YARN-10554 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > > {code:java} > 2020-12-22 04:58:30,600 | ERROR | Thread-62 | Thread Thread[Thread-62,5,main] > threw an Exception. | YarnUncaughtExceptionHandler.java:68 > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.placement.LocalityAppPlacementAllocator.decResourceRequest(LocalityAppPlacementAllocator.java:318) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.placement.LocalityAppPlacementAllocator.allocateNodeLocal(LocalityAppPlacementAllocator.java:300) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.placement.LocalityAppPlacementAllocator.allocate(LocalityAppPlacementAllocator.java:416) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:556) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.apply(FiCaSchedulerApp.java:597) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2948) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$ResourceCommitterService.run(CapacityScheduler.java:644) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10554) NPE in ResourceCommiterService#tryCommit when async scheduling enabled
Bilwa S T created YARN-10554: Summary: NPE in ResourceCommiterService#tryCommit when async scheduling enabled Key: YARN-10554 URL: https://issues.apache.org/jira/browse/YARN-10554 Project: Hadoop YARN Issue Type: Bug Reporter: Bilwa S T Assignee: Bilwa S T {code:java} 2020-12-22 04:58:30,600 | ERROR | Thread-62 | Thread Thread[Thread-62,5,main] threw an Exception. | YarnUncaughtExceptionHandler.java:68 java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.placement.LocalityAppPlacementAllocator.decResourceRequest(LocalityAppPlacementAllocator.java:318) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.placement.LocalityAppPlacementAllocator.allocateNodeLocal(LocalityAppPlacementAllocator.java:300) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.placement.LocalityAppPlacementAllocator.allocate(LocalityAppPlacementAllocator.java:416) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:556) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.apply(FiCaSchedulerApp.java:597) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2948) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$ResourceCommitterService.run(CapacityScheduler.java:644) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10463) For Federation, we should support getApplicationAttemptReport.
[ https://issues.apache.org/jira/browse/YARN-10463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17238238#comment-17238238 ] Bilwa S T edited comment on YARN-10463 at 11/24/20, 4:17 PM: - Hi [~brahmareddy] [~jbrennan] can you please help in commiting this patch? was (Author: bilwast): Hi [~brahmareddy] can you please help in commiting this patch? > For Federation, we should support getApplicationAttemptReport. > -- > > Key: YARN-10463 > URL: https://issues.apache.org/jira/browse/YARN-10463 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.3.0 >Reporter: zhuqi >Assignee: zhuqi >Priority: Major > Attachments: YARN-10463.001.patch, YARN-10463.002.patch, > YARN-10463.003.patch, YARN-10463.004.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10463) For Federation, we should support getApplicationAttemptReport.
[ https://issues.apache.org/jira/browse/YARN-10463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17238238#comment-17238238 ] Bilwa S T commented on YARN-10463: -- Hi [~brahmareddy] can you please help in commiting this patch? > For Federation, we should support getApplicationAttemptReport. > -- > > Key: YARN-10463 > URL: https://issues.apache.org/jira/browse/YARN-10463 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.3.0 >Reporter: zhuqi >Assignee: zhuqi >Priority: Major > Attachments: YARN-10463.001.patch, YARN-10463.002.patch, > YARN-10463.003.patch, YARN-10463.004.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-7235) RMWebServices SSL renegotiate denied
[ https://issues.apache.org/jira/browse/YARN-7235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T reassigned YARN-7235: --- Assignee: (was: Bilwa S T) > RMWebServices SSL renegotiate denied > > > Key: YARN-7235 > URL: https://issues.apache.org/jira/browse/YARN-7235 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.7.3 >Reporter: Prabhu Joseph >Priority: Major > > We see lot of SSL renegotiate denied WARN messages in RM logs > {code} > 2017-08-29 08:14:15,821 WARN mortbay.log (Slf4jLog.java:warn(76)) - SSL > renegotiate denied: java.nio.channels.SocketChannel[connected > local=/10.136.19.134:8078 remote=/10.136.19.103:59994] > {code} > Looks we need a similar fix like YARN-6797 for RMWebServices. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-9459) ConfiguredYarnAuthorizer should not now any details about queues
[ https://issues.apache.org/jira/browse/YARN-9459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T reassigned YARN-9459: --- Assignee: (was: Bilwa S T) > ConfiguredYarnAuthorizer should not now any details about queues > > > Key: YARN-9459 > URL: https://issues.apache.org/jira/browse/YARN-9459 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Priority: Major > > ConfiguredYarnAuthorizer should not now anything about queues, as this is not > its responsibility, see > [here|https://github.com/apache/hadoop/blob/39b4a37e02e929a698fcf9e32f1f71bb6b977635/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/security/ConfiguredYarnAuthorizer.java#L70] > Code like this could be part of the QueueACLsManager. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10463) For Federation, we should support getApplicationAttemptReport.
[ https://issues.apache.org/jira/browse/YARN-10463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17228697#comment-17228697 ] Bilwa S T commented on YARN-10463: -- [~zhuqi] Thanks for patch. +1(Non-binding) > For Federation, we should support getApplicationAttemptReport. > -- > > Key: YARN-10463 > URL: https://issues.apache.org/jira/browse/YARN-10463 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.3.0 >Reporter: zhuqi >Assignee: zhuqi >Priority: Major > Attachments: YARN-10463.001.patch, YARN-10463.002.patch, > YARN-10463.003.patch, YARN-10463.004.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10178) Global Scheduler asycthread crash caused by 'Comparison method violates its general contract'
[ https://issues.apache.org/jira/browse/YARN-10178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226928#comment-17226928 ] Bilwa S T commented on YARN-10178: -- Hi [~bteke] Is there any update on this jira? We faced the same issue in production cluster recently when async scheduling was enabled. > Global Scheduler asycthread crash caused by 'Comparison method violates its > general contract' > - > > Key: YARN-10178 > URL: https://issues.apache.org/jira/browse/YARN-10178 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.2.1 >Reporter: tuyu >Assignee: Benjamin Teke >Priority: Major > > Global Scheduler Async Thread crash stack > {code:java} > ERROR org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received > RMFatalEvent of type CRITICAL_THREAD_CRASH, caused by a critical thread, > Thread-6066574, that exited unexpectedly: java.lang.IllegalArgumentException: > Comparison method violates its general contract! >at > java.util.TimSort.mergeHi(TimSort.java:899) > at java.util.TimSort.mergeAt(TimSort.java:516) > at java.util.TimSort.mergeForceCollapse(TimSort.java:457) > at java.util.TimSort.sort(TimSort.java:254) > at java.util.Arrays.sort(Arrays.java:1512) > at java.util.ArrayList.sort(ArrayList.java:1462) > at java.util.Collections.sort(Collections.java:177) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.policy.PriorityUtilizationQueueOrderingPolicy.getAssignmentIterator(PriorityUtilizationQueueOrderingPolicy.java:221) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.sortAndGetChildrenAllocationIterator(ParentQueue.java:777) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:791) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:623) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateOrReserveNewContainers(CapacityScheduler.java:1635) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainerOnSingleNode(CapacityScheduler.java:1629) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1732) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1481) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.schedule(CapacityScheduler.java:569) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$AsyncScheduleThread.run(CapacityScheduler.java:616) > {code} > JAVA 8 Arrays.sort default use timsort algo, and timsort has few require > {code:java} > 1.x.compareTo(y) != y.compareTo(x) > 2.x>y,y>z --> x > z > 3.x=y, x.compareTo(z) == y.compareTo(z) > {code} > if not Arrays paramters not satify this require,TimSort will throw > 'java.lang.IllegalArgumentException' > look at PriorityUtilizationQueueOrderingPolicy.compare function,we will know > Capacity Scheduler use this these queue resource usage to compare > {code:java} > AbsoluteUsedCapacity > UsedCapacity > ConfiguredMinResource > AbsoluteCapacity > {code} > In Capacity Scheduler Global Scheduler AsyncThread use > PriorityUtilizationQueueOrderingPolicy function to choose queue to assign > container,and construct a CSAssignment struct, and use > submitResourceCommitRequest function add CSAssignment to backlogs > ResourceCommitterService will tryCommit this CSAssignment,look tryCommit > function,there will update queue resource usage > {code:java} > public boolean tryCommit(Resource cluster, ResourceCommitRequest r, > boolean updatePending) { > long commitStart = System.nanoTime(); > ResourceCommitRequest request = > (ResourceCommitRequest) r; > > ... > boolean isSuccess = false; > if (attemptId != null) { > FiCaSchedulerApp app = getApplicationAttempt(attemptId); > // Required sanity check for attemptId - when async-scheduling enabled, > // proposal might be outdated if AM failover just finished > // and proposal queue was not be consumed in time > if (app != null && attemptId.equals(app.getApplicationAttemptId())) { > if (app.accept(cluster, request, updatePending) > && app.apply(cluster, request, updatePending)) { // apply this > resource
[jira] [Commented] (YARN-10463) For Federation, we should support getApplicationAttemptReport.
[ https://issues.apache.org/jira/browse/YARN-10463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226760#comment-17226760 ] Bilwa S T commented on YARN-10463: -- Hi [~zhuqi] This is because app gets rejected as you are submitting with queue name 'q1' which doesn't exist. Instead use *"root.default"* as queuename {quote}I have fixed your comments in the new patch, but how can we fill the appAttemptId to the RMApp in RMClientService. The test will failed in RMClientService: {quote} Also attempt number should be 1 not 0 ie {code:java} ApplicationAttemptId appAttemptId = ApplicationAttemptId.newInstance(appId, 1); {code} > For Federation, we should support getApplicationAttemptReport. > -- > > Key: YARN-10463 > URL: https://issues.apache.org/jira/browse/YARN-10463 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.3.0 >Reporter: zhuqi >Assignee: zhuqi >Priority: Major > Attachments: YARN-10463.001.patch, YARN-10463.002.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10463) For Federation, we should support getApplicationAttemptReport.
[ https://issues.apache.org/jira/browse/YARN-10463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17216580#comment-17216580 ] Bilwa S T commented on YARN-10463: -- Hi [~zhuqi] Thanks for patch. I have few comments: # In below code you need to call routerMetrics.incrAppAttemptsFailedRetrieved() instead of routerMetrics.incrAppsFailedRetrieved(); {code:java} try { response = clientRMProxy.getApplicationAttemptReport(request); } catch (Exception e) { routerMetrics.incrAppsFailedRetrieved(); LOG.error("Unable to get the applicationAttempt report for " + request.getApplicationAttemptId() + "to SubCluster " + subClusterId.getId(), e); throw e; } {code} 2. Add null check for Application id ie request.getApplicationAttemptId().getApplicationId() 3. In TestFederationClientInterceptor.java, change "Test FederationClientInterceptor: Get Application Report" to "Test FederationClientInterceptor: Get Application Attempt Report" 4. In testcase instead of Assert.fail() and catch, use LambdaTestUtils#intercept(). > For Federation, we should support getApplicationAttemptReport. > -- > > Key: YARN-10463 > URL: https://issues.apache.org/jira/browse/YARN-10463 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.3.0 >Reporter: zhuqi >Assignee: zhuqi >Priority: Major > Attachments: YARN-10463.001.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10463) For Federation, we should support getApplicationAttemptReport.
[ https://issues.apache.org/jira/browse/YARN-10463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17216484#comment-17216484 ] Bilwa S T commented on YARN-10463: -- YARN-10111 has been raised for getQueueInfo() api > For Federation, we should support getApplicationAttemptReport. > -- > > Key: YARN-10463 > URL: https://issues.apache.org/jira/browse/YARN-10463 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.3.0 >Reporter: zhuqi >Assignee: zhuqi >Priority: Major > Attachments: YARN-10463.001.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10463) For Federation, we should support getApplicationAttemptReport.
[ https://issues.apache.org/jira/browse/YARN-10463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17215315#comment-17215315 ] Bilwa S T commented on YARN-10463: -- Hi [~zhuqi] # We cannot change return type of API for this {quote}we will return a map with subcluster ID as a key and queue info as values? {quote} # We can implement this api only if all subclusters have same queue hierarchy , otherwise it doesn't make sense. So this is up for discussion. > For Federation, we should support getApplicationAttemptReport. > -- > > Key: YARN-10463 > URL: https://issues.apache.org/jira/browse/YARN-10463 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.3.0 >Reporter: zhuqi >Assignee: zhuqi >Priority: Major > Attachments: YARN-10463.001.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10463) For Federation, we should support getApplicationAttemptReport.
[ https://issues.apache.org/jira/browse/YARN-10463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17215251#comment-17215251 ] Bilwa S T commented on YARN-10463: -- Hi [~zhuqi] It depends on the API. For example getClusterNodes/getContainers/getApps API you need to send request to all subclusters and return list of all nodes/containers/apps. whereas for API's like getApplicationReport you can send request to only homesubcluster. I would like to review your patches. Thanks > For Federation, we should support getApplicationAttemptReport. > -- > > Key: YARN-10463 > URL: https://issues.apache.org/jira/browse/YARN-10463 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.3.0 >Reporter: zhuqi >Assignee: zhuqi >Priority: Major > Attachments: YARN-10463.001.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10425) Replace the legacy placement engine in CS with the new one
[ https://issues.apache.org/jira/browse/YARN-10425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17210927#comment-17210927 ] Bilwa S T commented on YARN-10425: -- Thanks [~shuzirra] for patch. I have minor nit 1. Why is null check needed in below code? Null can be returned only in case of LeafQueue.getChildQueues() but here its parentQueue and ParentQueue.getChildQueues() never return null. {code:java} if (parentQueue.getChildQueues() != null) { for (CSQueue queue : parentQueue.getChildQueues()) { if (queue instanceof LeafQueue) { //if a non managed parent queue has at least one leaf queue, this /mapping can be valid, we cannot do any more checks return true; } {code} > Replace the legacy placement engine in CS with the new one > -- > > Key: YARN-10425 > URL: https://issues.apache.org/jira/browse/YARN-10425 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > Attachments: YARN-10425.001.patch > > > Remove the UserGroupMapping and ApplicationName mapping classes, and use the > new CSMappingPlacementRule instead. Also cleanup the orphan classes which are > used by these classes only. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10439) Yarn Service AM listens on all IP's on the machine
[ https://issues.apache.org/jira/browse/YARN-10439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205262#comment-17205262 ] Bilwa S T commented on YARN-10439: -- Thanks [~dmmkr] for updating. Changes looks good to me. +1(non-binding) > Yarn Service AM listens on all IP's on the machine > -- > > Key: YARN-10439 > URL: https://issues.apache.org/jira/browse/YARN-10439 > Project: Hadoop YARN > Issue Type: Bug > Components: security, yarn-native-services >Reporter: D M Murali Krishna Reddy >Assignee: D M Murali Krishna Reddy >Priority: Minor > Attachments: YARN-10439.001.patch, YARN-10439.002.patch > > > In ClientAMService.java, rpc server is created without passing hostname, due > to which the client listens on 0.0.0.0, which is a bad practise. > > {{InetSocketAddress address = {color:#cc7832}new > {color}InetSocketAddress({color:#6897bb}0{color}){color:#cc7832};{color}}} > {{{color:#9876aa}server {color}= > rpc.getServer(ClientAMProtocol.{color:#cc7832}class, this, > {color}address{color:#cc7832}, {color}conf{color:#cc7832},{color} > {color:#9876aa}context{color}.{color:#9876aa}secretManager{color}{color:#cc7832}, > {color}{color:#6897bb}1{color}){color:#cc7832};{color}}} > > Also, a new configuration must be added similar to > "yarn.app.mapreduce.am.job.client.port-range", so that client can configure > port range for yarn service AM to bind. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-9312) NPE while rendering SLS simulate page
[ https://issues.apache.org/jira/browse/YARN-9312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T reassigned YARN-9312: --- Assignee: (was: Bilwa S T) > NPE while rendering SLS simulate page > - > > Key: YARN-9312 > URL: https://issues.apache.org/jira/browse/YARN-9312 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin Chundatt >Priority: Minor > > http://localhost:10001/simulate > {code} > java.lang.NullPointerException > at > org.apache.hadoop.yarn.sls.web.SLSWebApp.printPageSimulate(SLSWebApp.java:240) > at > org.apache.hadoop.yarn.sls.web.SLSWebApp.access$100(SLSWebApp.java:55) > at > org.apache.hadoop.yarn.sls.web.SLSWebApp$1.handle(SLSWebApp.java:152) > at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134) > at org.eclipse.jetty.server.Server.handle(Server.java:539) > at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:333) > at > org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251) > at > org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283) > at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108) > at > org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93) > at > org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303) > at > org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148) > at > org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136) > at > org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671) > at > org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-10449) Flexing doesn't consider containers which were stopped
[ https://issues.apache.org/jira/browse/YARN-10449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T reassigned YARN-10449: Assignee: Bilwa S T > Flexing doesn't consider containers which were stopped > -- > > Key: YARN-10449 > URL: https://issues.apache.org/jira/browse/YARN-10449 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > > we have use case where if worker is idle for some period of time then user > would want to shutdown the worker to release resource and request more > workers when load is more. > In case of ON_FAILURE retry policy if user gracefully shutdown worker its > exit status will be 0 so container wont be relaunched. In this case > if user try to flex up then it currently doesn't consider stopped containers > which is not correct. > i could think of two possible solutions: > 1. Consider deducting succeeded containers from Number Of Containers and then > clear succeeded component if flex up/down is done. > 2. Consider updating number of containers when stopped > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10449) Flexing doesn't consider containers which were stopped
Bilwa S T created YARN-10449: Summary: Flexing doesn't consider containers which were stopped Key: YARN-10449 URL: https://issues.apache.org/jira/browse/YARN-10449 Project: Hadoop YARN Issue Type: Bug Reporter: Bilwa S T we have use case where if worker is idle for some period of time then user would want to shutdown the worker to release resource and request more workers when load is more. In case of ON_FAILURE retry policy if user gracefully shutdown worker its exit status will be 0 so container wont be relaunched. In this case if user try to flex up then it currently doesn't consider stopped containers which is not correct. i could think of two possible solutions: 1. Consider deducting succeeded containers from Number Of Containers and then clear succeeded component if flex up/down is done. 2. Consider updating number of containers when stopped -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10439) Yarn Service AM listens on all IP's on the machine
[ https://issues.apache.org/jira/browse/YARN-10439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197392#comment-17197392 ] Bilwa S T edited comment on YARN-10439 at 9/17/20, 7:55 AM: Thanks [~dmmkr] for patch. I have few comments 1. can you reuse YARN_SERVICE_PREFIX in YarnServiceConf.java? 2. You can reuse ServiceUtils#mandatoryEnvVariable instead of doing System.getEnv 3. Looks like UT failures are related. Please check Thanks was (Author: bilwast): Thanks [~dmmkr] for patch. I have few comments 1. can you reuse YARN_SERVICE_PREFIX in YarnServiceConf.java? 2. Looks like UT failures are related. Please check Thanks > Yarn Service AM listens on all IP's on the machine > -- > > Key: YARN-10439 > URL: https://issues.apache.org/jira/browse/YARN-10439 > Project: Hadoop YARN > Issue Type: Bug > Components: security, yarn-native-services >Reporter: D M Murali Krishna Reddy >Assignee: D M Murali Krishna Reddy >Priority: Minor > Attachments: YARN-10439.001.patch > > > In ClientAMService.java, rpc server is created without passing hostname, due > to which the client listens on 0.0.0.0, which is a bad practise. > > {{InetSocketAddress address = {color:#cc7832}new > {color}InetSocketAddress({color:#6897bb}0{color}){color:#cc7832};{color}}} > {{{color:#9876aa}server {color}= > rpc.getServer(ClientAMProtocol.{color:#cc7832}class, this, > {color}address{color:#cc7832}, {color}conf{color:#cc7832},{color} > {color:#9876aa}context{color}.{color:#9876aa}secretManager{color}{color:#cc7832}, > {color}{color:#6897bb}1{color}){color:#cc7832};{color}}} > > Also, a new configuration must be added similar to > "yarn.app.mapreduce.am.job.client.port-range", so that client can configure > port range for yarn service AM to bind. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10439) Yarn Service AM listens on all IP's on the machine
[ https://issues.apache.org/jira/browse/YARN-10439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197392#comment-17197392 ] Bilwa S T commented on YARN-10439: -- Thanks [~dmmkr] for patch. I have few comments 1. can you reuse YARN_SERVICE_PREFIX in YarnServiceConf.java? 2. Looks like UT failures are related. Please check Thanks > Yarn Service AM listens on all IP's on the machine > -- > > Key: YARN-10439 > URL: https://issues.apache.org/jira/browse/YARN-10439 > Project: Hadoop YARN > Issue Type: Bug > Components: security, yarn-native-services >Reporter: D M Murali Krishna Reddy >Assignee: D M Murali Krishna Reddy >Priority: Minor > Attachments: YARN-10439.001.patch > > > In ClientAMService.java, rpc server is created without passing hostname, due > to which the client listens on 0.0.0.0, which is a bad practise. > > {{InetSocketAddress address = {color:#cc7832}new > {color}InetSocketAddress({color:#6897bb}0{color}){color:#cc7832};{color}}} > {{{color:#9876aa}server {color}= > rpc.getServer(ClientAMProtocol.{color:#cc7832}class, this, > {color}address{color:#cc7832}, {color}conf{color:#cc7832},{color} > {color:#9876aa}context{color}.{color:#9876aa}secretManager{color}{color:#cc7832}, > {color}{color:#6897bb}1{color}){color:#cc7832};{color}}} > > Also, a new configuration must be added similar to > "yarn.app.mapreduce.am.job.client.port-range", so that client can configure > port range for yarn service AM to bind. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10430) Log improvements in NodeStatusUpdaterImpl
[ https://issues.apache.org/jira/browse/YARN-10430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17196747#comment-17196747 ] Bilwa S T commented on YARN-10430: -- Thank you [~Jim_Brennan] > Log improvements in NodeStatusUpdaterImpl > - > > Key: YARN-10430 > URL: https://issues.apache.org/jira/browse/YARN-10430 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Minor > Fix For: 3.3.1, 3.1.5, 3.4.1, 3.2.3 > > Attachments: YARN-10430.001.patch > > > I think in below places log should be printed only if list size is not zero. > {code:java} > if (LOG.isDebugEnabled()) { > LOG.debug("The cache log aggregation status size:" > + logAggregationReports.size()); > } > {code} > {code:java} > LOG.info("Sending out " + containerStatuses.size() > + " NM container statuses: " + containerStatuses); > {code} > {code:java} > if (LOG.isDebugEnabled()) { > LOG.debug("Sending out " + containerStatuses.size() > + " container statuses: " + containerStatuses); > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4783) Log aggregation failure for application when Nodemanager is restarted
[ https://issues.apache.org/jira/browse/YARN-4783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17194943#comment-17194943 ] Bilwa S T commented on YARN-4783: - Hi [~gandras] I took a look at your code again. Yes you can iterate over all hdfs tokens instead of first one. I think that should solve multiple nameservice scenario. > Log aggregation failure for application when Nodemanager is restarted > -- > > Key: YARN-4783 > URL: https://issues.apache.org/jira/browse/YARN-4783 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.7.1 >Reporter: Surendra Singh Lilhore >Assignee: Andras Gyori >Priority: Major > Attachments: YARN-4783.001.patch, YARN-4783.002.patch, > YARN-4783.003.patch > > > Scenario : > = > 1.Start NM with user dsperf:hadoop > 2.Configure linux-execute user as dsperf > 3.Submit application with yarn user > 4.Once few containers are allocated to NM 1 > 5.Nodemanager 1 is stopped (wait for expiry ) > 6.Start node manager after application is completed > 7.Check the log aggregation is happening for the containers log in NMLocal > directory > Expect Output : > === > Log aggregation should be succesfull > Actual Output : > === > Log aggreation not successfull -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10430) Log improvements in NodeStatusUpdaterImpl
[ https://issues.apache.org/jira/browse/YARN-10430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10430: - Attachment: YARN-10430.001.patch > Log improvements in NodeStatusUpdaterImpl > - > > Key: YARN-10430 > URL: https://issues.apache.org/jira/browse/YARN-10430 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Minor > Attachments: YARN-10430.001.patch > > > I think in below places log should be printed only if list size is not zero. > {code:java} > if (LOG.isDebugEnabled()) { > LOG.debug("The cache log aggregation status size:" > + logAggregationReports.size()); > } > {code} > {code:java} > LOG.info("Sending out " + containerStatuses.size() > + " NM container statuses: " + containerStatuses); > {code} > {code:java} > if (LOG.isDebugEnabled()) { > LOG.debug("Sending out " + containerStatuses.size() > + " container statuses: " + containerStatuses); > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10397) SchedulerRequest should be forwarded to scheduler if custom scheduler supports placement constraints
[ https://issues.apache.org/jira/browse/YARN-10397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17192941#comment-17192941 ] Bilwa S T commented on YARN-10397: -- Thanks [~brahmareddy] > SchedulerRequest should be forwarded to scheduler if custom scheduler > supports placement constraints > > > Key: YARN-10397 > URL: https://issues.apache.org/jira/browse/YARN-10397 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Minor > Fix For: 3.4.0, 3.3.1 > > Attachments: YARN-10397.001.patch, YARN-10397.002.patch > > > Currently only CapacityScheduler supports placement constraints so request > gets forwarded only for capacityScheduler. Below exception will be thrown if > custom scheduler supports placement constraint > {code:java} > if (request.getSchedulingRequests() != null > && !request.getSchedulingRequests().isEmpty()) { > if (!(scheduler instanceof CapacityScheduler)) { > String message = "Found non empty SchedulingRequest of " > + "AllocateRequest for application=" + appAttemptId.toString() > + ", however the configured scheduler=" > + scheduler.getClass().getCanonicalName() > + " cannot handle placement constraints, rejecting this " > + "allocate operation"; > LOG.warn(message); > throw new YarnException(message); > } > } > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-10430) Log improvements in NodeStatusUpdaterImpl
[ https://issues.apache.org/jira/browse/YARN-10430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T reassigned YARN-10430: Assignee: Bilwa S T > Log improvements in NodeStatusUpdaterImpl > - > > Key: YARN-10430 > URL: https://issues.apache.org/jira/browse/YARN-10430 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Minor > > I think in below places log should be printed only if list size is not zero. > {code:java} > if (LOG.isDebugEnabled()) { > LOG.debug("The cache log aggregation status size:" > + logAggregationReports.size()); > } > {code} > {code:java} > LOG.info("Sending out " + containerStatuses.size() > + " NM container statuses: " + containerStatuses); > {code} > {code:java} > if (LOG.isDebugEnabled()) { > LOG.debug("Sending out " + containerStatuses.size() > + " container statuses: " + containerStatuses); > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10430) Log improvements in NodeStatusUpdaterImpl
Bilwa S T created YARN-10430: Summary: Log improvements in NodeStatusUpdaterImpl Key: YARN-10430 URL: https://issues.apache.org/jira/browse/YARN-10430 Project: Hadoop YARN Issue Type: Bug Reporter: Bilwa S T I think in below places log should be printed only if list size is not zero. {code:java} if (LOG.isDebugEnabled()) { LOG.debug("The cache log aggregation status size:" + logAggregationReports.size()); } {code} {code:java} LOG.info("Sending out " + containerStatuses.size() + " NM container statuses: " + containerStatuses); {code} {code:java} if (LOG.isDebugEnabled()) { LOG.debug("Sending out " + containerStatuses.size() + " container statuses: " + containerStatuses); } {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4783) Log aggregation failure for application when Nodemanager is restarted
[ https://issues.apache.org/jira/browse/YARN-4783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17192060#comment-17192060 ] Bilwa S T commented on YARN-4783: - Hi [~gandras] I think we can renew/request token for conf "yarn.nodemanager.remote-app-log-dir" values. I mean if value configured is not defaultFs then we renew whatever nameservice is configured here as application would use same nameservice for log aggreagation. Thanks > Log aggregation failure for application when Nodemanager is restarted > -- > > Key: YARN-4783 > URL: https://issues.apache.org/jira/browse/YARN-4783 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.7.1 >Reporter: Surendra Singh Lilhore >Assignee: Andras Gyori >Priority: Major > Attachments: YARN-4783.001.patch, YARN-4783.002.patch, > YARN-4783.003.patch > > > Scenario : > = > 1.Start NM with user dsperf:hadoop > 2.Configure linux-execute user as dsperf > 3.Submit application with yarn user > 4.Once few containers are allocated to NM 1 > 5.Nodemanager 1 is stopped (wait for expiry ) > 6.Start node manager after application is completed > 7.Check the log aggregation is happening for the containers log in NMLocal > directory > Expect Output : > === > Log aggregation should be succesfull > Actual Output : > === > Log aggreation not successfull -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10397) SchedulerRequest should be forwarded to scheduler if custom scheduler supports placement constraints
[ https://issues.apache.org/jira/browse/YARN-10397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17180774#comment-17180774 ] Bilwa S T commented on YARN-10397: -- Thanks [~elgoiri] for reviewing this. I have updated javadoc . Yes this is covered by UT TestCapacitySchedulerSchedulingRequestUpdate. This testcase checks if capacityscheduler supports placement constraint or not. bq. BTW, I'm guessing you are using your own scheduler that supports this? Yes we have our own scheduler which supports placement constraints. > SchedulerRequest should be forwarded to scheduler if custom scheduler > supports placement constraints > > > Key: YARN-10397 > URL: https://issues.apache.org/jira/browse/YARN-10397 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Minor > Attachments: YARN-10397.001.patch, YARN-10397.002.patch > > > Currently only CapacityScheduler supports placement constraints so request > gets forwarded only for capacityScheduler. Below exception will be thrown if > custom scheduler supports placement constraint > {code:java} > if (request.getSchedulingRequests() != null > && !request.getSchedulingRequests().isEmpty()) { > if (!(scheduler instanceof CapacityScheduler)) { > String message = "Found non empty SchedulingRequest of " > + "AllocateRequest for application=" + appAttemptId.toString() > + ", however the configured scheduler=" > + scheduler.getClass().getCanonicalName() > + " cannot handle placement constraints, rejecting this " > + "allocate operation"; > LOG.warn(message); > throw new YarnException(message); > } > } > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10397) SchedulerRequest should be forwarded to scheduler if custom scheduler supports placement constraints
[ https://issues.apache.org/jira/browse/YARN-10397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10397: - Attachment: YARN-10397.002.patch > SchedulerRequest should be forwarded to scheduler if custom scheduler > supports placement constraints > > > Key: YARN-10397 > URL: https://issues.apache.org/jira/browse/YARN-10397 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Minor > Attachments: YARN-10397.001.patch, YARN-10397.002.patch > > > Currently only CapacityScheduler supports placement constraints so request > gets forwarded only for capacityScheduler. Below exception will be thrown if > custom scheduler supports placement constraint > {code:java} > if (request.getSchedulingRequests() != null > && !request.getSchedulingRequests().isEmpty()) { > if (!(scheduler instanceof CapacityScheduler)) { > String message = "Found non empty SchedulingRequest of " > + "AllocateRequest for application=" + appAttemptId.toString() > + ", however the configured scheduler=" > + scheduler.getClass().getCanonicalName() > + " cannot handle placement constraints, rejecting this " > + "allocate operation"; > LOG.warn(message); > throw new YarnException(message); > } > } > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10397) SchedulerRequest should be forwarded to scheduler if custom scheduler supports placement constraints
[ https://issues.apache.org/jira/browse/YARN-10397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10397: - Priority: Minor (was: Major) > SchedulerRequest should be forwarded to scheduler if custom scheduler > supports placement constraints > > > Key: YARN-10397 > URL: https://issues.apache.org/jira/browse/YARN-10397 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Minor > Attachments: YARN-10397.001.patch > > > Currently only CapacityScheduler supports placement constraints so request > gets forwarded only for capacityScheduler. Below exception will be thrown if > custom scheduler supports placement constraint > {code:java} > if (request.getSchedulingRequests() != null > && !request.getSchedulingRequests().isEmpty()) { > if (!(scheduler instanceof CapacityScheduler)) { > String message = "Found non empty SchedulingRequest of " > + "AllocateRequest for application=" + appAttemptId.toString() > + ", however the configured scheduler=" > + scheduler.getClass().getCanonicalName() > + " cannot handle placement constraints, rejecting this " > + "allocate operation"; > LOG.warn(message); > throw new YarnException(message); > } > } > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-4783) Log aggregation failure for application when Nodemanager is restarted
[ https://issues.apache.org/jira/browse/YARN-4783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179146#comment-17179146 ] Bilwa S T edited comment on YARN-4783 at 8/17/20, 5:45 PM: --- Thanks [~gandras] for patch. This may not work in multiple nameservice setup in case if you are requesting for a new token. currently client handles getting delegation token from all nameservices . whereas in your patch i see that you are just trying to get current nameservice token correct me if i am wrong. I am talking about a case where yarn.nodemanager.remote-app-log-dir is set to a namespace that is not default-fs was (Author: bilwast): Thanks [~gandras] for patch. This may not work in multiple nameservice setup in case if you are requesting for a new token. currently client handles getting delegation token from all nameservices . whereas in your patch i see that you are just trying to get current nameservice token correct me if i am wrong. > Log aggregation failure for application when Nodemanager is restarted > -- > > Key: YARN-4783 > URL: https://issues.apache.org/jira/browse/YARN-4783 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.7.1 >Reporter: Surendra Singh Lilhore >Assignee: Andras Gyori >Priority: Major > Attachments: YARN-4783.001.patch, YARN-4783.002.patch, > YARN-4783.003.patch > > > Scenario : > = > 1.Start NM with user dsperf:hadoop > 2.Configure linux-execute user as dsperf > 3.Submit application with yarn user > 4.Once few containers are allocated to NM 1 > 5.Nodemanager 1 is stopped (wait for expiry ) > 6.Start node manager after application is completed > 7.Check the log aggregation is happening for the containers log in NMLocal > directory > Expect Output : > === > Log aggregation should be succesfull > Actual Output : > === > Log aggreation not successfull -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4783) Log aggregation failure for application when Nodemanager is restarted
[ https://issues.apache.org/jira/browse/YARN-4783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179146#comment-17179146 ] Bilwa S T commented on YARN-4783: - Thanks [~gandras] for patch. This may not work in multiple nameservice setup in case if you are requesting for a new token. currently client handles getting delegation token from all nameservices . whereas in your patch i see that you are just trying to get current nameservice token correct me if i am wrong. > Log aggregation failure for application when Nodemanager is restarted > -- > > Key: YARN-4783 > URL: https://issues.apache.org/jira/browse/YARN-4783 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.7.1 >Reporter: Surendra Singh Lilhore >Assignee: Andras Gyori >Priority: Major > Attachments: YARN-4783.001.patch, YARN-4783.002.patch, > YARN-4783.003.patch > > > Scenario : > = > 1.Start NM with user dsperf:hadoop > 2.Configure linux-execute user as dsperf > 3.Submit application with yarn user > 4.Once few containers are allocated to NM 1 > 5.Nodemanager 1 is stopped (wait for expiry ) > 6.Start node manager after application is completed > 7.Check the log aggregation is happening for the containers log in NMLocal > directory > Expect Output : > === > Log aggregation should be succesfull > Actual Output : > === > Log aggreation not successfull -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-10122) In Federation,executing yarn container signal command throws an exception
[ https://issues.apache.org/jira/browse/YARN-10122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T reassigned YARN-10122: Assignee: (was: Bilwa S T) > In Federation,executing yarn container signal command throws an exception > - > > Key: YARN-10122 > URL: https://issues.apache.org/jira/browse/YARN-10122 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation, yarn >Reporter: Sushanta Sen >Priority: Major > > Executing yarn container signal command failed, prompting an error > “org.apache.commons.lang.NotImplementedException: Code is not implemented”. > {noformat} > ./yarn container -signal container_e79_1581316978887_0001_01_10 > Signalling container container_e79_1581316978887_0001_01_10 > 2020-02-10 14:51:18,045 INFO impl.YarnClientImpl: Signalling container > container_e79_1581316978887_0001_01_10 with command OUTPUT_THREAD_DUMP > Exception in thread "main" org.apache.commons.lang.NotImplementedException: > Code is not implemented > at > org.apache.hadoop.yarn.server.router.clientrm.FederationClientInterceptor.signalToContainer(FederationClientInterceptor.java:993) > at > org.apache.hadoop.yarn.server.router.clientrm.RouterClientRMService.signalToContainer(RouterClientRMService.java:403) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.signalToContainer(ApplicationClientProtocolPBServiceImpl.java:629) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:629) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:530) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1036) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:928) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:863) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2793) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at > org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53) > at > org.apache.hadoop.yarn.ipc.RPCUtil.instantiateRuntimeException(RPCUtil.java:85) > at > org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:122) > at > org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.signalToContainer(ApplicationClientProtocolPBClientImpl.java:620) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359) > at com.sun.proxy.$Proxy8.signalToContainer(Unknown Source) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.signalToContainer(YarnClientImpl.java:949) > at > org.apache.hadoop.yarn.client.cli.ApplicationCLI.signalToContainer(ApplicationCLI.java:717) > at > org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:478) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) > at > org.apache.hadoop.yarn.client.cli.ApplicationCLI.main(ApplicationCLI.java:119) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail:
[jira] [Assigned] (YARN-10132) For Federation,yarn applicationattempt fail command throws an exception
[ https://issues.apache.org/jira/browse/YARN-10132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T reassigned YARN-10132: Assignee: (was: Bilwa S T) > For Federation,yarn applicationattempt fail command throws an exception > --- > > Key: YARN-10132 > URL: https://issues.apache.org/jira/browse/YARN-10132 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Sushanta Sen >Priority: Major > > yarn applicationattempt fail command is failing with exception > “org.apache.commons.lang.NotImplementedException: Code is not implemented”. > {noformat} > ./yarn applicationattempt -fail appattempt_1581497870689_0001_01 > Failing attempt appattempt_1581497870689_0001_01 of application > application_1581497870689_0001 > 2020-02-12 20:48:48,530 INFO impl.YarnClientImpl: Failing application attempt > appattempt_1581497870689_0001_01 > Exception in thread "main" org.apache.commons.lang.NotImplementedException: > Code is not implemented > at > org.apache.hadoop.yarn.server.router.clientrm.FederationClientInterceptor.failApplicationAttempt(FederationClientInterceptor.java:980) > at > org.apache.hadoop.yarn.server.router.clientrm.RouterClientRMService.failApplicationAttempt(RouterClientRMService.java:388) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.failApplicationAttempt(ApplicationClientProtocolPBServiceImpl.java:210) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:581) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:530) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1036) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:928) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:863) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2793) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at > org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53) > at > org.apache.hadoop.yarn.ipc.RPCUtil.instantiateRuntimeException(RPCUtil.java:85) > at > org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:122) > at > org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.failApplicationAttempt(ApplicationClientProtocolPBClientImpl.java:223) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359) > at com.sun.proxy.$Proxy8.failApplicationAttempt(Unknown Source) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.failApplicationAttempt(YarnClientImpl.java:447) > at > org.apache.hadoop.yarn.client.cli.ApplicationCLI.failApplicationAttempt(ApplicationCLI.java:985) > at > org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:455) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) > at > org.apache.hadoop.yarn.client.cli.ApplicationCLI.main(ApplicationCLI.java:119) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail:
[jira] [Commented] (YARN-10397) SchedulerRequest should be forwarded to scheduler if custom scheduler supports placement constraints
[ https://issues.apache.org/jira/browse/YARN-10397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177834#comment-17177834 ] Bilwa S T commented on YARN-10397: -- cc [~leftnoteasy] [~prabhujoseph] > SchedulerRequest should be forwarded to scheduler if custom scheduler > supports placement constraints > > > Key: YARN-10397 > URL: https://issues.apache.org/jira/browse/YARN-10397 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10397.001.patch > > > Currently only CapacityScheduler supports placement constraints so request > gets forwarded only for capacityScheduler. Below exception will be thrown if > custom scheduler supports placement constraint > {code:java} > if (request.getSchedulingRequests() != null > && !request.getSchedulingRequests().isEmpty()) { > if (!(scheduler instanceof CapacityScheduler)) { > String message = "Found non empty SchedulingRequest of " > + "AllocateRequest for application=" + appAttemptId.toString() > + ", however the configured scheduler=" > + scheduler.getClass().getCanonicalName() > + " cannot handle placement constraints, rejecting this " > + "allocate operation"; > LOG.warn(message); > throw new YarnException(message); > } > } > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10397) SchedulerRequest should be forwarded to scheduler if custom scheduler supports placement constraints
[ https://issues.apache.org/jira/browse/YARN-10397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10397: - Attachment: YARN-10397.001.patch > SchedulerRequest should be forwarded to scheduler if custom scheduler > supports placement constraints > > > Key: YARN-10397 > URL: https://issues.apache.org/jira/browse/YARN-10397 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-10397.001.patch > > > Currently only CapacityScheduler supports placement constraints so request > gets forwarded only for capacityScheduler. Below exception will be thrown if > custom scheduler supports placement constraint > {code:java} > if (request.getSchedulingRequests() != null > && !request.getSchedulingRequests().isEmpty()) { > if (!(scheduler instanceof CapacityScheduler)) { > String message = "Found non empty SchedulingRequest of " > + "AllocateRequest for application=" + appAttemptId.toString() > + ", however the configured scheduler=" > + scheduler.getClass().getCanonicalName() > + " cannot handle placement constraints, rejecting this " > + "allocate operation"; > LOG.warn(message); > throw new YarnException(message); > } > } > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10397) SchedulerRequest should be forwarded to scheduler if custom scheduler supports placement constraints
[ https://issues.apache.org/jira/browse/YARN-10397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10397: - Description: Currently only CapacityScheduler supports placement constraints so request gets forwarded only for capacityScheduler. Below exception will be thrown if custom scheduler supports placement constraint {code:java} if (request.getSchedulingRequests() != null && !request.getSchedulingRequests().isEmpty()) { if (!(scheduler instanceof CapacityScheduler)) { String message = "Found non empty SchedulingRequest of " + "AllocateRequest for application=" + appAttemptId.toString() + ", however the configured scheduler=" + scheduler.getClass().getCanonicalName() + " cannot handle placement constraints, rejecting this " + "allocate operation"; LOG.warn(message); throw new YarnException(message); } } {code} was: Currently only CapacityScheduler supports placement constraints so request gets forwarded only for capacityScheduler. Below exception will be thrown if custom scheduler supports placement constraint {code:java} if (request.getSchedulingRequests() != null && !request.getSchedulingRequests().isEmpty()) { if (!(scheduler instanceof CapacityScheduler)) { String message = "Found non empty SchedulingRequest of " + "AllocateRequest for application=" + appAttemptId.toString() + ", however the configured scheduler=" + scheduler.getClass().getCanonicalName() + " cannot handle placement constraints, rejecting this " + "allocate operation"; LOG.warn(message); throw new YarnException(message); } } {code} I think we should make this configurable > SchedulerRequest should be forwarded to scheduler if custom scheduler > supports placement constraints > > > Key: YARN-10397 > URL: https://issues.apache.org/jira/browse/YARN-10397 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > > Currently only CapacityScheduler supports placement constraints so request > gets forwarded only for capacityScheduler. Below exception will be thrown if > custom scheduler supports placement constraint > {code:java} > if (request.getSchedulingRequests() != null > && !request.getSchedulingRequests().isEmpty()) { > if (!(scheduler instanceof CapacityScheduler)) { > String message = "Found non empty SchedulingRequest of " > + "AllocateRequest for application=" + appAttemptId.toString() > + ", however the configured scheduler=" > + scheduler.getClass().getCanonicalName() > + " cannot handle placement constraints, rejecting this " > + "allocate operation"; > LOG.warn(message); > throw new YarnException(message); > } > } > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10342) [UI1] Provide a way to hide Tools section in Web UIv1
[ https://issues.apache.org/jira/browse/YARN-10342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177786#comment-17177786 ] Bilwa S T commented on YARN-10342: -- Thanks [~gandras] for updating patch. +1(Non-binding) > [UI1] Provide a way to hide Tools section in Web UIv1 > - > > Key: YARN-10342 > URL: https://issues.apache.org/jira/browse/YARN-10342 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Andras Gyori >Assignee: Andras Gyori >Priority: Minor > Attachments: Screenshot 2020-07-09 at 14.13.19.png, > YARN-10342.001.patch, YARN-10342.002.patch, YARN-10342.003.patch, > YARN-10342.004.patch > > > The Tools section in web UI1 might contain sensitive information, which > should ideally be hidden from end users. We should provide a configurable > value to hide it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org