[jira] [Updated] (YARN-1449) Protocol changes and implementations in NM side to support change container resource
[ https://issues.apache.org/jira/browse/YARN-1449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-1449: - Description: As described in YARN-1197, we need add API/implementation changes, 1) Add a changeContainersResources method in ContainerManagementProtocol 2) Can get succeed/failed increased/decreased containers in response of changeContainersResources 3) Add a new decreased containers field in NodeStatus which can help NM notify RM such changes was: As described in YARN-1197, we need add API/implementation changes, 1) Add a changeContainersResources method in ContainerManagementProtocol 2) Can get succeed/failed increased/decreased containers in response of changeContainersResources 3) Add a new decreased containers field in NodeStatus which can help NM notify RM such changes 4) Added changeContainersResources implementation in ContainerManagerImpl 5) Added changes in ContainersMonitorImpl to support change resource limit of containers Protocol changes and implementations in NM side to support change container resource Key: YARN-1449 URL: https://issues.apache.org/jira/browse/YARN-1449 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Reporter: Wangda Tan (No longer used) Assignee: Wangda Tan Attachments: yarn-1449.1.patch, yarn-1449.3.patch, yarn-1449.4.patch, yarn-1449.5.patch As described in YARN-1197, we need add API/implementation changes, 1) Add a changeContainersResources method in ContainerManagementProtocol 2) Can get succeed/failed increased/decreased containers in response of changeContainersResources 3) Add a new decreased containers field in NodeStatus which can help NM notify RM such changes -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-1449) Protocol changes and implementations in NM side to support change container resource
[ https://issues.apache.org/jira/browse/YARN-1449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-1449: - Attachment: yarn-1449.4.patch [~sandyr], I've updated my code according to your comments (thanks again!). But there're some places I haven't changed. {quote} {code} +// first add all decreased containers to failed, we will support it in the +// future {code} What is it that we will support in the future? {quote} Originally, I planned not support container decrease, so I put all decrease requests directly to failed list. This patch included decrease requests support and such decreased containers will be added to next node heartbeat sent to RM. {quote} {code} +public class ContainerChangeMonitoringEvent extends ContainersMonitorEvent { + private final long vmemLimit; + private final long pmemLimit; {code} No need to calculate / include the vmem here. The containers monitor should know about it. {quote} I think we needed them. Originally this is created by ContainerImpl.LaunchTransition, and ContainersMonitoring will not care about this, this is passed in to CsM by ContainerStartMonitoringEvent. I just follow the same way of what ContainerStartMonitoringEvent does. {quote} Unnecessary change {code} - private static class ProcessTreeInfo { + static class ProcessTreeInfo { {code} {quote} This is used for testing, the easiest way to test ContainersMonitoring is change pmem/vmem by setting fields in ResourceCalculatorProcessTree. We can get ProcessTreeInfo in trackingContainers of ContainersMonitor, but we cannot use this class because its private. I don't if I should propose another jira for this, but I think this can make our easier to write tests for ContainersMonitoring. {quote} {code} + public void testResourceChange() throws Exception { {code} It looks like this test relies on sleeps to wait for events to be handled. This both makes the test run longer and can cause flaky failures. I think there is a way to actually wait for the events to be completed. I don't know it off the top of my head but can look it up for you if that would be helpful. {quote} I've thinked about this but not come up with a better way to do it. This will be failed only in some extremely bad cases (like very low resource available). I've set YarnConfiguration.NM_CONTAINER_MON_INTERVAL_MS to 20ms in test, and 200ms seems enough for ContainersMonitor react to bad containers. Please let me know if you have any other ideas! And any thoughts about this patch? Protocol changes and implementations in NM side to support change container resource Key: YARN-1449 URL: https://issues.apache.org/jira/browse/YARN-1449 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.2.0 Reporter: Wangda Tan Assignee: Wangda Tan Attachments: yarn-1449.1.patch, yarn-1449.3.patch, yarn-1449.4.patch As described in YARN-1197, we need add API/implementation changes, 1) Add a changeContainersResources method in ContainerManagementProtocol 2) Can get succeed/failed increased/decreased containers in response of changeContainersResources 3) Add a new decreased containers field in NodeStatus which can help NM notify RM such changes 4) Added changeContainersResources implementation in ContainerManagerImpl 5) Added changes in ContainersMonitorImpl to support change resource limit of containers -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (YARN-1449) Protocol changes and implementations in NM side to support change container resource
[ https://issues.apache.org/jira/browse/YARN-1449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-1449: - Attachment: yarn-1449.5.patch Updated patch to handle findbugs warning Protocol changes and implementations in NM side to support change container resource Key: YARN-1449 URL: https://issues.apache.org/jira/browse/YARN-1449 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.2.0 Reporter: Wangda Tan Assignee: Wangda Tan Attachments: yarn-1449.1.patch, yarn-1449.3.patch, yarn-1449.4.patch, yarn-1449.5.patch As described in YARN-1197, we need add API/implementation changes, 1) Add a changeContainersResources method in ContainerManagementProtocol 2) Can get succeed/failed increased/decreased containers in response of changeContainersResources 3) Add a new decreased containers field in NodeStatus which can help NM notify RM such changes 4) Added changeContainersResources implementation in ContainerManagerImpl 5) Added changes in ContainersMonitorImpl to support change resource limit of containers -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (YARN-1449) Protocol changes and implementations in NM side to support change container resource
[ https://issues.apache.org/jira/browse/YARN-1449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-1449: - Summary: Protocol changes and implementations in NM side to support change container resource (was: Protocol changes in NM side to support change container resource) Protocol changes and implementations in NM side to support change container resource Key: YARN-1449 URL: https://issues.apache.org/jira/browse/YARN-1449 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.2.0 Reporter: Wangda Tan Assignee: Wangda Tan Attachments: yarn-1449.1.patch, yarn-1449.3.patch As described in YARN-1197, we need add API in NM to support 1) Add a changeContainersResources method in ContainerManagementProtocol 2) Can get succeed/failed increased/decreased containers in response of changeContainersResources 3) Add a new decreased containers field in NodeStatus which can help NM notify RM such changes -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1449) Protocol changes and implementations in NM side to support change container resource
[ https://issues.apache.org/jira/browse/YARN-1449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-1449: - Description: As described in YARN-1197, we need add API/implementation changes, 1) Add a changeContainersResources method in ContainerManagementProtocol 2) Can get succeed/failed increased/decreased containers in response of changeContainersResources 3) Add a new decreased containers field in NodeStatus which can help NM notify RM such changes 4) Added changeContainersResources implementation in ContainerManagerImpl 5) Added changes in ContainersMonitorImpl to support change resource limit of containers was: As described in YARN-1197, we need add API in NM to support 1) Add a changeContainersResources method in ContainerManagementProtocol 2) Can get succeed/failed increased/decreased containers in response of changeContainersResources 3) Add a new decreased containers field in NodeStatus which can help NM notify RM such changes Protocol changes and implementations in NM side to support change container resource Key: YARN-1449 URL: https://issues.apache.org/jira/browse/YARN-1449 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.2.0 Reporter: Wangda Tan Assignee: Wangda Tan Attachments: yarn-1449.1.patch, yarn-1449.3.patch As described in YARN-1197, we need add API/implementation changes, 1) Add a changeContainersResources method in ContainerManagementProtocol 2) Can get succeed/failed increased/decreased containers in response of changeContainersResources 3) Add a new decreased containers field in NodeStatus which can help NM notify RM such changes 4) Added changeContainersResources implementation in ContainerManagerImpl 5) Added changes in ContainersMonitorImpl to support change resource limit of containers -- This message was sent by Atlassian JIRA (v6.1#6144)