[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15667462#comment-15667462 ] Arun Suresh commented on YARN-4597: --- The testcase failures are not related to the patch. The checkstyle unused import warning refers to a class that has been renamed. This is probably since the github patch records that I had created the file. Not sure why I get this findBugs warning since it refers to findBugs files that don't exist. Committing this. Thanks for all the reviews [~jianhe], [~kasha], [~kkaranasos] and [~vvasudev]. > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, > YARN-4597.009.patch, YARN-4597.010.patch, YARN-4597.011.patch, > YARN-4597.012.patch, YARN-4597.013.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15667294#comment-15667294 ] Hadoop QA commented on YARN-4597: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 22s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 19 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 24s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 22s{color} | {color:red} branch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests no findbugs output file (hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/target/findbugsXml.xml) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 56s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 9m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 9m 44s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 59s{color} | {color:orange} root: The patch generated 10 new + 1070 unchanged - 17 fixed = 1080 total (was 1087) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 26s{color} | {color:red} patch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests no findbugs output file (hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/target/findbugsXml.xml) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 0 new + 235 unchanged - 1 fixed = 235 total (was 236) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} hadoop-yarn-server-tests in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} hadoop-mapreduce-client-jobclient in
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15666401#comment-15666401 ] Konstantinos Karanasos commented on YARN-4597: -- Thanks for the updated patch, [~asuresh]. Looks good to me. Some final comments below... All are minor, so up to you to address (I would only "insist" about the last one). - In the {{ContainerScheduler}}: -- In the comment for the runningContainers, let's mention that these are the running containers, including the containers that are in the process of transitioning from the SCHEDULED to the RUNNING state. I think the rest are details that might be confusing. -- In the {{updateQueuingLimit}}, you can do an extra check of the form {{if (this.queuingLimit.getMaxQueueLength() < queuedOpportunisticContainers.size())}} to avoid calling the shedding if the queue is not long enough. This might often be the case if the NM has imposed a small queue size. -- I was thinking that, although less likely than before, the fields of the {{OpportunisticContainersStatus()}} can still be updated during the {{getOpportunisticContainersStatus()}}. To avoid synchronization, we could set the fields using an event, and then in the {{getOpportunisticContainersStatus()}} we would just return the object. But if you think it is too much, we can leave it as is. -- In the {{onContainerCompleted}}, a container can belong either to queued guaranteed, to queued opportunistic or to running. So, you could avoid doing the remove from all lists once found in one of them. - In the {{YarnConfiguration}}, let's include in a comment that the max queue length coming from the RM is the globally max queue length. - In the {{SchedulerNode}}, I still suggest to put the {{++numContainers}} and the {{--numContainers}} inside the if statements. If I remember well, these fields are used for the web UI, so there will be a disconnect between the resources used (referring only to guaranteed containers) and the number of containers (referring to both guaranteed and opportunistic at the moment). The stats for the opportunistic containers are carried by the opportunisticContainersStatus, so we are good with reporting them too. Again, all comments are minor. +1 for the patch and thanks for all the work! > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, > YARN-4597.009.patch, YARN-4597.010.patch, YARN-4597.011.patch, > YARN-4597.012.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665022#comment-15665022 ] Hadoop QA commented on YARN-4597: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 19 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 34s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 24s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 23s{color} | {color:red} branch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests no findbugs output file (hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/target/findbugsXml.xml) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 3s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 10m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 10m 50s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 50s{color} | {color:orange} root: The patch generated 10 new + 1070 unchanged - 17 fixed = 1080 total (was 1087) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 24s{color} | {color:red} patch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests no findbugs output file (hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/target/findbugsXml.xml) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 0 new + 235 unchanged - 1 fixed = 235 total (was 236) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s{color} | {color:green} hadoop-yarn-server-tests in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} hadoop-mapreduce-client-jobclient in
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664991#comment-15664991 ] Karthik Kambatla commented on YARN-4597: The latest patch looks good to me. > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, > YARN-4597.009.patch, YARN-4597.010.patch, YARN-4597.011.patch, > YARN-4597.012.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664347#comment-15664347 ] ASF GitHub Bot commented on YARN-4597: -- Github user xslogic commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87836296 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java --- @@ -0,0 +1,411 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import com.google.common.annotations.VisibleForTesting; +import org.apache.hadoop.service.AbstractService; +import org.apache.hadoop.yarn.api.records.ContainerExitStatus; +import org.apache.hadoop.yarn.api.records.ContainerId; +import org.apache.hadoop.yarn.api.records.ExecutionType; +import org.apache.hadoop.yarn.api.records.ResourceUtilization; +import org.apache.hadoop.yarn.conf.YarnConfiguration; +import org.apache.hadoop.yarn.event.AsyncDispatcher; +import org.apache.hadoop.yarn.event.EventHandler; +import org.apache.hadoop.yarn.server.api.records.ContainerQueuingLimit; +import org.apache.hadoop.yarn.server.api.records.OpportunisticContainersStatus; +import org.apache.hadoop.yarn.server.nodemanager.Context; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitor; + + +import org.apache.hadoop.yarn.server.nodemanager.metrics.NodeManagerMetrics; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.Iterator; +import java.util.LinkedHashMap; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; + +/** + * The ContainerScheduler manages a collection of runnable containers. It + * ensures that a container is launched only if all its launch criteria are + * met. It also ensures that OPPORTUNISTIC containers are killed to make + * room for GUARANTEED containers. + */ +public class ContainerScheduler extends AbstractService implements +EventHandler { + + private static final Logger LOG = + LoggerFactory.getLogger(ContainerScheduler.class); + + private final Context context; + private final int maxOppQueueLength; + + // Queue of Guaranteed Containers waiting for resources to run + private final LinkedHashMap+ queuedGuaranteedContainers = new LinkedHashMap<>(); + // Queue of Opportunistic Containers waiting for resources to run + private final LinkedHashMap + queuedOpportunisticContainers = new LinkedHashMap<>(); + + // Used to keep track of containers that have been marked to be killed + // to make room for a guaranteed container. + private final Map oppContainersToKill = + new HashMap<>(); + + // Containers launched by the Scheduler will take a while to actually + // move to the RUNNING state, but should still be fair game for killing + // by the scheduler to make room for guaranteed containers. + private final LinkedHashMap runningContainers = + new LinkedHashMap<>(); + + private final ContainerQueuingLimit queuingLimit = + ContainerQueuingLimit.newInstance(); + + private final OpportunisticContainersStatus opportunisticContainersStatus; + + // Resource Utilization Tracker that decides how utilization of the cluster + // increases / decreases based on container start / finish + private ResourceUtilizationTracker utilizationTracker;
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664346#comment-15664346 ] ASF GitHub Bot commented on YARN-4597: -- Github user xslogic commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87836252 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ResourceUtilizationTracker.java --- @@ -0,0 +1,161 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import org.apache.hadoop.yarn.api.records.Resource; +import org.apache.hadoop.yarn.api.records.ResourceUtilization; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitor; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * This class abstracts out how a container contributes to Resource Utilization. + * It is used by the {@link ContainerScheduler} to determine which + * OPPORTUNISTIC containers to be killed to make room for a GUARANTEED + * container. + * It currently equates resource utilization with the total resource allocated + * to the container. Another implementation might choose to use the actual + * resource utilization. + */ + +public class ResourceUtilizationTracker { --- End diff -- Done fixed in latest commit > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, > YARN-4597.009.patch, YARN-4597.010.patch, YARN-4597.011.patch, > YARN-4597.012.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664241#comment-15664241 ] ASF GitHub Bot commented on YARN-4597: -- Github user xslogic commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87826671 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ResourceUtilizationTracker.java --- @@ -0,0 +1,161 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import org.apache.hadoop.yarn.api.records.Resource; +import org.apache.hadoop.yarn.api.records.ResourceUtilization; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitor; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * This class abstracts out how a container contributes to Resource Utilization. + * It is used by the {@link ContainerScheduler} to determine which + * OPPORTUNISTIC containers to be killed to make room for a GUARANTEED + * container. + * It currently equates resource utilization with the total resource allocated + * to the container. Another implementation might choose to use the actual + * resource utilization. + */ + +public class ResourceUtilizationTracker { --- End diff -- Ah yes... missed this... will fix > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, > YARN-4597.009.patch, YARN-4597.010.patch, YARN-4597.011.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664209#comment-15664209 ] ASF GitHub Bot commented on YARN-4597: -- Github user kambatla commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87823436 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java --- @@ -0,0 +1,411 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import com.google.common.annotations.VisibleForTesting; +import org.apache.hadoop.service.AbstractService; +import org.apache.hadoop.yarn.api.records.ContainerExitStatus; +import org.apache.hadoop.yarn.api.records.ContainerId; +import org.apache.hadoop.yarn.api.records.ExecutionType; +import org.apache.hadoop.yarn.api.records.ResourceUtilization; +import org.apache.hadoop.yarn.conf.YarnConfiguration; +import org.apache.hadoop.yarn.event.AsyncDispatcher; +import org.apache.hadoop.yarn.event.EventHandler; +import org.apache.hadoop.yarn.server.api.records.ContainerQueuingLimit; +import org.apache.hadoop.yarn.server.api.records.OpportunisticContainersStatus; +import org.apache.hadoop.yarn.server.nodemanager.Context; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitor; + + +import org.apache.hadoop.yarn.server.nodemanager.metrics.NodeManagerMetrics; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.Iterator; +import java.util.LinkedHashMap; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; + +/** + * The ContainerScheduler manages a collection of runnable containers. It + * ensures that a container is launched only if all its launch criteria are + * met. It also ensures that OPPORTUNISTIC containers are killed to make + * room for GUARANTEED containers. + */ +public class ContainerScheduler extends AbstractService implements +EventHandler { + + private static final Logger LOG = + LoggerFactory.getLogger(ContainerScheduler.class); + + private final Context context; + private final int maxOppQueueLength; + + // Queue of Guaranteed Containers waiting for resources to run + private final LinkedHashMap+ queuedGuaranteedContainers = new LinkedHashMap<>(); + // Queue of Opportunistic Containers waiting for resources to run + private final LinkedHashMap + queuedOpportunisticContainers = new LinkedHashMap<>(); + + // Used to keep track of containers that have been marked to be killed + // to make room for a guaranteed container. + private final Map oppContainersToKill = + new HashMap<>(); + + // Containers launched by the Scheduler will take a while to actually + // move to the RUNNING state, but should still be fair game for killing + // by the scheduler to make room for guaranteed containers. + private final LinkedHashMap runningContainers = + new LinkedHashMap<>(); + + private final ContainerQueuingLimit queuingLimit = + ContainerQueuingLimit.newInstance(); + + private final OpportunisticContainersStatus opportunisticContainersStatus; + + // Resource Utilization Tracker that decides how utilization of the cluster + // increases / decreases based on container start / finish + private ResourceUtilizationTracker utilizationTracker;
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664208#comment-15664208 ] ASF GitHub Bot commented on YARN-4597: -- Github user kambatla commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87822206 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java --- @@ -295,29 +306,29 @@ private void startAllocatedContainer(Container container) { // Go over the running opportunistic containers. // Use a descending iterator to kill more recently started containers. Iterator reverseContainerIterator = --- End diff -- Nit: Should we call this lifoIterator? > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, > YARN-4597.009.patch, YARN-4597.010.patch, YARN-4597.011.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664210#comment-15664210 ] ASF GitHub Bot commented on YARN-4597: -- Github user kambatla commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87823047 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ResourceUtilizationTracker.java --- @@ -0,0 +1,161 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import org.apache.hadoop.yarn.api.records.Resource; +import org.apache.hadoop.yarn.api.records.ResourceUtilization; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitor; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * This class abstracts out how a container contributes to Resource Utilization. + * It is used by the {@link ContainerScheduler} to determine which + * OPPORTUNISTIC containers to be killed to make room for a GUARANTEED + * container. + * It currently equates resource utilization with the total resource allocated + * to the container. Another implementation might choose to use the actual + * resource utilization. + */ + +public class ResourceUtilizationTracker { --- End diff -- Didn't we think this should be an interface with the default implementation being the one that considers allocation - a.k.a AllocationBased{ResourceUtilization}Tracker > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, > YARN-4597.009.patch, YARN-4597.010.patch, YARN-4597.011.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664213#comment-15664213 ] ASF GitHub Bot commented on YARN-4597: -- Github user kambatla commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87823624 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java --- @@ -0,0 +1,393 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import com.google.common.annotations.VisibleForTesting; +import org.apache.hadoop.service.AbstractService; +import org.apache.hadoop.yarn.api.records.ContainerExitStatus; +import org.apache.hadoop.yarn.api.records.ContainerId; +import org.apache.hadoop.yarn.api.records.ExecutionType; +import org.apache.hadoop.yarn.api.records.ResourceUtilization; +import org.apache.hadoop.yarn.conf.YarnConfiguration; +import org.apache.hadoop.yarn.event.EventHandler; +import org.apache.hadoop.yarn.server.api.records.ContainerQueuingLimit; +import org.apache.hadoop.yarn.server.api.records.OpportunisticContainersStatus; +import org.apache.hadoop.yarn.server.nodemanager.Context; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitor; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.Iterator; +import java.util.LinkedHashMap; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; + +/** + * The ContainerScheduler manages a collection of runnable containers. It + * ensures that a container is launched only if all it launch criteria are + * met. It also ensures that OPPORTUNISTIC containers are killed to make + * room for GUARANTEED containers. + */ +public class ContainerScheduler extends AbstractService implements +EventHandler { + + private static final Logger LOG = + LoggerFactory.getLogger(ContainerScheduler.class); + + private final Context context; + private final int maxOppQueueLength; + + // Queue of Guaranteed Containers waiting for resources to run + private final LinkedHashMap--- End diff -- Sounds reasonable. > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, > YARN-4597.009.patch, YARN-4597.010.patch, YARN-4597.011.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15662488#comment-15662488 ] Hadoop QA commented on YARN-4597: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 23s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 19 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 43s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 25s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 22s{color} | {color:red} branch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests no findbugs output file (hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/target/findbugsXml.xml) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 55s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 9m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 9m 12s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 52s{color} | {color:orange} root: The patch generated 10 new + 1070 unchanged - 17 fixed = 1080 total (was 1087) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 25s{color} | {color:red} patch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests no findbugs output file (hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/target/findbugsXml.xml) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 0 new + 235 unchanged - 1 fixed = 235 total (was 236) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s{color} | {color:green} hadoop-yarn-server-tests in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} hadoop-mapreduce-client-jobclient in
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15661942#comment-15661942 ] Hadoop QA commented on YARN-4597: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 25s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 19 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 40s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 1s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 30s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 22s{color} | {color:red} branch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests no findbugs output file (hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/target/findbugsXml.xml) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 56s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 9m 17s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 9m 17s{color} | {color:red} root generated 1 new + 690 unchanged - 1 fixed = 691 total (was 691) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 51s{color} | {color:orange} root: The patch generated 15 new + 1070 unchanged - 17 fixed = 1085 total (was 1087) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 25s{color} | {color:red} patch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests no findbugs output file (hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/target/findbugsXml.xml) {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 28s{color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 2 new + 235 unchanged - 1 fixed = 237 total (was 236) {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 36s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 28s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 13m 27s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 40m 11s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 4m 39s{color} | {color:red} hadoop-yarn-server-tests in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 16m 18s{color} | {color:green} hadoop-yarn-client in
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658402#comment-15658402 ] Konstantinos Karanasos commented on YARN-4597: -- Here are some comments on the {{ContainerScheduler}}: - {{queuedOpportunisticContainers}} will have concurrency issues. We are updating it when containers arrive but also in the {{shedQueuedOpportunisticContainers}}. - {{queuedGuaranteedContainers}} and {{queuedOpportunisticContainers}}: I think we should use queues. I don't think we retrieve the container by the key anywhere either ways. - {{oppContainersMarkedForKill}}: could be a Set, right? - {{scheduledToRunContainers}} are containers that are either already running or are going to run very soon (transitioning from SCHEDULED to RUNNING state). Name is a bit misleading, because it sounds like they are only the ones belonging to the second category. I would rather say {{runningContainers}} and specify in a comment that they might not be running at this very moment but will be running very soon. - In the {{onContainerCompleted()}}, the {{scheduledToRunContainers.remove(container.getContainerId())}} and the {{startPendingContainers()}} can go inside the if statement above. If the container was not running and no resources were freed up, we don't need to call the {{startPendingContainers()}}. - fields of the {{opportunisticContainersStatus}} are set in different places. Due to that, when we call {{getOpportunisticContainersStatus()}} we may see an inconsistent object. Let's set the fields only in the {{getOpportunisticContainersStatus()}}. - line 252, indeed we can now do extraOpportContainersToKill -> opportContainersToKill, as Karthik mentioned at a comment. - line 87: increase -> increases - {{shedQueuedOpportunisticContainers}}: -- {{numAllowed}} is the number of allowed containers in the queue. Instead, we are killing numAllowed containers. In other words, we should not kill numAllowed, but {{queuedOpportunisticContainers.size() - numAllowed}}. -- "Container Killed to make room for Guaranteed Container." -> "Container killed to meet NM queuing limits". Instead of kill, you can also say de-queued. > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, > YARN-4597.009.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658380#comment-15658380 ] ASF GitHub Bot commented on YARN-4597: -- Github user xslogic commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87670013 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ResourceUtilizationManager.java --- @@ -0,0 +1,163 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import org.apache.hadoop.yarn.api.records.ExecutionType; +import org.apache.hadoop.yarn.api.records.Resource; +import org.apache.hadoop.yarn.api.records.ResourceUtilization; +import org.apache.hadoop.yarn.server.api.records.OpportunisticContainersStatus; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitor; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * This class abstracts out how a container contributes to Resource Utilization. + * It is used by the {@link ContainerScheduler} to determine which + * OPPORTUNISTIC containers to be killed to make room for a GUARANTEED + * container. + * It currently equates resource utilization with the total resource allocated + * to the container. Another implementation might choose to use the actual + * resource utilization. + */ + +public class ResourceUtilizationManager { + + private static final Logger LOG = + LoggerFactory.getLogger(ResourceUtilizationManager.class); + + private ResourceUtilization containersAllocation; + private ContainerScheduler scheduler; + + ResourceUtilizationManager(ContainerScheduler scheduler) { +this.containersAllocation = ResourceUtilization.newInstance(0, 0, 0.0f); +this.scheduler = scheduler; + } + + /** + * Get the current accumulated utilization. Currently it is the accumulation + * of totally allocated resources to a container. + * @return ResourceUtilization Resource Utilization. + */ + public ResourceUtilization getCurrentUtilization() { +return this.containersAllocation; + } + + /** + * Add Container's resources to the accumulated Utilization. + * @param container Container. + */ + public void addContainerResources(Container container) { +increaseResourceUtilization( +getContainersMonitor(), this.containersAllocation, +container.getResource()); + } + + /** + * Subtract Container's resources to the accumulated Utilization. + * @param container Container. + */ + public void subtractContainerResource(Container container) { +decreaseResourceUtilization( +getContainersMonitor(), this.containersAllocation, +container.getResource()); + } + + /** + * Check if NM has resources available currently to run the container. + * @param container Container. + * @return True, if NM has resources available currently to run the container. + */ + public boolean hasResourcesAvailable(Container container) { +long pMemBytes = container.getResource().getMemorySize() * 1024 * 1024L; +return hasResourcesAvailable(pMemBytes, +(long) (getContainersMonitor().getVmemRatio()* pMemBytes), +container.getResource().getVirtualCores()); + } + + private boolean hasResourcesAvailable(long pMemBytes, long vMemBytes, + int cpuVcores) { +// Check physical memory. +if (LOG.isDebugEnabled()) { + LOG.debug("pMemCheck [current={} + asked={} > allowed={}]", +
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658359#comment-15658359 ] ASF GitHub Bot commented on YARN-4597: -- Github user xslogic commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87669163 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ResourceUtilizationManager.java --- @@ -0,0 +1,163 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import org.apache.hadoop.yarn.api.records.ExecutionType; --- End diff -- will do.. > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, > YARN-4597.009.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658357#comment-15658357 ] ASF GitHub Bot commented on YARN-4597: -- Github user xslogic commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87669098 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java --- @@ -743,6 +739,8 @@ private void changeContainerResource( LOG.warn("Container " + containerId.toString() + "does not exist"); return; } +// TODO: Route this through the ContainerScheduler to --- End diff -- will do: YARN-5860 > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, > YARN-4597.009.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658353#comment-15658353 ] ASF GitHub Bot commented on YARN-4597: -- Github user xslogic commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87668870 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManager.java --- @@ -26,6 +26,8 @@ import org.apache.hadoop.yarn.server.nodemanager.ContainerManagerEvent; import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor .ContainersMonitor; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler +.ContainerScheduler; --- End diff -- as mentioned earlier.. can we leave it as it is, since it is ignored by checkstyle ? > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, > YARN-4597.009.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658350#comment-15658350 ] ASF GitHub Bot commented on YARN-4597: -- Github user xslogic commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87668767 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml --- @@ -1000,10 +1000,10 @@ -Enable Queuing of OPPORTUNISTIC containers on the +Max numbed of OPPORTUNISTIC containers to queue at the --- End diff -- done.. > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, > YARN-4597.009.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658346#comment-15658346 ] ASF GitHub Bot commented on YARN-4597: -- Github user xslogic commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87668681 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerSchedulerEvent.java --- @@ -0,0 +1,51 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import org.apache.hadoop.yarn.event.AbstractEvent; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container +.Container; --- End diff -- My IDE is very obstinate about it :) Can we leave it as it is.. especially since the checkstyle anyway ignores it ? > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, > YARN-4597.009.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658343#comment-15658343 ] ASF GitHub Bot commented on YARN-4597: -- Github user xslogic commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87668561 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java --- @@ -0,0 +1,393 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import com.google.common.annotations.VisibleForTesting; +import org.apache.hadoop.service.AbstractService; +import org.apache.hadoop.yarn.api.records.ContainerExitStatus; +import org.apache.hadoop.yarn.api.records.ContainerId; +import org.apache.hadoop.yarn.api.records.ExecutionType; +import org.apache.hadoop.yarn.api.records.ResourceUtilization; +import org.apache.hadoop.yarn.conf.YarnConfiguration; +import org.apache.hadoop.yarn.event.EventHandler; +import org.apache.hadoop.yarn.server.api.records.ContainerQueuingLimit; +import org.apache.hadoop.yarn.server.api.records.OpportunisticContainersStatus; +import org.apache.hadoop.yarn.server.nodemanager.Context; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitor; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.Iterator; +import java.util.LinkedHashMap; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; + +/** + * The ContainerScheduler manages a collection of runnable containers. It + * ensures that a container is launched only if all it launch criteria are + * met. It also ensures that OPPORTUNISTIC containers are killed to make + * room for GUARANTEED containers. + */ +public class ContainerScheduler extends AbstractService implements +EventHandler { + + private static final Logger LOG = + LoggerFactory.getLogger(ContainerScheduler.class); + + private final Context context; + private final int maxOppQueueLength; + + // Queue of Guaranteed Containers waiting for resources to run + private final LinkedHashMap+ queuedGuaranteedContainers = new LinkedHashMap<>(); + // Queue of Opportunistic Containers waiting for resources to run + private final LinkedHashMap + queuedOpportunisticContainers = new LinkedHashMap<>(); + + // Used to keep track of containers that have been marked to be killed + // to make room for a guaranteed container. + private final Map oppContainersMarkedForKill = + new HashMap<>(); + + // Containers launched by the Scheduler will take a while to actually + // move to the RUNNING state, but should still be fair game for killing + // by the scheduler to make room for guaranteed containers. + private final LinkedHashMap scheduledToRunContainers = + new LinkedHashMap<>(); + + private final ContainerQueuingLimit queuingLimit = + ContainerQueuingLimit.newInstance(); + + private final OpportunisticContainersStatus opportunisticContainersStatus; + + // Resource Utilization Manager that decides how utilization of the cluster + // increase / decreases based on container start / finish + private ResourceUtilizationManager utilizationManager; + + /** + * Instantiate a Container Scheduler. + * @param context NodeManager Context. + */ + public
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658338#comment-15658338 ] ASF GitHub Bot commented on YARN-4597: -- Github user xslogic commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87668393 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java --- @@ -0,0 +1,393 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import com.google.common.annotations.VisibleForTesting; +import org.apache.hadoop.service.AbstractService; +import org.apache.hadoop.yarn.api.records.ContainerExitStatus; +import org.apache.hadoop.yarn.api.records.ContainerId; +import org.apache.hadoop.yarn.api.records.ExecutionType; +import org.apache.hadoop.yarn.api.records.ResourceUtilization; +import org.apache.hadoop.yarn.conf.YarnConfiguration; +import org.apache.hadoop.yarn.event.EventHandler; +import org.apache.hadoop.yarn.server.api.records.ContainerQueuingLimit; +import org.apache.hadoop.yarn.server.api.records.OpportunisticContainersStatus; +import org.apache.hadoop.yarn.server.nodemanager.Context; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitor; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.Iterator; +import java.util.LinkedHashMap; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; + +/** + * The ContainerScheduler manages a collection of runnable containers. It + * ensures that a container is launched only if all it launch criteria are + * met. It also ensures that OPPORTUNISTIC containers are killed to make + * room for GUARANTEED containers. + */ +public class ContainerScheduler extends AbstractService implements +EventHandler { + + private static final Logger LOG = + LoggerFactory.getLogger(ContainerScheduler.class); + + private final Context context; + private final int maxOppQueueLength; + + // Queue of Guaranteed Containers waiting for resources to run + private final LinkedHashMap+ queuedGuaranteedContainers = new LinkedHashMap<>(); + // Queue of Opportunistic Containers waiting for resources to run + private final LinkedHashMap + queuedOpportunisticContainers = new LinkedHashMap<>(); + + // Used to keep track of containers that have been marked to be killed + // to make room for a guaranteed container. + private final Map oppContainersMarkedForKill = + new HashMap<>(); + + // Containers launched by the Scheduler will take a while to actually + // move to the RUNNING state, but should still be fair game for killing + // by the scheduler to make room for guaranteed containers. + private final LinkedHashMap scheduledToRunContainers = + new LinkedHashMap<>(); + + private final ContainerQueuingLimit queuingLimit = + ContainerQueuingLimit.newInstance(); + + private final OpportunisticContainersStatus opportunisticContainersStatus; + + // Resource Utilization Manager that decides how utilization of the cluster + // increase / decreases based on container start / finish + private ResourceUtilizationManager utilizationManager; + + /** + * Instantiate a Container Scheduler. + * @param context NodeManager Context. + */ + public
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658334#comment-15658334 ] ASF GitHub Bot commented on YARN-4597: -- Github user xslogic commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87668067 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java --- @@ -0,0 +1,393 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import com.google.common.annotations.VisibleForTesting; +import org.apache.hadoop.service.AbstractService; +import org.apache.hadoop.yarn.api.records.ContainerExitStatus; +import org.apache.hadoop.yarn.api.records.ContainerId; +import org.apache.hadoop.yarn.api.records.ExecutionType; +import org.apache.hadoop.yarn.api.records.ResourceUtilization; +import org.apache.hadoop.yarn.conf.YarnConfiguration; +import org.apache.hadoop.yarn.event.EventHandler; +import org.apache.hadoop.yarn.server.api.records.ContainerQueuingLimit; +import org.apache.hadoop.yarn.server.api.records.OpportunisticContainersStatus; +import org.apache.hadoop.yarn.server.nodemanager.Context; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitor; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.Iterator; +import java.util.LinkedHashMap; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; + +/** + * The ContainerScheduler manages a collection of runnable containers. It + * ensures that a container is launched only if all it launch criteria are + * met. It also ensures that OPPORTUNISTIC containers are killed to make + * room for GUARANTEED containers. + */ +public class ContainerScheduler extends AbstractService implements +EventHandler { + + private static final Logger LOG = + LoggerFactory.getLogger(ContainerScheduler.class); + + private final Context context; + private final int maxOppQueueLength; + + // Queue of Guaranteed Containers waiting for resources to run + private final LinkedHashMap+ queuedGuaranteedContainers = new LinkedHashMap<>(); + // Queue of Opportunistic Containers waiting for resources to run + private final LinkedHashMap + queuedOpportunisticContainers = new LinkedHashMap<>(); + + // Used to keep track of containers that have been marked to be killed + // to make room for a guaranteed container. + private final Map oppContainersMarkedForKill = + new HashMap<>(); + + // Containers launched by the Scheduler will take a while to actually + // move to the RUNNING state, but should still be fair game for killing + // by the scheduler to make room for guaranteed containers. + private final LinkedHashMap scheduledToRunContainers = + new LinkedHashMap<>(); + + private final ContainerQueuingLimit queuingLimit = + ContainerQueuingLimit.newInstance(); + + private final OpportunisticContainersStatus opportunisticContainersStatus; + + // Resource Utilization Manager that decides how utilization of the cluster + // increase / decreases based on container start / finish + private ResourceUtilizationManager utilizationManager; + + /** + * Instantiate a Container Scheduler. + * @param context NodeManager Context. + */ + public
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658320#comment-15658320 ] ASF GitHub Bot commented on YARN-4597: -- Github user xslogic commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87667155 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java --- @@ -0,0 +1,393 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import com.google.common.annotations.VisibleForTesting; +import org.apache.hadoop.service.AbstractService; +import org.apache.hadoop.yarn.api.records.ContainerExitStatus; +import org.apache.hadoop.yarn.api.records.ContainerId; +import org.apache.hadoop.yarn.api.records.ExecutionType; +import org.apache.hadoop.yarn.api.records.ResourceUtilization; +import org.apache.hadoop.yarn.conf.YarnConfiguration; +import org.apache.hadoop.yarn.event.EventHandler; +import org.apache.hadoop.yarn.server.api.records.ContainerQueuingLimit; +import org.apache.hadoop.yarn.server.api.records.OpportunisticContainersStatus; +import org.apache.hadoop.yarn.server.nodemanager.Context; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitor; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.Iterator; +import java.util.LinkedHashMap; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; + +/** + * The ContainerScheduler manages a collection of runnable containers. It + * ensures that a container is launched only if all it launch criteria are + * met. It also ensures that OPPORTUNISTIC containers are killed to make + * room for GUARANTEED containers. + */ +public class ContainerScheduler extends AbstractService implements +EventHandler { + + private static final Logger LOG = + LoggerFactory.getLogger(ContainerScheduler.class); + + private final Context context; + private final int maxOppQueueLength; + + // Queue of Guaranteed Containers waiting for resources to run + private final LinkedHashMap+ queuedGuaranteedContainers = new LinkedHashMap<>(); + // Queue of Opportunistic Containers waiting for resources to run + private final LinkedHashMap + queuedOpportunisticContainers = new LinkedHashMap<>(); + + // Used to keep track of containers that have been marked to be killed + // to make room for a guaranteed container. + private final Map oppContainersMarkedForKill = + new HashMap<>(); + + // Containers launched by the Scheduler will take a while to actually + // move to the RUNNING state, but should still be fair game for killing + // by the scheduler to make room for guaranteed containers. + private final LinkedHashMap scheduledToRunContainers = + new LinkedHashMap<>(); + + private final ContainerQueuingLimit queuingLimit = + ContainerQueuingLimit.newInstance(); + + private final OpportunisticContainersStatus opportunisticContainersStatus; + + // Resource Utilization Manager that decides how utilization of the cluster + // increase / decreases based on container start / finish + private ResourceUtilizationManager utilizationManager; + + /** + * Instantiate a Container Scheduler. + * @param context NodeManager Context. + */ + public
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658308#comment-15658308 ] ASF GitHub Bot commented on YARN-4597: -- Github user xslogic commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87666512 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java --- @@ -0,0 +1,393 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import com.google.common.annotations.VisibleForTesting; +import org.apache.hadoop.service.AbstractService; +import org.apache.hadoop.yarn.api.records.ContainerExitStatus; +import org.apache.hadoop.yarn.api.records.ContainerId; +import org.apache.hadoop.yarn.api.records.ExecutionType; +import org.apache.hadoop.yarn.api.records.ResourceUtilization; +import org.apache.hadoop.yarn.conf.YarnConfiguration; +import org.apache.hadoop.yarn.event.EventHandler; +import org.apache.hadoop.yarn.server.api.records.ContainerQueuingLimit; +import org.apache.hadoop.yarn.server.api.records.OpportunisticContainersStatus; +import org.apache.hadoop.yarn.server.nodemanager.Context; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitor; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.Iterator; +import java.util.LinkedHashMap; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; + +/** + * The ContainerScheduler manages a collection of runnable containers. It + * ensures that a container is launched only if all it launch criteria are + * met. It also ensures that OPPORTUNISTIC containers are killed to make + * room for GUARANTEED containers. + */ +public class ContainerScheduler extends AbstractService implements +EventHandler { + + private static final Logger LOG = + LoggerFactory.getLogger(ContainerScheduler.class); + + private final Context context; + private final int maxOppQueueLength; + + // Queue of Guaranteed Containers waiting for resources to run + private final LinkedHashMap+ queuedGuaranteedContainers = new LinkedHashMap<>(); + // Queue of Opportunistic Containers waiting for resources to run + private final LinkedHashMap + queuedOpportunisticContainers = new LinkedHashMap<>(); + + // Used to keep track of containers that have been marked to be killed + // to make room for a guaranteed container. + private final Map oppContainersMarkedForKill = + new HashMap<>(); + + // Containers launched by the Scheduler will take a while to actually + // move to the RUNNING state, but should still be fair game for killing + // by the scheduler to make room for guaranteed containers. + private final LinkedHashMap scheduledToRunContainers = + new LinkedHashMap<>(); + + private final ContainerQueuingLimit queuingLimit = + ContainerQueuingLimit.newInstance(); + + private final OpportunisticContainersStatus opportunisticContainersStatus; + + // Resource Utilization Manager that decides how utilization of the cluster + // increase / decreases based on container start / finish + private ResourceUtilizationManager utilizationManager; + + /** + * Instantiate a Container Scheduler. + * @param context NodeManager Context. + */ + public
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658300#comment-15658300 ] ASF GitHub Bot commented on YARN-4597: -- Github user xslogic commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87666090 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java --- @@ -0,0 +1,393 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import com.google.common.annotations.VisibleForTesting; +import org.apache.hadoop.service.AbstractService; +import org.apache.hadoop.yarn.api.records.ContainerExitStatus; +import org.apache.hadoop.yarn.api.records.ContainerId; +import org.apache.hadoop.yarn.api.records.ExecutionType; +import org.apache.hadoop.yarn.api.records.ResourceUtilization; +import org.apache.hadoop.yarn.conf.YarnConfiguration; +import org.apache.hadoop.yarn.event.EventHandler; +import org.apache.hadoop.yarn.server.api.records.ContainerQueuingLimit; +import org.apache.hadoop.yarn.server.api.records.OpportunisticContainersStatus; +import org.apache.hadoop.yarn.server.nodemanager.Context; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitor; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.Iterator; +import java.util.LinkedHashMap; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; + +/** + * The ContainerScheduler manages a collection of runnable containers. It + * ensures that a container is launched only if all it launch criteria are + * met. It also ensures that OPPORTUNISTIC containers are killed to make + * room for GUARANTEED containers. + */ +public class ContainerScheduler extends AbstractService implements +EventHandler { + + private static final Logger LOG = + LoggerFactory.getLogger(ContainerScheduler.class); + + private final Context context; + private final int maxOppQueueLength; + + // Queue of Guaranteed Containers waiting for resources to run + private final LinkedHashMap+ queuedGuaranteedContainers = new LinkedHashMap<>(); + // Queue of Opportunistic Containers waiting for resources to run + private final LinkedHashMap + queuedOpportunisticContainers = new LinkedHashMap<>(); + + // Used to keep track of containers that have been marked to be killed + // to make room for a guaranteed container. + private final Map oppContainersMarkedForKill = + new HashMap<>(); + + // Containers launched by the Scheduler will take a while to actually + // move to the RUNNING state, but should still be fair game for killing + // by the scheduler to make room for guaranteed containers. + private final LinkedHashMap scheduledToRunContainers = + new LinkedHashMap<>(); + + private final ContainerQueuingLimit queuingLimit = + ContainerQueuingLimit.newInstance(); + + private final OpportunisticContainersStatus opportunisticContainersStatus; + + // Resource Utilization Manager that decides how utilization of the cluster + // increase / decreases based on container start / finish + private ResourceUtilizationManager utilizationManager; + + /** + * Instantiate a Container Scheduler. + * @param context NodeManager Context. + */ + public
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658291#comment-15658291 ] ASF GitHub Bot commented on YARN-4597: -- Github user xslogic commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87665464 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java --- @@ -0,0 +1,393 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import com.google.common.annotations.VisibleForTesting; +import org.apache.hadoop.service.AbstractService; +import org.apache.hadoop.yarn.api.records.ContainerExitStatus; +import org.apache.hadoop.yarn.api.records.ContainerId; +import org.apache.hadoop.yarn.api.records.ExecutionType; +import org.apache.hadoop.yarn.api.records.ResourceUtilization; +import org.apache.hadoop.yarn.conf.YarnConfiguration; +import org.apache.hadoop.yarn.event.EventHandler; +import org.apache.hadoop.yarn.server.api.records.ContainerQueuingLimit; +import org.apache.hadoop.yarn.server.api.records.OpportunisticContainersStatus; +import org.apache.hadoop.yarn.server.nodemanager.Context; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitor; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.Iterator; +import java.util.LinkedHashMap; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; + +/** + * The ContainerScheduler manages a collection of runnable containers. It + * ensures that a container is launched only if all it launch criteria are + * met. It also ensures that OPPORTUNISTIC containers are killed to make + * room for GUARANTEED containers. + */ +public class ContainerScheduler extends AbstractService implements +EventHandler { + + private static final Logger LOG = + LoggerFactory.getLogger(ContainerScheduler.class); + + private final Context context; + private final int maxOppQueueLength; + + // Queue of Guaranteed Containers waiting for resources to run + private final LinkedHashMap+ queuedGuaranteedContainers = new LinkedHashMap<>(); + // Queue of Opportunistic Containers waiting for resources to run + private final LinkedHashMap + queuedOpportunisticContainers = new LinkedHashMap<>(); + + // Used to keep track of containers that have been marked to be killed + // to make room for a guaranteed container. + private final Map oppContainersMarkedForKill = + new HashMap<>(); + + // Containers launched by the Scheduler will take a while to actually + // move to the RUNNING state, but should still be fair game for killing + // by the scheduler to make room for guaranteed containers. + private final LinkedHashMap scheduledToRunContainers = + new LinkedHashMap<>(); + + private final ContainerQueuingLimit queuingLimit = + ContainerQueuingLimit.newInstance(); + + private final OpportunisticContainersStatus opportunisticContainersStatus; + + // Resource Utilization Manager that decides how utilization of the cluster + // increase / decreases based on container start / finish + private ResourceUtilizationManager utilizationManager; + + /** + * Instantiate a Container Scheduler. + * @param context NodeManager Context. + */ + public
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658290#comment-15658290 ] ASF GitHub Bot commented on YARN-4597: -- Github user xslogic commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87665434 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java --- @@ -0,0 +1,393 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import com.google.common.annotations.VisibleForTesting; +import org.apache.hadoop.service.AbstractService; +import org.apache.hadoop.yarn.api.records.ContainerExitStatus; +import org.apache.hadoop.yarn.api.records.ContainerId; +import org.apache.hadoop.yarn.api.records.ExecutionType; +import org.apache.hadoop.yarn.api.records.ResourceUtilization; +import org.apache.hadoop.yarn.conf.YarnConfiguration; +import org.apache.hadoop.yarn.event.EventHandler; +import org.apache.hadoop.yarn.server.api.records.ContainerQueuingLimit; +import org.apache.hadoop.yarn.server.api.records.OpportunisticContainersStatus; +import org.apache.hadoop.yarn.server.nodemanager.Context; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitor; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.Iterator; +import java.util.LinkedHashMap; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; + +/** + * The ContainerScheduler manages a collection of runnable containers. It + * ensures that a container is launched only if all it launch criteria are + * met. It also ensures that OPPORTUNISTIC containers are killed to make + * room for GUARANTEED containers. + */ +public class ContainerScheduler extends AbstractService implements +EventHandler { + + private static final Logger LOG = + LoggerFactory.getLogger(ContainerScheduler.class); + + private final Context context; + private final int maxOppQueueLength; + + // Queue of Guaranteed Containers waiting for resources to run + private final LinkedHashMap+ queuedGuaranteedContainers = new LinkedHashMap<>(); + // Queue of Opportunistic Containers waiting for resources to run + private final LinkedHashMap + queuedOpportunisticContainers = new LinkedHashMap<>(); + + // Used to keep track of containers that have been marked to be killed + // to make room for a guaranteed container. + private final Map oppContainersMarkedForKill = + new HashMap<>(); + + // Containers launched by the Scheduler will take a while to actually + // move to the RUNNING state, but should still be fair game for killing + // by the scheduler to make room for guaranteed containers. + private final LinkedHashMap scheduledToRunContainers = + new LinkedHashMap<>(); + + private final ContainerQueuingLimit queuingLimit = + ContainerQueuingLimit.newInstance(); + + private final OpportunisticContainersStatus opportunisticContainersStatus; + + // Resource Utilization Manager that decides how utilization of the cluster + // increase / decreases based on container start / finish + private ResourceUtilizationManager utilizationManager; + + /** + * Instantiate a Container Scheduler. + * @param context NodeManager Context. + */ + public
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658283#comment-15658283 ] ASF GitHub Bot commented on YARN-4597: -- Github user xslogic commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87665076 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java --- @@ -0,0 +1,393 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import com.google.common.annotations.VisibleForTesting; +import org.apache.hadoop.service.AbstractService; +import org.apache.hadoop.yarn.api.records.ContainerExitStatus; +import org.apache.hadoop.yarn.api.records.ContainerId; +import org.apache.hadoop.yarn.api.records.ExecutionType; +import org.apache.hadoop.yarn.api.records.ResourceUtilization; +import org.apache.hadoop.yarn.conf.YarnConfiguration; +import org.apache.hadoop.yarn.event.EventHandler; +import org.apache.hadoop.yarn.server.api.records.ContainerQueuingLimit; +import org.apache.hadoop.yarn.server.api.records.OpportunisticContainersStatus; +import org.apache.hadoop.yarn.server.nodemanager.Context; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitor; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.Iterator; +import java.util.LinkedHashMap; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; + +/** + * The ContainerScheduler manages a collection of runnable containers. It + * ensures that a container is launched only if all it launch criteria are + * met. It also ensures that OPPORTUNISTIC containers are killed to make + * room for GUARANTEED containers. + */ +public class ContainerScheduler extends AbstractService implements +EventHandler { + + private static final Logger LOG = + LoggerFactory.getLogger(ContainerScheduler.class); + + private final Context context; + private final int maxOppQueueLength; + + // Queue of Guaranteed Containers waiting for resources to run + private final LinkedHashMap+ queuedGuaranteedContainers = new LinkedHashMap<>(); + // Queue of Opportunistic Containers waiting for resources to run + private final LinkedHashMap + queuedOpportunisticContainers = new LinkedHashMap<>(); + + // Used to keep track of containers that have been marked to be killed + // to make room for a guaranteed container. + private final Map oppContainersMarkedForKill = + new HashMap<>(); + + // Containers launched by the Scheduler will take a while to actually + // move to the RUNNING state, but should still be fair game for killing + // by the scheduler to make room for guaranteed containers. + private final LinkedHashMap scheduledToRunContainers = --- End diff -- I agree with @kkaranasos, lets keep it as it is... Regarding the LIFO order of killing, I am using a descending iterator so candidates are examined in LIFO order. It was earlier 'runningContainers'. But based of @jian-he's feedback, we felt this is better, since the containers need not be immediately running at the point it is added to this. > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 >
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658273#comment-15658273 ] ASF GitHub Bot commented on YARN-4597: -- Github user xslogic commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87664536 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java --- @@ -0,0 +1,393 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import com.google.common.annotations.VisibleForTesting; +import org.apache.hadoop.service.AbstractService; +import org.apache.hadoop.yarn.api.records.ContainerExitStatus; +import org.apache.hadoop.yarn.api.records.ContainerId; +import org.apache.hadoop.yarn.api.records.ExecutionType; +import org.apache.hadoop.yarn.api.records.ResourceUtilization; +import org.apache.hadoop.yarn.conf.YarnConfiguration; +import org.apache.hadoop.yarn.event.EventHandler; +import org.apache.hadoop.yarn.server.api.records.ContainerQueuingLimit; +import org.apache.hadoop.yarn.server.api.records.OpportunisticContainersStatus; +import org.apache.hadoop.yarn.server.nodemanager.Context; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitor; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.Iterator; +import java.util.LinkedHashMap; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; + +/** + * The ContainerScheduler manages a collection of runnable containers. It + * ensures that a container is launched only if all it launch criteria are + * met. It also ensures that OPPORTUNISTIC containers are killed to make + * room for GUARANTEED containers. + */ +public class ContainerScheduler extends AbstractService implements +EventHandler { + + private static final Logger LOG = + LoggerFactory.getLogger(ContainerScheduler.class); + + private final Context context; + private final int maxOppQueueLength; + + // Queue of Guaranteed Containers waiting for resources to run + private final LinkedHashMap+ queuedGuaranteedContainers = new LinkedHashMap<>(); + // Queue of Opportunistic Containers waiting for resources to run + private final LinkedHashMap --- End diff -- Ditto :) > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, > YARN-4597.009.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658276#comment-15658276 ] ASF GitHub Bot commented on YARN-4597: -- Github user xslogic commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87664657 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java --- @@ -0,0 +1,393 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import com.google.common.annotations.VisibleForTesting; +import org.apache.hadoop.service.AbstractService; +import org.apache.hadoop.yarn.api.records.ContainerExitStatus; +import org.apache.hadoop.yarn.api.records.ContainerId; +import org.apache.hadoop.yarn.api.records.ExecutionType; +import org.apache.hadoop.yarn.api.records.ResourceUtilization; +import org.apache.hadoop.yarn.conf.YarnConfiguration; +import org.apache.hadoop.yarn.event.EventHandler; +import org.apache.hadoop.yarn.server.api.records.ContainerQueuingLimit; +import org.apache.hadoop.yarn.server.api.records.OpportunisticContainersStatus; +import org.apache.hadoop.yarn.server.nodemanager.Context; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitor; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.Iterator; +import java.util.LinkedHashMap; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; + +/** + * The ContainerScheduler manages a collection of runnable containers. It + * ensures that a container is launched only if all it launch criteria are + * met. It also ensures that OPPORTUNISTIC containers are killed to make + * room for GUARANTEED containers. + */ +public class ContainerScheduler extends AbstractService implements +EventHandler { + + private static final Logger LOG = + LoggerFactory.getLogger(ContainerScheduler.class); + + private final Context context; + private final int maxOppQueueLength; + + // Queue of Guaranteed Containers waiting for resources to run + private final LinkedHashMap+ queuedGuaranteedContainers = new LinkedHashMap<>(); + // Queue of Opportunistic Containers waiting for resources to run + private final LinkedHashMap + queuedOpportunisticContainers = new LinkedHashMap<>(); + + // Used to keep track of containers that have been marked to be killed + // to make room for a guaranteed container. + private final Map oppContainersMarkedForKill = --- End diff -- Will do > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, > YARN-4597.009.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658269#comment-15658269 ] ASF GitHub Bot commented on YARN-4597: -- Github user xslogic commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87664283 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java --- @@ -0,0 +1,393 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import com.google.common.annotations.VisibleForTesting; +import org.apache.hadoop.service.AbstractService; +import org.apache.hadoop.yarn.api.records.ContainerExitStatus; +import org.apache.hadoop.yarn.api.records.ContainerId; +import org.apache.hadoop.yarn.api.records.ExecutionType; +import org.apache.hadoop.yarn.api.records.ResourceUtilization; +import org.apache.hadoop.yarn.conf.YarnConfiguration; +import org.apache.hadoop.yarn.event.EventHandler; +import org.apache.hadoop.yarn.server.api.records.ContainerQueuingLimit; +import org.apache.hadoop.yarn.server.api.records.OpportunisticContainersStatus; +import org.apache.hadoop.yarn.server.nodemanager.Context; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitor; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.Iterator; +import java.util.LinkedHashMap; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; + +/** + * The ContainerScheduler manages a collection of runnable containers. It + * ensures that a container is launched only if all it launch criteria are + * met. It also ensures that OPPORTUNISTIC containers are killed to make + * room for GUARANTEED containers. + */ +public class ContainerScheduler extends AbstractService implements +EventHandler { + + private static final Logger LOG = + LoggerFactory.getLogger(ContainerScheduler.class); + + private final Context context; + private final int maxOppQueueLength; + + // Queue of Guaranteed Containers waiting for resources to run + private final LinkedHashMap--- End diff -- I chose a LinkedHashMap, since it is essentially an indexed Queue. We might need to check if a container exists in the queue or remove a specific container from the queue (for eg. when AM asks to kill a queued container). Also, like you noted, we would also need to implement hashCode/equals for Container... > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, > YARN-4597.009.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail:
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15657977#comment-15657977 ] ASF GitHub Bot commented on YARN-4597: -- Github user kkaranasos commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87647259 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java --- @@ -0,0 +1,393 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import com.google.common.annotations.VisibleForTesting; +import org.apache.hadoop.service.AbstractService; +import org.apache.hadoop.yarn.api.records.ContainerExitStatus; +import org.apache.hadoop.yarn.api.records.ContainerId; +import org.apache.hadoop.yarn.api.records.ExecutionType; +import org.apache.hadoop.yarn.api.records.ResourceUtilization; +import org.apache.hadoop.yarn.conf.YarnConfiguration; +import org.apache.hadoop.yarn.event.EventHandler; +import org.apache.hadoop.yarn.server.api.records.ContainerQueuingLimit; +import org.apache.hadoop.yarn.server.api.records.OpportunisticContainersStatus; +import org.apache.hadoop.yarn.server.nodemanager.Context; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitor; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.Iterator; +import java.util.LinkedHashMap; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; + +/** + * The ContainerScheduler manages a collection of runnable containers. It + * ensures that a container is launched only if all it launch criteria are + * met. It also ensures that OPPORTUNISTIC containers are killed to make + * room for GUARANTEED containers. + */ +public class ContainerScheduler extends AbstractService implements +EventHandler { + + private static final Logger LOG = + LoggerFactory.getLogger(ContainerScheduler.class); + + private final Context context; + private final int maxOppQueueLength; + + // Queue of Guaranteed Containers waiting for resources to run + private final LinkedHashMap+ queuedGuaranteedContainers = new LinkedHashMap<>(); + // Queue of Opportunistic Containers waiting for resources to run + private final LinkedHashMap + queuedOpportunisticContainers = new LinkedHashMap<>(); + + // Used to keep track of containers that have been marked to be killed + // to make room for a guaranteed container. + private final Map oppContainersMarkedForKill = + new HashMap<>(); + + // Containers launched by the Scheduler will take a while to actually + // move to the RUNNING state, but should still be fair game for killing + // by the scheduler to make room for guaranteed containers. + private final LinkedHashMap scheduledToRunContainers = + new LinkedHashMap<>(); + + private final ContainerQueuingLimit queuingLimit = + ContainerQueuingLimit.newInstance(); + + private final OpportunisticContainersStatus opportunisticContainersStatus; + + // Resource Utilization Manager that decides how utilization of the cluster + // increase / decreases based on container start / finish + private ResourceUtilizationManager utilizationManager; + + /** + * Instantiate a Container Scheduler. + * @param context NodeManager Context. + */ + public
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15657936#comment-15657936 ] ASF GitHub Bot commented on YARN-4597: -- Github user kambatla commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87639946 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java --- @@ -0,0 +1,393 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import com.google.common.annotations.VisibleForTesting; +import org.apache.hadoop.service.AbstractService; +import org.apache.hadoop.yarn.api.records.ContainerExitStatus; +import org.apache.hadoop.yarn.api.records.ContainerId; +import org.apache.hadoop.yarn.api.records.ExecutionType; +import org.apache.hadoop.yarn.api.records.ResourceUtilization; +import org.apache.hadoop.yarn.conf.YarnConfiguration; +import org.apache.hadoop.yarn.event.EventHandler; +import org.apache.hadoop.yarn.server.api.records.ContainerQueuingLimit; +import org.apache.hadoop.yarn.server.api.records.OpportunisticContainersStatus; +import org.apache.hadoop.yarn.server.nodemanager.Context; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitor; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.Iterator; +import java.util.LinkedHashMap; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; + +/** + * The ContainerScheduler manages a collection of runnable containers. It + * ensures that a container is launched only if all it launch criteria are + * met. It also ensures that OPPORTUNISTIC containers are killed to make + * room for GUARANTEED containers. + */ +public class ContainerScheduler extends AbstractService implements +EventHandler { + + private static final Logger LOG = + LoggerFactory.getLogger(ContainerScheduler.class); + + private final Context context; + private final int maxOppQueueLength; + + // Queue of Guaranteed Containers waiting for resources to run + private final LinkedHashMap+ queuedGuaranteedContainers = new LinkedHashMap<>(); + // Queue of Opportunistic Containers waiting for resources to run + private final LinkedHashMap + queuedOpportunisticContainers = new LinkedHashMap<>(); + + // Used to keep track of containers that have been marked to be killed + // to make room for a guaranteed container. + private final Map oppContainersMarkedForKill = + new HashMap<>(); + + // Containers launched by the Scheduler will take a while to actually + // move to the RUNNING state, but should still be fair game for killing + // by the scheduler to make room for guaranteed containers. + private final LinkedHashMap scheduledToRunContainers = + new LinkedHashMap<>(); + + private final ContainerQueuingLimit queuingLimit = + ContainerQueuingLimit.newInstance(); + + private final OpportunisticContainersStatus opportunisticContainersStatus; + + // Resource Utilization Manager that decides how utilization of the cluster + // increase / decreases based on container start / finish + private ResourceUtilizationManager utilizationManager; + + /** + * Instantiate a Container Scheduler. + * @param context NodeManager Context. + */ + public
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15657950#comment-15657950 ] ASF GitHub Bot commented on YARN-4597: -- Github user kambatla commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87645581 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ResourceUtilizationManager.java --- @@ -0,0 +1,163 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import org.apache.hadoop.yarn.api.records.ExecutionType; +import org.apache.hadoop.yarn.api.records.Resource; +import org.apache.hadoop.yarn.api.records.ResourceUtilization; +import org.apache.hadoop.yarn.server.api.records.OpportunisticContainersStatus; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitor; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * This class abstracts out how a container contributes to Resource Utilization. + * It is used by the {@link ContainerScheduler} to determine which + * OPPORTUNISTIC containers to be killed to make room for a GUARANTEED + * container. + * It currently equates resource utilization with the total resource allocated + * to the container. Another implementation might choose to use the actual + * resource utilization. + */ + +public class ResourceUtilizationManager { + + private static final Logger LOG = + LoggerFactory.getLogger(ResourceUtilizationManager.class); + + private ResourceUtilization containersAllocation; + private ContainerScheduler scheduler; + + ResourceUtilizationManager(ContainerScheduler scheduler) { +this.containersAllocation = ResourceUtilization.newInstance(0, 0, 0.0f); +this.scheduler = scheduler; + } + + /** + * Get the current accumulated utilization. Currently it is the accumulation + * of totally allocated resources to a container. + * @return ResourceUtilization Resource Utilization. + */ + public ResourceUtilization getCurrentUtilization() { +return this.containersAllocation; + } + + /** + * Add Container's resources to the accumulated Utilization. + * @param container Container. + */ + public void addContainerResources(Container container) { +increaseResourceUtilization( +getContainersMonitor(), this.containersAllocation, +container.getResource()); + } + + /** + * Subtract Container's resources to the accumulated Utilization. + * @param container Container. + */ + public void subtractContainerResource(Container container) { +decreaseResourceUtilization( +getContainersMonitor(), this.containersAllocation, +container.getResource()); + } + + /** + * Check if NM has resources available currently to run the container. + * @param container Container. + * @return True, if NM has resources available currently to run the container. + */ + public boolean hasResourcesAvailable(Container container) { +long pMemBytes = container.getResource().getMemorySize() * 1024 * 1024L; +return hasResourcesAvailable(pMemBytes, +(long) (getContainersMonitor().getVmemRatio()* pMemBytes), +container.getResource().getVirtualCores()); + } + + private boolean hasResourcesAvailable(long pMemBytes, long vMemBytes, + int cpuVcores) { +// Check physical memory. +if (LOG.isDebugEnabled()) { + LOG.debug("pMemCheck [current={} + asked={} > allowed={}]", +
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15657935#comment-15657935 ] ASF GitHub Bot commented on YARN-4597: -- Github user kambatla commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87639036 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java --- @@ -0,0 +1,393 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import com.google.common.annotations.VisibleForTesting; +import org.apache.hadoop.service.AbstractService; +import org.apache.hadoop.yarn.api.records.ContainerExitStatus; +import org.apache.hadoop.yarn.api.records.ContainerId; +import org.apache.hadoop.yarn.api.records.ExecutionType; +import org.apache.hadoop.yarn.api.records.ResourceUtilization; +import org.apache.hadoop.yarn.conf.YarnConfiguration; +import org.apache.hadoop.yarn.event.EventHandler; +import org.apache.hadoop.yarn.server.api.records.ContainerQueuingLimit; +import org.apache.hadoop.yarn.server.api.records.OpportunisticContainersStatus; +import org.apache.hadoop.yarn.server.nodemanager.Context; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitor; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.Iterator; +import java.util.LinkedHashMap; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; + +/** + * The ContainerScheduler manages a collection of runnable containers. It + * ensures that a container is launched only if all it launch criteria are + * met. It also ensures that OPPORTUNISTIC containers are killed to make + * room for GUARANTEED containers. + */ +public class ContainerScheduler extends AbstractService implements +EventHandler { + + private static final Logger LOG = + LoggerFactory.getLogger(ContainerScheduler.class); + + private final Context context; + private final int maxOppQueueLength; + + // Queue of Guaranteed Containers waiting for resources to run + private final LinkedHashMap+ queuedGuaranteedContainers = new LinkedHashMap<>(); + // Queue of Opportunistic Containers waiting for resources to run + private final LinkedHashMap + queuedOpportunisticContainers = new LinkedHashMap<>(); + + // Used to keep track of containers that have been marked to be killed + // to make room for a guaranteed container. + private final Map oppContainersMarkedForKill = + new HashMap<>(); + + // Containers launched by the Scheduler will take a while to actually + // move to the RUNNING state, but should still be fair game for killing + // by the scheduler to make room for guaranteed containers. + private final LinkedHashMap scheduledToRunContainers = + new LinkedHashMap<>(); + + private final ContainerQueuingLimit queuingLimit = + ContainerQueuingLimit.newInstance(); + + private final OpportunisticContainersStatus opportunisticContainersStatus; + + // Resource Utilization Manager that decides how utilization of the cluster + // increase / decreases based on container start / finish + private ResourceUtilizationManager utilizationManager; + + /** + * Instantiate a Container Scheduler. + * @param context NodeManager Context. + */ + public
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15657932#comment-15657932 ] ASF GitHub Bot commented on YARN-4597: -- Github user kambatla commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87629189 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java --- @@ -0,0 +1,393 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import com.google.common.annotations.VisibleForTesting; +import org.apache.hadoop.service.AbstractService; +import org.apache.hadoop.yarn.api.records.ContainerExitStatus; +import org.apache.hadoop.yarn.api.records.ContainerId; +import org.apache.hadoop.yarn.api.records.ExecutionType; +import org.apache.hadoop.yarn.api.records.ResourceUtilization; +import org.apache.hadoop.yarn.conf.YarnConfiguration; +import org.apache.hadoop.yarn.event.EventHandler; +import org.apache.hadoop.yarn.server.api.records.ContainerQueuingLimit; +import org.apache.hadoop.yarn.server.api.records.OpportunisticContainersStatus; +import org.apache.hadoop.yarn.server.nodemanager.Context; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitor; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.Iterator; +import java.util.LinkedHashMap; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; + +/** + * The ContainerScheduler manages a collection of runnable containers. It + * ensures that a container is launched only if all it launch criteria are + * met. It also ensures that OPPORTUNISTIC containers are killed to make + * room for GUARANTEED containers. + */ +public class ContainerScheduler extends AbstractService implements +EventHandler { + + private static final Logger LOG = + LoggerFactory.getLogger(ContainerScheduler.class); + + private final Context context; + private final int maxOppQueueLength; + + // Queue of Guaranteed Containers waiting for resources to run + private final LinkedHashMap+ queuedGuaranteedContainers = new LinkedHashMap<>(); + // Queue of Opportunistic Containers waiting for resources to run + private final LinkedHashMap + queuedOpportunisticContainers = new LinkedHashMap<>(); + + // Used to keep track of containers that have been marked to be killed + // to make room for a guaranteed container. + private final Map oppContainersMarkedForKill = --- End diff -- Since only opportunistic containers will be killed to make room, can we go with a simpler name containersToKill? May be, oppContainersToKill? > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, > YARN-4597.009.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15657934#comment-15657934 ] ASF GitHub Bot commented on YARN-4597: -- Github user kambatla commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87639161 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java --- @@ -0,0 +1,393 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import com.google.common.annotations.VisibleForTesting; +import org.apache.hadoop.service.AbstractService; +import org.apache.hadoop.yarn.api.records.ContainerExitStatus; +import org.apache.hadoop.yarn.api.records.ContainerId; +import org.apache.hadoop.yarn.api.records.ExecutionType; +import org.apache.hadoop.yarn.api.records.ResourceUtilization; +import org.apache.hadoop.yarn.conf.YarnConfiguration; +import org.apache.hadoop.yarn.event.EventHandler; +import org.apache.hadoop.yarn.server.api.records.ContainerQueuingLimit; +import org.apache.hadoop.yarn.server.api.records.OpportunisticContainersStatus; +import org.apache.hadoop.yarn.server.nodemanager.Context; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitor; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.Iterator; +import java.util.LinkedHashMap; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; + +/** + * The ContainerScheduler manages a collection of runnable containers. It + * ensures that a container is launched only if all it launch criteria are + * met. It also ensures that OPPORTUNISTIC containers are killed to make + * room for GUARANTEED containers. + */ +public class ContainerScheduler extends AbstractService implements +EventHandler { + + private static final Logger LOG = + LoggerFactory.getLogger(ContainerScheduler.class); + + private final Context context; + private final int maxOppQueueLength; + + // Queue of Guaranteed Containers waiting for resources to run + private final LinkedHashMap+ queuedGuaranteedContainers = new LinkedHashMap<>(); + // Queue of Opportunistic Containers waiting for resources to run + private final LinkedHashMap + queuedOpportunisticContainers = new LinkedHashMap<>(); + + // Used to keep track of containers that have been marked to be killed + // to make room for a guaranteed container. + private final Map oppContainersMarkedForKill = + new HashMap<>(); + + // Containers launched by the Scheduler will take a while to actually + // move to the RUNNING state, but should still be fair game for killing + // by the scheduler to make room for guaranteed containers. + private final LinkedHashMap scheduledToRunContainers = + new LinkedHashMap<>(); + + private final ContainerQueuingLimit queuingLimit = + ContainerQueuingLimit.newInstance(); + + private final OpportunisticContainersStatus opportunisticContainersStatus; + + // Resource Utilization Manager that decides how utilization of the cluster + // increase / decreases based on container start / finish + private ResourceUtilizationManager utilizationManager; + + /** + * Instantiate a Container Scheduler. + * @param context NodeManager Context. + */ + public
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15657945#comment-15657945 ] ASF GitHub Bot commented on YARN-4597: -- Github user kambatla commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87643346 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java --- @@ -311,6 +318,10 @@ protected void createAMRMProxyService(Configuration conf) { } } + protected ContainerScheduler createContainerScheduler(Context cntxt) { --- End diff -- Is this method exposed for testing? If yes, should it be marked @VisibleForTesting? > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, > YARN-4597.009.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15657949#comment-15657949 ] ASF GitHub Bot commented on YARN-4597: -- Github user kambatla commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87642982 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerSchedulerEvent.java --- @@ -0,0 +1,51 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import org.apache.hadoop.yarn.event.AbstractEvent; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container +.Container; --- End diff -- One line for imports? > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, > YARN-4597.009.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15657953#comment-15657953 ] ASF GitHub Bot commented on YARN-4597: -- Github user kambatla commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87645560 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ResourceUtilizationManager.java --- @@ -0,0 +1,163 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import org.apache.hadoop.yarn.api.records.ExecutionType; +import org.apache.hadoop.yarn.api.records.Resource; +import org.apache.hadoop.yarn.api.records.ResourceUtilization; +import org.apache.hadoop.yarn.server.api.records.OpportunisticContainersStatus; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitor; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * This class abstracts out how a container contributes to Resource Utilization. + * It is used by the {@link ContainerScheduler} to determine which + * OPPORTUNISTIC containers to be killed to make room for a GUARANTEED + * container. + * It currently equates resource utilization with the total resource allocated + * to the container. Another implementation might choose to use the actual + * resource utilization. + */ + +public class ResourceUtilizationManager { + + private static final Logger LOG = + LoggerFactory.getLogger(ResourceUtilizationManager.class); + + private ResourceUtilization containersAllocation; + private ContainerScheduler scheduler; + + ResourceUtilizationManager(ContainerScheduler scheduler) { +this.containersAllocation = ResourceUtilization.newInstance(0, 0, 0.0f); +this.scheduler = scheduler; + } + + /** + * Get the current accumulated utilization. Currently it is the accumulation + * of totally allocated resources to a container. + * @return ResourceUtilization Resource Utilization. + */ + public ResourceUtilization getCurrentUtilization() { +return this.containersAllocation; + } + + /** + * Add Container's resources to the accumulated Utilization. + * @param container Container. + */ + public void addContainerResources(Container container) { +increaseResourceUtilization( +getContainersMonitor(), this.containersAllocation, +container.getResource()); + } + + /** + * Subtract Container's resources to the accumulated Utilization. + * @param container Container. + */ + public void subtractContainerResource(Container container) { +decreaseResourceUtilization( +getContainersMonitor(), this.containersAllocation, +container.getResource()); + } + + /** + * Check if NM has resources available currently to run the container. + * @param container Container. + * @return True, if NM has resources available currently to run the container. + */ + public boolean hasResourcesAvailable(Container container) { +long pMemBytes = container.getResource().getMemorySize() * 1024 * 1024L; +return hasResourcesAvailable(pMemBytes, +(long) (getContainersMonitor().getVmemRatio()* pMemBytes), +container.getResource().getVirtualCores()); + } + + private boolean hasResourcesAvailable(long pMemBytes, long vMemBytes, + int cpuVcores) { +// Check physical memory. +if (LOG.isDebugEnabled()) { + LOG.debug("pMemCheck [current={} + asked={} > allowed={}]", +
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15657930#comment-15657930 ] ASF GitHub Bot commented on YARN-4597: -- Github user kambatla commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87639403 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java --- @@ -0,0 +1,393 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import com.google.common.annotations.VisibleForTesting; +import org.apache.hadoop.service.AbstractService; +import org.apache.hadoop.yarn.api.records.ContainerExitStatus; +import org.apache.hadoop.yarn.api.records.ContainerId; +import org.apache.hadoop.yarn.api.records.ExecutionType; +import org.apache.hadoop.yarn.api.records.ResourceUtilization; +import org.apache.hadoop.yarn.conf.YarnConfiguration; +import org.apache.hadoop.yarn.event.EventHandler; +import org.apache.hadoop.yarn.server.api.records.ContainerQueuingLimit; +import org.apache.hadoop.yarn.server.api.records.OpportunisticContainersStatus; +import org.apache.hadoop.yarn.server.nodemanager.Context; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitor; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.Iterator; +import java.util.LinkedHashMap; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; + +/** + * The ContainerScheduler manages a collection of runnable containers. It + * ensures that a container is launched only if all it launch criteria are + * met. It also ensures that OPPORTUNISTIC containers are killed to make + * room for GUARANTEED containers. + */ +public class ContainerScheduler extends AbstractService implements +EventHandler { + + private static final Logger LOG = + LoggerFactory.getLogger(ContainerScheduler.class); + + private final Context context; + private final int maxOppQueueLength; + + // Queue of Guaranteed Containers waiting for resources to run + private final LinkedHashMap+ queuedGuaranteedContainers = new LinkedHashMap<>(); + // Queue of Opportunistic Containers waiting for resources to run + private final LinkedHashMap + queuedOpportunisticContainers = new LinkedHashMap<>(); + + // Used to keep track of containers that have been marked to be killed + // to make room for a guaranteed container. + private final Map oppContainersMarkedForKill = + new HashMap<>(); + + // Containers launched by the Scheduler will take a while to actually + // move to the RUNNING state, but should still be fair game for killing + // by the scheduler to make room for guaranteed containers. + private final LinkedHashMap scheduledToRunContainers = + new LinkedHashMap<>(); + + private final ContainerQueuingLimit queuingLimit = + ContainerQueuingLimit.newInstance(); + + private final OpportunisticContainersStatus opportunisticContainersStatus; + + // Resource Utilization Manager that decides how utilization of the cluster + // increase / decreases based on container start / finish + private ResourceUtilizationManager utilizationManager; + + /** + * Instantiate a Container Scheduler. + * @param context NodeManager Context. + */ + public
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15657943#comment-15657943 ] ASF GitHub Bot commented on YARN-4597: -- Github user kambatla commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87641557 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java --- @@ -0,0 +1,393 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import com.google.common.annotations.VisibleForTesting; +import org.apache.hadoop.service.AbstractService; +import org.apache.hadoop.yarn.api.records.ContainerExitStatus; +import org.apache.hadoop.yarn.api.records.ContainerId; +import org.apache.hadoop.yarn.api.records.ExecutionType; +import org.apache.hadoop.yarn.api.records.ResourceUtilization; +import org.apache.hadoop.yarn.conf.YarnConfiguration; +import org.apache.hadoop.yarn.event.EventHandler; +import org.apache.hadoop.yarn.server.api.records.ContainerQueuingLimit; +import org.apache.hadoop.yarn.server.api.records.OpportunisticContainersStatus; +import org.apache.hadoop.yarn.server.nodemanager.Context; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitor; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.Iterator; +import java.util.LinkedHashMap; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; + +/** + * The ContainerScheduler manages a collection of runnable containers. It + * ensures that a container is launched only if all it launch criteria are + * met. It also ensures that OPPORTUNISTIC containers are killed to make + * room for GUARANTEED containers. + */ +public class ContainerScheduler extends AbstractService implements +EventHandler { + + private static final Logger LOG = + LoggerFactory.getLogger(ContainerScheduler.class); + + private final Context context; + private final int maxOppQueueLength; + + // Queue of Guaranteed Containers waiting for resources to run + private final LinkedHashMap+ queuedGuaranteedContainers = new LinkedHashMap<>(); + // Queue of Opportunistic Containers waiting for resources to run + private final LinkedHashMap + queuedOpportunisticContainers = new LinkedHashMap<>(); + + // Used to keep track of containers that have been marked to be killed + // to make room for a guaranteed container. + private final Map oppContainersMarkedForKill = + new HashMap<>(); + + // Containers launched by the Scheduler will take a while to actually + // move to the RUNNING state, but should still be fair game for killing + // by the scheduler to make room for guaranteed containers. + private final LinkedHashMap scheduledToRunContainers = + new LinkedHashMap<>(); + + private final ContainerQueuingLimit queuingLimit = + ContainerQueuingLimit.newInstance(); + + private final OpportunisticContainersStatus opportunisticContainersStatus; + + // Resource Utilization Manager that decides how utilization of the cluster + // increase / decreases based on container start / finish + private ResourceUtilizationManager utilizationManager; + + /** + * Instantiate a Container Scheduler. + * @param context NodeManager Context. + */ + public
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15657942#comment-15657942 ] ASF GitHub Bot commented on YARN-4597: -- Github user kambatla commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87640015 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java --- @@ -0,0 +1,393 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import com.google.common.annotations.VisibleForTesting; +import org.apache.hadoop.service.AbstractService; +import org.apache.hadoop.yarn.api.records.ContainerExitStatus; +import org.apache.hadoop.yarn.api.records.ContainerId; +import org.apache.hadoop.yarn.api.records.ExecutionType; +import org.apache.hadoop.yarn.api.records.ResourceUtilization; +import org.apache.hadoop.yarn.conf.YarnConfiguration; +import org.apache.hadoop.yarn.event.EventHandler; +import org.apache.hadoop.yarn.server.api.records.ContainerQueuingLimit; +import org.apache.hadoop.yarn.server.api.records.OpportunisticContainersStatus; +import org.apache.hadoop.yarn.server.nodemanager.Context; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitor; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.Iterator; +import java.util.LinkedHashMap; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; + +/** + * The ContainerScheduler manages a collection of runnable containers. It + * ensures that a container is launched only if all it launch criteria are + * met. It also ensures that OPPORTUNISTIC containers are killed to make + * room for GUARANTEED containers. + */ +public class ContainerScheduler extends AbstractService implements +EventHandler { + + private static final Logger LOG = + LoggerFactory.getLogger(ContainerScheduler.class); + + private final Context context; + private final int maxOppQueueLength; + + // Queue of Guaranteed Containers waiting for resources to run + private final LinkedHashMap+ queuedGuaranteedContainers = new LinkedHashMap<>(); + // Queue of Opportunistic Containers waiting for resources to run + private final LinkedHashMap + queuedOpportunisticContainers = new LinkedHashMap<>(); + + // Used to keep track of containers that have been marked to be killed + // to make room for a guaranteed container. + private final Map oppContainersMarkedForKill = + new HashMap<>(); + + // Containers launched by the Scheduler will take a while to actually + // move to the RUNNING state, but should still be fair game for killing + // by the scheduler to make room for guaranteed containers. + private final LinkedHashMap scheduledToRunContainers = + new LinkedHashMap<>(); + + private final ContainerQueuingLimit queuingLimit = + ContainerQueuingLimit.newInstance(); + + private final OpportunisticContainersStatus opportunisticContainersStatus; + + // Resource Utilization Manager that decides how utilization of the cluster + // increase / decreases based on container start / finish + private ResourceUtilizationManager utilizationManager; + + /** + * Instantiate a Container Scheduler. + * @param context NodeManager Context. + */ + public
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15657933#comment-15657933 ] ASF GitHub Bot commented on YARN-4597: -- Github user kambatla commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87639089 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java --- @@ -0,0 +1,393 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import com.google.common.annotations.VisibleForTesting; +import org.apache.hadoop.service.AbstractService; +import org.apache.hadoop.yarn.api.records.ContainerExitStatus; +import org.apache.hadoop.yarn.api.records.ContainerId; +import org.apache.hadoop.yarn.api.records.ExecutionType; +import org.apache.hadoop.yarn.api.records.ResourceUtilization; +import org.apache.hadoop.yarn.conf.YarnConfiguration; +import org.apache.hadoop.yarn.event.EventHandler; +import org.apache.hadoop.yarn.server.api.records.ContainerQueuingLimit; +import org.apache.hadoop.yarn.server.api.records.OpportunisticContainersStatus; +import org.apache.hadoop.yarn.server.nodemanager.Context; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitor; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.Iterator; +import java.util.LinkedHashMap; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; + +/** + * The ContainerScheduler manages a collection of runnable containers. It + * ensures that a container is launched only if all it launch criteria are + * met. It also ensures that OPPORTUNISTIC containers are killed to make + * room for GUARANTEED containers. + */ +public class ContainerScheduler extends AbstractService implements +EventHandler { + + private static final Logger LOG = + LoggerFactory.getLogger(ContainerScheduler.class); + + private final Context context; + private final int maxOppQueueLength; + + // Queue of Guaranteed Containers waiting for resources to run + private final LinkedHashMap+ queuedGuaranteedContainers = new LinkedHashMap<>(); + // Queue of Opportunistic Containers waiting for resources to run + private final LinkedHashMap + queuedOpportunisticContainers = new LinkedHashMap<>(); + + // Used to keep track of containers that have been marked to be killed + // to make room for a guaranteed container. + private final Map oppContainersMarkedForKill = + new HashMap<>(); + + // Containers launched by the Scheduler will take a while to actually + // move to the RUNNING state, but should still be fair game for killing + // by the scheduler to make room for guaranteed containers. + private final LinkedHashMap scheduledToRunContainers = + new LinkedHashMap<>(); + + private final ContainerQueuingLimit queuingLimit = + ContainerQueuingLimit.newInstance(); + + private final OpportunisticContainersStatus opportunisticContainersStatus; + + // Resource Utilization Manager that decides how utilization of the cluster + // increase / decreases based on container start / finish + private ResourceUtilizationManager utilizationManager; + + /** + * Instantiate a Container Scheduler. + * @param context NodeManager Context. + */ + public
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15657955#comment-15657955 ] ASF GitHub Bot commented on YARN-4597: -- Github user kambatla commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87643141 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml --- @@ -1000,10 +1000,10 @@ -Enable Queuing of OPPORTUNISTIC containers on the +Max numbed of OPPORTUNISTIC containers to queue at the --- End diff -- s/numbed/number > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, > YARN-4597.009.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15657947#comment-15657947 ] ASF GitHub Bot commented on YARN-4597: -- Github user kambatla commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87642663 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java --- @@ -0,0 +1,393 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import com.google.common.annotations.VisibleForTesting; +import org.apache.hadoop.service.AbstractService; +import org.apache.hadoop.yarn.api.records.ContainerExitStatus; +import org.apache.hadoop.yarn.api.records.ContainerId; +import org.apache.hadoop.yarn.api.records.ExecutionType; +import org.apache.hadoop.yarn.api.records.ResourceUtilization; +import org.apache.hadoop.yarn.conf.YarnConfiguration; +import org.apache.hadoop.yarn.event.EventHandler; +import org.apache.hadoop.yarn.server.api.records.ContainerQueuingLimit; +import org.apache.hadoop.yarn.server.api.records.OpportunisticContainersStatus; +import org.apache.hadoop.yarn.server.nodemanager.Context; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitor; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.Iterator; +import java.util.LinkedHashMap; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; + +/** + * The ContainerScheduler manages a collection of runnable containers. It + * ensures that a container is launched only if all it launch criteria are + * met. It also ensures that OPPORTUNISTIC containers are killed to make + * room for GUARANTEED containers. + */ +public class ContainerScheduler extends AbstractService implements +EventHandler { + + private static final Logger LOG = + LoggerFactory.getLogger(ContainerScheduler.class); + + private final Context context; + private final int maxOppQueueLength; + + // Queue of Guaranteed Containers waiting for resources to run + private final LinkedHashMap+ queuedGuaranteedContainers = new LinkedHashMap<>(); + // Queue of Opportunistic Containers waiting for resources to run + private final LinkedHashMap + queuedOpportunisticContainers = new LinkedHashMap<>(); + + // Used to keep track of containers that have been marked to be killed + // to make room for a guaranteed container. + private final Map oppContainersMarkedForKill = + new HashMap<>(); + + // Containers launched by the Scheduler will take a while to actually + // move to the RUNNING state, but should still be fair game for killing + // by the scheduler to make room for guaranteed containers. + private final LinkedHashMap scheduledToRunContainers = + new LinkedHashMap<>(); + + private final ContainerQueuingLimit queuingLimit = + ContainerQueuingLimit.newInstance(); + + private final OpportunisticContainersStatus opportunisticContainersStatus; + + // Resource Utilization Manager that decides how utilization of the cluster + // increase / decreases based on container start / finish + private ResourceUtilizationManager utilizationManager; + + /** + * Instantiate a Container Scheduler. + * @param context NodeManager Context. + */ + public
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15657929#comment-15657929 ] ASF GitHub Bot commented on YARN-4597: -- Github user kambatla commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87628936 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java --- @@ -0,0 +1,393 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import com.google.common.annotations.VisibleForTesting; +import org.apache.hadoop.service.AbstractService; +import org.apache.hadoop.yarn.api.records.ContainerExitStatus; +import org.apache.hadoop.yarn.api.records.ContainerId; +import org.apache.hadoop.yarn.api.records.ExecutionType; +import org.apache.hadoop.yarn.api.records.ResourceUtilization; +import org.apache.hadoop.yarn.conf.YarnConfiguration; +import org.apache.hadoop.yarn.event.EventHandler; +import org.apache.hadoop.yarn.server.api.records.ContainerQueuingLimit; +import org.apache.hadoop.yarn.server.api.records.OpportunisticContainersStatus; +import org.apache.hadoop.yarn.server.nodemanager.Context; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitor; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.Iterator; +import java.util.LinkedHashMap; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; + +/** + * The ContainerScheduler manages a collection of runnable containers. It + * ensures that a container is launched only if all it launch criteria are --- End diff -- s/it/its > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, > YARN-4597.009.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15657939#comment-15657939 ] ASF GitHub Bot commented on YARN-4597: -- Github user kambatla commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87639223 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java --- @@ -0,0 +1,393 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import com.google.common.annotations.VisibleForTesting; +import org.apache.hadoop.service.AbstractService; +import org.apache.hadoop.yarn.api.records.ContainerExitStatus; +import org.apache.hadoop.yarn.api.records.ContainerId; +import org.apache.hadoop.yarn.api.records.ExecutionType; +import org.apache.hadoop.yarn.api.records.ResourceUtilization; +import org.apache.hadoop.yarn.conf.YarnConfiguration; +import org.apache.hadoop.yarn.event.EventHandler; +import org.apache.hadoop.yarn.server.api.records.ContainerQueuingLimit; +import org.apache.hadoop.yarn.server.api.records.OpportunisticContainersStatus; +import org.apache.hadoop.yarn.server.nodemanager.Context; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitor; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.Iterator; +import java.util.LinkedHashMap; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; + +/** + * The ContainerScheduler manages a collection of runnable containers. It + * ensures that a container is launched only if all it launch criteria are + * met. It also ensures that OPPORTUNISTIC containers are killed to make + * room for GUARANTEED containers. + */ +public class ContainerScheduler extends AbstractService implements +EventHandler { + + private static final Logger LOG = + LoggerFactory.getLogger(ContainerScheduler.class); + + private final Context context; + private final int maxOppQueueLength; + + // Queue of Guaranteed Containers waiting for resources to run + private final LinkedHashMap+ queuedGuaranteedContainers = new LinkedHashMap<>(); + // Queue of Opportunistic Containers waiting for resources to run + private final LinkedHashMap + queuedOpportunisticContainers = new LinkedHashMap<>(); + + // Used to keep track of containers that have been marked to be killed + // to make room for a guaranteed container. + private final Map oppContainersMarkedForKill = + new HashMap<>(); + + // Containers launched by the Scheduler will take a while to actually + // move to the RUNNING state, but should still be fair game for killing + // by the scheduler to make room for guaranteed containers. + private final LinkedHashMap scheduledToRunContainers = + new LinkedHashMap<>(); + + private final ContainerQueuingLimit queuingLimit = + ContainerQueuingLimit.newInstance(); + + private final OpportunisticContainersStatus opportunisticContainersStatus; + + // Resource Utilization Manager that decides how utilization of the cluster + // increase / decreases based on container start / finish + private ResourceUtilizationManager utilizationManager; + + /** + * Instantiate a Container Scheduler. + * @param context NodeManager Context. + */ + public
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15657931#comment-15657931 ] ASF GitHub Bot commented on YARN-4597: -- Github user kambatla commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87639061 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java --- @@ -0,0 +1,393 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import com.google.common.annotations.VisibleForTesting; +import org.apache.hadoop.service.AbstractService; +import org.apache.hadoop.yarn.api.records.ContainerExitStatus; +import org.apache.hadoop.yarn.api.records.ContainerId; +import org.apache.hadoop.yarn.api.records.ExecutionType; +import org.apache.hadoop.yarn.api.records.ResourceUtilization; +import org.apache.hadoop.yarn.conf.YarnConfiguration; +import org.apache.hadoop.yarn.event.EventHandler; +import org.apache.hadoop.yarn.server.api.records.ContainerQueuingLimit; +import org.apache.hadoop.yarn.server.api.records.OpportunisticContainersStatus; +import org.apache.hadoop.yarn.server.nodemanager.Context; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitor; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.Iterator; +import java.util.LinkedHashMap; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; + +/** + * The ContainerScheduler manages a collection of runnable containers. It + * ensures that a container is launched only if all it launch criteria are + * met. It also ensures that OPPORTUNISTIC containers are killed to make + * room for GUARANTEED containers. + */ +public class ContainerScheduler extends AbstractService implements +EventHandler { + + private static final Logger LOG = + LoggerFactory.getLogger(ContainerScheduler.class); + + private final Context context; + private final int maxOppQueueLength; + + // Queue of Guaranteed Containers waiting for resources to run + private final LinkedHashMap+ queuedGuaranteedContainers = new LinkedHashMap<>(); + // Queue of Opportunistic Containers waiting for resources to run + private final LinkedHashMap + queuedOpportunisticContainers = new LinkedHashMap<>(); + + // Used to keep track of containers that have been marked to be killed + // to make room for a guaranteed container. + private final Map oppContainersMarkedForKill = + new HashMap<>(); + + // Containers launched by the Scheduler will take a while to actually + // move to the RUNNING state, but should still be fair game for killing + // by the scheduler to make room for guaranteed containers. + private final LinkedHashMap scheduledToRunContainers = + new LinkedHashMap<>(); + + private final ContainerQueuingLimit queuingLimit = + ContainerQueuingLimit.newInstance(); + + private final OpportunisticContainersStatus opportunisticContainersStatus; + + // Resource Utilization Manager that decides how utilization of the cluster + // increase / decreases based on container start / finish + private ResourceUtilizationManager utilizationManager; + + /** + * Instantiate a Container Scheduler. + * @param context NodeManager Context. + */ + public
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15657951#comment-15657951 ] ASF GitHub Bot commented on YARN-4597: -- Github user kambatla commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87644035 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java --- @@ -743,6 +739,8 @@ private void changeContainerResource( LOG.warn("Container " + containerId.toString() + "does not exist"); return; } +// TODO: Route this through the ContainerScheduler to --- End diff -- Annotate TODO with JIRA? > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, > YARN-4597.009.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15657941#comment-15657941 ] ASF GitHub Bot commented on YARN-4597: -- Github user kambatla commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87640580 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java --- @@ -0,0 +1,393 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import com.google.common.annotations.VisibleForTesting; +import org.apache.hadoop.service.AbstractService; +import org.apache.hadoop.yarn.api.records.ContainerExitStatus; +import org.apache.hadoop.yarn.api.records.ContainerId; +import org.apache.hadoop.yarn.api.records.ExecutionType; +import org.apache.hadoop.yarn.api.records.ResourceUtilization; +import org.apache.hadoop.yarn.conf.YarnConfiguration; +import org.apache.hadoop.yarn.event.EventHandler; +import org.apache.hadoop.yarn.server.api.records.ContainerQueuingLimit; +import org.apache.hadoop.yarn.server.api.records.OpportunisticContainersStatus; +import org.apache.hadoop.yarn.server.nodemanager.Context; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitor; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.Iterator; +import java.util.LinkedHashMap; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; + +/** + * The ContainerScheduler manages a collection of runnable containers. It + * ensures that a container is launched only if all it launch criteria are + * met. It also ensures that OPPORTUNISTIC containers are killed to make + * room for GUARANTEED containers. + */ +public class ContainerScheduler extends AbstractService implements +EventHandler { + + private static final Logger LOG = + LoggerFactory.getLogger(ContainerScheduler.class); + + private final Context context; + private final int maxOppQueueLength; + + // Queue of Guaranteed Containers waiting for resources to run + private final LinkedHashMap--- End diff -- Why do these need to be a map, why not a queue? Is it because there could be multiple Container objects corresponding to one ContainerId? If yes, would it make sense to implement equals and hashCode on Container to ensure we check only ContainerId? > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, > YARN-4597.009.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15657954#comment-15657954 ] ASF GitHub Bot commented on YARN-4597: -- Github user kambatla commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87645682 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ResourceUtilizationManager.java --- @@ -0,0 +1,163 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import org.apache.hadoop.yarn.api.records.ExecutionType; --- End diff -- Some unused imports. > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, > YARN-4597.009.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15657940#comment-15657940 ] ASF GitHub Bot commented on YARN-4597: -- Github user kambatla commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87640635 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java --- @@ -0,0 +1,393 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import com.google.common.annotations.VisibleForTesting; +import org.apache.hadoop.service.AbstractService; +import org.apache.hadoop.yarn.api.records.ContainerExitStatus; +import org.apache.hadoop.yarn.api.records.ContainerId; +import org.apache.hadoop.yarn.api.records.ExecutionType; +import org.apache.hadoop.yarn.api.records.ResourceUtilization; +import org.apache.hadoop.yarn.conf.YarnConfiguration; +import org.apache.hadoop.yarn.event.EventHandler; +import org.apache.hadoop.yarn.server.api.records.ContainerQueuingLimit; +import org.apache.hadoop.yarn.server.api.records.OpportunisticContainersStatus; +import org.apache.hadoop.yarn.server.nodemanager.Context; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitor; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.Iterator; +import java.util.LinkedHashMap; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; + +/** + * The ContainerScheduler manages a collection of runnable containers. It + * ensures that a container is launched only if all it launch criteria are + * met. It also ensures that OPPORTUNISTIC containers are killed to make + * room for GUARANTEED containers. + */ +public class ContainerScheduler extends AbstractService implements +EventHandler { + + private static final Logger LOG = + LoggerFactory.getLogger(ContainerScheduler.class); + + private final Context context; + private final int maxOppQueueLength; + + // Queue of Guaranteed Containers waiting for resources to run + private final LinkedHashMap+ queuedGuaranteedContainers = new LinkedHashMap<>(); + // Queue of Opportunistic Containers waiting for resources to run + private final LinkedHashMap --- End diff -- Same comment about it being a queue here.. > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, > YARN-4597.009.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15657946#comment-15657946 ] ASF GitHub Bot commented on YARN-4597: -- Github user kambatla commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87645111 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ResourceUtilizationManager.java --- @@ -0,0 +1,163 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import org.apache.hadoop.yarn.api.records.ExecutionType; +import org.apache.hadoop.yarn.api.records.Resource; +import org.apache.hadoop.yarn.api.records.ResourceUtilization; +import org.apache.hadoop.yarn.server.api.records.OpportunisticContainersStatus; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitor; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * This class abstracts out how a container contributes to Resource Utilization. + * It is used by the {@link ContainerScheduler} to determine which + * OPPORTUNISTIC containers to be killed to make room for a GUARANTEED + * container. + * It currently equates resource utilization with the total resource allocated + * to the container. Another implementation might choose to use the actual + * resource utilization. + */ + +public class ResourceUtilizationManager { --- End diff -- Should this be called ResourceUtilizationTracker? Also, would it make more sense for this to be an interface with an allocation-based implementation? Later, one could add a usage-based implementation and may be integrate tightly with ContainersMonitor? > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, > YARN-4597.009.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15657956#comment-15657956 ] ASF GitHub Bot commented on YARN-4597: -- Github user kambatla commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87640967 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java --- @@ -0,0 +1,393 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import com.google.common.annotations.VisibleForTesting; +import org.apache.hadoop.service.AbstractService; +import org.apache.hadoop.yarn.api.records.ContainerExitStatus; +import org.apache.hadoop.yarn.api.records.ContainerId; +import org.apache.hadoop.yarn.api.records.ExecutionType; +import org.apache.hadoop.yarn.api.records.ResourceUtilization; +import org.apache.hadoop.yarn.conf.YarnConfiguration; +import org.apache.hadoop.yarn.event.EventHandler; +import org.apache.hadoop.yarn.server.api.records.ContainerQueuingLimit; +import org.apache.hadoop.yarn.server.api.records.OpportunisticContainersStatus; +import org.apache.hadoop.yarn.server.nodemanager.Context; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitor; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.Iterator; +import java.util.LinkedHashMap; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; + +/** + * The ContainerScheduler manages a collection of runnable containers. It + * ensures that a container is launched only if all it launch criteria are + * met. It also ensures that OPPORTUNISTIC containers are killed to make + * room for GUARANTEED containers. + */ +public class ContainerScheduler extends AbstractService implements +EventHandler { + + private static final Logger LOG = + LoggerFactory.getLogger(ContainerScheduler.class); + + private final Context context; + private final int maxOppQueueLength; + + // Queue of Guaranteed Containers waiting for resources to run + private final LinkedHashMap+ queuedGuaranteedContainers = new LinkedHashMap<>(); + // Queue of Opportunistic Containers waiting for resources to run + private final LinkedHashMap + queuedOpportunisticContainers = new LinkedHashMap<>(); + + // Used to keep track of containers that have been marked to be killed + // to make room for a guaranteed container. + private final Map oppContainersMarkedForKill = + new HashMap<>(); + + // Containers launched by the Scheduler will take a while to actually + // move to the RUNNING state, but should still be fair game for killing + // by the scheduler to make room for guaranteed containers. + private final LinkedHashMap scheduledToRunContainers = + new LinkedHashMap<>(); + + private final ContainerQueuingLimit queuingLimit = + ContainerQueuingLimit.newInstance(); + + private final OpportunisticContainersStatus opportunisticContainersStatus; + + // Resource Utilization Manager that decides how utilization of the cluster + // increase / decreases based on container start / finish + private ResourceUtilizationManager utilizationManager; + + /** + * Instantiate a Container Scheduler. + * @param context NodeManager Context. + */ + public
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15657928#comment-15657928 ] ASF GitHub Bot commented on YARN-4597: -- Github user kambatla commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87638283 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java --- @@ -0,0 +1,393 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import com.google.common.annotations.VisibleForTesting; +import org.apache.hadoop.service.AbstractService; +import org.apache.hadoop.yarn.api.records.ContainerExitStatus; +import org.apache.hadoop.yarn.api.records.ContainerId; +import org.apache.hadoop.yarn.api.records.ExecutionType; +import org.apache.hadoop.yarn.api.records.ResourceUtilization; +import org.apache.hadoop.yarn.conf.YarnConfiguration; +import org.apache.hadoop.yarn.event.EventHandler; +import org.apache.hadoop.yarn.server.api.records.ContainerQueuingLimit; +import org.apache.hadoop.yarn.server.api.records.OpportunisticContainersStatus; +import org.apache.hadoop.yarn.server.nodemanager.Context; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitor; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.Iterator; +import java.util.LinkedHashMap; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; + +/** + * The ContainerScheduler manages a collection of runnable containers. It + * ensures that a container is launched only if all it launch criteria are + * met. It also ensures that OPPORTUNISTIC containers are killed to make + * room for GUARANTEED containers. + */ +public class ContainerScheduler extends AbstractService implements +EventHandler { + + private static final Logger LOG = + LoggerFactory.getLogger(ContainerScheduler.class); + + private final Context context; + private final int maxOppQueueLength; + + // Queue of Guaranteed Containers waiting for resources to run + private final LinkedHashMap+ queuedGuaranteedContainers = new LinkedHashMap<>(); + // Queue of Opportunistic Containers waiting for resources to run + private final LinkedHashMap + queuedOpportunisticContainers = new LinkedHashMap<>(); + + // Used to keep track of containers that have been marked to be killed + // to make room for a guaranteed container. + private final Map oppContainersMarkedForKill = + new HashMap<>(); + + // Containers launched by the Scheduler will take a while to actually + // move to the RUNNING state, but should still be fair game for killing + // by the scheduler to make room for guaranteed containers. + private final LinkedHashMap scheduledToRunContainers = --- End diff -- Are these running containers? If yes, should we call them so? > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch,
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15657948#comment-15657948 ] ASF GitHub Bot commented on YARN-4597: -- Github user kambatla commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87643223 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManager.java --- @@ -26,6 +26,8 @@ import org.apache.hadoop.yarn.server.nodemanager.ContainerManagerEvent; import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor .ContainersMonitor; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler +.ContainerScheduler; --- End diff -- imports should be on a single line? > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, > YARN-4597.009.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15657937#comment-15657937 ] ASF GitHub Bot commented on YARN-4597: -- Github user kambatla commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87639879 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java --- @@ -0,0 +1,393 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import com.google.common.annotations.VisibleForTesting; +import org.apache.hadoop.service.AbstractService; +import org.apache.hadoop.yarn.api.records.ContainerExitStatus; +import org.apache.hadoop.yarn.api.records.ContainerId; +import org.apache.hadoop.yarn.api.records.ExecutionType; +import org.apache.hadoop.yarn.api.records.ResourceUtilization; +import org.apache.hadoop.yarn.conf.YarnConfiguration; +import org.apache.hadoop.yarn.event.EventHandler; +import org.apache.hadoop.yarn.server.api.records.ContainerQueuingLimit; +import org.apache.hadoop.yarn.server.api.records.OpportunisticContainersStatus; +import org.apache.hadoop.yarn.server.nodemanager.Context; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitor; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.Iterator; +import java.util.LinkedHashMap; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; + +/** + * The ContainerScheduler manages a collection of runnable containers. It + * ensures that a container is launched only if all it launch criteria are + * met. It also ensures that OPPORTUNISTIC containers are killed to make + * room for GUARANTEED containers. + */ +public class ContainerScheduler extends AbstractService implements +EventHandler { + + private static final Logger LOG = + LoggerFactory.getLogger(ContainerScheduler.class); + + private final Context context; + private final int maxOppQueueLength; + + // Queue of Guaranteed Containers waiting for resources to run + private final LinkedHashMap+ queuedGuaranteedContainers = new LinkedHashMap<>(); + // Queue of Opportunistic Containers waiting for resources to run + private final LinkedHashMap + queuedOpportunisticContainers = new LinkedHashMap<>(); + + // Used to keep track of containers that have been marked to be killed + // to make room for a guaranteed container. + private final Map oppContainersMarkedForKill = + new HashMap<>(); + + // Containers launched by the Scheduler will take a while to actually + // move to the RUNNING state, but should still be fair game for killing + // by the scheduler to make room for guaranteed containers. + private final LinkedHashMap scheduledToRunContainers = --- End diff -- Also, should this be a stack? So, containers can be killed in a LIFO manner? > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, >
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15657938#comment-15657938 ] ASF GitHub Bot commented on YARN-4597: -- Github user kambatla commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87639757 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java --- @@ -0,0 +1,393 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import com.google.common.annotations.VisibleForTesting; +import org.apache.hadoop.service.AbstractService; +import org.apache.hadoop.yarn.api.records.ContainerExitStatus; +import org.apache.hadoop.yarn.api.records.ContainerId; +import org.apache.hadoop.yarn.api.records.ExecutionType; +import org.apache.hadoop.yarn.api.records.ResourceUtilization; +import org.apache.hadoop.yarn.conf.YarnConfiguration; +import org.apache.hadoop.yarn.event.EventHandler; +import org.apache.hadoop.yarn.server.api.records.ContainerQueuingLimit; +import org.apache.hadoop.yarn.server.api.records.OpportunisticContainersStatus; +import org.apache.hadoop.yarn.server.nodemanager.Context; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitor; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.Iterator; +import java.util.LinkedHashMap; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; + +/** + * The ContainerScheduler manages a collection of runnable containers. It + * ensures that a container is launched only if all it launch criteria are + * met. It also ensures that OPPORTUNISTIC containers are killed to make + * room for GUARANTEED containers. + */ +public class ContainerScheduler extends AbstractService implements +EventHandler { + + private static final Logger LOG = + LoggerFactory.getLogger(ContainerScheduler.class); + + private final Context context; + private final int maxOppQueueLength; + + // Queue of Guaranteed Containers waiting for resources to run + private final LinkedHashMap+ queuedGuaranteedContainers = new LinkedHashMap<>(); + // Queue of Opportunistic Containers waiting for resources to run + private final LinkedHashMap + queuedOpportunisticContainers = new LinkedHashMap<>(); + + // Used to keep track of containers that have been marked to be killed + // to make room for a guaranteed container. + private final Map oppContainersMarkedForKill = + new HashMap<>(); + + // Containers launched by the Scheduler will take a while to actually + // move to the RUNNING state, but should still be fair game for killing + // by the scheduler to make room for guaranteed containers. + private final LinkedHashMap scheduledToRunContainers = + new LinkedHashMap<>(); + + private final ContainerQueuingLimit queuingLimit = + ContainerQueuingLimit.newInstance(); + + private final OpportunisticContainersStatus opportunisticContainersStatus; + + // Resource Utilization Manager that decides how utilization of the cluster + // increase / decreases based on container start / finish + private ResourceUtilizationManager utilizationManager; + + /** + * Instantiate a Container Scheduler. + * @param context NodeManager Context. + */ + public
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15657952#comment-15657952 ] ASF GitHub Bot commented on YARN-4597: -- Github user kambatla commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87641199 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java --- @@ -0,0 +1,393 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import com.google.common.annotations.VisibleForTesting; +import org.apache.hadoop.service.AbstractService; +import org.apache.hadoop.yarn.api.records.ContainerExitStatus; +import org.apache.hadoop.yarn.api.records.ContainerId; +import org.apache.hadoop.yarn.api.records.ExecutionType; +import org.apache.hadoop.yarn.api.records.ResourceUtilization; +import org.apache.hadoop.yarn.conf.YarnConfiguration; +import org.apache.hadoop.yarn.event.EventHandler; +import org.apache.hadoop.yarn.server.api.records.ContainerQueuingLimit; +import org.apache.hadoop.yarn.server.api.records.OpportunisticContainersStatus; +import org.apache.hadoop.yarn.server.nodemanager.Context; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitor; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.Iterator; +import java.util.LinkedHashMap; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; + +/** + * The ContainerScheduler manages a collection of runnable containers. It + * ensures that a container is launched only if all it launch criteria are + * met. It also ensures that OPPORTUNISTIC containers are killed to make + * room for GUARANTEED containers. + */ +public class ContainerScheduler extends AbstractService implements +EventHandler { + + private static final Logger LOG = + LoggerFactory.getLogger(ContainerScheduler.class); + + private final Context context; + private final int maxOppQueueLength; + + // Queue of Guaranteed Containers waiting for resources to run + private final LinkedHashMap+ queuedGuaranteedContainers = new LinkedHashMap<>(); + // Queue of Opportunistic Containers waiting for resources to run + private final LinkedHashMap + queuedOpportunisticContainers = new LinkedHashMap<>(); + + // Used to keep track of containers that have been marked to be killed + // to make room for a guaranteed container. + private final Map oppContainersMarkedForKill = + new HashMap<>(); + + // Containers launched by the Scheduler will take a while to actually + // move to the RUNNING state, but should still be fair game for killing + // by the scheduler to make room for guaranteed containers. + private final LinkedHashMap scheduledToRunContainers = + new LinkedHashMap<>(); + + private final ContainerQueuingLimit queuingLimit = + ContainerQueuingLimit.newInstance(); + + private final OpportunisticContainersStatus opportunisticContainersStatus; + + // Resource Utilization Manager that decides how utilization of the cluster + // increase / decreases based on container start / finish + private ResourceUtilizationManager utilizationManager; + + /** + * Instantiate a Container Scheduler. + * @param context NodeManager Context. + */ + public
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15657944#comment-15657944 ] ASF GitHub Bot commented on YARN-4597: -- Github user kambatla commented on a diff in the pull request: https://github.com/apache/hadoop/pull/143#discussion_r87641844 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java --- @@ -0,0 +1,393 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler; + +import com.google.common.annotations.VisibleForTesting; +import org.apache.hadoop.service.AbstractService; +import org.apache.hadoop.yarn.api.records.ContainerExitStatus; +import org.apache.hadoop.yarn.api.records.ContainerId; +import org.apache.hadoop.yarn.api.records.ExecutionType; +import org.apache.hadoop.yarn.api.records.ResourceUtilization; +import org.apache.hadoop.yarn.conf.YarnConfiguration; +import org.apache.hadoop.yarn.event.EventHandler; +import org.apache.hadoop.yarn.server.api.records.ContainerQueuingLimit; +import org.apache.hadoop.yarn.server.api.records.OpportunisticContainersStatus; +import org.apache.hadoop.yarn.server.nodemanager.Context; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; +import org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitor; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.Iterator; +import java.util.LinkedHashMap; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; + +/** + * The ContainerScheduler manages a collection of runnable containers. It + * ensures that a container is launched only if all it launch criteria are + * met. It also ensures that OPPORTUNISTIC containers are killed to make + * room for GUARANTEED containers. + */ +public class ContainerScheduler extends AbstractService implements +EventHandler { + + private static final Logger LOG = + LoggerFactory.getLogger(ContainerScheduler.class); + + private final Context context; + private final int maxOppQueueLength; + + // Queue of Guaranteed Containers waiting for resources to run + private final LinkedHashMap+ queuedGuaranteedContainers = new LinkedHashMap<>(); + // Queue of Opportunistic Containers waiting for resources to run + private final LinkedHashMap + queuedOpportunisticContainers = new LinkedHashMap<>(); + + // Used to keep track of containers that have been marked to be killed + // to make room for a guaranteed container. + private final Map oppContainersMarkedForKill = + new HashMap<>(); + + // Containers launched by the Scheduler will take a while to actually + // move to the RUNNING state, but should still be fair game for killing + // by the scheduler to make room for guaranteed containers. + private final LinkedHashMap scheduledToRunContainers = + new LinkedHashMap<>(); + + private final ContainerQueuingLimit queuingLimit = + ContainerQueuingLimit.newInstance(); + + private final OpportunisticContainersStatus opportunisticContainersStatus; + + // Resource Utilization Manager that decides how utilization of the cluster + // increase / decreases based on container start / finish + private ResourceUtilizationManager utilizationManager; + + /** + * Instantiate a Container Scheduler. + * @param context NodeManager Context. + */ + public
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656417#comment-15656417 ] Arun Suresh commented on YARN-4597: --- Appreciate the review [~kkaranasos], 1. bq. The Container has two new methods (sendLaunchEvent and sendKillEvent), which are public and are not following.. sendKillEvent is used by the Scheduler (which is in another package) to kill a container. Since this patch introduces an external entity that launches and kills a container, viz. the Scheduler, I feel it is apt to keep both as public methods. I prefer it to 'dispatcher.getEventHandler().handle..'. 2. The Container needs to be added to the {{nodeUpdateQueue}} if the container is to be move from ACQUIRED to RUNNING state (this is a state transition all containers should go thru). Regarding the {{launchedContainers}}, Lets have both Opportunistic and Guaranteed containers flow through a common code-path... and introduce specific behaviors if required in subsequent patches as and when required. 3. bq. In the OpportunisticContainerAllocatorAMService we are now calling the SchedulerNode::allocate, and then we do not update the used resources but we do update some other counters, which leads to inconsistencies. Hmmm... I do see that the numContainers are not decremented correctly when release. Thanks... but it looks like it would more likely just impact reporting / UI, nothing functional (Will update the patch). Can you specify which other counters specifically ? Like I mentioned in the previous patch.. lets run all containers thru as much of the common code path before we add new counters etc. 4. bq. Maybe as part of a different JIRA, we should at some point extend the container.metrics in the ContainerImpl to keep track of the scheduled/queued containers. Yup.. +1 to that. The rest of your comments make sense... will update patch. bq. let's stress-test the code in a cluster before committing to make sure everything is good It has been tested on a 3 node cluster and MR Pi jobs (with opportunistic containers) and I didn't hit any major issues. We can always open follow-up JIRAs for specific performance related issues as and when we find it. Besides, stess-testing is not really a precondition to committing a patch. > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, > YARN-4597.009.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655975#comment-15655975 ] Konstantinos Karanasos commented on YARN-4597: -- Thanks for working on this, [~asuresh]! I am sending some first comments. I have not yet looked at the {{ContainerScheduler}} -- I will do that tomorrow. - The {{Container}} has two new methods ({{sendLaunchEvent}} and {{sendKillEvent}}), which are public and are not following the design of the rest of the code that keeps such methods private and calls them through transitions in the {{ContainerImpl}}. Let's try to use the existing design if possible. - In {{RMNodeImpl}}: -- Instead of using the {{launchedContainers}} for both the launched and the queued, we might want to split it in two: one for the launched and one for the queued containers. -- I think we should not add opportunistic containers to the {{launchContainers}}. If we do, they will be added to the {{newlyLaunchedContainers}}, then to the {{nodeUpdateQueue}}, and, if I am not wrong, they will be propagated to the schedulers for the guaranteed containers, which will create problems. I have to look at it a bit more, but my hunch is that we should avoid doing it. Even if it does not affect the resource accounting, I don't see any advantage to adding them. - In the {{OpportunisticContainerAllocatorAMService}} we are now calling the {{SchedulerNode::allocate}}, and then we do not update the used resources, but we do update some other counters, which leads to inconsistencies. For example, when releasing a container, I think at the moment we are not calling the release of the {{SchedulerNode}}, which means that the container count will become inconsistent. -- Instead, I suggest to add some counters for opportunistic containers at the {{SchedulerNode}}, both for the number of containers and for the resources used. In this case, we need to make sure that those resources are released too. - Maybe as part of a different JIRA, we should at some point extend the {{container.metrics}} in the {{ContainerImpl}} to keep track of the scheduled/queued containers. h6. Nits: - There seem to be two redundant parameters at {{YarnConfiguration}} at the moment: {{NM_CONTAINER_QUEUING_MIN_QUEUE_LENGTH}} and {{NM_OPPORTUNISTIC_CONTAINERS_MAX_QUEUE_LENGTH}}. If I am not missing something, we should keep one of the two. - {{yarn-default.xml}}: numbed -> number (in a comment) - {{TestNodeManagerResync}}: I think it is better to use one of the existing methods for waiting to get to the RUNNING state. - In {{Container}}/{{ContainerImpl}} and all the associated classes, I would suggest to rename {{isMarkedToKill}} to {{isMarkedForKilling}}. I know it is minor, but it is more self-explanatory. I will send more comments once I check the {{ContainerScheduler}}. Also, let's stress-test the code in a cluster before committing to make sure everything is good. I can help with that. > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, > YARN-4597.009.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15653359#comment-15653359 ] Karthik Kambatla commented on YARN-4597: [~asuresh] - could you hold off until Friday evening? I ll try and get to it by then. Thanks. > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, > YARN-4597.009.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652246#comment-15652246 ] Konstantinos Karanasos commented on YARN-4597: -- Hi [~jianhe]. Yes, I will check the patch today. > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, > YARN-4597.009.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652237#comment-15652237 ] Jian He commented on YARN-4597: --- the patch looks good to me, is [~kkaranasos] going to take a look ? > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, > YARN-4597.009.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651938#comment-15651938 ] Hadoop QA commented on YARN-4597: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 19 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 55s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 29s{color} | {color:red} branch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests no findbugs output file (hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/target/findbugsXml.xml) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 24s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 10m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 10m 2s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 58s{color} | {color:orange} root: The patch generated 15 new + 1053 unchanged - 16 fixed = 1068 total (was 1069) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 5m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 3m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 30s{color} | {color:red} patch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests no findbugs output file (hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/target/findbugsXml.xml) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s{color} | {color:green} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 0 new + 235 unchanged - 1 fixed = 235 total (was 236) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 36s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} hadoop-yarn-server-tests in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green} hadoop-mapreduce-client-jobclient in
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15651248#comment-15651248 ] Arun Suresh commented on YARN-4597: --- The new test failures are caused by an issue with YARN-5833. Kicking it off again.. > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, > YARN-4597.009.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15650971#comment-15650971 ] Hadoop QA commented on YARN-4597: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 19 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 5m 10s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 3m 29s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 29s{color} | {color:red} branch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests no findbugs output file (hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/target/findbugsXml.xml) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 56s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 13m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 13m 4s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 25s{color} | {color:orange} root: The patch generated 15 new + 1052 unchanged - 16 fixed = 1067 total (was 1068) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 6m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 3m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 29s{color} | {color:red} patch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests no findbugs output file (hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/target/findbugsXml.xml) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s{color} | {color:green} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 0 new + 235 unchanged - 1 fixed = 235 total (was 236) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green} hadoop-yarn-server-tests in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} hadoop-mapreduce-client-jobclient in
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15650189#comment-15650189 ] Hadoop QA commented on YARN-4597: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 9s{color} | {color:red} YARN-4597 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-4597 | | GITHUB PR | https://github.com/apache/hadoop/pull/143 | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/13841/console | | Powered by | Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, > YARN-4597.009.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15650178#comment-15650178 ] Hadoop QA commented on YARN-4597: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 6s{color} | {color:red} YARN-4597 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-4597 | | GITHUB PR | https://github.com/apache/hadoop/pull/143 | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/13839/console | | Powered by | Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, > YARN-4597.009.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15649313#comment-15649313 ] Hadoop QA commented on YARN-4597: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 8s{color} | {color:red} YARN-4597 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-4597 | | GITHUB PR | https://github.com/apache/hadoop/pull/143 | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/13834/console | | Powered by | Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15649229#comment-15649229 ] Jian He commented on YARN-4597: --- I see, sounds reasonable. bq. I feel a better name for 'shouldLaunchContainer' should have been 'containerAlreadyLaunched'. btw. agree with this, in case you would like to change it. > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15649168#comment-15649168 ] Arun Suresh commented on YARN-4597: --- Hmmm.. I agree it looks equivalent. But what if the 'isMarkedForKill' flag is set on the container object by the scheduler in between the time the ContainerLaunch thread is running the 'ContainerLaunch::launchContainer()' method and the 'ContainerLaunch::handleContainerExitCode()'. In that case, the CONTAINER_KILLED_ON_REQUEST event will not be sent. > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15649082#comment-15649082 ] Jian He commented on YARN-4597: --- I see.. thanks for looking into it.. I guess the killedBeforeStart flag could be avoided, by calling "container.isMarkedToKill()" directly in the handleContainerExitCode method like below ? {code} if (!container.isMarkedToKill()) { dispatcher.getEventHandler().handle( new ContainerExitEvent(containerId, ContainerEventType.CONTAINER_KILLED_ON_REQUEST, exitCode, diagnosticInfo.toString())); } {code} > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15648992#comment-15648992 ] Arun Suresh commented on YARN-4597: --- [~jianhe], tried another round at refactoring out the 'killedBeforeStart' flag from the ContainerLaunch. I now feel, it should not be clubbed with the 'shouldLaunchContainer' flag. They signify very different states: the former is used to ensure the container is never started while the latter is used to capture the event that the container has already been launch, and thereby 'should not be launched again'. I feel a better name for 'shouldLaunchContainer' should have been 'containerAlreadyLaunched'. I have raised YARN-5860 and YARN-5861 to track the remaining tasks need to get this truly feature complete. If [~jianhe] is fine with the current state of this patch, I would like to check this in. It would be nice though if [~kkaranasos] also takes a look at this to verify if I havn't missed anything in the {{ContainerScheduler}} functionality vis-a-vis the {{QueuingContainerManagerImpl}} > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15648821#comment-15648821 ] Jian He commented on YARN-4597: --- bq. I tried to reuse shouldLaunchContainer for the killedBeforeStart flag, but I see too many test failures, since a lot of the code expects the CONTAINER_LAUNCH event to also be raised. If its ok with you, I will give it another look once we converge on everything else. Sure, other things look good to me. > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15647435#comment-15647435 ] Hadoop QA commented on YARN-4597: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 3m 47s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 19 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 50s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 8s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 22s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 9m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 9m 17s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 55s{color} | {color:orange} root: The patch generated 12 new + 1049 unchanged - 14 fixed = 1061 total (was 1063) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 3m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 7m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s{color} | {color:green} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 0 new + 235 unchanged - 1 fixed = 235 total (was 236) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} hadoop-yarn-server-tests in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green} hadoop-mapreduce-client-jobclient in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 40s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 31s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15646647#comment-15646647 ] Hadoop QA commented on YARN-4597: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 19 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 4s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 30s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 9m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 9m 35s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 56s{color} | {color:orange} root: The patch generated 11 new + 1050 unchanged - 16 fixed = 1061 total (was 1066) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 3m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 7m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 46s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s{color} | {color:green} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 0 new + 235 unchanged - 1 fixed = 235 total (was 236) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s{color} | {color:green} hadoop-yarn-server-tests in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} hadoop-mapreduce-client-jobclient in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 42s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 44s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15645994#comment-15645994 ] Arun Suresh commented on YARN-4597: --- bq. I also think that should be changed for existing code. The code for sending container_launched event should be inside the 'else' block when really launching the container. Aah.. I was worried I would break something else If I moved it inside the else. If it is fine to modify the existing path, I'll use the 'shouldLaunchContainer' flag.. Thanks. bq. It will call 'storeContainerQueued' unconditionally, even though the container is not queued when it reached the queue-len limit. shouldn't we not call storeContainerQueued in that case. Got it.. Yup, will fix it. > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15645979#comment-15645979 ] Jian He commented on YARN-4597: --- bq. scheduledToRunContainers sounds good bq. Using shouldLaunchContainer also causes the CONTAINER_LAUNCH event to be fired which I do not want. I think the shouldLaunchContainer is also used for in case container killed has been called, skip launching the container. (see the code comment in below code: "// Check if the container is signalled to be killed." ). In terms of the additional event, I also think that should be changed for existing code. The code for sending container_launched event should be inside the 'else' block when really launching the container. {code} // LaunchContainer is a blocking call. We are here almost means the // container is launched, so send out the event. dispatcher.getEventHandler().handle(new ContainerEvent( containerId, ContainerEventType.CONTAINER_LAUNCHED)); context.getNMStateStore().storeContainerLaunched(containerId); // Check if the container is signalled to be killed. if (!shouldLaunchContainer.compareAndSet(false, true)) { LOG.info("Container " + containerId + " not launched as " + "cleanup already called"); return ExitCode.TERMINATED.getExitCode(); {code} bq. Once queue limit is reached, no new opportunistic containers should also be queued. The AM is free to request it again. The MRAppMaster, for eg. re-requests the same task as a GUARANTEED container. sorry for unclear, I meant below code. It will call 'storeContainerQueued' unconditionally, even though the container is not queued when it reached the queue-len limit. shouldn't we not call storeContainerQueued in that case ? {code} { try { this.context.getNMStateStore().storeContainerQueued( container.getContainerId()); } catch (IOException e) { LOG.warn("Could not store container state into store..", e); } LOG.info("No available resources for container {} to start its execution " + "immediately.", container.getContainerId()); if (container.getContainerTokenIdentifier().getExecutionType() == ExecutionType.GUARANTEED) { queuedGuaranteedContainers.put(container.getContainerId(), container); // Kill running opportunistic containers to make space for // guaranteed container. killOpportunisticContainers(container); } else { if (queuedOpportunisticContainers.size() <= maxOppQueueLength) { LOG.info("Opportunistic container {} will be queued at the NM.", container.getContainerId()); queuedOpportunisticContainers.put( container.getContainerId(), container); } else { LOG.info("Opportunistic container [{}] will not be queued at the NM" + "since max queue length [{}] has been reached", container.getContainerId(), maxOppQueueLength); container.sendKillEvent( ContainerExitStatus.KILLED_BY_CONTAINER_SCHEDULER, "Opportunistic container queue is full."); } } } {code} > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15623825#comment-15623825 ] Jian He commented on YARN-4597: --- Few more comments - maybe rename ContainerScheduler#runningContainers to scheduledContainers - The ContainerLaunch#killedBeforeStart flag, looks like the exising flag 'shouldLaunchContainer' serves the same purpose, can we reuse that ? if so, the container#isMarkedToKill is also not needed. - NodeManager#containerScheduler variable not used, remove - I think this comment is not addressed ? "In case we exceed the max-queue length, we are killing the container directly instead of queueing the container, in this case, we should not store the container as queued?" > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15602841#comment-15602841 ] Hadoop QA commented on YARN-4597: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 19 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 43s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 55s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 40s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 21s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 46s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 9s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 3s {color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 49s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 17s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 7m 17s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 7m 17s {color} | {color:red} root generated 2 new + 701 unchanged - 2 fixed = 703 total (was 703) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 40s {color} | {color:red} root: The patch generated 7 new + 1018 unchanged - 15 fixed = 1025 total (was 1033) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 41s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 45s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s {color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s {color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s {color} | {color:green} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 0 new + 236 unchanged - 1 fixed = 236 total (was 237) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s {color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 11s {color} | {color:green} hadoop-yarn-server-tests in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s {color} | {color:green} hadoop-yarn-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s {color} | {color:green} hadoop-mapreduce-client-jobclient in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 26s {color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 17s {color} | {color:green} hadoop-yarn-common in the patch passed. {color} | |
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15599872#comment-15599872 ] Arun Suresh commented on YARN-4597: --- The test failures are unrelated to the patch. Will fix the javadoc and checkstyle shortly. Please note: Container Increase/Decrease does not currently go thru the Scheduler. Think it is better working on that in a subsequent patch, once we are in agreement over this patch. > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15599619#comment-15599619 ] Hadoop QA commented on YARN-4597: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 19 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 39s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 43s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 41s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 2s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 4s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 4s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 38s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 30s {color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 21s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 19s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 8m 19s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 8m 19s {color} | {color:red} root generated 2 new + 700 unchanged - 2 fixed = 702 total (was 702) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 54s {color} | {color:red} root: The patch generated 36 new + 1017 unchanged - 15 fixed = 1053 total (was 1032) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 52s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 51s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 42s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s {color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s {color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s {color} | {color:green} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 0 new + 236 unchanged - 1 fixed = 236 total (was 237) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s {color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s {color} | {color:green} hadoop-yarn-server-tests in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s {color} | {color:green} hadoop-yarn-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s {color} | {color:green} hadoop-mapreduce-client-jobclient in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 32s {color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 36s {color} | {color:green} hadoop-yarn-common in the patch passed. {color} | |
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15596564#comment-15596564 ] Jian He commented on YARN-4597: --- bq. Will rerun and see if they are required… but in anycase, having them there should be harmless right? If the events should not happen in theory, I think we should not add it. otherwise, it could possibly hide bugs..we should fix the source to not send the events, instead of ignoring the events at receiver. > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15596462#comment-15596462 ] Jian He commented on YARN-4597: --- bq. the current default one only takes into account actual resource allocated to containers..We assumed that resources requested by a container is constant, oh, I thought it's checking for the dynamically changing vmem, it's actually only checking for the max-limit. then we are fine I think. > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593945#comment-15593945 ] Arun Suresh commented on YARN-4597: --- [~jianhe], thanks again for taking a look. bq. I think there might be some behavior change or bug for scheduling guaranteed containers when the oppotunistic-queue is enabled. Previously, when launching container, NM will not check for current vmem usage, and cpu usage. It assumes what RM allocated can be launched. Now, NM will check these limits and won't launch the container if hits the limit. Yup, we do a *hasResources* check only at the start of a container and when a container is killed. We assumed that resources requested by a container is constant, essentially we considered only actual *allocated* resources which we assume will not varying during the lifetime of the container... which implies, there is no point in checking this at any other time other than start and kill of containers. But like you stated, if we consider container resource *utilization*, based on the work [~kasha] is doing in YARN-1011, then yes, we should have a timer thread that periodically checks the vmem and cpu usage and starts (and kills) containers based on that. bq. the ResourceUtilizationManager looks like only incorporated some utility methods, not sure how we will make this pluggable later. Following on my point above, the idea was to have a {{ResourceUtilizationManager}} that can provide a different value of {{getCurrentUtilization}}, {{addResource}} and {{subtractResource}} which is used by the ContainerScheduler to calculate the resources to free up. For instance, the current default one only takes into account actual resource *allocated* to containers... for YARN-1011, we might replace that with the resource *utilized* by running containers, and provide a different value for {{getCurrentUtilization}}. The timer thread I mentioned in the previous point, which can be apart of this new ResourceUtilizationManager, can send events to the scheduler to re-process queued containers when utilization has changed. bq. The logic to select opportunisitic container: we may kill more opportunistic containers than required. e.g... Good catch, in the {{resourcesToFreeUp}}, I needed to decrement any already-marked-for-kill opportunistic container. It was there earlier, Had removed it when I was testing something, but forgot to put it back :) bq. we don't need to synchronize on the currentUtilization object? I don't see any other place it's synchronized Yup, It isnt required. Varun did point out the same.. I thought I had fixed it, think I might have missed 'git add'ing the change w.r.t Adding the new transitions, I was seeing some error messages in some testcases. Will rerun and see if they are required… but in anycase, having them there should be harmless right? The rest of your comments makes sense.. will address them shortly. > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593320#comment-15593320 ] Jian He commented on YARN-4597: --- [~asuresh], some more questions and comments on the patch: - why are these two transitions added? {code} .addTransition(ContainerState.DONE, ContainerState.DONE, ContainerEventType.CONTAINER_RESOURCES_CLEANEDUP) .addTransition(ContainerState.DONE, ContainerState.DONE, ContainerEventType.CONTAINER_LAUNCHED) .addTransition(ContainerState.DONE, ContainerState.DONE, {code} - the storeContainerKilled will be called in ContainerLaunch#cleanupContainer later, we don't need to call it here? {code} public void sendKillEvent(int exitStatus, String description) { try { context.getNMStateStore().storeContainerKilled(containerId); } catch (IOException ioe) { LOG.error("Could not log container state change to state store..", ioe); } {code} - remove unused imports in ContainersMonitor.java - remove ContainersMonitorImpl#allocatedCpuUsage unused method - why do you need to add the additional check for SCHEDULED state ? {code} // Process running containers if (remoteContainer.getState() == ContainerState.RUNNING || remoteContainer.getState() == ContainerState.SCHEDULED) { {code} - why this test needs to be changed? {code} testGetContainerStatus(container, i, EnumSet.of(ContainerState.RUNNING, ContainerState.SCHEDULED), "", {code} - similarly here in TestNodeManagerShutdown, we still need to change the test to make sure the container reaches to running state? {code} Assert.assertTrue( EnumSet.of(ContainerState.RUNNING, ContainerState.SCHEDULED) .contains(containerStatus.getState())); {code} - why do we need to change the test to run for 10 min ? {code} @Test(timeout = 60) public void testAMRMClient() throws YarnException, IOException { {code} - unreleated to this patch: should ResourceUtilization#pmem,vmem be changed to long type ? we had specifically changed it for Resource object - we don't need to synchronize on the currentUtilization object? I don't see any other place it's synchronized {code} synchronized (currentUtilization) { {code} - In case we exceed the max-queue length, we are killing the container directly instead of queueing the container, in this case, we should not store the container as queued? {code} try { this.context.getNMStateStore().storeContainerQueued( container.getContainerId()); } catch (IOException e) { LOG.warn("Could not store container state into store..", e); } {code} - The ResourceUtilizationManager looks like only incorporated some utility methods, not sure how we will make this pluggable later.. - I think there might be some behavior change or bug for scheduling guaranteed containers when the oppotunistic-queue is enabled. -- Previously, when launching container, NM will not check for current vmem usage, and cpu usage. It assumes what RM allocated can be launched. -- Now, NM will check these limits and won't launch the container if hits the limit. -- Suppose the guarateed container hits the limit, it will be queued into queuedGuaranteedContainers. And this container will never be launched until one other container finishes which triggers the code path, even if it's not hitting these limits any more. This is a problem especially when other containers are long running container and never finishes. - The logic to select opportunisitic container: we may kill more opportunistic containers than required. e.g. -- one guarateed container comes, we select one opportunistic container -- before the selected opportunistic container is killed, another guarateed comes, then we will select two opportunistic containers to kill -- The process repeats, we may end up killing more opportunistic containers than required. > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584462#comment-15584462 ] Hadoop QA commented on YARN-4597: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 19 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 42s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 21s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 14s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 38s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 2s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 58s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 12s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 5s {color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 49s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 7m 29s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 7m 29s {color} | {color:red} root generated 2 new + 700 unchanged - 2 fixed = 702 total (was 702) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 45s {color} | {color:red} root: The patch generated 37 new + 958 unchanged - 14 fixed = 995 total (was 972) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 36s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 42s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 39s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s {color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s {color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s {color} | {color:green} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 0 new + 236 unchanged - 1 fixed = 236 total (was 237) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s {color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s {color} | {color:green} hadoop-yarn-server-tests in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s {color} | {color:green} hadoop-yarn-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s {color} | {color:green} hadoop-mapreduce-client-jobclient in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 25s {color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 17s {color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:red}-1{color} |
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584119#comment-15584119 ] ASF GitHub Bot commented on YARN-4597: -- GitHub user xslogic opened a pull request: https://github.com/apache/hadoop/pull/143 YARN-4597: initial commit You can merge this pull request into a Git repository by running: $ git pull https://github.com/xslogic/hadoop YARN-4597 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/hadoop/pull/143.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #143 commit 44f42805c78a2a605889e5ff757a9996525b99e5 Author: Arun SureshDate: 2016-10-18T01:49:53Z YARN-4597: initial commit > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15571974#comment-15571974 ] Varun Vasudev commented on YARN-4597: - Thanks for the patch [~asuresh]. 1) I agree with [~kasha] on {quote}The methods for killing containers as needed all seem to be hardcoded to only consider allocated resources. Can we abstract it out further to allow for passing either allocation or utilization based on whether oversubscription is enabled.{quote} At some point, people will want to be able to plug in policies to decide which containers to kill. However, I wouldn't hold up the patch for it. 2) The changes to BaseContainerManagerTest.java seem unnecessary. 3) Can you explain why we need the synchronized block here - {code} + synchronized (this.containersAllocation) {code} > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Attachments: YARN-4597.001.patch, YARN-4597.002.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569508#comment-15569508 ] Karthik Kambatla commented on YARN-4597: Thanks for working on this, [~asuresh]. The approach looks reasonable to me. The patch is pretty big; it might be easier to use Github PR or RB for a thorough review - especially for minor comments. High-level comments on the patch, keeping YARN-1011 work in mind: # Really like that we are getting rid of QueuingContainer* classes, letting both guaranteed/queued containers go through the same code path # In ContainerScheduler, I see there are two code paths leading to starting a container for when enough resources are available or not. Did you consider a single path where we queue containers directly and let another thread launch them? This thread could be triggered immediately on queuing a container, on completion of a container, and periodically for cases where we do resource oversubscription. # The methods for killing containers as needed all seem to be hardcoded to only consider allocated resources. Can we abstract it out further to allow for passing either allocation or utilization based on whether oversubscription is enabled. # Relatively minor: resourcesToFreeUp is initialized to container allocation on the node. Shouldn't it be initialized to zero? May be I am missing something. > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Attachments: YARN-4597.001.patch, YARN-4597.002.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569200#comment-15569200 ] Arun Suresh commented on YARN-4597: --- What i meant was, the current patch does not need synchronized collections ( unlike the {{QueuingContainerManager}} ), since it runs on the same thread as the ContainerManager's AsyncDispatcher. I will update the patch with using a linkedblockingqueue shortly. > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Attachments: YARN-4597.001.patch, YARN-4597.002.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569147#comment-15569147 ] Jian He commented on YARN-4597: --- bq. This will preserve the serial nature of operation (and thereby keep the code simple by not needing synchronized collections) I don't actually see a synchronized collection in the ContainerScheduler. I agree with the point that this could avoid holding up the main containerManager thread. we can have ContainerScheduler create its own dispatcher. > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Attachments: YARN-4597.001.patch, YARN-4597.002.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15568792#comment-15568792 ] Arun Suresh commented on YARN-4597: --- Thanks for taking a look [~jianhe], bq. Wondering why KillWhileExitingTransition is added.. I had put it in there for debugging something... Left it there since it thought its harmless... but, yeah looks like it does over-ride the exitcode. Will remove it. Good catch. * w.r.t {{ContainerState#SCHEDULED}} : Actually, I think we should expose this. We currently club NEW, LOCALIZING, LOCALIZED etc. into RUNNING, but the container is actually not running, and is thus misleading. SCHEDULED implies that some of the containers dependencies (resources for localization + some internal queuing/scheduling policy) have not yet been met. Prior to this, YARN-2877 had introduced the QUEUED return state. This would be visible to applications, if Queuing was enabled. This patch technically just renames QUEUED to SCHEDULED. Also, all containers will go thru the SCHEDULED state, not just the opportunistic ones (although, for guaranteed containers this will just be a pass-thru state) Another thing I was hoping for some input was, currently, the {{ContainerScheduler}} runs in the same thread as the ContainerManager's AsyncDispatcher started by the ContainerManager. Also, the Scheduler is triggered only by events. I was wondering if there is any merit pushing these events into a blocking queue as they arrive and have a separate thread take care of them. This will preserve the serial nature of operation (and thereby keep the code simple by not needing synchronized collections) and will not hold up the dispatcher from delivering other events while the scheduler is scheduling. A minor disadvantage, is that the NM will probably consume a thread that for the most part will be blocked on the queue. This thread could be used by one of the containers. > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Attachments: YARN-4597.001.patch, YARN-4597.002.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567053#comment-15567053 ] Jian He commented on YARN-4597: --- Glanced through the patch, few comments: - Wondering why KillWhileExitingTransition is added: It will override the exitcode of the container. I feel in this case we should preserve the previous exitcode, because the container failed/succeeded first before the kill event. Or maybe we don't even need this transition, as the behavior is that the kill is ignored. - NodeManager#containerScheduler, NMContext#containerScheduler variable is not used, we can remove. - I think it's fine to not expose the ContainerState#SCHEDULED state to the user, this state is mostly internal-facing and transient. User may not care which intermediate state the container is at. {code} case SCHEDULED: return org.apache.hadoop.yarn.api.records.ContainerState.SCHEDULED; {code} > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Attachments: YARN-4597.001.patch, YARN-4597.002.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15560751#comment-15560751 ] Hadoop QA commented on YARN-4597: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 15 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 11s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 10s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 4s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 41s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 51s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 50s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 58s {color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 44s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 1s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 9m 1s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 9m 1s {color} | {color:red} root generated 3 new + 705 unchanged - 3 fixed = 708 total (was 708) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 2m 4s {color} | {color:red} root: The patch generated 35 new + 782 unchanged - 12 fixed = 817 total (was 794) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 33s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 52s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 58s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s {color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s {color} | {color:green} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 0 new + 239 unchanged - 1 fixed = 239 total (was 240) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s {color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s {color} | {color:green} hadoop-yarn-server-tests in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s {color} | {color:green} hadoop-yarn-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s {color} | {color:green} hadoop-mapreduce-client-jobclient in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 27s {color} | {color:red} hadoop-yarn-api in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 12m 34s {color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 38m 40s {color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit