[jira] [Created] (YARN-7212) [Atsv2] TimelineSchemaCreator fails to create flowrun table causes RegionServer down!
Rohith Sharma K S created YARN-7212: --- Summary: [Atsv2] TimelineSchemaCreator fails to create flowrun table causes RegionServer down! Key: YARN-7212 URL: https://issues.apache.org/jira/browse/YARN-7212 Project: Hadoop YARN Issue Type: Bug Reporter: Rohith Sharma K S *Hbase-2.0* officially support *hadoop-alpha* compilations. So I was trying to build and test with HBase-2.0. But table schema creation fails and causes RegionServer to shutdown with following error {noformat} Caused by: java.lang.NoSuchMethodError: org.apache.hadoop.hbase.Tag.asList([BII)Ljava/util/List; at org.apache.hadoop.yarn.server.timelineservice.storage.flow.FlowScanner.getCurrentAggOp(FlowScanner.java:250) at org.apache.hadoop.yarn.server.timelineservice.storage.flow.FlowScanner.nextInternal(FlowScanner.java:226) at org.apache.hadoop.yarn.server.timelineservice.storage.flow.FlowScanner.next(FlowScanner.java:145) at org.apache.hadoop.hbase.regionserver.StoreFlusher.performFlush(StoreFlusher.java:132) at org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:75) at org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:973) at org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:2252) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2672) {noformat} Since HBase-2.0 community is ready to release Hadoop-3.x compatible versions, ATSv2 also need to support HBase-2.0 versions. For this, we need to take up a task of test and validate HBase-2.0 issues! -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-7213) [Atsv2] Test and validate HBase-2.0 with Atsv2
Rohith Sharma K S created YARN-7213: --- Summary: [Atsv2] Test and validate HBase-2.0 with Atsv2 Key: YARN-7213 URL: https://issues.apache.org/jira/browse/YARN-7213 Project: Hadoop YARN Issue Type: Task Reporter: Rohith Sharma K S Hbase-2.0 officially support hadoop-alpha compilations. So, this JIRA is to keep track of HBase-2.0 integration issues. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7213) [Atsv2] Test and validate HBase-2.0.x with Atsv2
[ https://issues.apache.org/jira/browse/YARN-7213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith Sharma K S updated YARN-7213: Summary: [Atsv2] Test and validate HBase-2.0.x with Atsv2 (was: [Atsv2] Test and validate HBase-2.0 with Atsv2) > [Atsv2] Test and validate HBase-2.0.x with Atsv2 > > > Key: YARN-7213 > URL: https://issues.apache.org/jira/browse/YARN-7213 > Project: Hadoop YARN > Issue Type: Task >Reporter: Rohith Sharma K S > > Hbase-2.0 officially support hadoop-alpha compilations. So, this JIRA is to > keep track of HBase-2.0 integration issues. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7213) [Atsv2] Test and validate HBase-2.0.x with Atsv2
[ https://issues.apache.org/jira/browse/YARN-7213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith Sharma K S updated YARN-7213: Description: Hbase-2.0.x officially support hadoop-alpha compilations. And also they are getting ready for Hadoop-beta release so that HBase can release their versions compatible with Hadoop-beta. So, this JIRA is to keep track of HBase-2.0 integration issues. (was: Hbase-2.0 officially support hadoop-alpha compilations. So, this JIRA is to keep track of HBase-2.0 integration issues. ) > [Atsv2] Test and validate HBase-2.0.x with Atsv2 > > > Key: YARN-7213 > URL: https://issues.apache.org/jira/browse/YARN-7213 > Project: Hadoop YARN > Issue Type: Task >Reporter: Rohith Sharma K S > > Hbase-2.0.x officially support hadoop-alpha compilations. And also they are > getting ready for Hadoop-beta release so that HBase can release their > versions compatible with Hadoop-beta. So, this JIRA is to keep track of > HBase-2.0 integration issues. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-7214) duplicated container completed To AM
zhangshilong created YARN-7214: -- Summary: duplicated container completed To AM Key: YARN-7214 URL: https://issues.apache.org/jira/browse/YARN-7214 Project: Hadoop YARN Issue Type: Bug Affects Versions: 3.0.0-alpha3, 2.7.1 Environment: hadoop 2.7.1 rm recovery and nm recovery enabled Reporter: zhangshilong env: hadoop 2.7.1 with rm recovery and nm recovery enabled case: spark app(app1) running least one container(named c1) in NM1. 1、NM1 crashed,and RM found NM1 expired in 10 minutes. 2、RM will remove all containers in NM1(RMNodeImpl). and app1 will receive c1 completed message.But RM can not send c1(to be removed) to NM1 because NM1 lost. 3、NM1 restart and register with RM(c1 in register request),but RM found NM1 is lost and will not handle containers from NM1. 4、NM1 will not heartbeat with c1(c1 not in heartbeat request). So c1 will not removed from context of NM1. 5、 RM restart, NM1 re register with RM。And c1 will be handled and recovered. RM will send c1 complted message to AM of app1. So, app1 received duplicated c1. once spark AM receive one container completed from RM, it will allocate one new container. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7214) duplicated container completed To AM
[ https://issues.apache.org/jira/browse/YARN-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16171270#comment-16171270 ] zhangshilong commented on YARN-7214: 3. {code:java} public static class AddNodeTransition implements SingleArcTransition { @Override public void transition(RMNodeImpl rmNode, RMNodeEvent event) { // Inform the scheduler RMNodeStartedEvent startEvent = (RMNodeStartedEvent) event; List containers = null; NodeId nodeId = rmNode.nodeId; RMNode previousRMNode = rmNode.context.getInactiveRMNodes().remove(nodeId); if (previousRMNode != null) { rmNode.updateMetricsForRejoinedNode(previousRMNode.getState()); } else { NodeId unknownNodeId = NodesListManager.createUnknownNodeId(nodeId.getHost()); previousRMNode = rmNode.context.getInactiveRMNodes().remove(unknownNodeId); if (previousRMNode != null) { ClusterMetrics.getMetrics().decrDecommisionedNMs(); } // Increment activeNodes explicitly because this is a new node. ClusterMetrics.getMetrics().incrNumActiveNodes(); containers = startEvent.getNMContainerStatuses(); if (containers != null && !containers.isEmpty()) { for (NMContainerStatus container : containers) { if (container.getContainerState() == ContainerState.RUNNING || container.getContainerState() == ContainerState.SCHEDULED) { rmNode.launchedContainers.add(container.getContainerId()); } } } } if (null != startEvent.getRunningApplications()) { for (ApplicationId appId : startEvent.getRunningApplications()) { handleRunningAppOnNode(rmNode, rmNode.context, appId, rmNode.nodeId); } } rmNode.context.getDispatcher().getEventHandler() .handle(new NodeAddedSchedulerEvent(rmNode, containers)); rmNode.context.getDispatcher().getEventHandler().handle( new NodesListManagerEvent( NodesListManagerEventType.NODE_USABLE, rmNode)); } } {code} 4、 in NodeStatusUpdaterImpl.java before register: getNMContainerStatuses will be called. So completedContainer will be put into recentlyStoppedContainers. in register request: completed containers will be sent to RM. {code:java} public void addCompletedContainer(ContainerId containerId) { synchronized (recentlyStoppedContainers) { removeVeryOldStoppedContainersFromCache(); if (!recentlyStoppedContainers.containsKey(containerId)) { recentlyStoppedContainers.put(containerId, System.currentTimeMillis() + durationToTrackStoppedContainers); } } } {code} normal heartbeat, getContainerStatuses is called. So completed container will not be put into containerStatuses beacause it is in recentlyStoppedContainers. So completed container will not be sent to RM. {code:java} protected List getContainerStatuses() throws IOException { List containerStatuses = new ArrayList(); for (Container container : this.context.getContainers().values()) { ContainerId containerId = container.getContainerId(); ApplicationId applicationId = containerId.getApplicationAttemptId() .getApplicationId(); org.apache.hadoop.yarn.api.records.ContainerStatus containerStatus = container.cloneAndGetContainerStatus(); if (containerStatus.getState() == ContainerState.COMPLETE) { if (isApplicationStopped(applicationId)) { if (LOG.isDebugEnabled()) { LOG.debug(applicationId + " is completing, " + " remove " + containerId + " from NM context."); } context.getContainers().remove(containerId); pendingCompletedContainers.put(containerId, containerStatus); } else { if (!isContainerRecentlyStopped(containerId)) { pendingCompletedContainers.put(containerId, containerStatus); } } // Adding to finished containers cache. Cache will keep it around at // least for #durationToTrackStoppedContainers duration. In the // subsequent call to stop container it will get removed from cache. addCompletedContainer(containerId); } else { containerStatuses.add(containerStatus); } } containerStatuses.addAll(pendingCompletedContainers.values()); if (LOG.isDebugEnabled()) { LOG.debug("Sending out " + containerStatuses.size() + " container statuses: " + containerStatuses); } return containerStatuses; } {code} > duplicated container completed To AM > > > Key: YARN-7214 > URL: https://issues.apache.org/jira/browse/YARN-7214 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.7.1, 3.
[jira] [Commented] (YARN-7214) duplicated container completed To AM
[ https://issues.apache.org/jira/browse/YARN-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16171322#comment-16171322 ] zhangshilong commented on YARN-7214: in my thought, containers in recentlyStoppedContainers can be removed from NMContext if NM heartbeat normally with RM. > duplicated container completed To AM > > > Key: YARN-7214 > URL: https://issues.apache.org/jira/browse/YARN-7214 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.7.1, 3.0.0-alpha3 > Environment: hadoop 2.7.1 rm recovery and nm recovery enabled >Reporter: zhangshilong > > env: hadoop 2.7.1 with rm recovery and nm recovery enabled > case: > spark app(app1) running least one container(named c1) in NM1. > 1、NM1 crashed,and RM found NM1 expired in 10 minutes. > 2、RM will remove all containers in NM1(RMNodeImpl). and app1 will receive > c1 completed message.But RM can not send c1(to be removed) to NM1 because NM1 > lost. > 3、NM1 restart and register with RM(c1 in register request),but RM found NM1 > is lost and will not handle containers from NM1. > 4、NM1 will not heartbeat with c1(c1 not in heartbeat request). So c1 will > not removed from context of NM1. > 5、 RM restart, NM1 re register with RM。And c1 will be handled and recovered. > RM will send c1 complted message to AM of app1. So, app1 received duplicated > c1. > once spark AM receive one container completed from RM, it will allocate one > new container. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6916) Moving logging APIs over to slf4j in hadoop-yarn-server-common
[ https://issues.apache.org/jira/browse/YARN-6916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-6916: --- Attachment: YARN-6916.005.patch +1 LGTM. Attaching rebased patch again > Moving logging APIs over to slf4j in hadoop-yarn-server-common > -- > > Key: YARN-6916 > URL: https://issues.apache.org/jira/browse/YARN-6916 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka > Attachments: YARN-6712.01.patch, YARN-6916.002.patch, > YARN-6916.003.patch, YARN-6916.004.patch, YARN-6916.005.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6878) TestCapacityScheduler.testDefaultNodeLabelExpressionQueueConfig() has the args to assertEqual() in the wrong order
[ https://issues.apache.org/jira/browse/YARN-6878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16171360#comment-16171360 ] Sen Zhao commented on YARN-6878: The failed tests are unrelated. [~templedf], can you give me some advises about the latest patch? > TestCapacityScheduler.testDefaultNodeLabelExpressionQueueConfig() has the > args to assertEqual() in the wrong order > -- > > Key: YARN-6878 > URL: https://issues.apache.org/jira/browse/YARN-6878 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, test >Affects Versions: 3.0.0-alpha4 >Reporter: Daniel Templeton >Assignee: Sen Zhao >Priority: Trivial > Labels: newbie > Attachments: YARN-6878.001.patch, YARN-6878.002.patch, > YARN-6878.003.patch > > > The expected value should come before the actual value. It would be nice to > add some assert messages as well. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7214) duplicated container completed To AM
[ https://issues.apache.org/jira/browse/YARN-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhangshilong updated YARN-7214: --- Attachment: screenshot-1.png > duplicated container completed To AM > > > Key: YARN-7214 > URL: https://issues.apache.org/jira/browse/YARN-7214 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.7.1, 3.0.0-alpha3 > Environment: hadoop 2.7.1 rm recovery and nm recovery enabled >Reporter: zhangshilong > Attachments: screenshot-1.png > > > env: hadoop 2.7.1 with rm recovery and nm recovery enabled > case: > spark app(app1) running least one container(named c1) in NM1. > 1、NM1 crashed,and RM found NM1 expired in 10 minutes. > 2、RM will remove all containers in NM1(RMNodeImpl). and app1 will receive > c1 completed message.But RM can not send c1(to be removed) to NM1 because NM1 > lost. > 3、NM1 restart and register with RM(c1 in register request),but RM found NM1 > is lost and will not handle containers from NM1. > 4、NM1 will not heartbeat with c1(c1 not in heartbeat request). So c1 will > not removed from context of NM1. > 5、 RM restart, NM1 re register with RM。And c1 will be handled and recovered. > RM will send c1 complted message to AM of app1. So, app1 received duplicated > c1. > once spark AM receive one container completed from RM, it will allocate one > new container. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Issue Comment Deleted] (YARN-7214) duplicated container completed To AM
[ https://issues.apache.org/jira/browse/YARN-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] rangjiaheng updated YARN-7214: -- Comment: was deleted (was: aa ) > duplicated container completed To AM > > > Key: YARN-7214 > URL: https://issues.apache.org/jira/browse/YARN-7214 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.7.1, 3.0.0-alpha3 > Environment: hadoop 2.7.1 rm recovery and nm recovery enabled >Reporter: zhangshilong > Attachments: screenshot-1.png > > > env: hadoop 2.7.1 with rm recovery and nm recovery enabled > case: > spark app(app1) running least one container(named c1) in NM1. > 1、NM1 crashed,and RM found NM1 expired in 10 minutes. > 2、RM will remove all containers in NM1(RMNodeImpl). and app1 will receive > c1 completed message.But RM can not send c1(to be removed) to NM1 because NM1 > lost. > 3、NM1 restart and register with RM(c1 in register request),but RM found NM1 > is lost and will not handle containers from NM1. > 4、NM1 will not heartbeat with c1(c1 not in heartbeat request). So c1 will > not removed from context of NM1. > 5、 RM restart, NM1 re register with RM。And c1 will be handled and recovered. > RM will send c1 complted message to AM of app1. So, app1 received duplicated > c1. > once spark AM receive one container completed from RM, it will allocate one > new container. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7214) duplicated container completed To AM
[ https://issues.apache.org/jira/browse/YARN-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16171391#comment-16171391 ] rangjiaheng commented on YARN-7214: --- aa > duplicated container completed To AM > > > Key: YARN-7214 > URL: https://issues.apache.org/jira/browse/YARN-7214 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.7.1, 3.0.0-alpha3 > Environment: hadoop 2.7.1 rm recovery and nm recovery enabled >Reporter: zhangshilong > Attachments: screenshot-1.png > > > env: hadoop 2.7.1 with rm recovery and nm recovery enabled > case: > spark app(app1) running least one container(named c1) in NM1. > 1、NM1 crashed,and RM found NM1 expired in 10 minutes. > 2、RM will remove all containers in NM1(RMNodeImpl). and app1 will receive > c1 completed message.But RM can not send c1(to be removed) to NM1 because NM1 > lost. > 3、NM1 restart and register with RM(c1 in register request),but RM found NM1 > is lost and will not handle containers from NM1. > 4、NM1 will not heartbeat with c1(c1 not in heartbeat request). So c1 will > not removed from context of NM1. > 5、 RM restart, NM1 re register with RM。And c1 will be handled and recovered. > RM will send c1 complted message to AM of app1. So, app1 received duplicated > c1. > once spark AM receive one container completed from RM, it will allocate one > new container. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7214) duplicated container completed To AM
[ https://issues.apache.org/jira/browse/YARN-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16171396#comment-16171396 ] zhangshilong commented on YARN-7214: !screenshot-1.png! generally, 1、 NM complete one container(c) and send to RM 2、RM sent c to AM, tell AM c is completed. 3、RM sent c to NM, tell NM c can be removed from NM. If RM restart before step 3, c will be in in context of NM for ever. If RM restart again, c will be duplicated container completed to AM. > duplicated container completed To AM > > > Key: YARN-7214 > URL: https://issues.apache.org/jira/browse/YARN-7214 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.7.1, 3.0.0-alpha3 > Environment: hadoop 2.7.1 rm recovery and nm recovery enabled >Reporter: zhangshilong > Attachments: screenshot-1.png > > > env: hadoop 2.7.1 with rm recovery and nm recovery enabled > case: > spark app(app1) running least one container(named c1) in NM1. > 1、NM1 crashed,and RM found NM1 expired in 10 minutes. > 2、RM will remove all containers in NM1(RMNodeImpl). and app1 will receive > c1 completed message.But RM can not send c1(to be removed) to NM1 because NM1 > lost. > 3、NM1 restart and register with RM(c1 in register request),but RM found NM1 > is lost and will not handle containers from NM1. > 4、NM1 will not heartbeat with c1(c1 not in heartbeat request). So c1 will > not removed from context of NM1. > 5、 RM restart, NM1 re register with RM。And c1 will be handled and recovered. > RM will send c1 complted message to AM of app1. So, app1 received duplicated > c1. > once spark AM receive one container completed from RM, it will allocate one > new container. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7214) duplicated container completed To AM
[ https://issues.apache.org/jira/browse/YARN-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16171402#comment-16171402 ] rangjiaheng commented on YARN-7214: --- We found this problem in Spark streaming application, a long-running application, which has fixed number of containers; after NM lost, NM restarted and RM restarted, a more container were allocated. > duplicated container completed To AM > > > Key: YARN-7214 > URL: https://issues.apache.org/jira/browse/YARN-7214 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.7.1, 3.0.0-alpha3 > Environment: hadoop 2.7.1 rm recovery and nm recovery enabled >Reporter: zhangshilong > Attachments: screenshot-1.png > > > env: hadoop 2.7.1 with rm recovery and nm recovery enabled > case: > spark app(app1) running least one container(named c1) in NM1. > 1、NM1 crashed,and RM found NM1 expired in 10 minutes. > 2、RM will remove all containers in NM1(RMNodeImpl). and app1 will receive > c1 completed message.But RM can not send c1(to be removed) to NM1 because NM1 > lost. > 3、NM1 restart and register with RM(c1 in register request),but RM found NM1 > is lost and will not handle containers from NM1. > 4、NM1 will not heartbeat with c1(c1 not in heartbeat request). So c1 will > not removed from context of NM1. > 5、 RM restart, NM1 re register with RM。And c1 will be handled and recovered. > RM will send c1 complted message to AM of app1. So, app1 received duplicated > c1. > once spark AM receive one container completed from RM, it will allocate one > new container. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-7214) duplicated container completed To AM
[ https://issues.apache.org/jira/browse/YARN-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16171402#comment-16171402 ] rangjiaheng edited comment on YARN-7214 at 9/19/17 9:32 AM: We found this problem in Spark streaming application, a long-running application, which has fixed number of containers; after NM lost, NM restarted and RM restarted, a duplicated container was allocated. was (Author: neomatrix): We found this problem in Spark streaming application, a long-running application, which has fixed number of containers; after NM lost, NM restarted and RM restarted, a more container were allocated. > duplicated container completed To AM > > > Key: YARN-7214 > URL: https://issues.apache.org/jira/browse/YARN-7214 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.7.1, 3.0.0-alpha3 > Environment: hadoop 2.7.1 rm recovery and nm recovery enabled >Reporter: zhangshilong > Attachments: screenshot-1.png > > > env: hadoop 2.7.1 with rm recovery and nm recovery enabled > case: > spark app(app1) running least one container(named c1) in NM1. > 1、NM1 crashed,and RM found NM1 expired in 10 minutes. > 2、RM will remove all containers in NM1(RMNodeImpl). and app1 will receive > c1 completed message.But RM can not send c1(to be removed) to NM1 because NM1 > lost. > 3、NM1 restart and register with RM(c1 in register request),but RM found NM1 > is lost and will not handle containers from NM1. > 4、NM1 will not heartbeat with c1(c1 not in heartbeat request). So c1 will > not removed from context of NM1. > 5、 RM restart, NM1 re register with RM。And c1 will be handled and recovered. > RM will send c1 complted message to AM of app1. So, app1 received duplicated > c1. > once spark AM receive one container completed from RM, it will allocate one > new container. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-7214) duplicated container completed To AM
[ https://issues.apache.org/jira/browse/YARN-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16171396#comment-16171396 ] zhangshilong edited comment on YARN-7214 at 9/19/17 9:38 AM: - !screenshot-1.png! generally, 1、 NM complete one container(c) and send to RM 2、RM sent c to AM, tell AM c is completed. 3、RM sent c to NM, tell NM c can be removed from NM. If RM restart before step 3, c will be duplicated container completed to AM. was (Author: zsl2007): !screenshot-1.png! generally, 1、 NM complete one container(c) and send to RM 2、RM sent c to AM, tell AM c is completed. 3、RM sent c to NM, tell NM c can be removed from NM. If RM restart before step 3, c will be in in context of NM for ever. If RM restart again, c will be duplicated container completed to AM. > duplicated container completed To AM > > > Key: YARN-7214 > URL: https://issues.apache.org/jira/browse/YARN-7214 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.7.1, 3.0.0-alpha3 > Environment: hadoop 2.7.1 rm recovery and nm recovery enabled >Reporter: zhangshilong > Attachments: screenshot-1.png > > > env: hadoop 2.7.1 with rm recovery and nm recovery enabled > case: > spark app(app1) running least one container(named c1) in NM1. > 1、NM1 crashed,and RM found NM1 expired in 10 minutes. > 2、RM will remove all containers in NM1(RMNodeImpl). and app1 will receive > c1 completed message.But RM can not send c1(to be removed) to NM1 because NM1 > lost. > 3、NM1 restart and register with RM(c1 in register request),but RM found NM1 > is lost and will not handle containers from NM1. > 4、NM1 will not heartbeat with c1(c1 not in heartbeat request). So c1 will > not removed from context of NM1. > 5、 RM restart, NM1 re register with RM。And c1 will be handled and recovered. > RM will send c1 complted message to AM of app1. So, app1 received duplicated > c1. > once spark AM receive one container completed from RM, it will allocate one > new container. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7001) If shared cache upload is terminated in the middle, the temp file will never be deleted
[ https://issues.apache.org/jira/browse/YARN-7001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16171520#comment-16171520 ] Sen Zhao commented on YARN-7001: Hi, [~miklos.szeg...@cloudera.com]. I would like to try this issue. And I will submit a patch. > If shared cache upload is terminated in the middle, the temp file will never > be deleted > --- > > Key: YARN-7001 > URL: https://issues.apache.org/jira/browse/YARN-7001 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi > > There is a missing deleteTempFile(tempPath); > {code} > tempPath = new Path(directoryPath, getTemporaryFileName(actualPath)); > if (!uploadFile(actualPath, tempPath)) { > LOG.warn("Could not copy the file to the shared cache at " + > tempPath); > return false; > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7001) If shared cache upload is terminated in the middle, the temp file will never be deleted
[ https://issues.apache.org/jira/browse/YARN-7001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sen Zhao updated YARN-7001: --- Attachment: YARN-7001.001.patch > If shared cache upload is terminated in the middle, the temp file will never > be deleted > --- > > Key: YARN-7001 > URL: https://issues.apache.org/jira/browse/YARN-7001 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi > Attachments: YARN-7001.001.patch > > > There is a missing deleteTempFile(tempPath); > {code} > tempPath = new Path(directoryPath, getTemporaryFileName(actualPath)); > if (!uploadFile(actualPath, tempPath)) { > LOG.warn("Could not copy the file to the shared cache at " + > tempPath); > return false; > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-7001) If shared cache upload is terminated in the middle, the temp file will never be deleted
[ https://issues.apache.org/jira/browse/YARN-7001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sen Zhao reassigned YARN-7001: -- Assignee: Sen Zhao > If shared cache upload is terminated in the middle, the temp file will never > be deleted > --- > > Key: YARN-7001 > URL: https://issues.apache.org/jira/browse/YARN-7001 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi >Assignee: Sen Zhao > Attachments: YARN-7001.001.patch > > > There is a missing deleteTempFile(tempPath); > {code} > tempPath = new Path(directoryPath, getTemporaryFileName(actualPath)); > if (!uploadFile(actualPath, tempPath)) { > LOG.warn("Could not copy the file to the shared cache at " + > tempPath); > return false; > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6878) TestCapacityScheduler.testDefaultNodeLabelExpressionQueueConfig() has the args to assertEqual() in the wrong order
[ https://issues.apache.org/jira/browse/YARN-6878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16171671#comment-16171671 ] Daniel Templeton commented on YARN-6878: LGTM +1 > TestCapacityScheduler.testDefaultNodeLabelExpressionQueueConfig() has the > args to assertEqual() in the wrong order > -- > > Key: YARN-6878 > URL: https://issues.apache.org/jira/browse/YARN-6878 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, test >Affects Versions: 3.0.0-alpha4 >Reporter: Daniel Templeton >Assignee: Sen Zhao >Priority: Trivial > Labels: newbie > Attachments: YARN-6878.001.patch, YARN-6878.002.patch, > YARN-6878.003.patch > > > The expected value should come before the actual value. It would be nice to > add some assert messages as well. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6991) "Kill application" button does not show error if other user tries to kill the application for secure cluster
[ https://issues.apache.org/jira/browse/YARN-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16171804#comment-16171804 ] Sunil G commented on YARN-6991: --- patch looks fine to me. > "Kill application" button does not show error if other user tries to kill the > application for secure cluster > > > Key: YARN-6991 > URL: https://issues.apache.org/jira/browse/YARN-6991 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Sumana Sathish >Assignee: Suma Shivaprasad > Attachments: YARN-6991.001.patch, YARN-6991.002.patch, > YARN-6991.003.patch > > > 1. Submit an application by user 1 > 2. log into RM UI as user 2 > 3. Kill the application submitted by user 1 > 4. Even though application does not get killed, there is no error/info dialog > box being shown to let the user that "user doesnot have permissions to kill > application of other user" -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7135) Clean up lock-try order in common scheduler code
[ https://issues.apache.org/jira/browse/YARN-7135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16171878#comment-16171878 ] Daniel Templeton commented on YARN-7135: Looks like there were some whitespace issues. Try running _git diff --check_. Otherwise looks good. > Clean up lock-try order in common scheduler code > > > Key: YARN-7135 > URL: https://issues.apache.org/jira/browse/YARN-7135 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Affects Versions: 3.0.0-alpha4 >Reporter: Daniel Templeton >Assignee: weiyuan > Labels: newbie > Attachments: YARN-7135.001.patch > > > There are many places that follow the pattern:{code}try { > lock.lock(); > ... > } finally { > lock.unlock(); > }{code} > There are a couple of reasons that's a bad idea. The correct pattern > is:{code}lock.lock(); > try { > ... > } finally { > lock.unlock(); > }{code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4266) Allow users to enter containers as UID:GID pair instead of by username
[ https://issues.apache.org/jira/browse/YARN-4266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated YARN-4266: -- Attachment: YARN-4266.004.patch [~jlowe], thanks for the review! bq. These comments don't match the code: Fixed the code to match the comments. bq. Should we handle ExitCodeException or other types of exceptions that might happen (e.g.: "no such user" type of errors) explicitly when running the id command so we can provide a better debug experience, or is the exception message enough info to debug issues? ContainerExecutionException doesn't have a constructor with both a string and a throwable, so I just removed the string part. That way it will correctly parse the information in the throwable that comes from the failed command. bq. Also I found it odd that getUserIdInfo and getGroupIdInfo take a parameter for the id command but these methods are highly dependent upon the "right" parameter being passed in order to function properly. They are each only called in one place, and IMHO there's no reason to make this parameterized given the parsing code needs the corresponding parameter to be correct. We should just remove the parameter and have it passed directly. Yep, good call. Removed the parameter and hardcoded in the "-u" and "-G" into the respective method. > Allow users to enter containers as UID:GID pair instead of by username > -- > > Key: YARN-4266 > URL: https://issues.apache.org/jira/browse/YARN-4266 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Sidharta Seethana >Assignee: luhuichun > Attachments: YARN-4266.001.patch, YARN-4266.001.patch, > YARN-4266.002.patch, YARN-4266.003.patch, YARN-4266.004.patch, > YARN-4266_Allow_whitelisted_users_to_disable_user_re-mapping.pdf, > YARN-4266_Allow_whitelisted_users_to_disable_user_re-mapping_v2.pdf, > YARN-4266_Allow_whitelisted_users_to_disable_user_re-mapping_v3.pdf, > YARN-4266-branch-2.8.001.patch > > > Docker provides a mechanism (the --user switch) that enables us to specify > the user the container processes should run as. We use this mechanism today > when launching docker containers . In non-secure mode, we run the docker > container based on > `yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user` and in > secure mode, as the submitting user. However, this mechanism breaks down with > a large number of 'pre-created' images which don't necessarily have the users > available within the image. Examples of such images include shared images > that need to be used by multiple users. We need a way in which we can allow a > pre-defined set of users to run containers based on existing images, without > using the --user switch. There are some implications of disabling this user > squashing that we'll need to work through : log aggregation, artifact > deletion etc., -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7201) Add more sophisticated example YARN service
[ https://issues.apache.org/jira/browse/YARN-7201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated YARN-7201: Attachment: YARN-7201.yarn-native-services.006.patch Correction to artifact image name. > Add more sophisticated example YARN service > --- > > Key: YARN-7201 > URL: https://issues.apache.org/jira/browse/YARN-7201 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Jian He > Attachments: YARN-7201.yarn-native-services.001.patch, > YARN-7201.yarn-native-services.002.patch, > YARN-7201.yarn-native-services.003.patch, > YARN-7201.yarn-native-services.004.patch, > YARN-7201.yarn-native-services.005.patch, > YARN-7201.yarn-native-services.006.patch > > > We can show case the following capabilities in the YARN service examples: > # Description of the service > # Component dependencies > # How to mount HDFS volume via NFS Gateway > # Enable privileged container > # Quick link to web application > # Queue to submit > # Use private docker registry -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7135) Clean up lock-try order in common scheduler code
[ https://issues.apache.org/jira/browse/YARN-7135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] weiyuan updated YARN-7135: -- Attachment: YARN-7135.002.patch > Clean up lock-try order in common scheduler code > > > Key: YARN-7135 > URL: https://issues.apache.org/jira/browse/YARN-7135 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Affects Versions: 3.0.0-alpha4 >Reporter: Daniel Templeton >Assignee: weiyuan > Labels: newbie > Attachments: YARN-7135.001.patch, YARN-7135.002.patch > > > There are many places that follow the pattern:{code}try { > lock.lock(); > ... > } finally { > lock.unlock(); > }{code} > There are a couple of reasons that's a bad idea. The correct pattern > is:{code}lock.lock(); > try { > ... > } finally { > lock.unlock(); > }{code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7135) Clean up lock-try order in common scheduler code
[ https://issues.apache.org/jira/browse/YARN-7135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16171994#comment-16171994 ] weiyuan commented on YARN-7135: --- [~templedf], thanks for your suggestion, I updated the patch again. > Clean up lock-try order in common scheduler code > > > Key: YARN-7135 > URL: https://issues.apache.org/jira/browse/YARN-7135 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Affects Versions: 3.0.0-alpha4 >Reporter: Daniel Templeton >Assignee: weiyuan > Labels: newbie > Attachments: YARN-7135.001.patch, YARN-7135.002.patch > > > There are many places that follow the pattern:{code}try { > lock.lock(); > ... > } finally { > lock.unlock(); > }{code} > There are a couple of reasons that's a bad idea. The correct pattern > is:{code}lock.lock(); > try { > ... > } finally { > lock.unlock(); > }{code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7135) Clean up lock-try order in common scheduler code
[ https://issues.apache.org/jira/browse/YARN-7135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16171993#comment-16171993 ] Hadoop QA commented on YARN-7135: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 6s{color} | {color:red} YARN-7135 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-7135 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12887928/YARN-7135.002.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/17515/console | | Powered by | Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Clean up lock-try order in common scheduler code > > > Key: YARN-7135 > URL: https://issues.apache.org/jira/browse/YARN-7135 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Affects Versions: 3.0.0-alpha4 >Reporter: Daniel Templeton >Assignee: weiyuan > Labels: newbie > Attachments: YARN-7135.001.patch, YARN-7135.002.patch > > > There are many places that follow the pattern:{code}try { > lock.lock(); > ... > } finally { > lock.unlock(); > }{code} > There are a couple of reasons that's a bad idea. The correct pattern > is:{code}lock.lock(); > try { > ... > } finally { > lock.unlock(); > }{code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Issue Comment Deleted] (YARN-7135) Clean up lock-try order in common scheduler code
[ https://issues.apache.org/jira/browse/YARN-7135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] weiyuan updated YARN-7135: -- Comment: was deleted (was: [~templedf], thanks for your suggestion, I updated the patch again. ) > Clean up lock-try order in common scheduler code > > > Key: YARN-7135 > URL: https://issues.apache.org/jira/browse/YARN-7135 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Affects Versions: 3.0.0-alpha4 >Reporter: Daniel Templeton >Assignee: weiyuan > Labels: newbie > Attachments: YARN-7135.001.patch, YARN-7135.002.patch > > > There are many places that follow the pattern:{code}try { > lock.lock(); > ... > } finally { > lock.unlock(); > }{code} > There are a couple of reasons that's a bad idea. The correct pattern > is:{code}lock.lock(); > try { > ... > } finally { > lock.unlock(); > }{code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7135) Clean up lock-try order in common scheduler code
[ https://issues.apache.org/jira/browse/YARN-7135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] weiyuan updated YARN-7135: -- Attachment: YARN-7135.003.patch > Clean up lock-try order in common scheduler code > > > Key: YARN-7135 > URL: https://issues.apache.org/jira/browse/YARN-7135 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Affects Versions: 3.0.0-alpha4 >Reporter: Daniel Templeton >Assignee: weiyuan > Labels: newbie > Attachments: YARN-7135.001.patch, YARN-7135.002.patch, > YARN-7135.003.patch > > > There are many places that follow the pattern:{code}try { > lock.lock(); > ... > } finally { > lock.unlock(); > }{code} > There are a couple of reasons that's a bad idea. The correct pattern > is:{code}lock.lock(); > try { > ... > } finally { > lock.unlock(); > }{code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7135) Clean up lock-try order in common scheduler code
[ https://issues.apache.org/jira/browse/YARN-7135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172005#comment-16172005 ] Hadoop QA commented on YARN-7135: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 7s{color} | {color:red} YARN-7135 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-7135 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12887932/YARN-7135.003.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/17516/console | | Powered by | Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Clean up lock-try order in common scheduler code > > > Key: YARN-7135 > URL: https://issues.apache.org/jira/browse/YARN-7135 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Affects Versions: 3.0.0-alpha4 >Reporter: Daniel Templeton >Assignee: weiyuan > Labels: newbie > Attachments: YARN-7135.001.patch, YARN-7135.002.patch, > YARN-7135.003.patch > > > There are many places that follow the pattern:{code}try { > lock.lock(); > ... > } finally { > lock.unlock(); > }{code} > There are a couple of reasons that's a bad idea. The correct pattern > is:{code}lock.lock(); > try { > ... > } finally { > lock.unlock(); > }{code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4266) Allow users to enter containers as UID:GID pair instead of by username
[ https://issues.apache.org/jira/browse/YARN-4266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172006#comment-16172006 ] Hadoop QA commented on YARN-4266: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 52s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 11s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 2s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager in trunk has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 7m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 12s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 10s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 5 new + 227 unchanged - 0 fixed = 232 total (was 227) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 43s{color} | {color:red} hadoop-yarn-api in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 15m 27s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 41s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 71m 42s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.conf.TestYarnConfigurationFields | | | hadoop.yarn.server.nodemanager.containermanager.TestContainerManager | | | hadoop.yarn.server.nodemanager.containermanager.linux.runtime.TestDockerContainerRuntime | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:71bbb86 | | JIRA Issue | YARN-4266 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12887908/YARN-4266.004.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc | | uname | Linux 5620da87c15a 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 3a20deb | | Default Java | 1.8.0_144 | | findbugs | v3.1.0-RC1 | | findbugs | https://bu
[jira] [Commented] (YARN-7001) If shared cache upload is terminated in the middle, the temp file will never be deleted
[ https://issues.apache.org/jira/browse/YARN-7001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172026#comment-16172026 ] Miklos Szegedi commented on YARN-7001: -- Thank you, [~Sen Zhao] for the patch. Could you add a unit test? > If shared cache upload is terminated in the middle, the temp file will never > be deleted > --- > > Key: YARN-7001 > URL: https://issues.apache.org/jira/browse/YARN-7001 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi >Assignee: Sen Zhao > Attachments: YARN-7001.001.patch > > > There is a missing deleteTempFile(tempPath); > {code} > tempPath = new Path(directoryPath, getTemporaryFileName(actualPath)); > if (!uploadFile(actualPath, tempPath)) { > LOG.warn("Could not copy the file to the shared cache at " + > tempPath); > return false; > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6916) Moving logging APIs over to slf4j in hadoop-yarn-server-common
[ https://issues.apache.org/jira/browse/YARN-6916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172032#comment-16172032 ] Hadoop QA commented on YARN-6916: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} docker {color} | {color:red}499m 37s{color} | {color:red} Docker failed to build yetus/hadoop:tp-16710. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-6916 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12887852/YARN-6916.005.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/17512/console | | Powered by | Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Moving logging APIs over to slf4j in hadoop-yarn-server-common > -- > > Key: YARN-6916 > URL: https://issues.apache.org/jira/browse/YARN-6916 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka > Attachments: YARN-6712.01.patch, YARN-6916.002.patch, > YARN-6916.003.patch, YARN-6916.004.patch, YARN-6916.005.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6943) Update Yarn to YARN in documentation
[ https://issues.apache.org/jira/browse/YARN-6943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172042#comment-16172042 ] Haibo Chen commented on YARN-6943: -- [~chetna] I have added you as a contributor to YARN. You should have the permission to submit a patch now. Let me know if you still cannot. > Update Yarn to YARN in documentation > > > Key: YARN-6943 > URL: https://issues.apache.org/jira/browse/YARN-6943 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi >Priority: Minor > Labels: newbie > > Based on the discussion with [~templedf] in YARN-6757 the official case of > YARN is YARN, not Yarn, so we should update all the md files. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6962) Add support for updateContainers when allocating using FederationInterceptor
[ https://issues.apache.org/jira/browse/YARN-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Botong Huang updated YARN-6962: --- Attachment: YARN-6962.v2.patch v2 patch unit test added. > Add support for updateContainers when allocating using FederationInterceptor > > > Key: YARN-6962 > URL: https://issues.apache.org/jira/browse/YARN-6962 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Botong Huang >Assignee: Botong Huang >Priority: Minor > Attachments: YARN-6962.v1.patch, YARN-6962.v2.patch > > > Container update is introduced in YARN-5221. Federation Interceptor needs to > support it when splitting (merging) the allocate request (response). -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7034) DefaultLinuxContainerRuntime and DockerLinuxContainerRuntime sends client environment variables to container-executor
[ https://issues.apache.org/jira/browse/YARN-7034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172060#comment-16172060 ] Hadoop QA commented on YARN-7034: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 22s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 28s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 43s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager in trunk has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 14m 24s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 38m 52s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.nodemanager.containermanager.TestContainerManager | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:71bbb86 | | JIRA Issue | YARN-7034 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12887386/YARN-7034.006.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux d560dae92c19 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 595d478 | | Default Java | 1.8.0_144 | | findbugs | v3.1.0-RC1 | | findbugs | https://builds.apache.org/job/PreCommit-YARN-Build/17517/artifact/patchprocess/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-warnings.html | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/17517/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/17517/testReport/ | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/17517/console | | Powered by | Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org |
[jira] [Commented] (YARN-7135) Clean up lock-try order in common scheduler code
[ https://issues.apache.org/jira/browse/YARN-7135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172065#comment-16172065 ] Wangda Tan commented on YARN-7135: -- [~v123582] / [~templedf], Apologize for my late responses, I just checked some resources. >From \[1\], {code} Assuming that lock is a ReentrantLock, then it makes no real difference, since lock() does not throw any checked exceptions. The Java documentation, however, leaves lock() outside the try block in the ReentrantLock example. The reason for this is that an unchecked exception in lock() should not lead to unlock() incorrectly being called. Whether correctness is a concern in the presence of an unchecked exception in lock() of all things, that is another discussion altogether. It is a good coding practice in general to keep things like try blocks as fine-grained as possible. {code} And you can also check that, Java's ReentrantLock.lock doesn't throw any exception, see \[2\]. I think update all ReentrantLock in YARN RM package might be overkill and will potentially cause lots of conflict when we want to do backport (unless we backport this patch to all active branches). Instead of doing this, I suggest to keep all ReentrantLock as-is, and only update lock pattern when necessary. \[1\] https://stackoverflow.com/questions/10868423/lock-lock-before-try \[2\] http://docs.oracle.com/javase/6/docs/api/java/util/concurrent/locks/ReentrantLock.html > Clean up lock-try order in common scheduler code > > > Key: YARN-7135 > URL: https://issues.apache.org/jira/browse/YARN-7135 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Affects Versions: 3.0.0-alpha4 >Reporter: Daniel Templeton >Assignee: weiyuan > Labels: newbie > Attachments: YARN-7135.001.patch, YARN-7135.002.patch, > YARN-7135.003.patch > > > There are many places that follow the pattern:{code}try { > lock.lock(); > ... > } finally { > lock.unlock(); > }{code} > There are a couple of reasons that's a bad idea. The correct pattern > is:{code}lock.lock(); > try { > ... > } finally { > lock.unlock(); > }{code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7034) DefaultLinuxContainerRuntime and DockerLinuxContainerRuntime sends client environment variables to container-executor
[ https://issues.apache.org/jira/browse/YARN-7034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172086#comment-16172086 ] Miklos Szegedi commented on YARN-7034: -- The unit test is a flaky one. See YARN-7145 Identify potential flaky unit tests. > DefaultLinuxContainerRuntime and DockerLinuxContainerRuntime sends client > environment variables to container-executor > - > > Key: YARN-7034 > URL: https://issues.apache.org/jira/browse/YARN-7034 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Critical > Attachments: YARN-7034.000.patch, YARN-7034.001.patch, > YARN-7034.002.patch, YARN-7034.003.patch, YARN-7034.004.patch, > YARN-7034.005.patch, YARN-7034.006.patch, YARN-7034.branch-2.000.patch, > YARN-7034.branch-2.004.patch, YARN-7034.branch-2.005.patch, > YARN-7034.branch-2.006.patch, YARN-7034.branch-2.8.000.patch, > YARN-7034.branch-2.8.004.patch, YARN-7034.branch-2.8.005.patch, > YARN-7034.branch-2.8.006.patch > > > This behavior is unnecessary since there is nothing that is used from the > environment right now. One option is to whitelist these variables before > passing them. Are there any known use cases for this to justify? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7034) DefaultLinuxContainerRuntime and DockerLinuxContainerRuntime sends client environment variables to container-executor
[ https://issues.apache.org/jira/browse/YARN-7034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172087#comment-16172087 ] Miklos Szegedi commented on YARN-7034: -- [~shaneku...@gmail.com], do you have any other comments? > DefaultLinuxContainerRuntime and DockerLinuxContainerRuntime sends client > environment variables to container-executor > - > > Key: YARN-7034 > URL: https://issues.apache.org/jira/browse/YARN-7034 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Critical > Attachments: YARN-7034.000.patch, YARN-7034.001.patch, > YARN-7034.002.patch, YARN-7034.003.patch, YARN-7034.004.patch, > YARN-7034.005.patch, YARN-7034.006.patch, YARN-7034.branch-2.000.patch, > YARN-7034.branch-2.004.patch, YARN-7034.branch-2.005.patch, > YARN-7034.branch-2.006.patch, YARN-7034.branch-2.8.000.patch, > YARN-7034.branch-2.8.004.patch, YARN-7034.branch-2.8.005.patch, > YARN-7034.branch-2.8.006.patch > > > This behavior is unnecessary since there is nothing that is used from the > environment right now. One option is to whitelist these variables before > passing them. Are there any known use cases for this to justify? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6962) Add support for updateContainers when allocating using FederationInterceptor
[ https://issues.apache.org/jira/browse/YARN-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172097#comment-16172097 ] Hadoop QA commented on YARN-6962: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 28s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 44s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager in trunk has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 14m 16s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 16s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 35m 19s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.nodemanager.containermanager.TestContainerManager | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:71bbb86 | | JIRA Issue | YARN-6962 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12887937/YARN-6962.v2.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 77a2eab87357 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 31b5840 | | Default Java | 1.8.0_144 | | findbugs | v3.1.0-RC1 | | findbugs | https://builds.apache.org/job/PreCommit-YARN-Build/17518/artifact/patchprocess/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-warnings.html | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/17518/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/17518/testReport/ | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/17518/console | | Powered by | Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org |
[jira] [Commented] (YARN-5534) Allow whitelisted volume mounts
[ https://issues.apache.org/jira/browse/YARN-5534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172102#comment-16172102 ] Miklos Szegedi commented on YARN-5534: -- Thank you, [~eyang] for sharing your thoughts. Sorry, I am confused. Are you suggesting to make the whitelist visible to more users or less visible? > Allow whitelisted volume mounts > > > Key: YARN-5534 > URL: https://issues.apache.org/jira/browse/YARN-5534 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: luhuichun >Assignee: Shane Kumpf > Attachments: YARN-5534.001.patch, YARN-5534.002.patch, > YARN-5534.003.patch > > > Introduction > Mounting files or directories from the host is one way of passing > configuration and other information into a docker container. > We could allow the user to set a list of mounts in the environment of > ContainerLaunchContext (e.g. /dir1:/targetdir1,/dir2:/targetdir2). > These would be mounted read-only to the specified target locations. This has > been resolved in YARN-4595 > 2.Problem Definition > Bug mounting arbitrary volumes into a Docker container can be a security risk. > 3.Possible solutions > one approach to provide safe mounts is to allow the cluster administrator to > configure a set of parent directories as white list mounting directories. > Add a property named yarn.nodemanager.volume-mounts.white-list, when > container executor do mount checking, only the allowed directories or > sub-directories can be mounted. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-6943) Update Yarn to YARN in documentation
[ https://issues.apache.org/jira/browse/YARN-6943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Miklos Szegedi reassigned YARN-6943: Assignee: Chetna Chaudhari > Update Yarn to YARN in documentation > > > Key: YARN-6943 > URL: https://issues.apache.org/jira/browse/YARN-6943 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi >Assignee: Chetna Chaudhari >Priority: Minor > Labels: newbie > > Based on the discussion with [~templedf] in YARN-6757 the official case of > YARN is YARN, not Yarn, so we should update all the md files. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6943) Update Yarn to YARN in documentation
[ https://issues.apache.org/jira/browse/YARN-6943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172106#comment-16172106 ] Miklos Szegedi commented on YARN-6943: -- Thank you for signing up. I assigned it to you [~chetna]. > Update Yarn to YARN in documentation > > > Key: YARN-6943 > URL: https://issues.apache.org/jira/browse/YARN-6943 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi >Assignee: Chetna Chaudhari >Priority: Minor > Labels: newbie > > Based on the discussion with [~templedf] in YARN-6757 the official case of > YARN is YARN, not Yarn, so we should update all the md files. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6620) [YARN-6223] NM Java side code changes to support isolate GPU devices by using CGroups
[ https://issues.apache.org/jira/browse/YARN-6620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-6620: - Attachment: YARN-6620.009.patch Attached ver.9 patch, (hopefully) fixed Jenkins issues > [YARN-6223] NM Java side code changes to support isolate GPU devices by using > CGroups > - > > Key: YARN-6620 > URL: https://issues.apache.org/jira/browse/YARN-6620 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-6620.001.patch, YARN-6620.002.patch, > YARN-6620.003.patch, YARN-6620.004.patch, YARN-6620.005.patch, > YARN-6620.006-WIP.patch, YARN-6620.007.patch, YARN-6620.008.patch, > YARN-6620.009.patch > > > This JIRA plan to add support of: > 1) GPU configuration for NodeManagers > 2) Isolation in CGroups. (Java side). > 3) NM restart and recovery allocated GPU devices -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6962) Add support for updateContainers when allocating using FederationInterceptor
[ https://issues.apache.org/jira/browse/YARN-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172158#comment-16172158 ] Botong Huang commented on YARN-6962: testContainerUpdateExecTypeOpportunisticToGuaranteed failure is not related, and being handled in Yarn7196. > Add support for updateContainers when allocating using FederationInterceptor > > > Key: YARN-6962 > URL: https://issues.apache.org/jira/browse/YARN-6962 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Botong Huang >Assignee: Botong Huang >Priority: Minor > Attachments: YARN-6962.v1.patch, YARN-6962.v2.patch > > > Container update is introduced in YARN-5221. Federation Interceptor needs to > support it when splitting (merging) the allocate request (response). -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-6962) Add support for updateContainers when allocating using FederationInterceptor
[ https://issues.apache.org/jira/browse/YARN-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172158#comment-16172158 ] Botong Huang edited comment on YARN-6962 at 9/19/17 6:45 PM: - testContainerUpdateExecTypeOpportunisticToGuaranteed failure is not related, and being handled in Yarn-7196. was (Author: botong): testContainerUpdateExecTypeOpportunisticToGuaranteed failure is not related, and being handled in Yarn7196. > Add support for updateContainers when allocating using FederationInterceptor > > > Key: YARN-6962 > URL: https://issues.apache.org/jira/browse/YARN-6962 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Botong Huang >Assignee: Botong Huang >Priority: Minor > Attachments: YARN-6962.v1.patch, YARN-6962.v2.patch > > > Container update is introduced in YARN-5221. Federation Interceptor needs to > support it when splitting (merging) the allocate request (response). -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-7215) REST API to list all deployed services by the same user
Eric Yang created YARN-7215: --- Summary: REST API to list all deployed services by the same user Key: YARN-7215 URL: https://issues.apache.org/jira/browse/YARN-7215 Project: Hadoop YARN Issue Type: Bug Components: api, applications Reporter: Eric Yang Assignee: Eric Yang In Slider, it is possible to list deployed applications from the same user by using: slider list This API can help UI to display application and services deployed by the same user. Apiserver does not have ability to list all applications/services at this time. This API requires fast response to list all applications because it is a common UI operation. ApiServer deployed applications persist configuration in HDFS similar to slider, but using directory listing to display deployed application might cost too much overhead to namenode. We may want to use alternative storage mechanism to cache deployed application configuration to accelerate the response time of list deployed applications. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-7216) Missing ability to list configuration vs status
Eric Yang created YARN-7216: --- Summary: Missing ability to list configuration vs status Key: YARN-7216 URL: https://issues.apache.org/jira/browse/YARN-7216 Project: Hadoop YARN Issue Type: Bug Components: api, applications Reporter: Eric Yang Assignee: Eric Yang API Server has /ws/v1/services/{service_name}. This REST end point returns Services object which contains both configuration and status. When status or macro based parameters changed in Services object, it can confuse UI code to making configuration changes. The suggestion is to preserve a copy of configuration object independent of status object. This gives UI ability to change services configuration and update configuration. Similar to Ambari, it might provide better information if we have the following separated REST end points: {code} /ws/v1/services/{service_name}/config /ws/v1/services/{service_name}/status {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4048) Linux kernel panic under strict CPU limits(on CentOS/RHEL 6.x)
[ https://issues.apache.org/jira/browse/YARN-4048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172195#comment-16172195 ] Ruslan Dautkhanov commented on YARN-4048: - Information on this linux kernel panic from RHEL CASE 01901460. They have analyzed the crash file dumped during panic and came up with below analysis of the problem. {noformat} Kwon, Daniel on Aug 02 2017 at 11:25 PM -06:00 Hi, This is Daniel Kwon from Kernel team. I'm working with the case owner to support your case. The system was crashed due to hard lockup which means there was a process holding a CPU for more than 60 seconds. -- CPUS: 88 DATE: Fri Jul 28 14:37:44 2017 UPTIME: 15 days, 16:42:29 LOAD AVERAGE: 109.16, 39.97, 15.17 TASKS: 3728 NODENAME: pc1udahad14 RELEASE: 2.6.32-696.3.2.el6.x86_64 VERSION: #1 SMP Wed Jun 7 11:51:39 EDT 2017 MACHINE: x86_64 (2397 Mhz) MEMORY: 511.9 GB PANIC: "Kernel panic - not syncing: Hard LOCKUP" -- Checking the runqueue time shows that there were many CPUs not updated lately which means processes were holding those CPUs for long. -- crash> runq -t | grep CPU | sort -k3r | awk 'NR==1{now=strtonum("0x"$3)}1{printf"%s\t%7.2fs behind\n",$0,(now-strtonum("0x"$3))/10}' CPU 25: 4d0f5bec32555 0.00s behind CPU 2: 4d0f5bec32015 0.00s behind <... cut ...> CPU 4: 4d0f5bb9984fb 0.05s behind CPU 61: 4d0f5bb8f5619 0.05s behind CPU 57: 4d0f5bb83f85d 0.05s behind CPU 17: 4d0f5bb78ad7d 0.06s behind CPU 12: 4d0f5ba972dbd 0.07s behind CPU 48: 4d0f5ba8b4980 0.07s behind CPU 84: 4d0f5ba72ca7e 0.07s behind CPU 13: 4d0f5966bef24 0.68s behind CPU 15: 4d0f58a123cfd 0.88s behind CPU 54: 4d0f5832e5754 1.00s behind CPU 62: 4d0f581d593b7 1.02s behind CPU 49: 4d0f54868608e 1.99s behind CPU 52: 4d0f5480bd287 1.99s behind CPU 24: 4d0eb096f58da 45.99s behind CPU 35: 4d0e730040f57 62.52s behind CPU 85: 4d0e42eaeaea0 75.43s behind CPU 46: 4d0e3287e8aae 79.83s behind CPU 1: 4d0e1072119ac 88.98s behind CPU 45: 4d0e061ec766a 91.75s behind CPU 60: 4d0db70bc6ad6 112.98s behind CPU 6: 4d0db002d7b9b 114.87s behind CPU 14: 4d0d9679efaad 121.72s behind CPU 9: 4d0d938f74e97 122.50s behind CPU 51: 4d0d912c6d77e 123.14s behind CPU 5: 4d0d807a6de65 127.63s behind CPU 53: 4d0d80637174f 127.65s behind CPU 70: 4d0d78c599c8e 129.69s behind CPU 44: 4d0d75602d3c3 130.61s behind CPU 3: 4d0d6fe84455e 132.07s behind CPU 0: 4d0d6f1c22d11 132.29s behind CPU 47: 4d0d6f16a2e95 132.29s behind CPU 64: 4d0d6f06851ee 132.31s behind CPU 59: 4d0d6da9596d6 132.68s behind CPU 23: 4d0d6d89ecaaa 132.71s behind CPU 22: 4d0d6c8ad9dd2 132.98s behind CPU 67: 4d0d6c853e44d 132.98s behind -- I have checked the two longest holders which are CPU 22 and CPU 67. The process on CPU 67 was awaiting for runqueue lock for the current CPU. -- crash> runq -c 67 CPU 67 RUNQUEUE: 8841616f6ec0 CURRENT: PID: 40639 TASK: 885c89c5b520 COMMAND: "java" RT PRIO_ARRAY: 8841616f7048 [ 0] PID: 271 TASK: 8840266c7520 COMMAND: "migration/67" [ 0] PID: 274 TASK: 8840266d6ab0 COMMAND: "watchdog/67" CFS RB_ROOT: 8841616f6f58 [120] PID: 422 TASK: 884026392ab0 COMMAND: "events/67" [120] PID: 7857 TASK: 888005966040 COMMAND: "kondemand/67" crash> bt 40639 PID: 40639 TASK: 885c89c5b520 CPU: 67 COMMAND: "java" #0 [8841616e6e90] crash_nmi_callback at 81036726 #1 [8841616e6ea0] notifier_call_chain at 81551085 #2 [8841616e6ee0] atomic_notifier_call_chain at 815510ea #3 [8841616e6ef0] notify_die at 810acd0e #4 [8841616e6f20] do_nmi at 8154ec09 #5 [8841616e6f50] nmi at 8154e5b3 [exception RIP: wait_for_rqlock+0x31] <... cut ...> --- --- #6 [887fd461feb8] wait_for_rqlock at 8105d751 #7 [887fd461fec0] do_exit at 81081dbc #8 [887fd461ff40] do_group_exit at 810820c8 #9 [887fd461ff70] sys_exit_group at 81082157 #10 [887fd461ff80] system_call_fastpath at 8100b0d2 <... cut ...> -- The process on CPU 22 was also waiting for runqueue lock. This time it was waiting for the runqueue lock for a specific task which is java (40639) that was running on CPU 67. -- crash> runq -c 22 CPU 22 RUNQUEUE: 884161416ec0 CURRENT: PID: 8944 TASK: 88801833eab0 COMMAND: "cmf-agent" RT PRIO_ARRAY: 884161417048 [no tasks queued] CFS RB_
[jira] [Updated] (YARN-7216) Missing ability to list configuration vs status
[ https://issues.apache.org/jira/browse/YARN-7216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated YARN-7216: Description: API Server has /ws/v1/services/{service_name}. This REST end point returns Services object which contains both configuration and status. When status or macro based parameters changed in Services object, it can confuse UI code to making configuration changes. The suggestion is to preserve a copy of configuration object independent of status object. This gives UI ability to change services configuration and update configuration. Similar to Ambari, it might provide better information if we have the following separated REST end points: {code} /ws/v1/services/[service_name]/config /ws/v1/services/[service_name]/status {code} was: API Server has /ws/v1/services/{service_name}. This REST end point returns Services object which contains both configuration and status. When status or macro based parameters changed in Services object, it can confuse UI code to making configuration changes. The suggestion is to preserve a copy of configuration object independent of status object. This gives UI ability to change services configuration and update configuration. Similar to Ambari, it might provide better information if we have the following separated REST end points: {code} /ws/v1/services/{service_name}/config /ws/v1/services/{service_name}/status {code} > Missing ability to list configuration vs status > --- > > Key: YARN-7216 > URL: https://issues.apache.org/jira/browse/YARN-7216 > Project: Hadoop YARN > Issue Type: Bug > Components: api, applications >Reporter: Eric Yang >Assignee: Eric Yang > > API Server has /ws/v1/services/{service_name}. This REST end point returns > Services object which contains both configuration and status. When status or > macro based parameters changed in Services object, it can confuse UI code to > making configuration changes. The suggestion is to preserve a copy of > configuration object independent of status object. This gives UI ability to > change services configuration and update configuration. > Similar to Ambari, it might provide better information if we have the > following separated REST end points: > {code} > /ws/v1/services/[service_name]/config > /ws/v1/services/[service_name]/status > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4266) Allow users to enter containers as UID:GID pair instead of by username
[ https://issues.apache.org/jira/browse/YARN-4266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172217#comment-16172217 ] Jason Lowe commented on YARN-4266: -- Thanks for updating the patch! The YarnConfigurationFields and TestDockerContainerRuntime failures are related. On a related noted, my RHEL7 box TestDockerContainerRuntime fails because my user account is in group wheel. I could see this test failing for others similarly. Do we really want to limit it to gid>=100 by default? If so, we may want to account for this in the unit test and adjust the threshold setting appropriately so we're not failing on the wrong thing in the test. Nit: In DockerLinuxContainerRuntime it would be nice if it was consistent with the treatment of other YarnConfiguration constants. Other uses just qualify them with YarnConfiguration rather than import them directly. > Allow users to enter containers as UID:GID pair instead of by username > -- > > Key: YARN-4266 > URL: https://issues.apache.org/jira/browse/YARN-4266 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Sidharta Seethana >Assignee: luhuichun > Attachments: YARN-4266.001.patch, YARN-4266.001.patch, > YARN-4266.002.patch, YARN-4266.003.patch, YARN-4266.004.patch, > YARN-4266_Allow_whitelisted_users_to_disable_user_re-mapping.pdf, > YARN-4266_Allow_whitelisted_users_to_disable_user_re-mapping_v2.pdf, > YARN-4266_Allow_whitelisted_users_to_disable_user_re-mapping_v3.pdf, > YARN-4266-branch-2.8.001.patch > > > Docker provides a mechanism (the --user switch) that enables us to specify > the user the container processes should run as. We use this mechanism today > when launching docker containers . In non-secure mode, we run the docker > container based on > `yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user` and in > secure mode, as the submitting user. However, this mechanism breaks down with > a large number of 'pre-created' images which don't necessarily have the users > available within the image. Examples of such images include shared images > that need to be used by multiple users. We need a way in which we can allow a > pre-defined set of users to run containers based on existing images, without > using the --user switch. There are some implications of disabling this user > squashing that we'll need to work through : log aggregation, artifact > deletion etc., -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7215) REST API to list all deployed services by the same user
[ https://issues.apache.org/jira/browse/YARN-7215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated YARN-7215: Issue Type: Sub-task (was: Bug) Parent: YARN-7054 > REST API to list all deployed services by the same user > --- > > Key: YARN-7215 > URL: https://issues.apache.org/jira/browse/YARN-7215 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, applications >Reporter: Eric Yang >Assignee: Eric Yang > > In Slider, it is possible to list deployed applications from the same user by > using: > > slider list > > This API can help UI to display application and services deployed by the same > user. > Apiserver does not have ability to list all applications/services at this > time. This API requires fast response to list all applications because it is > a common UI operation. ApiServer deployed applications persist configuration > in HDFS similar to slider, but using directory listing to display deployed > application might cost too much overhead to namenode. We may want to use > alternative storage mechanism to cache deployed application configuration to > accelerate the response time of list deployed applications. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7216) Missing ability to list configuration vs status
[ https://issues.apache.org/jira/browse/YARN-7216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated YARN-7216: Issue Type: Sub-task (was: Bug) Parent: YARN-7054 > Missing ability to list configuration vs status > --- > > Key: YARN-7216 > URL: https://issues.apache.org/jira/browse/YARN-7216 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, applications >Reporter: Eric Yang >Assignee: Eric Yang > > API Server has /ws/v1/services/{service_name}. This REST end point returns > Services object which contains both configuration and status. When status or > macro based parameters changed in Services object, it can confuse UI code to > making configuration changes. The suggestion is to preserve a copy of > configuration object independent of status object. This gives UI ability to > change services configuration and update configuration. > Similar to Ambari, it might provide better information if we have the > following separated REST end points: > {code} > /ws/v1/services/[service_name]/config > /ws/v1/services/[service_name]/status > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7001) If shared cache upload is terminated in the middle, the temp file will never be deleted
[ https://issues.apache.org/jira/browse/YARN-7001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172224#comment-16172224 ] Hadoop QA commented on YARN-7001: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} docker {color} | {color:red}499m 26s{color} | {color:red} Docker failed to build yetus/hadoop:tp-30407. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-7001 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12887873/YARN-7001.001.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/17513/console | | Powered by | Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > If shared cache upload is terminated in the middle, the temp file will never > be deleted > --- > > Key: YARN-7001 > URL: https://issues.apache.org/jira/browse/YARN-7001 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi >Assignee: Sen Zhao > Attachments: YARN-7001.001.patch > > > There is a missing deleteTempFile(tempPath); > {code} > tempPath = new Path(directoryPath, getTemporaryFileName(actualPath)); > if (!uploadFile(actualPath, tempPath)) { > LOG.warn("Could not copy the file to the shared cache at " + > tempPath); > return false; > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6968) Hard coded reference to an absolute pathname in org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.launchContainer(Cont
[ https://issues.apache.org/jira/browse/YARN-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172229#comment-16172229 ] Jason Lowe commented on YARN-6968: -- Any update on this? It would be nice to have Yetus stop complaining about the latent findbugs warning. > Hard coded reference to an absolute pathname in > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.launchContainer(ContainerRuntimeContext) > - > > Key: YARN-6968 > URL: https://issues.apache.org/jira/browse/YARN-6968 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi > > This could be done after YARN-6757 is checked in. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-6968) Hard coded reference to an absolute pathname in org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.launchContainer(Conta
[ https://issues.apache.org/jira/browse/YARN-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger reassigned YARN-6968: - Assignee: Eric Badger (was: Miklos Szegedi) from Miklos on YARN-7025 bq. Eric Badger, Would you like to work on YARN-6968? Feel free to assign it to yourself. Assigning to myself > Hard coded reference to an absolute pathname in > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.launchContainer(ContainerRuntimeContext) > - > > Key: YARN-6968 > URL: https://issues.apache.org/jira/browse/YARN-6968 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Miklos Szegedi >Assignee: Eric Badger > > This could be done after YARN-6757 is checked in. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6620) [YARN-6223] NM Java side code changes to support isolate GPU devices by using CGroups
[ https://issues.apache.org/jira/browse/YARN-6620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172244#comment-16172244 ] Hadoop QA commented on YARN-6620: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 15 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 11s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 12s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 1s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager in trunk has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 52s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 54s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 13s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 79 new + 479 unchanged - 24 fixed = 558 total (was 503) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} shellcheck {color} | {color:green} 0m 0s{color} | {color:green} There were no new shellcheck issues. {color} | | {color:green}+1{color} | {color:green} shelldocs {color} | {color:green} 0m 27s{color} | {color:green} There were no new shelldocs issues. {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch 523 line(s) with tabs. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 50s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 41s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 46s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 14m 52s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 33s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 79m 39s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.nodemanager.containermanager.TestContainerManager | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:71bbb86 | | JIRA Issue | YARN-6620 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12887944/YARN-6620.009.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml shellcheck shel
[jira] [Updated] (YARN-7210) Some fixes related to Registry DNS
[ https://issues.apache.org/jira/browse/YARN-7210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-7210: -- Attachment: YARN-7210.yarn-native-services.03.patch > Some fixes related to Registry DNS > -- > > Key: YARN-7210 > URL: https://issues.apache.org/jira/browse/YARN-7210 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Jian He >Assignee: Jian He > Attachments: YARN-7210.yarn-native-services.01.patch, > YARN-7210.yarn-native-services.02.patch, > YARN-7210.yarn-native-services.03.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7210) Some fixes related to Registry DNS
[ https://issues.apache.org/jira/browse/YARN-7210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172276#comment-16172276 ] Jian He commented on YARN-7210: --- I added one more fix to allow empty launchCommand The UT are actually passing locally, running again. > Some fixes related to Registry DNS > -- > > Key: YARN-7210 > URL: https://issues.apache.org/jira/browse/YARN-7210 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Jian He >Assignee: Jian He > Attachments: YARN-7210.yarn-native-services.01.patch, > YARN-7210.yarn-native-services.02.patch, > YARN-7210.yarn-native-services.03.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-7217) PUT method for update service for Service API doesn't function correctly
Eric Yang created YARN-7217: --- Summary: PUT method for update service for Service API doesn't function correctly Key: YARN-7217 URL: https://issues.apache.org/jira/browse/YARN-7217 Project: Hadoop YARN Issue Type: Task Components: api, applications Reporter: Eric Yang The PUT method for updateService API provides multiple functions: # Stopping a service. # Start a service. # Increase or decrease number of containers. The overloading is buggy depending on how the configuration should be applied. Scenario 1 A user retrieves Service object from getService call, and the Service object contains state: STARTED. The user would like to increase number of containers for the deployed service. The JSON has been updated to increase container count. The PUT method does not actually increase container count. Scenario 2 A user retrieves Service object from getService call, and the Service object contains state: STOPPED. The user would like to make a environment configuration change. The configuration does not get updated after PUT method. This is possible to address by rearranging the logic of START/STOP after configuration update. However, there are other potential combinations that can break PUT method. For example, user like to make configuration changes, but not yet restart the service until a later time. The alternative is to separate the PUT method into PUT method for configuration vs status. This increase the number of action that can be performed. New API could look like: {code} @PUT /ws/v1/services/[service_name]/config Request Data: { "name":"[service_name]", "number_of_containers": 5 } {code} {code} @PUT /ws/v1/services/[service_name]/state Request data: { "name": "[service_name]", "state": "STOPPED|STARTED" } {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6142) Support rolling upgrade between 2.x and 3.x
[ https://issues.apache.org/jira/browse/YARN-6142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated YARN-6142: -- Target Version/s: 3.0.0 (was: 3.0.0-beta1) Thanks Ray. Let's move this to 3.0.0 GA then. > Support rolling upgrade between 2.x and 3.x > --- > > Key: YARN-6142 > URL: https://issues.apache.org/jira/browse/YARN-6142 > Project: Hadoop YARN > Issue Type: Task > Components: rolling upgrade >Affects Versions: 3.0.0-alpha2 >Reporter: Andrew Wang >Assignee: Ray Chiang >Priority: Blocker > > Counterpart JIRA to HDFS-11096. We need to: > * examine YARN and MR's JACC report for binary and source incompatibilities > * run the [PB > differ|https://issues.apache.org/jira/browse/HDFS-11096?focusedCommentId=15816405&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15816405] > that Sean wrote for HDFS-11096 for the YARN PBs. > * sanity test some rolling upgrades between 2.x and 3.x. Ideally these are > automated and something we can run upstream. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4266) Allow users to enter containers as UID:GID pair instead of by username
[ https://issues.apache.org/jira/browse/YARN-4266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated YARN-4266: -- Attachment: YARN-4266.005.patch Thanks for the review, [~jlowe] bq. The YarnConfigurationFields and TestDockerContainerRuntime failures are related. Fixed the tests bq. On a related noted, my RHEL7 box TestDockerContainerRuntime fails because my user account is in group wheel. I could see this test failing for others similarly. Do we really want to limit it to gid>=100 by default? If so, we may want to account for this in the unit test and adjust the threshold setting appropriately so we're not failing on the wrong thing in the test. I think making the uid and gid lower limits 1 and 1 should be ok. This makes everything open from the start, but allows admins to define limits if they want certain levels of users not to be allowed to run containers. So setting the uid and gid to 1 and 1. bq. Nit: In DockerLinuxContainerRuntime it would be nice if it was consistent with the treatment of other YarnConfiguration constants. Other uses just qualify them with YarnConfiguration rather than import them directly. Fixed > Allow users to enter containers as UID:GID pair instead of by username > -- > > Key: YARN-4266 > URL: https://issues.apache.org/jira/browse/YARN-4266 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Sidharta Seethana >Assignee: luhuichun > Attachments: YARN-4266.001.patch, YARN-4266.001.patch, > YARN-4266.002.patch, YARN-4266.003.patch, YARN-4266.004.patch, > YARN-4266.005.patch, > YARN-4266_Allow_whitelisted_users_to_disable_user_re-mapping.pdf, > YARN-4266_Allow_whitelisted_users_to_disable_user_re-mapping_v2.pdf, > YARN-4266_Allow_whitelisted_users_to_disable_user_re-mapping_v3.pdf, > YARN-4266-branch-2.8.001.patch > > > Docker provides a mechanism (the --user switch) that enables us to specify > the user the container processes should run as. We use this mechanism today > when launching docker containers . In non-secure mode, we run the docker > container based on > `yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user` and in > secure mode, as the submitting user. However, this mechanism breaks down with > a large number of 'pre-created' images which don't necessarily have the users > available within the image. Examples of such images include shared images > that need to be used by multiple users. We need a way in which we can allow a > pre-defined set of users to run containers based on existing images, without > using the --user switch. There are some implications of disabling this user > squashing that we'll need to work through : log aggregation, artifact > deletion etc., -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6570) No logs were found for running application, running container
[ https://issues.apache.org/jira/browse/YARN-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-6570: - Attachment: YARN-6570-branch-2.8.002.patch Fix unit test failure and check style warning in 002 patch for branch-2.8. > No logs were found for running application, running container > - > > Key: YARN-6570 > URL: https://issues.apache.org/jira/browse/YARN-6570 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Reporter: Sumana Sathish >Assignee: Junping Du >Priority: Critical > Fix For: 2.9.0, 3.0.0-beta1, 3.1.0 > > Attachments: YARN-6570-branch-2.8.001.patch, > YARN-6570-branch-2.8.002.patch, YARN-6570.poc.patch, YARN-6570-v2.patch, > YARN-6570-v3.patch > > > 1.Obtain running containers from the following CLI for running application: > yarn container -list appattempt > 2. Couldnot fetch logs > {code} > Can not find any log file matching the pattern: ALL for the container > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-7218) ApiServer REST API naming convention /ws/v1 is already used in Hadoop v2
Eric Yang created YARN-7218: --- Summary: ApiServer REST API naming convention /ws/v1 is already used in Hadoop v2 Key: YARN-7218 URL: https://issues.apache.org/jira/browse/YARN-7218 Project: Hadoop YARN Issue Type: Sub-task Components: api, applications Reporter: Eric Yang Assignee: Eric Yang In YARN-6626, there is a desire to have ability to run ApiServer REST API in Resource Manager, this can eliminate the requirement to deploy another daemon service for submitting docker applications. In YARN-5698, a new UI has been implemented as a separate web application. There are some problems in the arrangement that can cause conflicts of how Java session are being managed. The root context of Resource Manager web application is /ws. This is hard coded in startWebapp method in ResourceManager.java. This means all the session management is applied to Web URL of /ws prefix. /ui2 is independent of /ws context, therefore session management code doesn't apply to /ui2. This could be a session management problem, if servlet based code is going to be introduced into /ui2 web application. ApiServer code base is designed as a separate web application. There is no easy way to inject a separate web application into the same /ws context because ResourceManager is already setup to bind to RMWebServices. Unless ApiServer code is moved into RMWebServices, otherwise, they will not share the same session management. The alternate solution is to keep ApiServer prefix URL independent of /ws context. However, this will be a departure from YARN web services naming convention. This can be loaded as a separate web application in Resource Manager jetty server. One possible proposal is /app/v1/services. This can keep ApiServer code modular and independent from Resource Manager. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-7219) Fix AllocateRequestProto difference between branch-2/branch-2.8 and trunk
Ray Chiang created YARN-7219: Summary: Fix AllocateRequestProto difference between branch-2/branch-2.8 and trunk Key: YARN-7219 URL: https://issues.apache.org/jira/browse/YARN-7219 Project: Hadoop YARN Issue Type: Sub-task Components: yarn Affects Versions: 3.0.0-beta1 Reporter: Ray Chiang Priority: Critical For yarn_service_protos.proto, we have the following code in (branch-2.8.0, branch-2.8, branch-2) {noformat} message AllocateRequestProto { repeated ResourceRequestProto ask = 1; repeated ContainerIdProto release = 2; optional ResourceBlacklistRequestProto blacklist_request = 3; optional int32 response_id = 4; optional float progress = 5; repeated ContainerResourceIncreaseRequestProto increase_request = 6; repeated UpdateContainerRequestProto update_requests = 7; } {noformat} For yarn_service_protos.proto, we have the following code in (trunk) {noformat} message AllocateRequestProto { repeated ResourceRequestProto ask = 1; repeated ContainerIdProto release = 2; optional ResourceBlacklistRequestProto blacklist_request = 3; optional int32 response_id = 4; optional float progress = 5; repeated UpdateContainerRequestProto update_requests = 6; } {noformat} Notes * YARN-3866 was the original JIRA for container resizing. * YARN-5221 is what introduced the incompatible change. * In branch-2/branch-2.8/branch-2.8.0, this protobuf change was undone by "Addendum patch to YARN-3866: fix incompatible API change." * There was a similar API fix done in YARN-6071. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7219) Fix AllocateRequestProto difference between branch-2/branch-2.8 and trunk
[ https://issues.apache.org/jira/browse/YARN-7219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172328#comment-16172328 ] Ray Chiang commented on YARN-7219: -- Will updating the update_requests field to 7 will be enough to fix the compatibility issue? [~asuresh] or [~djp], any comment? > Fix AllocateRequestProto difference between branch-2/branch-2.8 and trunk > - > > Key: YARN-7219 > URL: https://issues.apache.org/jira/browse/YARN-7219 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Affects Versions: 3.0.0-beta1 >Reporter: Ray Chiang >Priority: Critical > > For yarn_service_protos.proto, we have the following code in > (branch-2.8.0, branch-2.8, branch-2) > {noformat} > message AllocateRequestProto { > repeated ResourceRequestProto ask = 1; > repeated ContainerIdProto release = 2; > optional ResourceBlacklistRequestProto blacklist_request = 3; > optional int32 response_id = 4; > optional float progress = 5; > repeated ContainerResourceIncreaseRequestProto increase_request = 6; > repeated UpdateContainerRequestProto update_requests = 7; > } > {noformat} > For yarn_service_protos.proto, we have the following code in > (trunk) > {noformat} > message AllocateRequestProto { > repeated ResourceRequestProto ask = 1; > repeated ContainerIdProto release = 2; > optional ResourceBlacklistRequestProto blacklist_request = 3; > optional int32 response_id = 4; > optional float progress = 5; > repeated UpdateContainerRequestProto update_requests = 6; > } > {noformat} > Notes > * YARN-3866 was the original JIRA for container resizing. > * YARN-5221 is what introduced the incompatible change. > * In branch-2/branch-2.8/branch-2.8.0, this protobuf change was undone by > "Addendum patch to YARN-3866: fix incompatible API change." > * There was a similar API fix done in YARN-6071. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7215) REST API to list all deployed services by the same user
[ https://issues.apache.org/jira/browse/YARN-7215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated YARN-7215: Description: In Slider, it is possible to list deployed applications from the same user by using: {code} slider list {code} This API can help UI to display application and services deployed by the same user. Apiserver does not have ability to list all applications/services at this time. This API requires fast response to list all applications because it is a common UI operation. ApiServer deployed applications persist configuration in HDFS similar to slider, but using directory listing to display deployed application might cost too much overhead to namenode. We may want to use alternative storage mechanism to cache deployed application configuration to accelerate the response time of list deployed applications. was: In Slider, it is possible to list deployed applications from the same user by using: slider list This API can help UI to display application and services deployed by the same user. Apiserver does not have ability to list all applications/services at this time. This API requires fast response to list all applications because it is a common UI operation. ApiServer deployed applications persist configuration in HDFS similar to slider, but using directory listing to display deployed application might cost too much overhead to namenode. We may want to use alternative storage mechanism to cache deployed application configuration to accelerate the response time of list deployed applications. > REST API to list all deployed services by the same user > --- > > Key: YARN-7215 > URL: https://issues.apache.org/jira/browse/YARN-7215 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, applications >Reporter: Eric Yang >Assignee: Eric Yang > > In Slider, it is possible to list deployed applications from the same user by > using: > {code} > slider list > {code} > This API can help UI to display application and services deployed by the same > user. > Apiserver does not have ability to list all applications/services at this > time. This API requires fast response to list all applications because it is > a common UI operation. ApiServer deployed applications persist configuration > in HDFS similar to slider, but using directory listing to display deployed > application might cost too much overhead to namenode. We may want to use > alternative storage mechanism to cache deployed application configuration to > accelerate the response time of list deployed applications. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6968) Hard coded reference to an absolute pathname in org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.launchContainer(Contai
[ https://issues.apache.org/jira/browse/YARN-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated YARN-6968: -- Attachment: YARN-6968.001.patch Attaching a patch that makes the cgroups root directory a NM config in yarn-site.xml > Hard coded reference to an absolute pathname in > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.launchContainer(ContainerRuntimeContext) > - > > Key: YARN-6968 > URL: https://issues.apache.org/jira/browse/YARN-6968 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Miklos Szegedi >Assignee: Eric Badger > Attachments: YARN-6968.001.patch > > > This could be done after YARN-6757 is checked in. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7217) PUT method for update service for Service API doesn't function correctly
[ https://issues.apache.org/jira/browse/YARN-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172355#comment-16172355 ] Jian He commented on YARN-7217: --- Thanks Eric, how about call it spec instead of config ? because the spec itself has a config filed which will be confusing {code} @PUT /ws/v1/services/[service_name]/spec {code} In request body, the name field can be an optional field, since it can be retrieved from URL path > PUT method for update service for Service API doesn't function correctly > > > Key: YARN-7217 > URL: https://issues.apache.org/jira/browse/YARN-7217 > Project: Hadoop YARN > Issue Type: Task > Components: api, applications >Reporter: Eric Yang > > The PUT method for updateService API provides multiple functions: > # Stopping a service. > # Start a service. > # Increase or decrease number of containers. > The overloading is buggy depending on how the configuration should be applied. > Scenario 1 > A user retrieves Service object from getService call, and the Service object > contains state: STARTED. The user would like to increase number of > containers for the deployed service. The JSON has been updated to increase > container count. The PUT method does not actually increase container count. > Scenario 2 > A user retrieves Service object from getService call, and the Service object > contains state: STOPPED. The user would like to make a environment > configuration change. The configuration does not get updated after PUT > method. > This is possible to address by rearranging the logic of START/STOP after > configuration update. However, there are other potential combinations that > can break PUT method. For example, user like to make configuration changes, > but not yet restart the service until a later time. > The alternative is to separate the PUT method into PUT method for > configuration vs status. This increase the number of action that can be > performed. New API could look like: > {code} > @PUT > /ws/v1/services/[service_name]/config > Request Data: > { > "name":"[service_name]", > "number_of_containers": 5 > } > {code} > {code} > @PUT > /ws/v1/services/[service_name]/state > Request data: > { > "name": "[service_name]", > "state": "STOPPED|STARTED" > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-7220) Use apidoc for REST API documentation
Eric Yang created YARN-7220: --- Summary: Use apidoc for REST API documentation Key: YARN-7220 URL: https://issues.apache.org/jira/browse/YARN-7220 Project: Hadoop YARN Issue Type: Improvement Components: documentation Reporter: Eric Yang Assignee: Eric Yang There are more REST API being developed in Hadoop, and it would be great to standardize on the method of generate REST API document. There are several method done today: Swagger YAML Javadoc Wiki pages JIRA comments The most frequently used method is JIRA comments and Wiki pages. Both methods are prone to data loss through passage of time. We will need a more effortless approach to maintain REST API documentation. Swagger YAML can also be out of sync with reality, if new methods are added to java code directly. Javadoc annotation seems like a good approach to maintain REST API document. Both Jersey and Atlassian community has maven plugin to help generating REST API document, but those maven plugins have ceased to function. After searching online for REST API documentation for a bit, [apidoc|http://apidocjs.com/] is one library that stand out. This could be the ideal approach to manage Hadoop REST API document. It supports javadoc like annotations, and generate beautiful schema changes documentation. If this is accepted, I will add apidoc installation to dev-support Dockerfile, and pom.xml changes for javadoc plugin to ignore the custom tags. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7217) PUT method for update service for Service API doesn't function correctly
[ https://issues.apache.org/jira/browse/YARN-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172359#comment-16172359 ] Eric Yang commented on YARN-7217: - [~jianhe] Agree, spec will make this less confusing. > PUT method for update service for Service API doesn't function correctly > > > Key: YARN-7217 > URL: https://issues.apache.org/jira/browse/YARN-7217 > Project: Hadoop YARN > Issue Type: Task > Components: api, applications >Reporter: Eric Yang > > The PUT method for updateService API provides multiple functions: > # Stopping a service. > # Start a service. > # Increase or decrease number of containers. > The overloading is buggy depending on how the configuration should be applied. > Scenario 1 > A user retrieves Service object from getService call, and the Service object > contains state: STARTED. The user would like to increase number of > containers for the deployed service. The JSON has been updated to increase > container count. The PUT method does not actually increase container count. > Scenario 2 > A user retrieves Service object from getService call, and the Service object > contains state: STOPPED. The user would like to make a environment > configuration change. The configuration does not get updated after PUT > method. > This is possible to address by rearranging the logic of START/STOP after > configuration update. However, there are other potential combinations that > can break PUT method. For example, user like to make configuration changes, > but not yet restart the service until a later time. > The alternative is to separate the PUT method into PUT method for > configuration vs status. This increase the number of action that can be > performed. New API could look like: > {code} > @PUT > /ws/v1/services/[service_name]/config > Request Data: > { > "name":"[service_name]", > "number_of_containers": 5 > } > {code} > {code} > @PUT > /ws/v1/services/[service_name]/state > Request data: > { > "name": "[service_name]", > "state": "STOPPED|STARTED" > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7215) REST API to list all deployed services by the same user
[ https://issues.apache.org/jira/browse/YARN-7215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172369#comment-16172369 ] Jian He commented on YARN-7215: --- Another approach is, we can simply get the list of services from RM by a type filter set to "yarn-service", in fact, I was trying to implement that but then ran into a bug YARN-7076. > REST API to list all deployed services by the same user > --- > > Key: YARN-7215 > URL: https://issues.apache.org/jira/browse/YARN-7215 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, applications >Reporter: Eric Yang >Assignee: Eric Yang > > In Slider, it is possible to list deployed applications from the same user by > using: > {code} > slider list > {code} > This API can help UI to display application and services deployed by the same > user. > Apiserver does not have ability to list all applications/services at this > time. This API requires fast response to list all applications because it is > a common UI operation. ApiServer deployed applications persist configuration > in HDFS similar to slider, but using directory listing to display deployed > application might cost too much overhead to namenode. We may want to use > alternative storage mechanism to cache deployed application configuration to > accelerate the response time of list deployed applications. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6943) Update Yarn to YARN in documentation
[ https://issues.apache.org/jira/browse/YARN-6943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetna Chaudhari updated YARN-6943: --- Attachment: YARN-6943-1.patch Please review the attached patch. > Update Yarn to YARN in documentation > > > Key: YARN-6943 > URL: https://issues.apache.org/jira/browse/YARN-6943 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi >Assignee: Chetna Chaudhari >Priority: Minor > Labels: newbie > Attachments: YARN-6943-1.patch > > > Based on the discussion with [~templedf] in YARN-6757 the official case of > YARN is YARN, not Yarn, so we should update all the md files. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-6333) Improve doc for minSharePreemptionTimeout, fairSharePreemptionTimeout and fairSharePreemptionThreshold
[ https://issues.apache.org/jira/browse/YARN-6333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetna Chaudhari reassigned YARN-6333: -- Assignee: Chetna Chaudhari > Improve doc for minSharePreemptionTimeout, fairSharePreemptionTimeout and > fairSharePreemptionThreshold > -- > > Key: YARN-6333 > URL: https://issues.apache.org/jira/browse/YARN-6333 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 2.9.0, 3.0.0-alpha2 >Reporter: Yufei Gu >Assignee: Chetna Chaudhari > Labels: newbie++ > > Default values of them are not mentioned in doc. For example, default value > of minSharePreemptionTimeout is {{Long.MAX_VALUE}}, which means the min share > preemption won't happen until you set a meaningful value. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-7221) Add security check for privileged docker container
Eric Yang created YARN-7221: --- Summary: Add security check for privileged docker container Key: YARN-7221 URL: https://issues.apache.org/jira/browse/YARN-7221 Project: Hadoop YARN Issue Type: Sub-task Reporter: Eric Yang When a docker is running with privileges, majority of the use case is to have some program running with root then drop privileges to another user. i.e. httpd to start with privileged and bind to port 80, then drop privileges to www user. # We should add security check for submitting users, to verify they have "sudo" access to run privileged container. # We should remove --user=uid:gid for privileged containers. Docker can be launched with --privileged=true, and --user=uid:gid flag. With this parameter combinations, user will not have access to become root user. All docker exec command will be drop to uid:gid user to run instead of granting privileges. User can gain root privileges if container file system contains files that give user extra power, but this type of image is considered as dangerous. Non-privileged user can launch container with special bits to acquire same level of root power. Hence, we lose control of which image should be run with --privileges, and who have sudo rights to use privileged container images. As the result, we should check for sudo access then decide to parameterize --privileged=true OR --user=uid:gid. This will avoid leading developer down the wrong path. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-7222) Merge org.apache.hadoop.yarn.server.resourcemanager.NodeManager with MockNM
Yufei Gu created YARN-7222: -- Summary: Merge org.apache.hadoop.yarn.server.resourcemanager.NodeManager with MockNM Key: YARN-7222 URL: https://issues.apache.org/jira/browse/YARN-7222 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 3.1.0 Reporter: Yufei Gu The existence of org.apache.hadoop.yarn.server.resourcemanager.NodeManager is confusing. It is only for RM testing and basically another MockNM. There is no Java Doc for class, which easily let people consider it a real Nodemanager. Suggest to merge it with MockNM. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6570) No logs were found for running application, running container
[ https://issues.apache.org/jira/browse/YARN-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172425#comment-16172425 ] Hadoop QA commented on YARN-6570: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 30s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} branch-2.8 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 19s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 20s{color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 36s{color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 33s{color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 59s{color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 15s{color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 36s{color} | {color:green} branch-2.8 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 2m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 29s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 54m 59s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 36s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 86m 51s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.nodemanager.TestNodeStatusUpdaterForLabels | | | hadoop.yarn.server.nodemanager.TestNodeManagerReboot | | Timed out junit tests | org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown | | | org.apache.hadoop.yarn.server.nodemanager.TestNodeStatusUpdater | | | org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerResync | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:c2d96dd | | JIRA Issue | YARN-6570 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12887966/YARN-6570-branch-2.8.002.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc | | uname | Linux da7fbbb6 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | branch-2.8 / a81167e | | Default Java | 1.7.0_151 | | findbugs | v3.0.0 | | unit | https://builds.ap
[jira] [Updated] (YARN-6333) Improve doc for minSharePreemptionTimeout, fairSharePreemptionTimeout and fairSharePreemptionThreshold
[ https://issues.apache.org/jira/browse/YARN-6333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetna Chaudhari updated YARN-6333: --- Attachment: YARN-6333-1.patch > Improve doc for minSharePreemptionTimeout, fairSharePreemptionTimeout and > fairSharePreemptionThreshold > -- > > Key: YARN-6333 > URL: https://issues.apache.org/jira/browse/YARN-6333 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 2.9.0, 3.0.0-alpha2 >Reporter: Yufei Gu >Assignee: Chetna Chaudhari > Labels: newbie++ > Attachments: YARN-6333-1.patch > > > Default values of them are not mentioned in doc. For example, default value > of minSharePreemptionTimeout is {{Long.MAX_VALUE}}, which means the min share > preemption won't happen until you set a meaningful value. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7212) [Atsv2] TimelineSchemaCreator fails to create flowrun table causes RegionServer down!
[ https://issues.apache.org/jira/browse/YARN-7212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith Sharma K S updated YARN-7212: Issue Type: Sub-task (was: Bug) Parent: YARN-7213 > [Atsv2] TimelineSchemaCreator fails to create flowrun table causes > RegionServer down! > - > > Key: YARN-7212 > URL: https://issues.apache.org/jira/browse/YARN-7212 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Rohith Sharma K S > > *Hbase-2.0* officially support *hadoop-alpha* compilations. So I was trying > to build and test with HBase-2.0. But table schema creation fails and causes > RegionServer to shutdown with following error > {noformat} > Caused by: java.lang.NoSuchMethodError: > org.apache.hadoop.hbase.Tag.asList([BII)Ljava/util/List; > at > org.apache.hadoop.yarn.server.timelineservice.storage.flow.FlowScanner.getCurrentAggOp(FlowScanner.java:250) > at > org.apache.hadoop.yarn.server.timelineservice.storage.flow.FlowScanner.nextInternal(FlowScanner.java:226) > at > org.apache.hadoop.yarn.server.timelineservice.storage.flow.FlowScanner.next(FlowScanner.java:145) > at > org.apache.hadoop.hbase.regionserver.StoreFlusher.performFlush(StoreFlusher.java:132) > at > org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:75) > at org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:973) > at > org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:2252) > at > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2672) > {noformat} > Since HBase-2.0 community is ready to release Hadoop-3.x compatible versions, > ATSv2 also need to support HBase-2.0 versions. For this, we need to take up a > task of test and validate HBase-2.0 issues! -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7215) REST API to list all deployed services by the same user
[ https://issues.apache.org/jira/browse/YARN-7215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172434#comment-16172434 ] Eric Yang commented on YARN-7215: - [~jianhe] How does RM handle a service that is in stopped state? Stopped slider application does not have any record in resource manager. Same slider application can have multiple Application ID when the application has been restarted. Slider uses HDFS file to persist the paused application, but having resource manager to crawl through lists of HDFS directories to find stopped service seems like potential load attack to namenode. It would be better to have the operational record index, and cached by well known mechanism like a SOLR collection. This also reduces having to brew another random read/write, low latency, index, cache mechanism in YARN. Both HBase and SOLR have solved random read/write on top of HDFS with some success. It would be better to we use existing libraries that have been baked for several years than inventing something new for specialized purpose. > REST API to list all deployed services by the same user > --- > > Key: YARN-7215 > URL: https://issues.apache.org/jira/browse/YARN-7215 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, applications >Reporter: Eric Yang >Assignee: Eric Yang > > In Slider, it is possible to list deployed applications from the same user by > using: > {code} > slider list > {code} > This API can help UI to display application and services deployed by the same > user. > Apiserver does not have ability to list all applications/services at this > time. This API requires fast response to list all applications because it is > a common UI operation. ApiServer deployed applications persist configuration > in HDFS similar to slider, but using directory listing to display deployed > application might cost too much overhead to namenode. We may want to use > alternative storage mechanism to cache deployed application configuration to > accelerate the response time of list deployed applications. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-7215) REST API to list all deployed services by the same user
[ https://issues.apache.org/jira/browse/YARN-7215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172434#comment-16172434 ] Eric Yang edited comment on YARN-7215 at 9/19/17 10:19 PM: --- [~jianhe] How does RM handle a service that is in stopped state? Stopped slider application does not have any record in resource manager. Same slider application can have multiple Application ID when the application has been restarted. Slider uses HDFS file to persist the paused application, but having resource manager to crawl through lists of HDFS directories to find stopped service seems like potential load attack to namenode. It would be better to have the operational record index, and cached by well known mechanism like a SOLR collection. This also reduces having to brew another random read/write, low latency, index, cache mechanism in YARN. Both HBase and SOLR have solved random read/write on top of HDFS with some success. It would be better to use existing libraries that have been baked for several years than inventing something new for specialized purpose. was (Author: eyang): [~jianhe] How does RM handle a service that is in stopped state? Stopped slider application does not have any record in resource manager. Same slider application can have multiple Application ID when the application has been restarted. Slider uses HDFS file to persist the paused application, but having resource manager to crawl through lists of HDFS directories to find stopped service seems like potential load attack to namenode. It would be better to have the operational record index, and cached by well known mechanism like a SOLR collection. This also reduces having to brew another random read/write, low latency, index, cache mechanism in YARN. Both HBase and SOLR have solved random read/write on top of HDFS with some success. It would be better to we use existing libraries that have been baked for several years than inventing something new for specialized purpose. > REST API to list all deployed services by the same user > --- > > Key: YARN-7215 > URL: https://issues.apache.org/jira/browse/YARN-7215 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, applications >Reporter: Eric Yang >Assignee: Eric Yang > > In Slider, it is possible to list deployed applications from the same user by > using: > {code} > slider list > {code} > This API can help UI to display application and services deployed by the same > user. > Apiserver does not have ability to list all applications/services at this > time. This API requires fast response to list all applications because it is > a common UI operation. ApiServer deployed applications persist configuration > in HDFS similar to slider, but using directory listing to display deployed > application might cost too much overhead to namenode. We may want to use > alternative storage mechanism to cache deployed application configuration to > accelerate the response time of list deployed applications. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-6943) Update Yarn to YARN in documentation
[ https://issues.apache.org/jira/browse/YARN-6943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172385#comment-16172385 ] Chetna Chaudhari edited comment on YARN-6943 at 9/19/17 10:20 PM: -- Thanks [~haibo.chen] and [~miklos.szeg...@cloudera.com]. Please review the attached patch. was (Author: chetna): Please review the attached patch. > Update Yarn to YARN in documentation > > > Key: YARN-6943 > URL: https://issues.apache.org/jira/browse/YARN-6943 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi >Assignee: Chetna Chaudhari >Priority: Minor > Labels: newbie > Attachments: YARN-6943-1.patch > > > Based on the discussion with [~templedf] in YARN-6757 the official case of > YARN is YARN, not Yarn, so we should update all the md files. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7221) Add security check for privileged docker container
[ https://issues.apache.org/jira/browse/YARN-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172454#comment-16172454 ] Eric Badger commented on YARN-7221: --- Linking YARN-4266 as a blocker, since that is the JIRA that will add the code necessary for the user to run as a uid:gid pair. I agree that this will break privileged containers, since it will force them into their uid:gid pair instead of root > Add security check for privileged docker container > -- > > Key: YARN-7221 > URL: https://issues.apache.org/jira/browse/YARN-7221 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang > > When a docker is running with privileges, majority of the use case is to have > some program running with root then drop privileges to another user. i.e. > httpd to start with privileged and bind to port 80, then drop privileges to > www user. > # We should add security check for submitting users, to verify they have > "sudo" access to run privileged container. > # We should remove --user=uid:gid for privileged containers. > > Docker can be launched with --privileged=true, and --user=uid:gid flag. With > this parameter combinations, user will not have access to become root user. > All docker exec command will be drop to uid:gid user to run instead of > granting privileges. User can gain root privileges if container file system > contains files that give user extra power, but this type of image is > considered as dangerous. Non-privileged user can launch container with > special bits to acquire same level of root power. Hence, we lose control of > which image should be run with --privileges, and who have sudo rights to use > privileged container images. As the result, we should check for sudo access > then decide to parameterize --privileged=true OR --user=uid:gid. This will > avoid leading developer down the wrong path. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6333) Improve doc for minSharePreemptionTimeout, fairSharePreemptionTimeout and fairSharePreemptionThreshold
[ https://issues.apache.org/jira/browse/YARN-6333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172457#comment-16172457 ] Hadoop QA commented on YARN-6333: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 10s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 16s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 17m 31s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:71bbb86 | | JIRA Issue | YARN-6333 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12887981/YARN-6333-1.patch | | Optional Tests | asflicense mvnsite | | uname | Linux 9c43277230d9 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 51edaac | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/17525/console | | Powered by | Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Improve doc for minSharePreemptionTimeout, fairSharePreemptionTimeout and > fairSharePreemptionThreshold > -- > > Key: YARN-6333 > URL: https://issues.apache.org/jira/browse/YARN-6333 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 2.9.0, 3.0.0-alpha2 >Reporter: Yufei Gu >Assignee: Chetna Chaudhari > Labels: newbie++ > Attachments: YARN-6333-1.patch > > > Default values of them are not mentioned in doc. For example, default value > of minSharePreemptionTimeout is {{Long.MAX_VALUE}}, which means the min share > preemption won't happen until you set a meaningful value. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7215) REST API to list all deployed services by the same user
[ https://issues.apache.org/jira/browse/YARN-7215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172482#comment-16172482 ] Jian He commented on YARN-7215: --- bq. How does RM handle a service that is in stopped state? Actually, RM today already remembers the stopped apps in ZooKeeper, it also has its own way to lookup the applications. I'm not suggesting making RM do any more reads/writes. What is the scope of this jira ? By the description, it looks only to support the old slider list, the slider was also looking up from RM, it wasn't reading from HDFS. > REST API to list all deployed services by the same user > --- > > Key: YARN-7215 > URL: https://issues.apache.org/jira/browse/YARN-7215 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, applications >Reporter: Eric Yang >Assignee: Eric Yang > > In Slider, it is possible to list deployed applications from the same user by > using: > {code} > slider list > {code} > This API can help UI to display application and services deployed by the same > user. > Apiserver does not have ability to list all applications/services at this > time. This API requires fast response to list all applications because it is > a common UI operation. ApiServer deployed applications persist configuration > in HDFS similar to slider, but using directory listing to display deployed > application might cost too much overhead to namenode. We may want to use > alternative storage mechanism to cache deployed application configuration to > accelerate the response time of list deployed applications. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-7223) Document GPU isolation feature
Wangda Tan created YARN-7223: Summary: Document GPU isolation feature Key: YARN-7223 URL: https://issues.apache.org/jira/browse/YARN-7223 Project: Hadoop YARN Issue Type: Sub-task Reporter: Wangda Tan Assignee: Wangda Tan -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-7224) Support GPU isolation for docker container
Wangda Tan created YARN-7224: Summary: Support GPU isolation for docker container Key: YARN-7224 URL: https://issues.apache.org/jira/browse/YARN-7224 Project: Hadoop YARN Issue Type: Sub-task Reporter: Wangda Tan Assignee: Wangda Tan -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6119) Add javadoc for SchedulerApplicationAttempt#getNextResourceRequest
[ https://issues.apache.org/jira/browse/YARN-6119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172488#comment-16172488 ] Chetna Chaudhari commented on YARN-6119: [~kasha]: This method was removed as a part of [YARN-6040|https://issues.apache.org/jira/browse/YARN-6040] > Add javadoc for SchedulerApplicationAttempt#getNextResourceRequest > -- > > Key: YARN-6119 > URL: https://issues.apache.org/jira/browse/YARN-6119 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Reporter: Karthik Kambatla >Priority: Minor > Labels: newbie++ > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-6119) Add javadoc for SchedulerApplicationAttempt#getNextResourceRequest
[ https://issues.apache.org/jira/browse/YARN-6119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetna Chaudhari reassigned YARN-6119: -- Assignee: Chetna Chaudhari > Add javadoc for SchedulerApplicationAttempt#getNextResourceRequest > -- > > Key: YARN-6119 > URL: https://issues.apache.org/jira/browse/YARN-6119 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Reporter: Karthik Kambatla >Assignee: Chetna Chaudhari >Priority: Minor > Labels: newbie++ > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7219) Fix AllocateRequestProto difference between branch-2/branch-2.8 and trunk
[ https://issues.apache.org/jira/browse/YARN-7219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172491#comment-16172491 ] Ray Chiang commented on YARN-7219: -- Similar fix > Fix AllocateRequestProto difference between branch-2/branch-2.8 and trunk > - > > Key: YARN-7219 > URL: https://issues.apache.org/jira/browse/YARN-7219 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Affects Versions: 3.0.0-beta1 >Reporter: Ray Chiang >Priority: Critical > > For yarn_service_protos.proto, we have the following code in > (branch-2.8.0, branch-2.8, branch-2) > {noformat} > message AllocateRequestProto { > repeated ResourceRequestProto ask = 1; > repeated ContainerIdProto release = 2; > optional ResourceBlacklistRequestProto blacklist_request = 3; > optional int32 response_id = 4; > optional float progress = 5; > repeated ContainerResourceIncreaseRequestProto increase_request = 6; > repeated UpdateContainerRequestProto update_requests = 7; > } > {noformat} > For yarn_service_protos.proto, we have the following code in > (trunk) > {noformat} > message AllocateRequestProto { > repeated ResourceRequestProto ask = 1; > repeated ContainerIdProto release = 2; > optional ResourceBlacklistRequestProto blacklist_request = 3; > optional int32 response_id = 4; > optional float progress = 5; > repeated UpdateContainerRequestProto update_requests = 6; > } > {noformat} > Notes > * YARN-3866 was the original JIRA for container resizing. > * YARN-5221 is what introduced the incompatible change. > * In branch-2/branch-2.8/branch-2.8.0, this protobuf change was undone by > "Addendum patch to YARN-3866: fix incompatible API change." > * There was a similar API fix done in YARN-6071. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-6499) Remove the doc about Schedulable#redistributeShare()
[ https://issues.apache.org/jira/browse/YARN-6499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetna Chaudhari reassigned YARN-6499: -- Assignee: Chetna Chaudhari > Remove the doc about Schedulable#redistributeShare() > - > > Key: YARN-6499 > URL: https://issues.apache.org/jira/browse/YARN-6499 > Project: Hadoop YARN > Issue Type: Task > Components: fairscheduler >Reporter: Yufei Gu >Assignee: Chetna Chaudhari >Priority: Trivial > Labels: newbie++ > > Schedulable#redistributeShare() has been removed since YARN-187. We need to > remove the doc about it as well. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6499) Remove the doc about Schedulable#redistributeShare()
[ https://issues.apache.org/jira/browse/YARN-6499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetna Chaudhari updated YARN-6499: --- Attachment: YARN-6499.patch > Remove the doc about Schedulable#redistributeShare() > - > > Key: YARN-6499 > URL: https://issues.apache.org/jira/browse/YARN-6499 > Project: Hadoop YARN > Issue Type: Task > Components: fairscheduler >Reporter: Yufei Gu >Assignee: Chetna Chaudhari >Priority: Trivial > Labels: newbie++ > Attachments: YARN-6499.patch > > > Schedulable#redistributeShare() has been removed since YARN-187. We need to > remove the doc about it as well. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-6169) container-executor message on empty configuration file can be improved
[ https://issues.apache.org/jira/browse/YARN-6169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetna Chaudhari reassigned YARN-6169: -- Assignee: Chetna Chaudhari > container-executor message on empty configuration file can be improved > -- > > Key: YARN-6169 > URL: https://issues.apache.org/jira/browse/YARN-6169 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Miklos Szegedi >Assignee: Chetna Chaudhari >Priority: Trivial > Labels: newbie > > If the configuration file is empty, we get the following error message: > {{Invalid configuration provided in /root/etc/hadoop/container-executor.cfg}} > This is does not provide enough details to figure out what is the issue at > the first glance. We should use something like 'Empty configuration file > provided...' > {code} > if (cfg->size == 0) { > fprintf(ERRORFILE, "Invalid configuration provided in %s\n", file_name); > exit(INVALID_CONFIG_FILE); > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4500) Missing default config values in yarn-default.xml
[ https://issues.apache.org/jira/browse/YARN-4500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172498#comment-16172498 ] Chetna Chaudhari commented on YARN-4500: [~lewuathe]: Are you working on it? If not , can I pick it up ? > Missing default config values in yarn-default.xml > - > > Key: YARN-4500 > URL: https://issues.apache.org/jira/browse/YARN-4500 > Project: Hadoop YARN > Issue Type: Bug > Components: documentation >Affects Versions: 2.7.1, 2.6.2 >Reporter: Tianyin Xu >Assignee: Kai Sasaki > Labels: oct16-easy > Attachments: YARN-4500.01.patch > > > The docs > [yarn-default.xml|https://hadoop.apache.org/docs/r2.7.1/hadoop-yarn/hadoop-yarn-common/yarn-default.xml] > miss the default values of the following parameters: > {{yarn.web-proxy.address}} > {{yarn.ipc.client.factory.class}} > {{yarn.ipc.server.factory.class}} > {{yarn.ipc.record.factory.class}} > Here we go, > {code:title=YarnConfiguration.java|borderStyle=solid} > 97 /** Factory to create client IPC classes.*/ > 98 public static final String IPC_CLIENT_FACTORY_CLASS = > 99 IPC_PREFIX + "client.factory.class"; > 100 public static final String DEFAULT_IPC_CLIENT_FACTORY_CLASS = > 101 "org.apache.hadoop.yarn.factories.impl.pb.RpcClientFactoryPBImpl"; > 102 > 103 /** Factory to create server IPC classes.*/ > 104 public static final String IPC_SERVER_FACTORY_CLASS = > 105 IPC_PREFIX + "server.factory.class"; > 106 public static final String DEFAULT_IPC_SERVER_FACTORY_CLASS = > 107 "org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl"; > 108 > 109 /** Factory to create serializeable records.*/ > 110 public static final String IPC_RECORD_FACTORY_CLASS = > 111 IPC_PREFIX + "record.factory.class"; > 112 public static final String DEFAULT_IPC_RECORD_FACTORY_CLASS = > 113 "org.apache.hadoop.yarn.factories.impl.pb.RecordFactoryPBImpl"; > ... > 1119 /** The address for the web proxy.*/ > 1120 public static final String PROXY_ADDRESS = > 1121 PROXY_PREFIX + "address"; > 1122 public static final int DEFAULT_PROXY_PORT = 9099; > 1123 public static final String DEFAULT_PROXY_ADDRESS = > 1124 "0.0.0.0:" + DEFAULT_PROXY_PORT; > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7026) Fair scheduler docs should explain what happens when no placement rules are specified
[ https://issues.apache.org/jira/browse/YARN-7026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetna Chaudhari updated YARN-7026: --- Labels: documentation (was: ) > Fair scheduler docs should explain what happens when no placement rules are > specified > - > > Key: YARN-7026 > URL: https://issues.apache.org/jira/browse/YARN-7026 > Project: Hadoop YARN > Issue Type: Improvement > Components: docs >Affects Versions: 3.0.0-alpha4 >Reporter: Daniel Templeton > Labels: documentation > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6968) Hard coded reference to an absolute pathname in org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.launchContainer(Cont
[ https://issues.apache.org/jira/browse/YARN-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172507#comment-16172507 ] Hadoop QA commented on YARN-6968: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 15m 55s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 58s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 50s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 50s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager in trunk has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 37s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 45s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 57s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 4 new + 225 unchanged - 0 fixed = 229 total (was 225) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 16s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 22s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 58s{color} | {color:green} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager generated 0 new + 0 unchanged - 1 fixed = 0 total (was 1) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 31s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 35s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 34s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 14m 28s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 31s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 91m 35s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.nodemanager.containermanager.TestContainerManager | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:71bbb86 | | JIRA Issue | YARN-6968 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12887973/YARN-6968.001.pa
[jira] [Commented] (YARN-7196) Fix finicky TestContainerManager tests
[ https://issues.apache.org/jira/browse/YARN-7196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172518#comment-16172518 ] Arun Suresh commented on YARN-7196: --- [~djp] / [~wangda], what do u think of the latest patch ? > Fix finicky TestContainerManager tests > -- > > Key: YARN-7196 > URL: https://issues.apache.org/jira/browse/YARN-7196 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: YARN-7196.002.patch, YARN-7196.patch > > > The Testcase {{testContainerUpdateExecTypeGuaranteedToOpportunistic}} seem to > fail every once in a while. Maybe have to change the way the event is > triggered. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6499) Remove the doc about Schedulable#redistributeShare()
[ https://issues.apache.org/jira/browse/YARN-6499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172527#comment-16172527 ] Yufei Gu commented on YARN-6499: +1. Thanks for working on this. > Remove the doc about Schedulable#redistributeShare() > - > > Key: YARN-6499 > URL: https://issues.apache.org/jira/browse/YARN-6499 > Project: Hadoop YARN > Issue Type: Task > Components: fairscheduler >Reporter: Yufei Gu >Assignee: Chetna Chaudhari >Priority: Trivial > Labels: newbie++ > Attachments: YARN-6499.patch > > > Schedulable#redistributeShare() has been removed since YARN-187. We need to > remove the doc about it as well. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6499) Remove the doc about Schedulable#redistributeShare()
[ https://issues.apache.org/jira/browse/YARN-6499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172538#comment-16172538 ] Hadoop QA commented on YARN-6499: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 22s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 58s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 45m 25s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 66m 54s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:71bbb86 | | JIRA Issue | YARN-6499 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12887986/YARN-6499.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 28957bf3a9d3 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 51edaac | | Default Java | 1.8.0_144 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/17526/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/17526/testReport/ | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/17526/console | | Powered by | Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Remove the doc about Schedulable#redistributeShare() > - > >
[jira] [Created] (YARN-7225) Add queue and partition info to RM audit log
Jonathan Hung created YARN-7225: --- Summary: Add queue and partition info to RM audit log Key: YARN-7225 URL: https://issues.apache.org/jira/browse/YARN-7225 Project: Hadoop YARN Issue Type: Improvement Reporter: Jonathan Hung Right now RM audit log has fields such as user, ip, resource, etc. Having queue and partition is useful for resource tracking. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org