[jira] [Created] (YARN-7212) [Atsv2] TimelineSchemaCreator fails to create flowrun table causes RegionServer down!

2017-09-19 Thread Rohith Sharma K S (JIRA)
Rohith Sharma K S created YARN-7212:
---

 Summary: [Atsv2] TimelineSchemaCreator fails to create flowrun 
table causes RegionServer down!
 Key: YARN-7212
 URL: https://issues.apache.org/jira/browse/YARN-7212
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Rohith Sharma K S


*Hbase-2.0* officially support *hadoop-alpha* compilations. So I was trying to 
build and test with HBase-2.0. But table schema creation fails and causes 
RegionServer to shutdown with following error
{noformat}
Caused by: java.lang.NoSuchMethodError: 
org.apache.hadoop.hbase.Tag.asList([BII)Ljava/util/List;
at 
org.apache.hadoop.yarn.server.timelineservice.storage.flow.FlowScanner.getCurrentAggOp(FlowScanner.java:250)
at 
org.apache.hadoop.yarn.server.timelineservice.storage.flow.FlowScanner.nextInternal(FlowScanner.java:226)
at 
org.apache.hadoop.yarn.server.timelineservice.storage.flow.FlowScanner.next(FlowScanner.java:145)
at 
org.apache.hadoop.hbase.regionserver.StoreFlusher.performFlush(StoreFlusher.java:132)
at 
org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:75)
at org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:973)
at 
org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:2252)
at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2672)
{noformat}

Since HBase-2.0 community is ready to release Hadoop-3.x compatible versions, 
ATSv2 also need to support HBase-2.0 versions. For this, we need to take up a 
task of test and validate HBase-2.0 issues! 





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7213) [Atsv2] Test and validate HBase-2.0 with Atsv2

2017-09-19 Thread Rohith Sharma K S (JIRA)
Rohith Sharma K S created YARN-7213:
---

 Summary: [Atsv2] Test and validate HBase-2.0 with Atsv2
 Key: YARN-7213
 URL: https://issues.apache.org/jira/browse/YARN-7213
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Rohith Sharma K S


Hbase-2.0 officially support hadoop-alpha compilations. So, this JIRA is to 
keep track of HBase-2.0 integration issues. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7213) [Atsv2] Test and validate HBase-2.0.x with Atsv2

2017-09-19 Thread Rohith Sharma K S (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith Sharma K S updated YARN-7213:

Summary: [Atsv2] Test and validate HBase-2.0.x with Atsv2  (was: [Atsv2] 
Test and validate HBase-2.0 with Atsv2)

> [Atsv2] Test and validate HBase-2.0.x with Atsv2
> 
>
> Key: YARN-7213
> URL: https://issues.apache.org/jira/browse/YARN-7213
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Rohith Sharma K S
>
> Hbase-2.0 officially support hadoop-alpha compilations. So, this JIRA is to 
> keep track of HBase-2.0 integration issues. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7213) [Atsv2] Test and validate HBase-2.0.x with Atsv2

2017-09-19 Thread Rohith Sharma K S (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith Sharma K S updated YARN-7213:

Description: Hbase-2.0.x officially support hadoop-alpha compilations. And 
also they are getting ready for Hadoop-beta release so that HBase can release 
their versions compatible with Hadoop-beta. So, this JIRA is to keep track of 
HBase-2.0 integration issues.   (was: Hbase-2.0 officially support hadoop-alpha 
compilations. So, this JIRA is to keep track of HBase-2.0 integration issues. )

> [Atsv2] Test and validate HBase-2.0.x with Atsv2
> 
>
> Key: YARN-7213
> URL: https://issues.apache.org/jira/browse/YARN-7213
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Rohith Sharma K S
>
> Hbase-2.0.x officially support hadoop-alpha compilations. And also they are 
> getting ready for Hadoop-beta release so that HBase can release their 
> versions compatible with Hadoop-beta. So, this JIRA is to keep track of 
> HBase-2.0 integration issues. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7214) duplicated container completed To AM

2017-09-19 Thread zhangshilong (JIRA)
zhangshilong created YARN-7214:
--

 Summary: duplicated container completed To AM
 Key: YARN-7214
 URL: https://issues.apache.org/jira/browse/YARN-7214
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 3.0.0-alpha3, 2.7.1
 Environment: hadoop 2.7.1  rm recovery and nm recovery enabled
Reporter: zhangshilong


env: hadoop 2.7.1  with rm recovery and nm recovery enabled
case:
 spark app(app1) running least one container(named c1) in NM1.
 1、NM1 crashed,and RM found NM1 expired in 10 minutes.
 2、RM will remove all containers in NM1(RMNodeImpl). and  app1 will receive c1 
completed message.But RM can not send c1(to be removed) to NM1 because NM1 lost.
 3、NM1 restart and register with RM(c1 in register request),but RM found NM1 is 
lost and will not handle containers from NM1.
4、NM1 will not heartbeat with c1(c1 not in heartbeat request).  So c1 will not 
removed from context of NM1.
5、 RM restart, NM1 re register with RM。And c1 will be handled and recovered. RM 
will send c1 complted message to AM of app1.  So, app1 received duplicated c1. 
once spark AM   receive one container completed from RM, it will allocate one 
new container.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7214) duplicated container completed To AM

2017-09-19 Thread zhangshilong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16171270#comment-16171270
 ] 

zhangshilong commented on YARN-7214:


3. 
{code:java}
 public static class AddNodeTransition implements
  SingleArcTransition {

@Override
public void transition(RMNodeImpl rmNode, RMNodeEvent event) {
  // Inform the scheduler
  RMNodeStartedEvent startEvent = (RMNodeStartedEvent) event;
  List containers = null;

  NodeId nodeId = rmNode.nodeId;
  RMNode previousRMNode =
  rmNode.context.getInactiveRMNodes().remove(nodeId);
  if (previousRMNode != null) {
rmNode.updateMetricsForRejoinedNode(previousRMNode.getState());
  } else {
NodeId unknownNodeId =
NodesListManager.createUnknownNodeId(nodeId.getHost());
previousRMNode =
rmNode.context.getInactiveRMNodes().remove(unknownNodeId);
if (previousRMNode != null) {
  ClusterMetrics.getMetrics().decrDecommisionedNMs();
}
// Increment activeNodes explicitly because this is a new node.
ClusterMetrics.getMetrics().incrNumActiveNodes();
containers = startEvent.getNMContainerStatuses();
if (containers != null && !containers.isEmpty()) {
  for (NMContainerStatus container : containers) {
if (container.getContainerState() == ContainerState.RUNNING ||
container.getContainerState() == ContainerState.SCHEDULED) {
  rmNode.launchedContainers.add(container.getContainerId());
}
  }
}
  }

  if (null != startEvent.getRunningApplications()) {
for (ApplicationId appId : startEvent.getRunningApplications()) {
  handleRunningAppOnNode(rmNode, rmNode.context, appId, rmNode.nodeId);
}
  }

  rmNode.context.getDispatcher().getEventHandler()
.handle(new NodeAddedSchedulerEvent(rmNode, containers));
  rmNode.context.getDispatcher().getEventHandler().handle(
new NodesListManagerEvent(
NodesListManagerEventType.NODE_USABLE, rmNode));
}
  }
{code}

4、 in NodeStatusUpdaterImpl.java
  before register: getNMContainerStatuses will be called. So 
completedContainer will be put into recentlyStoppedContainers.
  in register request: completed containers will be sent to RM.
{code:java}
  public void addCompletedContainer(ContainerId containerId) {
synchronized (recentlyStoppedContainers) {
  removeVeryOldStoppedContainersFromCache();
  if (!recentlyStoppedContainers.containsKey(containerId)) {
recentlyStoppedContainers.put(containerId,
System.currentTimeMillis() + durationToTrackStoppedContainers);
  }
}
  }
{code}
normal heartbeat,  getContainerStatuses is called.
So completed container will not be put into containerStatuses beacause it is in 
recentlyStoppedContainers.
So completed container will not be sent to RM.
{code:java}
protected List getContainerStatuses() throws IOException {
List containerStatuses = new ArrayList();
for (Container container : this.context.getContainers().values()) {
  ContainerId containerId = container.getContainerId();
  ApplicationId applicationId = containerId.getApplicationAttemptId()
  .getApplicationId();
  org.apache.hadoop.yarn.api.records.ContainerStatus containerStatus =
  container.cloneAndGetContainerStatus();
  if (containerStatus.getState() == ContainerState.COMPLETE) {
if (isApplicationStopped(applicationId)) {
  if (LOG.isDebugEnabled()) {
LOG.debug(applicationId + " is completing, " + " remove "
+ containerId + " from NM context.");
  }
  context.getContainers().remove(containerId);
  pendingCompletedContainers.put(containerId, containerStatus);
} else {
  if (!isContainerRecentlyStopped(containerId)) {
pendingCompletedContainers.put(containerId, containerStatus);
  }
}
// Adding to finished containers cache. Cache will keep it around at
// least for #durationToTrackStoppedContainers duration. In the
// subsequent call to stop container it will get removed from cache.
addCompletedContainer(containerId);
  } else {
containerStatuses.add(containerStatus);
  }
}

containerStatuses.addAll(pendingCompletedContainers.values());

if (LOG.isDebugEnabled()) {
  LOG.debug("Sending out " + containerStatuses.size()
  + " container statuses: " + containerStatuses);
}
return containerStatuses;
  }
{code}




> duplicated container completed To AM
> 
>
> Key: YARN-7214
> URL: https://issues.apache.org/jira/browse/YARN-7214
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.1, 3.

[jira] [Commented] (YARN-7214) duplicated container completed To AM

2017-09-19 Thread zhangshilong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16171322#comment-16171322
 ] 

zhangshilong commented on YARN-7214:


in my thought,  containers in recentlyStoppedContainers can be removed from 
NMContext if NM heartbeat normally with RM.

> duplicated container completed To AM
> 
>
> Key: YARN-7214
> URL: https://issues.apache.org/jira/browse/YARN-7214
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.1, 3.0.0-alpha3
> Environment: hadoop 2.7.1  rm recovery and nm recovery enabled
>Reporter: zhangshilong
>
> env: hadoop 2.7.1  with rm recovery and nm recovery enabled
> case:
>  spark app(app1) running least one container(named c1) in NM1.
>  1、NM1 crashed,and RM found NM1 expired in 10 minutes.
>  2、RM will remove all containers in NM1(RMNodeImpl). and  app1 will receive 
> c1 completed message.But RM can not send c1(to be removed) to NM1 because NM1 
> lost.
>  3、NM1 restart and register with RM(c1 in register request),but RM found NM1 
> is lost and will not handle containers from NM1.
> 4、NM1 will not heartbeat with c1(c1 not in heartbeat request).  So c1 will 
> not removed from context of NM1.
> 5、 RM restart, NM1 re register with RM。And c1 will be handled and recovered. 
> RM will send c1 complted message to AM of app1.  So, app1 received duplicated 
> c1. 
> once spark AM   receive one container completed from RM, it will allocate one 
> new container.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6916) Moving logging APIs over to slf4j in hadoop-yarn-server-common

2017-09-19 Thread Bibin A Chundatt (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt updated YARN-6916:
---
Attachment: YARN-6916.005.patch

+1 LGTM. Attaching rebased patch again

> Moving logging APIs over to slf4j in hadoop-yarn-server-common
> --
>
> Key: YARN-6916
> URL: https://issues.apache.org/jira/browse/YARN-6916
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
> Attachments: YARN-6712.01.patch, YARN-6916.002.patch, 
> YARN-6916.003.patch, YARN-6916.004.patch, YARN-6916.005.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6878) TestCapacityScheduler.testDefaultNodeLabelExpressionQueueConfig() has the args to assertEqual() in the wrong order

2017-09-19 Thread Sen Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16171360#comment-16171360
 ] 

Sen Zhao commented on YARN-6878:


The failed tests are unrelated. [~templedf], can you give me some advises about 
the latest patch?

> TestCapacityScheduler.testDefaultNodeLabelExpressionQueueConfig() has the 
> args to assertEqual() in the wrong order
> --
>
> Key: YARN-6878
> URL: https://issues.apache.org/jira/browse/YARN-6878
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, test
>Affects Versions: 3.0.0-alpha4
>Reporter: Daniel Templeton
>Assignee: Sen Zhao
>Priority: Trivial
>  Labels: newbie
> Attachments: YARN-6878.001.patch, YARN-6878.002.patch, 
> YARN-6878.003.patch
>
>
> The expected value should come before the actual value.  It would be nice to 
> add some assert messages as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7214) duplicated container completed To AM

2017-09-19 Thread zhangshilong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangshilong updated YARN-7214:
---
Attachment: screenshot-1.png

> duplicated container completed To AM
> 
>
> Key: YARN-7214
> URL: https://issues.apache.org/jira/browse/YARN-7214
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.1, 3.0.0-alpha3
> Environment: hadoop 2.7.1  rm recovery and nm recovery enabled
>Reporter: zhangshilong
> Attachments: screenshot-1.png
>
>
> env: hadoop 2.7.1  with rm recovery and nm recovery enabled
> case:
>  spark app(app1) running least one container(named c1) in NM1.
>  1、NM1 crashed,and RM found NM1 expired in 10 minutes.
>  2、RM will remove all containers in NM1(RMNodeImpl). and  app1 will receive 
> c1 completed message.But RM can not send c1(to be removed) to NM1 because NM1 
> lost.
>  3、NM1 restart and register with RM(c1 in register request),but RM found NM1 
> is lost and will not handle containers from NM1.
> 4、NM1 will not heartbeat with c1(c1 not in heartbeat request).  So c1 will 
> not removed from context of NM1.
> 5、 RM restart, NM1 re register with RM。And c1 will be handled and recovered. 
> RM will send c1 complted message to AM of app1.  So, app1 received duplicated 
> c1. 
> once spark AM   receive one container completed from RM, it will allocate one 
> new container.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (YARN-7214) duplicated container completed To AM

2017-09-19 Thread rangjiaheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

rangjiaheng updated YARN-7214:
--
Comment: was deleted

(was: aa )

> duplicated container completed To AM
> 
>
> Key: YARN-7214
> URL: https://issues.apache.org/jira/browse/YARN-7214
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.1, 3.0.0-alpha3
> Environment: hadoop 2.7.1  rm recovery and nm recovery enabled
>Reporter: zhangshilong
> Attachments: screenshot-1.png
>
>
> env: hadoop 2.7.1  with rm recovery and nm recovery enabled
> case:
>  spark app(app1) running least one container(named c1) in NM1.
>  1、NM1 crashed,and RM found NM1 expired in 10 minutes.
>  2、RM will remove all containers in NM1(RMNodeImpl). and  app1 will receive 
> c1 completed message.But RM can not send c1(to be removed) to NM1 because NM1 
> lost.
>  3、NM1 restart and register with RM(c1 in register request),but RM found NM1 
> is lost and will not handle containers from NM1.
> 4、NM1 will not heartbeat with c1(c1 not in heartbeat request).  So c1 will 
> not removed from context of NM1.
> 5、 RM restart, NM1 re register with RM。And c1 will be handled and recovered. 
> RM will send c1 complted message to AM of app1.  So, app1 received duplicated 
> c1. 
> once spark AM   receive one container completed from RM, it will allocate one 
> new container.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7214) duplicated container completed To AM

2017-09-19 Thread rangjiaheng (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16171391#comment-16171391
 ] 

rangjiaheng commented on YARN-7214:
---

aa 

> duplicated container completed To AM
> 
>
> Key: YARN-7214
> URL: https://issues.apache.org/jira/browse/YARN-7214
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.1, 3.0.0-alpha3
> Environment: hadoop 2.7.1  rm recovery and nm recovery enabled
>Reporter: zhangshilong
> Attachments: screenshot-1.png
>
>
> env: hadoop 2.7.1  with rm recovery and nm recovery enabled
> case:
>  spark app(app1) running least one container(named c1) in NM1.
>  1、NM1 crashed,and RM found NM1 expired in 10 minutes.
>  2、RM will remove all containers in NM1(RMNodeImpl). and  app1 will receive 
> c1 completed message.But RM can not send c1(to be removed) to NM1 because NM1 
> lost.
>  3、NM1 restart and register with RM(c1 in register request),but RM found NM1 
> is lost and will not handle containers from NM1.
> 4、NM1 will not heartbeat with c1(c1 not in heartbeat request).  So c1 will 
> not removed from context of NM1.
> 5、 RM restart, NM1 re register with RM。And c1 will be handled and recovered. 
> RM will send c1 complted message to AM of app1.  So, app1 received duplicated 
> c1. 
> once spark AM   receive one container completed from RM, it will allocate one 
> new container.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7214) duplicated container completed To AM

2017-09-19 Thread zhangshilong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16171396#comment-16171396
 ] 

zhangshilong commented on YARN-7214:


!screenshot-1.png!
generally,
1、 NM complete one container(c) and send to RM
2、RM sent c to AM, tell AM c is completed.
3、RM sent c to NM, tell NM c can be removed from NM.
If RM restart before step 3, c will be in in context of NM for ever. 
If RM restart again, c will be duplicated container completed to AM.

> duplicated container completed To AM
> 
>
> Key: YARN-7214
> URL: https://issues.apache.org/jira/browse/YARN-7214
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.1, 3.0.0-alpha3
> Environment: hadoop 2.7.1  rm recovery and nm recovery enabled
>Reporter: zhangshilong
> Attachments: screenshot-1.png
>
>
> env: hadoop 2.7.1  with rm recovery and nm recovery enabled
> case:
>  spark app(app1) running least one container(named c1) in NM1.
>  1、NM1 crashed,and RM found NM1 expired in 10 minutes.
>  2、RM will remove all containers in NM1(RMNodeImpl). and  app1 will receive 
> c1 completed message.But RM can not send c1(to be removed) to NM1 because NM1 
> lost.
>  3、NM1 restart and register with RM(c1 in register request),but RM found NM1 
> is lost and will not handle containers from NM1.
> 4、NM1 will not heartbeat with c1(c1 not in heartbeat request).  So c1 will 
> not removed from context of NM1.
> 5、 RM restart, NM1 re register with RM。And c1 will be handled and recovered. 
> RM will send c1 complted message to AM of app1.  So, app1 received duplicated 
> c1. 
> once spark AM   receive one container completed from RM, it will allocate one 
> new container.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7214) duplicated container completed To AM

2017-09-19 Thread rangjiaheng (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16171402#comment-16171402
 ] 

rangjiaheng commented on YARN-7214:
---

We found this problem in Spark streaming application, a long-running 
application, which has fixed number of containers; after NM lost, NM restarted 
and RM restarted, a more container were allocated.

> duplicated container completed To AM
> 
>
> Key: YARN-7214
> URL: https://issues.apache.org/jira/browse/YARN-7214
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.1, 3.0.0-alpha3
> Environment: hadoop 2.7.1  rm recovery and nm recovery enabled
>Reporter: zhangshilong
> Attachments: screenshot-1.png
>
>
> env: hadoop 2.7.1  with rm recovery and nm recovery enabled
> case:
>  spark app(app1) running least one container(named c1) in NM1.
>  1、NM1 crashed,and RM found NM1 expired in 10 minutes.
>  2、RM will remove all containers in NM1(RMNodeImpl). and  app1 will receive 
> c1 completed message.But RM can not send c1(to be removed) to NM1 because NM1 
> lost.
>  3、NM1 restart and register with RM(c1 in register request),but RM found NM1 
> is lost and will not handle containers from NM1.
> 4、NM1 will not heartbeat with c1(c1 not in heartbeat request).  So c1 will 
> not removed from context of NM1.
> 5、 RM restart, NM1 re register with RM。And c1 will be handled and recovered. 
> RM will send c1 complted message to AM of app1.  So, app1 received duplicated 
> c1. 
> once spark AM   receive one container completed from RM, it will allocate one 
> new container.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-7214) duplicated container completed To AM

2017-09-19 Thread rangjiaheng (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16171402#comment-16171402
 ] 

rangjiaheng edited comment on YARN-7214 at 9/19/17 9:32 AM:


We found this problem in Spark streaming application, a long-running 
application, which has fixed number of containers; after NM lost, NM restarted 
and RM restarted, a duplicated container was allocated.


was (Author: neomatrix):
We found this problem in Spark streaming application, a long-running 
application, which has fixed number of containers; after NM lost, NM restarted 
and RM restarted, a more container were allocated.

> duplicated container completed To AM
> 
>
> Key: YARN-7214
> URL: https://issues.apache.org/jira/browse/YARN-7214
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.1, 3.0.0-alpha3
> Environment: hadoop 2.7.1  rm recovery and nm recovery enabled
>Reporter: zhangshilong
> Attachments: screenshot-1.png
>
>
> env: hadoop 2.7.1  with rm recovery and nm recovery enabled
> case:
>  spark app(app1) running least one container(named c1) in NM1.
>  1、NM1 crashed,and RM found NM1 expired in 10 minutes.
>  2、RM will remove all containers in NM1(RMNodeImpl). and  app1 will receive 
> c1 completed message.But RM can not send c1(to be removed) to NM1 because NM1 
> lost.
>  3、NM1 restart and register with RM(c1 in register request),but RM found NM1 
> is lost and will not handle containers from NM1.
> 4、NM1 will not heartbeat with c1(c1 not in heartbeat request).  So c1 will 
> not removed from context of NM1.
> 5、 RM restart, NM1 re register with RM。And c1 will be handled and recovered. 
> RM will send c1 complted message to AM of app1.  So, app1 received duplicated 
> c1. 
> once spark AM   receive one container completed from RM, it will allocate one 
> new container.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-7214) duplicated container completed To AM

2017-09-19 Thread zhangshilong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16171396#comment-16171396
 ] 

zhangshilong edited comment on YARN-7214 at 9/19/17 9:38 AM:
-

!screenshot-1.png!
generally,
1、 NM complete one container(c) and send to RM
2、RM sent c to AM, tell AM c is completed.
3、RM sent c to NM, tell NM c can be removed from NM.
If RM restart before step 3,  c will be duplicated container completed to AM.


was (Author: zsl2007):
!screenshot-1.png!
generally,
1、 NM complete one container(c) and send to RM
2、RM sent c to AM, tell AM c is completed.
3、RM sent c to NM, tell NM c can be removed from NM.
If RM restart before step 3, c will be in in context of NM for ever. 
If RM restart again, c will be duplicated container completed to AM.

> duplicated container completed To AM
> 
>
> Key: YARN-7214
> URL: https://issues.apache.org/jira/browse/YARN-7214
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.1, 3.0.0-alpha3
> Environment: hadoop 2.7.1  rm recovery and nm recovery enabled
>Reporter: zhangshilong
> Attachments: screenshot-1.png
>
>
> env: hadoop 2.7.1  with rm recovery and nm recovery enabled
> case:
>  spark app(app1) running least one container(named c1) in NM1.
>  1、NM1 crashed,and RM found NM1 expired in 10 minutes.
>  2、RM will remove all containers in NM1(RMNodeImpl). and  app1 will receive 
> c1 completed message.But RM can not send c1(to be removed) to NM1 because NM1 
> lost.
>  3、NM1 restart and register with RM(c1 in register request),but RM found NM1 
> is lost and will not handle containers from NM1.
> 4、NM1 will not heartbeat with c1(c1 not in heartbeat request).  So c1 will 
> not removed from context of NM1.
> 5、 RM restart, NM1 re register with RM。And c1 will be handled and recovered. 
> RM will send c1 complted message to AM of app1.  So, app1 received duplicated 
> c1. 
> once spark AM   receive one container completed from RM, it will allocate one 
> new container.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7001) If shared cache upload is terminated in the middle, the temp file will never be deleted

2017-09-19 Thread Sen Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16171520#comment-16171520
 ] 

Sen Zhao commented on YARN-7001:


Hi, [~miklos.szeg...@cloudera.com]. I would like to try this issue. And I will 
submit a patch.

> If shared cache upload is terminated in the middle, the temp file will never 
> be deleted
> ---
>
> Key: YARN-7001
> URL: https://issues.apache.org/jira/browse/YARN-7001
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Miklos Szegedi
>
> There is a missing deleteTempFile(tempPath);
> {code}
>   tempPath = new Path(directoryPath, getTemporaryFileName(actualPath));
>   if (!uploadFile(actualPath, tempPath)) {
> LOG.warn("Could not copy the file to the shared cache at " + 
> tempPath);
> return false;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7001) If shared cache upload is terminated in the middle, the temp file will never be deleted

2017-09-19 Thread Sen Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sen Zhao updated YARN-7001:
---
Attachment: YARN-7001.001.patch

> If shared cache upload is terminated in the middle, the temp file will never 
> be deleted
> ---
>
> Key: YARN-7001
> URL: https://issues.apache.org/jira/browse/YARN-7001
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Miklos Szegedi
> Attachments: YARN-7001.001.patch
>
>
> There is a missing deleteTempFile(tempPath);
> {code}
>   tempPath = new Path(directoryPath, getTemporaryFileName(actualPath));
>   if (!uploadFile(actualPath, tempPath)) {
> LOG.warn("Could not copy the file to the shared cache at " + 
> tempPath);
> return false;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-7001) If shared cache upload is terminated in the middle, the temp file will never be deleted

2017-09-19 Thread Sen Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sen Zhao reassigned YARN-7001:
--

Assignee: Sen Zhao

> If shared cache upload is terminated in the middle, the temp file will never 
> be deleted
> ---
>
> Key: YARN-7001
> URL: https://issues.apache.org/jira/browse/YARN-7001
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Miklos Szegedi
>Assignee: Sen Zhao
> Attachments: YARN-7001.001.patch
>
>
> There is a missing deleteTempFile(tempPath);
> {code}
>   tempPath = new Path(directoryPath, getTemporaryFileName(actualPath));
>   if (!uploadFile(actualPath, tempPath)) {
> LOG.warn("Could not copy the file to the shared cache at " + 
> tempPath);
> return false;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6878) TestCapacityScheduler.testDefaultNodeLabelExpressionQueueConfig() has the args to assertEqual() in the wrong order

2017-09-19 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16171671#comment-16171671
 ] 

Daniel Templeton commented on YARN-6878:


LGTM +1

> TestCapacityScheduler.testDefaultNodeLabelExpressionQueueConfig() has the 
> args to assertEqual() in the wrong order
> --
>
> Key: YARN-6878
> URL: https://issues.apache.org/jira/browse/YARN-6878
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, test
>Affects Versions: 3.0.0-alpha4
>Reporter: Daniel Templeton
>Assignee: Sen Zhao
>Priority: Trivial
>  Labels: newbie
> Attachments: YARN-6878.001.patch, YARN-6878.002.patch, 
> YARN-6878.003.patch
>
>
> The expected value should come before the actual value.  It would be nice to 
> add some assert messages as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6991) "Kill application" button does not show error if other user tries to kill the application for secure cluster

2017-09-19 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16171804#comment-16171804
 ] 

Sunil G commented on YARN-6991:
---

patch looks fine to me.

> "Kill application" button does not show error if other user tries to kill the 
> application for secure cluster
> 
>
> Key: YARN-6991
> URL: https://issues.apache.org/jira/browse/YARN-6991
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sumana Sathish
>Assignee: Suma Shivaprasad
> Attachments: YARN-6991.001.patch, YARN-6991.002.patch, 
> YARN-6991.003.patch
>
>
> 1. Submit an application by user 1
> 2. log into RM UI as user 2
> 3. Kill the application submitted by user 1
> 4. Even though application does not get killed, there is no error/info dialog 
> box being shown to let the user that "user doesnot have permissions to kill 
> application of other user"



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7135) Clean up lock-try order in common scheduler code

2017-09-19 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16171878#comment-16171878
 ] 

Daniel Templeton commented on YARN-7135:


Looks like there were some whitespace issues.  Try running _git diff --check_.  
Otherwise looks good.

> Clean up lock-try order in common scheduler code
> 
>
> Key: YARN-7135
> URL: https://issues.apache.org/jira/browse/YARN-7135
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Affects Versions: 3.0.0-alpha4
>Reporter: Daniel Templeton
>Assignee: weiyuan
>  Labels: newbie
> Attachments: YARN-7135.001.patch
>
>
> There are many places that follow the pattern:{code}try {
>   lock.lock();
>   ...
> } finally {
>   lock.unlock();
> }{code}
> There are a couple of reasons that's a bad idea.  The correct pattern 
> is:{code}lock.lock();
> try {
>   ...
> } finally {
>   lock.unlock();
> }{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4266) Allow users to enter containers as UID:GID pair instead of by username

2017-09-19 Thread Eric Badger (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated YARN-4266:
--
Attachment: YARN-4266.004.patch

[~jlowe], thanks for the review!

bq. These comments don't match the code:

Fixed the code to match the comments.

bq. Should we handle ExitCodeException or other types of exceptions that might 
happen (e.g.: "no such user" type of errors) explicitly when running the id 
command so we can provide a better debug experience, or is the exception 
message enough info to debug issues?

ContainerExecutionException doesn't have a constructor with both a string and a 
throwable, so I just removed the string part. That way it will correctly parse 
the information in the throwable that comes from the failed command.

bq. Also I found it odd that getUserIdInfo and getGroupIdInfo take a parameter 
for the id command but these methods are highly dependent upon the "right" 
parameter being passed in order to function properly. They are each only called 
in one place, and IMHO there's no reason to make this parameterized given the 
parsing code needs the corresponding parameter to be correct. We should just 
remove the parameter and have it passed directly.

Yep, good call. Removed the parameter and hardcoded in the "-u" and "-G" into 
the respective method.

> Allow users to enter containers as UID:GID pair instead of by username
> --
>
> Key: YARN-4266
> URL: https://issues.apache.org/jira/browse/YARN-4266
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Sidharta Seethana
>Assignee: luhuichun
> Attachments: YARN-4266.001.patch, YARN-4266.001.patch, 
> YARN-4266.002.patch, YARN-4266.003.patch, YARN-4266.004.patch, 
> YARN-4266_Allow_whitelisted_users_to_disable_user_re-mapping.pdf, 
> YARN-4266_Allow_whitelisted_users_to_disable_user_re-mapping_v2.pdf, 
> YARN-4266_Allow_whitelisted_users_to_disable_user_re-mapping_v3.pdf, 
> YARN-4266-branch-2.8.001.patch
>
>
> Docker provides a mechanism (the --user switch) that enables us to specify 
> the user the container processes should run as. We use this mechanism today 
> when launching docker containers . In non-secure mode, we run the docker 
> container based on 
> `yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user` and in 
> secure mode, as the submitting user. However, this mechanism breaks down with 
> a large number of 'pre-created' images which don't necessarily have the users 
> available within the image. Examples of such images include shared images 
> that need to be used by multiple users. We need a way in which we can allow a 
> pre-defined set of users to run containers based on existing images, without 
> using the --user switch. There are some implications of disabling this user 
> squashing that we'll need to work through : log aggregation, artifact 
> deletion etc.,



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7201) Add more sophisticated example YARN service

2017-09-19 Thread Eric Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-7201:

Attachment: YARN-7201.yarn-native-services.006.patch

Correction to artifact image name.

> Add more sophisticated example YARN service
> ---
>
> Key: YARN-7201
> URL: https://issues.apache.org/jira/browse/YARN-7201
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jian He
> Attachments: YARN-7201.yarn-native-services.001.patch, 
> YARN-7201.yarn-native-services.002.patch, 
> YARN-7201.yarn-native-services.003.patch, 
> YARN-7201.yarn-native-services.004.patch, 
> YARN-7201.yarn-native-services.005.patch, 
> YARN-7201.yarn-native-services.006.patch
>
>
> We can show case the following capabilities in the YARN service examples:
> #  Description of the service
> #  Component dependencies
> #  How to mount HDFS volume via NFS Gateway
> # Enable privileged container
> # Quick link to web application
> # Queue to submit
> # Use private docker registry



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7135) Clean up lock-try order in common scheduler code

2017-09-19 Thread weiyuan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weiyuan updated YARN-7135:
--
Attachment: YARN-7135.002.patch

> Clean up lock-try order in common scheduler code
> 
>
> Key: YARN-7135
> URL: https://issues.apache.org/jira/browse/YARN-7135
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Affects Versions: 3.0.0-alpha4
>Reporter: Daniel Templeton
>Assignee: weiyuan
>  Labels: newbie
> Attachments: YARN-7135.001.patch, YARN-7135.002.patch
>
>
> There are many places that follow the pattern:{code}try {
>   lock.lock();
>   ...
> } finally {
>   lock.unlock();
> }{code}
> There are a couple of reasons that's a bad idea.  The correct pattern 
> is:{code}lock.lock();
> try {
>   ...
> } finally {
>   lock.unlock();
> }{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7135) Clean up lock-try order in common scheduler code

2017-09-19 Thread weiyuan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16171994#comment-16171994
 ] 

weiyuan commented on YARN-7135:
---

[~templedf], thanks for your suggestion, I updated the patch again. 

> Clean up lock-try order in common scheduler code
> 
>
> Key: YARN-7135
> URL: https://issues.apache.org/jira/browse/YARN-7135
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Affects Versions: 3.0.0-alpha4
>Reporter: Daniel Templeton
>Assignee: weiyuan
>  Labels: newbie
> Attachments: YARN-7135.001.patch, YARN-7135.002.patch
>
>
> There are many places that follow the pattern:{code}try {
>   lock.lock();
>   ...
> } finally {
>   lock.unlock();
> }{code}
> There are a couple of reasons that's a bad idea.  The correct pattern 
> is:{code}lock.lock();
> try {
>   ...
> } finally {
>   lock.unlock();
> }{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7135) Clean up lock-try order in common scheduler code

2017-09-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16171993#comment-16171993
 ] 

Hadoop QA commented on YARN-7135:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  6s{color} 
| {color:red} YARN-7135 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-7135 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12887928/YARN-7135.002.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/17515/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Clean up lock-try order in common scheduler code
> 
>
> Key: YARN-7135
> URL: https://issues.apache.org/jira/browse/YARN-7135
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Affects Versions: 3.0.0-alpha4
>Reporter: Daniel Templeton
>Assignee: weiyuan
>  Labels: newbie
> Attachments: YARN-7135.001.patch, YARN-7135.002.patch
>
>
> There are many places that follow the pattern:{code}try {
>   lock.lock();
>   ...
> } finally {
>   lock.unlock();
> }{code}
> There are a couple of reasons that's a bad idea.  The correct pattern 
> is:{code}lock.lock();
> try {
>   ...
> } finally {
>   lock.unlock();
> }{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (YARN-7135) Clean up lock-try order in common scheduler code

2017-09-19 Thread weiyuan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weiyuan updated YARN-7135:
--
Comment: was deleted

(was: [~templedf], thanks for your suggestion, I updated the patch again. )

> Clean up lock-try order in common scheduler code
> 
>
> Key: YARN-7135
> URL: https://issues.apache.org/jira/browse/YARN-7135
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Affects Versions: 3.0.0-alpha4
>Reporter: Daniel Templeton
>Assignee: weiyuan
>  Labels: newbie
> Attachments: YARN-7135.001.patch, YARN-7135.002.patch
>
>
> There are many places that follow the pattern:{code}try {
>   lock.lock();
>   ...
> } finally {
>   lock.unlock();
> }{code}
> There are a couple of reasons that's a bad idea.  The correct pattern 
> is:{code}lock.lock();
> try {
>   ...
> } finally {
>   lock.unlock();
> }{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7135) Clean up lock-try order in common scheduler code

2017-09-19 Thread weiyuan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weiyuan updated YARN-7135:
--
Attachment: YARN-7135.003.patch

> Clean up lock-try order in common scheduler code
> 
>
> Key: YARN-7135
> URL: https://issues.apache.org/jira/browse/YARN-7135
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Affects Versions: 3.0.0-alpha4
>Reporter: Daniel Templeton
>Assignee: weiyuan
>  Labels: newbie
> Attachments: YARN-7135.001.patch, YARN-7135.002.patch, 
> YARN-7135.003.patch
>
>
> There are many places that follow the pattern:{code}try {
>   lock.lock();
>   ...
> } finally {
>   lock.unlock();
> }{code}
> There are a couple of reasons that's a bad idea.  The correct pattern 
> is:{code}lock.lock();
> try {
>   ...
> } finally {
>   lock.unlock();
> }{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7135) Clean up lock-try order in common scheduler code

2017-09-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172005#comment-16172005
 ] 

Hadoop QA commented on YARN-7135:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  7s{color} 
| {color:red} YARN-7135 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-7135 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12887932/YARN-7135.003.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/17516/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Clean up lock-try order in common scheduler code
> 
>
> Key: YARN-7135
> URL: https://issues.apache.org/jira/browse/YARN-7135
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Affects Versions: 3.0.0-alpha4
>Reporter: Daniel Templeton
>Assignee: weiyuan
>  Labels: newbie
> Attachments: YARN-7135.001.patch, YARN-7135.002.patch, 
> YARN-7135.003.patch
>
>
> There are many places that follow the pattern:{code}try {
>   lock.lock();
>   ...
> } finally {
>   lock.unlock();
> }{code}
> There are a couple of reasons that's a bad idea.  The correct pattern 
> is:{code}lock.lock();
> try {
>   ...
> } finally {
>   lock.unlock();
> }{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4266) Allow users to enter containers as UID:GID pair instead of by username

2017-09-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172006#comment-16172006
 ] 

Hadoop QA commented on YARN-4266:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
52s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
11s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m  
2s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 in trunk has 1 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  7m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
12s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 10s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 5 new + 227 unchanged - 0 fixed = 232 total (was 227) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 43s{color} 
| {color:red} hadoop-yarn-api in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 15m 27s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
41s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 71m 42s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.conf.TestYarnConfigurationFields |
|   | hadoop.yarn.server.nodemanager.containermanager.TestContainerManager |
|   | 
hadoop.yarn.server.nodemanager.containermanager.linux.runtime.TestDockerContainerRuntime
 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | YARN-4266 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12887908/YARN-4266.004.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  cc  |
| uname | Linux 5620da87c15a 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 
12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 3a20deb |
| Default Java | 1.8.0_144 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://bu

[jira] [Commented] (YARN-7001) If shared cache upload is terminated in the middle, the temp file will never be deleted

2017-09-19 Thread Miklos Szegedi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172026#comment-16172026
 ] 

Miklos Szegedi commented on YARN-7001:
--

Thank you, [~Sen Zhao] for the patch. Could you add a unit test?

> If shared cache upload is terminated in the middle, the temp file will never 
> be deleted
> ---
>
> Key: YARN-7001
> URL: https://issues.apache.org/jira/browse/YARN-7001
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Miklos Szegedi
>Assignee: Sen Zhao
> Attachments: YARN-7001.001.patch
>
>
> There is a missing deleteTempFile(tempPath);
> {code}
>   tempPath = new Path(directoryPath, getTemporaryFileName(actualPath));
>   if (!uploadFile(actualPath, tempPath)) {
> LOG.warn("Could not copy the file to the shared cache at " + 
> tempPath);
> return false;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6916) Moving logging APIs over to slf4j in hadoop-yarn-server-common

2017-09-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172032#comment-16172032
 ] 

Hadoop QA commented on YARN-6916:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red}499m 
37s{color} | {color:red} Docker failed to build yetus/hadoop:tp-16710. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-6916 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12887852/YARN-6916.005.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/17512/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Moving logging APIs over to slf4j in hadoop-yarn-server-common
> --
>
> Key: YARN-6916
> URL: https://issues.apache.org/jira/browse/YARN-6916
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
> Attachments: YARN-6712.01.patch, YARN-6916.002.patch, 
> YARN-6916.003.patch, YARN-6916.004.patch, YARN-6916.005.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6943) Update Yarn to YARN in documentation

2017-09-19 Thread Haibo Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172042#comment-16172042
 ] 

Haibo Chen commented on YARN-6943:
--

[~chetna] I have added you as a contributor to YARN. You should have the 
permission to submit a patch now. Let me know if you still cannot.

> Update Yarn to YARN in documentation
> 
>
> Key: YARN-6943
> URL: https://issues.apache.org/jira/browse/YARN-6943
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Miklos Szegedi
>Priority: Minor
>  Labels: newbie
>
> Based on the discussion with [~templedf] in YARN-6757 the official case of 
> YARN is YARN, not Yarn, so we should update all the md files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6962) Add support for updateContainers when allocating using FederationInterceptor

2017-09-19 Thread Botong Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated YARN-6962:
---
Attachment: YARN-6962.v2.patch

v2 patch unit test added. 

> Add support for updateContainers when allocating using FederationInterceptor
> 
>
> Key: YARN-6962
> URL: https://issues.apache.org/jira/browse/YARN-6962
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Minor
> Attachments: YARN-6962.v1.patch, YARN-6962.v2.patch
>
>
> Container update is introduced in YARN-5221. Federation Interceptor needs to 
> support it when splitting (merging) the allocate request (response).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7034) DefaultLinuxContainerRuntime and DockerLinuxContainerRuntime sends client environment variables to container-executor

2017-09-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172060#comment-16172060
 ] 

Hadoop QA commented on YARN-7034:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
43s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 in trunk has 1 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 14m 24s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 38m 52s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.nodemanager.containermanager.TestContainerManager |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | YARN-7034 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12887386/YARN-7034.006.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux d560dae92c19 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 
12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 595d478 |
| Default Java | 1.8.0_144 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-YARN-Build/17517/artifact/patchprocess/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-warnings.html
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/17517/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/17517/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/17517/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |

[jira] [Commented] (YARN-7135) Clean up lock-try order in common scheduler code

2017-09-19 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172065#comment-16172065
 ] 

Wangda Tan commented on YARN-7135:
--

[~v123582] / [~templedf],

Apologize for my late responses, I just checked some resources. 

>From \[1\],
{code}
Assuming that lock is a ReentrantLock, then it makes no real difference, since 
lock() does not throw any checked exceptions.

The Java documentation, however, leaves lock() outside the try block in the 
ReentrantLock example. The reason for this is that an unchecked exception in 
lock() should not lead to unlock() incorrectly being called. Whether 
correctness is a concern in the presence of an unchecked exception in lock() of 
all things, that is another discussion altogether.

It is a good coding practice in general to keep things like try blocks as 
fine-grained as possible. 
{code}

And you can also check that, Java's ReentrantLock.lock doesn't throw any 
exception, see \[2\].

I think update all ReentrantLock in YARN RM package might be overkill and will 
potentially cause lots of conflict when we want to do backport (unless we 
backport this patch to all active branches). 

Instead of doing this, I suggest to keep all ReentrantLock as-is, and only 
update lock pattern when necessary.

\[1\] https://stackoverflow.com/questions/10868423/lock-lock-before-try
\[2\] 
http://docs.oracle.com/javase/6/docs/api/java/util/concurrent/locks/ReentrantLock.html

> Clean up lock-try order in common scheduler code
> 
>
> Key: YARN-7135
> URL: https://issues.apache.org/jira/browse/YARN-7135
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Affects Versions: 3.0.0-alpha4
>Reporter: Daniel Templeton
>Assignee: weiyuan
>  Labels: newbie
> Attachments: YARN-7135.001.patch, YARN-7135.002.patch, 
> YARN-7135.003.patch
>
>
> There are many places that follow the pattern:{code}try {
>   lock.lock();
>   ...
> } finally {
>   lock.unlock();
> }{code}
> There are a couple of reasons that's a bad idea.  The correct pattern 
> is:{code}lock.lock();
> try {
>   ...
> } finally {
>   lock.unlock();
> }{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7034) DefaultLinuxContainerRuntime and DockerLinuxContainerRuntime sends client environment variables to container-executor

2017-09-19 Thread Miklos Szegedi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172086#comment-16172086
 ] 

Miklos Szegedi commented on YARN-7034:
--

The unit test is a flaky one. See YARN-7145 Identify potential flaky unit tests.

> DefaultLinuxContainerRuntime and DockerLinuxContainerRuntime sends client 
> environment variables to container-executor
> -
>
> Key: YARN-7034
> URL: https://issues.apache.org/jira/browse/YARN-7034
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
>Priority: Critical
> Attachments: YARN-7034.000.patch, YARN-7034.001.patch, 
> YARN-7034.002.patch, YARN-7034.003.patch, YARN-7034.004.patch, 
> YARN-7034.005.patch, YARN-7034.006.patch, YARN-7034.branch-2.000.patch, 
> YARN-7034.branch-2.004.patch, YARN-7034.branch-2.005.patch, 
> YARN-7034.branch-2.006.patch, YARN-7034.branch-2.8.000.patch, 
> YARN-7034.branch-2.8.004.patch, YARN-7034.branch-2.8.005.patch, 
> YARN-7034.branch-2.8.006.patch
>
>
> This behavior is unnecessary since there is nothing that is used from the 
> environment right now. One option is to whitelist these variables before 
> passing them. Are there any known use cases for this to justify?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7034) DefaultLinuxContainerRuntime and DockerLinuxContainerRuntime sends client environment variables to container-executor

2017-09-19 Thread Miklos Szegedi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172087#comment-16172087
 ] 

Miklos Szegedi commented on YARN-7034:
--

[~shaneku...@gmail.com], do you have any other comments?

> DefaultLinuxContainerRuntime and DockerLinuxContainerRuntime sends client 
> environment variables to container-executor
> -
>
> Key: YARN-7034
> URL: https://issues.apache.org/jira/browse/YARN-7034
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
>Priority: Critical
> Attachments: YARN-7034.000.patch, YARN-7034.001.patch, 
> YARN-7034.002.patch, YARN-7034.003.patch, YARN-7034.004.patch, 
> YARN-7034.005.patch, YARN-7034.006.patch, YARN-7034.branch-2.000.patch, 
> YARN-7034.branch-2.004.patch, YARN-7034.branch-2.005.patch, 
> YARN-7034.branch-2.006.patch, YARN-7034.branch-2.8.000.patch, 
> YARN-7034.branch-2.8.004.patch, YARN-7034.branch-2.8.005.patch, 
> YARN-7034.branch-2.8.006.patch
>
>
> This behavior is unnecessary since there is nothing that is used from the 
> environment right now. One option is to whitelist these variables before 
> passing them. Are there any known use cases for this to justify?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6962) Add support for updateContainers when allocating using FederationInterceptor

2017-09-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172097#comment-16172097
 ] 

Hadoop QA commented on YARN-6962:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
44s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 in trunk has 1 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 14m 16s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 35m 19s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.nodemanager.containermanager.TestContainerManager |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | YARN-6962 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12887937/YARN-6962.v2.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 77a2eab87357 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 
12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 31b5840 |
| Default Java | 1.8.0_144 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-YARN-Build/17518/artifact/patchprocess/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-warnings.html
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/17518/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/17518/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/17518/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


[jira] [Commented] (YARN-5534) Allow whitelisted volume mounts

2017-09-19 Thread Miklos Szegedi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172102#comment-16172102
 ] 

Miklos Szegedi commented on YARN-5534:
--

Thank you, [~eyang] for sharing your thoughts. Sorry, I am confused. Are you 
suggesting to make the whitelist visible to more users or less visible?

> Allow whitelisted volume mounts 
> 
>
> Key: YARN-5534
> URL: https://issues.apache.org/jira/browse/YARN-5534
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: luhuichun
>Assignee: Shane Kumpf
> Attachments: YARN-5534.001.patch, YARN-5534.002.patch, 
> YARN-5534.003.patch
>
>
> Introduction 
> Mounting files or directories from the host is one way of passing 
> configuration and other information into a docker container. 
> We could allow the user to set a list of mounts in the environment of 
> ContainerLaunchContext (e.g. /dir1:/targetdir1,/dir2:/targetdir2). 
> These would be mounted read-only to the specified target locations. This has 
> been resolved in YARN-4595
> 2.Problem Definition
> Bug mounting arbitrary volumes into a Docker container can be a security risk.
> 3.Possible solutions
> one approach to provide safe mounts is to allow the cluster administrator to 
> configure a set of parent directories as white list mounting directories.
>  Add a property named yarn.nodemanager.volume-mounts.white-list, when 
> container executor do mount checking, only the allowed directories or 
> sub-directories can be mounted. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-6943) Update Yarn to YARN in documentation

2017-09-19 Thread Miklos Szegedi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Szegedi reassigned YARN-6943:


Assignee: Chetna Chaudhari

> Update Yarn to YARN in documentation
> 
>
> Key: YARN-6943
> URL: https://issues.apache.org/jira/browse/YARN-6943
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Miklos Szegedi
>Assignee: Chetna Chaudhari
>Priority: Minor
>  Labels: newbie
>
> Based on the discussion with [~templedf] in YARN-6757 the official case of 
> YARN is YARN, not Yarn, so we should update all the md files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6943) Update Yarn to YARN in documentation

2017-09-19 Thread Miklos Szegedi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172106#comment-16172106
 ] 

Miklos Szegedi commented on YARN-6943:
--

Thank you for signing up. I assigned it to you [~chetna].

> Update Yarn to YARN in documentation
> 
>
> Key: YARN-6943
> URL: https://issues.apache.org/jira/browse/YARN-6943
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Miklos Szegedi
>Assignee: Chetna Chaudhari
>Priority: Minor
>  Labels: newbie
>
> Based on the discussion with [~templedf] in YARN-6757 the official case of 
> YARN is YARN, not Yarn, so we should update all the md files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6620) [YARN-6223] NM Java side code changes to support isolate GPU devices by using CGroups

2017-09-19 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-6620:
-
Attachment: YARN-6620.009.patch

Attached ver.9 patch, (hopefully) fixed Jenkins issues

> [YARN-6223] NM Java side code changes to support isolate GPU devices by using 
> CGroups
> -
>
> Key: YARN-6620
> URL: https://issues.apache.org/jira/browse/YARN-6620
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-6620.001.patch, YARN-6620.002.patch, 
> YARN-6620.003.patch, YARN-6620.004.patch, YARN-6620.005.patch, 
> YARN-6620.006-WIP.patch, YARN-6620.007.patch, YARN-6620.008.patch, 
> YARN-6620.009.patch
>
>
> This JIRA plan to add support of:
> 1) GPU configuration for NodeManagers
> 2) Isolation in CGroups. (Java side).
> 3) NM restart and recovery allocated GPU devices



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6962) Add support for updateContainers when allocating using FederationInterceptor

2017-09-19 Thread Botong Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172158#comment-16172158
 ] 

Botong Huang commented on YARN-6962:


testContainerUpdateExecTypeOpportunisticToGuaranteed failure is not related, 
and being handled in Yarn7196. 

> Add support for updateContainers when allocating using FederationInterceptor
> 
>
> Key: YARN-6962
> URL: https://issues.apache.org/jira/browse/YARN-6962
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Minor
> Attachments: YARN-6962.v1.patch, YARN-6962.v2.patch
>
>
> Container update is introduced in YARN-5221. Federation Interceptor needs to 
> support it when splitting (merging) the allocate request (response).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-6962) Add support for updateContainers when allocating using FederationInterceptor

2017-09-19 Thread Botong Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172158#comment-16172158
 ] 

Botong Huang edited comment on YARN-6962 at 9/19/17 6:45 PM:
-

testContainerUpdateExecTypeOpportunisticToGuaranteed failure is not related, 
and being handled in Yarn-7196. 


was (Author: botong):
testContainerUpdateExecTypeOpportunisticToGuaranteed failure is not related, 
and being handled in Yarn7196. 

> Add support for updateContainers when allocating using FederationInterceptor
> 
>
> Key: YARN-6962
> URL: https://issues.apache.org/jira/browse/YARN-6962
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Minor
> Attachments: YARN-6962.v1.patch, YARN-6962.v2.patch
>
>
> Container update is introduced in YARN-5221. Federation Interceptor needs to 
> support it when splitting (merging) the allocate request (response).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7215) REST API to list all deployed services by the same user

2017-09-19 Thread Eric Yang (JIRA)
Eric Yang created YARN-7215:
---

 Summary: REST API to list all deployed services by the same user
 Key: YARN-7215
 URL: https://issues.apache.org/jira/browse/YARN-7215
 Project: Hadoop YARN
  Issue Type: Bug
  Components: api, applications
Reporter: Eric Yang
Assignee: Eric Yang


In Slider, it is possible to list deployed applications from the same user by 
using:


slider list


This API can help UI to display application and services deployed by the same 
user.
Apiserver does not have ability to list all applications/services at this time. 
 This API requires fast response to list all applications because it is a 
common UI operation.  ApiServer deployed applications persist configuration in 
HDFS similar to slider, but using directory listing to display deployed 
application might cost too much overhead to namenode.  We may want to use 
alternative storage mechanism to cache deployed application configuration to 
accelerate the response time of list deployed applications.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7216) Missing ability to list configuration vs status

2017-09-19 Thread Eric Yang (JIRA)
Eric Yang created YARN-7216:
---

 Summary: Missing ability to list configuration vs status
 Key: YARN-7216
 URL: https://issues.apache.org/jira/browse/YARN-7216
 Project: Hadoop YARN
  Issue Type: Bug
  Components: api, applications
Reporter: Eric Yang
Assignee: Eric Yang


API Server has /ws/v1/services/{service_name}.  This REST end point returns 
Services object which contains both configuration and status.  When status or 
macro based parameters changed in Services object, it can confuse UI code to 
making configuration changes.  The suggestion is to preserve a copy of 
configuration object independent of status object.  This gives UI ability to 
change services configuration and update configuration.

Similar to Ambari, it might provide better information if we have the following 
separated REST end points:

{code}
/ws/v1/services/{service_name}/config
/ws/v1/services/{service_name}/status
{code}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4048) Linux kernel panic under strict CPU limits(on CentOS/RHEL 6.x)

2017-09-19 Thread Ruslan Dautkhanov (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172195#comment-16172195
 ] 

Ruslan Dautkhanov commented on YARN-4048:
-

Information on this linux kernel panic from RHEL CASE 01901460.
They have analyzed the crash file dumped during panic and came up with below 
analysis of the problem.

{noformat}
Kwon, Daniel on Aug 02 2017 at 11:25 PM -06:00 
Hi, 

This is Daniel Kwon from Kernel team. I'm working with the case owner to 
support your case. 

The system was crashed due to hard lockup which means there was a process 
holding a CPU for more than 60 seconds. 

-- 
CPUS: 88 
DATE: Fri Jul 28 14:37:44 2017 
UPTIME: 15 days, 16:42:29 
LOAD AVERAGE: 109.16, 39.97, 15.17 
TASKS: 3728 
NODENAME: pc1udahad14 
RELEASE: 2.6.32-696.3.2.el6.x86_64 
VERSION: #1 SMP Wed Jun 7 11:51:39 EDT 2017 
MACHINE: x86_64 (2397 Mhz) 
MEMORY: 511.9 GB 
PANIC: "Kernel panic - not syncing: Hard LOCKUP" 
-- 

Checking the runqueue time shows that there were many CPUs not updated lately 
which means processes were holding those CPUs for long. 

-- 
crash> runq -t | grep CPU | sort -k3r | awk 
'NR==1{now=strtonum("0x"$3)}1{printf"%s\t%7.2fs 
behind\n",$0,(now-strtonum("0x"$3))/10}' 
CPU 25: 4d0f5bec32555 0.00s behind 
CPU 2: 4d0f5bec32015 0.00s behind 
<... cut ...> 
CPU 4: 4d0f5bb9984fb 0.05s behind 
CPU 61: 4d0f5bb8f5619 0.05s behind 
CPU 57: 4d0f5bb83f85d 0.05s behind 
CPU 17: 4d0f5bb78ad7d 0.06s behind 
CPU 12: 4d0f5ba972dbd 0.07s behind 
CPU 48: 4d0f5ba8b4980 0.07s behind 
CPU 84: 4d0f5ba72ca7e 0.07s behind 
CPU 13: 4d0f5966bef24 0.68s behind 
CPU 15: 4d0f58a123cfd 0.88s behind 
CPU 54: 4d0f5832e5754 1.00s behind 
CPU 62: 4d0f581d593b7 1.02s behind 
CPU 49: 4d0f54868608e 1.99s behind 
CPU 52: 4d0f5480bd287 1.99s behind 
CPU 24: 4d0eb096f58da 45.99s behind 
CPU 35: 4d0e730040f57 62.52s behind 
CPU 85: 4d0e42eaeaea0 75.43s behind 
CPU 46: 4d0e3287e8aae 79.83s behind 
CPU 1: 4d0e1072119ac 88.98s behind 
CPU 45: 4d0e061ec766a 91.75s behind 
CPU 60: 4d0db70bc6ad6 112.98s behind 
CPU 6: 4d0db002d7b9b 114.87s behind 
CPU 14: 4d0d9679efaad 121.72s behind 
CPU 9: 4d0d938f74e97 122.50s behind 
CPU 51: 4d0d912c6d77e 123.14s behind 
CPU 5: 4d0d807a6de65 127.63s behind 
CPU 53: 4d0d80637174f 127.65s behind 
CPU 70: 4d0d78c599c8e 129.69s behind 
CPU 44: 4d0d75602d3c3 130.61s behind 
CPU 3: 4d0d6fe84455e 132.07s behind 
CPU 0: 4d0d6f1c22d11 132.29s behind 
CPU 47: 4d0d6f16a2e95 132.29s behind 
CPU 64: 4d0d6f06851ee 132.31s behind 
CPU 59: 4d0d6da9596d6 132.68s behind 
CPU 23: 4d0d6d89ecaaa 132.71s behind 
CPU 22: 4d0d6c8ad9dd2 132.98s behind 
CPU 67: 4d0d6c853e44d 132.98s behind 
-- 

I have checked the two longest holders which are CPU 22 and CPU 67. 

The process on CPU 67 was awaiting for runqueue lock for the current CPU. 

-- 
crash> runq -c 67 
CPU 67 RUNQUEUE: 8841616f6ec0 
CURRENT: PID: 40639 TASK: 885c89c5b520 COMMAND: "java" 
RT PRIO_ARRAY: 8841616f7048 
[ 0] PID: 271 TASK: 8840266c7520 COMMAND: "migration/67" 
[ 0] PID: 274 TASK: 8840266d6ab0 COMMAND: "watchdog/67" 
CFS RB_ROOT: 8841616f6f58 
[120] PID: 422 TASK: 884026392ab0 COMMAND: "events/67" 
[120] PID: 7857 TASK: 888005966040 COMMAND: "kondemand/67" 
crash> bt 40639 
PID: 40639 TASK: 885c89c5b520 CPU: 67 COMMAND: "java" 
#0 [8841616e6e90] crash_nmi_callback at 81036726 
#1 [8841616e6ea0] notifier_call_chain at 81551085 
#2 [8841616e6ee0] atomic_notifier_call_chain at 815510ea 
#3 [8841616e6ef0] notify_die at 810acd0e 
#4 [8841616e6f20] do_nmi at 8154ec09 
#5 [8841616e6f50] nmi at 8154e5b3 
[exception RIP: wait_for_rqlock+0x31] 
<... cut ...> 
---  --- 
#6 [887fd461feb8] wait_for_rqlock at 8105d751 
#7 [887fd461fec0] do_exit at 81081dbc 
#8 [887fd461ff40] do_group_exit at 810820c8 
#9 [887fd461ff70] sys_exit_group at 81082157 
#10 [887fd461ff80] system_call_fastpath at 8100b0d2 
<... cut ...> 
-- 

The process on CPU 22 was also waiting for runqueue lock. This time it was 
waiting for the runqueue lock for a specific task which is java (40639) that 
was running on CPU 67. 

-- 
crash> runq -c 22 
CPU 22 RUNQUEUE: 884161416ec0 
CURRENT: PID: 8944 TASK: 88801833eab0 COMMAND: "cmf-agent" 
RT PRIO_ARRAY: 884161417048 
[no tasks queued] 
CFS RB_

[jira] [Updated] (YARN-7216) Missing ability to list configuration vs status

2017-09-19 Thread Eric Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-7216:

Description: 
API Server has /ws/v1/services/{service_name}.  This REST end point returns 
Services object which contains both configuration and status.  When status or 
macro based parameters changed in Services object, it can confuse UI code to 
making configuration changes.  The suggestion is to preserve a copy of 
configuration object independent of status object.  This gives UI ability to 
change services configuration and update configuration.

Similar to Ambari, it might provide better information if we have the following 
separated REST end points:

{code}
 /ws/v1/services/[service_name]/config
 /ws/v1/services/[service_name]/status
{code}


  was:
API Server has /ws/v1/services/{service_name}.  This REST end point returns 
Services object which contains both configuration and status.  When status or 
macro based parameters changed in Services object, it can confuse UI code to 
making configuration changes.  The suggestion is to preserve a copy of 
configuration object independent of status object.  This gives UI ability to 
change services configuration and update configuration.

Similar to Ambari, it might provide better information if we have the following 
separated REST end points:

{code}
/ws/v1/services/{service_name}/config
/ws/v1/services/{service_name}/status
{code}



> Missing ability to list configuration vs status
> ---
>
> Key: YARN-7216
> URL: https://issues.apache.org/jira/browse/YARN-7216
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: api, applications
>Reporter: Eric Yang
>Assignee: Eric Yang
>
> API Server has /ws/v1/services/{service_name}.  This REST end point returns 
> Services object which contains both configuration and status.  When status or 
> macro based parameters changed in Services object, it can confuse UI code to 
> making configuration changes.  The suggestion is to preserve a copy of 
> configuration object independent of status object.  This gives UI ability to 
> change services configuration and update configuration.
> Similar to Ambari, it might provide better information if we have the 
> following separated REST end points:
> {code}
>  /ws/v1/services/[service_name]/config
>  /ws/v1/services/[service_name]/status
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4266) Allow users to enter containers as UID:GID pair instead of by username

2017-09-19 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172217#comment-16172217
 ] 

Jason Lowe commented on YARN-4266:
--

Thanks for updating the patch!

The YarnConfigurationFields and TestDockerContainerRuntime failures are related.

On a related noted, my RHEL7 box TestDockerContainerRuntime fails because my 
user account is in group wheel.  I could see this test failing for others 
similarly.  Do we really want to limit it to gid>=100 by default?  If so, we 
may want to account for this in the unit test and adjust the threshold setting 
appropriately so we're not failing on the wrong thing in the test.

Nit: In DockerLinuxContainerRuntime it would be nice if it was consistent with 
the treatment of other YarnConfiguration constants.  Other uses just qualify 
them with YarnConfiguration rather than import them directly.


> Allow users to enter containers as UID:GID pair instead of by username
> --
>
> Key: YARN-4266
> URL: https://issues.apache.org/jira/browse/YARN-4266
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Sidharta Seethana
>Assignee: luhuichun
> Attachments: YARN-4266.001.patch, YARN-4266.001.patch, 
> YARN-4266.002.patch, YARN-4266.003.patch, YARN-4266.004.patch, 
> YARN-4266_Allow_whitelisted_users_to_disable_user_re-mapping.pdf, 
> YARN-4266_Allow_whitelisted_users_to_disable_user_re-mapping_v2.pdf, 
> YARN-4266_Allow_whitelisted_users_to_disable_user_re-mapping_v3.pdf, 
> YARN-4266-branch-2.8.001.patch
>
>
> Docker provides a mechanism (the --user switch) that enables us to specify 
> the user the container processes should run as. We use this mechanism today 
> when launching docker containers . In non-secure mode, we run the docker 
> container based on 
> `yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user` and in 
> secure mode, as the submitting user. However, this mechanism breaks down with 
> a large number of 'pre-created' images which don't necessarily have the users 
> available within the image. Examples of such images include shared images 
> that need to be used by multiple users. We need a way in which we can allow a 
> pre-defined set of users to run containers based on existing images, without 
> using the --user switch. There are some implications of disabling this user 
> squashing that we'll need to work through : log aggregation, artifact 
> deletion etc.,



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7215) REST API to list all deployed services by the same user

2017-09-19 Thread Eric Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-7215:

Issue Type: Sub-task  (was: Bug)
Parent: YARN-7054

> REST API to list all deployed services by the same user
> ---
>
> Key: YARN-7215
> URL: https://issues.apache.org/jira/browse/YARN-7215
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, applications
>Reporter: Eric Yang
>Assignee: Eric Yang
>
> In Slider, it is possible to list deployed applications from the same user by 
> using:
> 
> slider list
> 
> This API can help UI to display application and services deployed by the same 
> user.
> Apiserver does not have ability to list all applications/services at this 
> time.  This API requires fast response to list all applications because it is 
> a common UI operation.  ApiServer deployed applications persist configuration 
> in HDFS similar to slider, but using directory listing to display deployed 
> application might cost too much overhead to namenode.  We may want to use 
> alternative storage mechanism to cache deployed application configuration to 
> accelerate the response time of list deployed applications.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7216) Missing ability to list configuration vs status

2017-09-19 Thread Eric Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-7216:

Issue Type: Sub-task  (was: Bug)
Parent: YARN-7054

> Missing ability to list configuration vs status
> ---
>
> Key: YARN-7216
> URL: https://issues.apache.org/jira/browse/YARN-7216
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, applications
>Reporter: Eric Yang
>Assignee: Eric Yang
>
> API Server has /ws/v1/services/{service_name}.  This REST end point returns 
> Services object which contains both configuration and status.  When status or 
> macro based parameters changed in Services object, it can confuse UI code to 
> making configuration changes.  The suggestion is to preserve a copy of 
> configuration object independent of status object.  This gives UI ability to 
> change services configuration and update configuration.
> Similar to Ambari, it might provide better information if we have the 
> following separated REST end points:
> {code}
>  /ws/v1/services/[service_name]/config
>  /ws/v1/services/[service_name]/status
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7001) If shared cache upload is terminated in the middle, the temp file will never be deleted

2017-09-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172224#comment-16172224
 ] 

Hadoop QA commented on YARN-7001:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red}499m 
26s{color} | {color:red} Docker failed to build yetus/hadoop:tp-30407. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-7001 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12887873/YARN-7001.001.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/17513/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> If shared cache upload is terminated in the middle, the temp file will never 
> be deleted
> ---
>
> Key: YARN-7001
> URL: https://issues.apache.org/jira/browse/YARN-7001
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Miklos Szegedi
>Assignee: Sen Zhao
> Attachments: YARN-7001.001.patch
>
>
> There is a missing deleteTempFile(tempPath);
> {code}
>   tempPath = new Path(directoryPath, getTemporaryFileName(actualPath));
>   if (!uploadFile(actualPath, tempPath)) {
> LOG.warn("Could not copy the file to the shared cache at " + 
> tempPath);
> return false;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6968) Hard coded reference to an absolute pathname in org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.launchContainer(Cont

2017-09-19 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172229#comment-16172229
 ] 

Jason Lowe commented on YARN-6968:
--

Any update on this?  It would be nice to have Yetus stop complaining about the 
latent findbugs warning.

> Hard coded reference to an absolute pathname in 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.launchContainer(ContainerRuntimeContext)
> -
>
> Key: YARN-6968
> URL: https://issues.apache.org/jira/browse/YARN-6968
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
>
> This could be done after YARN-6757 is checked in.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-6968) Hard coded reference to an absolute pathname in org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.launchContainer(Conta

2017-09-19 Thread Eric Badger (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger reassigned YARN-6968:
-

Assignee: Eric Badger  (was: Miklos Szegedi)

from Miklos on YARN-7025
bq. Eric Badger, Would you like to work on YARN-6968? Feel free to assign it to 
yourself.

Assigning to myself

> Hard coded reference to an absolute pathname in 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.launchContainer(ContainerRuntimeContext)
> -
>
> Key: YARN-6968
> URL: https://issues.apache.org/jira/browse/YARN-6968
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Miklos Szegedi
>Assignee: Eric Badger
>
> This could be done after YARN-6757 is checked in.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6620) [YARN-6223] NM Java side code changes to support isolate GPU devices by using CGroups

2017-09-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172244#comment-16172244
 ] 

Hadoop QA commented on YARN-6620:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 15 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
 6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
12s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m  
1s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 in trunk has 1 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
52s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m 
54s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 13s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 79 new + 479 unchanged - 24 fixed = 558 total (was 503) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} shellcheck {color} | {color:green}  0m 
 0s{color} | {color:green} There were no new shellcheck issues. {color} |
| {color:green}+1{color} | {color:green} shelldocs {color} | {color:green}  0m 
27s{color} | {color:green} There were no new shelldocs issues. {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch 523 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
50s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
41s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
46s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 14m 52s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
33s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 79m 39s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.nodemanager.containermanager.TestContainerManager |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | YARN-6620 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12887944/YARN-6620.009.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  xml  shellcheck  shel

[jira] [Updated] (YARN-7210) Some fixes related to Registry DNS

2017-09-19 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-7210:
--
Attachment: YARN-7210.yarn-native-services.03.patch

> Some fixes related to Registry DNS
> --
>
> Key: YARN-7210
> URL: https://issues.apache.org/jira/browse/YARN-7210
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-7210.yarn-native-services.01.patch, 
> YARN-7210.yarn-native-services.02.patch, 
> YARN-7210.yarn-native-services.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7210) Some fixes related to Registry DNS

2017-09-19 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172276#comment-16172276
 ] 

Jian He commented on YARN-7210:
---

I added one more fix to allow empty launchCommand
The UT are actually passing locally, running again. 

> Some fixes related to Registry DNS
> --
>
> Key: YARN-7210
> URL: https://issues.apache.org/jira/browse/YARN-7210
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-7210.yarn-native-services.01.patch, 
> YARN-7210.yarn-native-services.02.patch, 
> YARN-7210.yarn-native-services.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7217) PUT method for update service for Service API doesn't function correctly

2017-09-19 Thread Eric Yang (JIRA)
Eric Yang created YARN-7217:
---

 Summary: PUT method for update service for Service API doesn't 
function correctly
 Key: YARN-7217
 URL: https://issues.apache.org/jira/browse/YARN-7217
 Project: Hadoop YARN
  Issue Type: Task
  Components: api, applications
Reporter: Eric Yang


The PUT method for updateService API provides multiple functions:

# Stopping a service.
# Start a service.
# Increase or decrease number of containers.

The overloading is buggy depending on how the configuration should be applied.

Scenario 1
A user retrieves Service object from getService call, and the Service object 
contains state: STARTED.  The user would like to increase number of containers 
for the deployed service.  The JSON has been updated to increase container 
count.  The PUT method does not actually increase container count.

Scenario 2
A user retrieves Service object from getService call, and the Service object 
contains state: STOPPED.  The user would like to make a environment 
configuration change.  The configuration does not get updated after PUT method.

This is possible to address by rearranging the logic of START/STOP after 
configuration update.  However, there are other potential combinations that can 
break PUT method.  For example, user like to make configuration changes, but 
not yet restart the service until a later time.

The alternative is to separate the PUT method into PUT method for configuration 
vs status.  This increase the number of action that can be performed.  New API 
could look like:

{code}
@PUT
/ws/v1/services/[service_name]/config

Request Data:
{
  "name":"[service_name]",
  "number_of_containers": 5
}
{code}

{code}
@PUT
/ws/v1/services/[service_name]/state

Request data:
{
  "name": "[service_name]",
  "state": "STOPPED|STARTED"
}
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6142) Support rolling upgrade between 2.x and 3.x

2017-09-19 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated YARN-6142:
--
Target Version/s: 3.0.0  (was: 3.0.0-beta1)

Thanks Ray. Let's move this to 3.0.0 GA then.

> Support rolling upgrade between 2.x and 3.x
> ---
>
> Key: YARN-6142
> URL: https://issues.apache.org/jira/browse/YARN-6142
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: rolling upgrade
>Affects Versions: 3.0.0-alpha2
>Reporter: Andrew Wang
>Assignee: Ray Chiang
>Priority: Blocker
>
> Counterpart JIRA to HDFS-11096. We need to:
> * examine YARN and MR's  JACC report for binary and source incompatibilities
> * run the [PB 
> differ|https://issues.apache.org/jira/browse/HDFS-11096?focusedCommentId=15816405&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15816405]
>  that Sean wrote for HDFS-11096 for the YARN PBs.
> * sanity test some rolling upgrades between 2.x and 3.x. Ideally these are 
> automated and something we can run upstream.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4266) Allow users to enter containers as UID:GID pair instead of by username

2017-09-19 Thread Eric Badger (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated YARN-4266:
--
Attachment: YARN-4266.005.patch

Thanks for the review, [~jlowe]

bq. The YarnConfigurationFields and TestDockerContainerRuntime failures are 
related.
Fixed the tests

bq. On a related noted, my RHEL7 box TestDockerContainerRuntime fails because 
my user account is in group wheel. I could see this test failing for others 
similarly. Do we really want to limit it to gid>=100 by default? If so, we may 
want to account for this in the unit test and adjust the threshold setting 
appropriately so we're not failing on the wrong thing in the test.
I think making the uid and gid lower limits 1 and 1 should be ok. This makes 
everything open from the start, but allows admins to define limits if they want 
certain levels of users not to be allowed to run containers. So setting the uid 
and gid to 1 and 1. 

bq. Nit: In DockerLinuxContainerRuntime it would be nice if it was consistent 
with the treatment of other YarnConfiguration constants. Other uses just 
qualify them with YarnConfiguration rather than import them directly.
Fixed

> Allow users to enter containers as UID:GID pair instead of by username
> --
>
> Key: YARN-4266
> URL: https://issues.apache.org/jira/browse/YARN-4266
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Sidharta Seethana
>Assignee: luhuichun
> Attachments: YARN-4266.001.patch, YARN-4266.001.patch, 
> YARN-4266.002.patch, YARN-4266.003.patch, YARN-4266.004.patch, 
> YARN-4266.005.patch, 
> YARN-4266_Allow_whitelisted_users_to_disable_user_re-mapping.pdf, 
> YARN-4266_Allow_whitelisted_users_to_disable_user_re-mapping_v2.pdf, 
> YARN-4266_Allow_whitelisted_users_to_disable_user_re-mapping_v3.pdf, 
> YARN-4266-branch-2.8.001.patch
>
>
> Docker provides a mechanism (the --user switch) that enables us to specify 
> the user the container processes should run as. We use this mechanism today 
> when launching docker containers . In non-secure mode, we run the docker 
> container based on 
> `yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user` and in 
> secure mode, as the submitting user. However, this mechanism breaks down with 
> a large number of 'pre-created' images which don't necessarily have the users 
> available within the image. Examples of such images include shared images 
> that need to be used by multiple users. We need a way in which we can allow a 
> pre-defined set of users to run containers based on existing images, without 
> using the --user switch. There are some implications of disabling this user 
> squashing that we'll need to work through : log aggregation, artifact 
> deletion etc.,



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6570) No logs were found for running application, running container

2017-09-19 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-6570:
-
Attachment: YARN-6570-branch-2.8.002.patch

Fix unit test failure and check style warning in 002 patch for branch-2.8.

> No logs were found for running application, running container
> -
>
> Key: YARN-6570
> URL: https://issues.apache.org/jira/browse/YARN-6570
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Sumana Sathish
>Assignee: Junping Du
>Priority: Critical
> Fix For: 2.9.0, 3.0.0-beta1, 3.1.0
>
> Attachments: YARN-6570-branch-2.8.001.patch, 
> YARN-6570-branch-2.8.002.patch, YARN-6570.poc.patch, YARN-6570-v2.patch, 
> YARN-6570-v3.patch
>
>
> 1.Obtain running containers from the following CLI for running application:
>  yarn  container -list appattempt
> 2. Couldnot fetch logs 
> {code}
> Can not find any log file matching the pattern: ALL for the container
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7218) ApiServer REST API naming convention /ws/v1 is already used in Hadoop v2

2017-09-19 Thread Eric Yang (JIRA)
Eric Yang created YARN-7218:
---

 Summary: ApiServer REST API naming convention /ws/v1 is already 
used in Hadoop v2
 Key: YARN-7218
 URL: https://issues.apache.org/jira/browse/YARN-7218
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, applications
Reporter: Eric Yang
Assignee: Eric Yang


In YARN-6626, there is a desire to have ability to run ApiServer REST API in 
Resource Manager, this can eliminate the requirement to deploy another daemon 
service for submitting docker applications.  In YARN-5698, a new UI has been 
implemented as a separate web application.  There are some problems in the 
arrangement that can cause conflicts of how Java session are being managed.  
The root context of Resource Manager web application is /ws.  This is hard 
coded in startWebapp method in ResourceManager.java.  This means all the 
session management is applied to Web URL of /ws prefix.  /ui2 is independent of 
/ws context, therefore session management code doesn't apply to /ui2.  This 
could be a session management problem, if servlet based code is going to be 
introduced into /ui2 web application.

ApiServer code base is designed as a separate web application.  There is no 
easy way to inject a separate web application into the same /ws context because 
ResourceManager is already setup to bind to RMWebServices.  Unless ApiServer 
code is moved into RMWebServices, otherwise, they will not share the same 
session management.

The alternate solution is to keep ApiServer prefix URL independent of /ws 
context.  However, this will be a departure from YARN web services naming 
convention.  This can be loaded as a separate web application in Resource 
Manager jetty server.  One possible proposal is /app/v1/services.  This can 
keep ApiServer code modular and independent from Resource Manager.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7219) Fix AllocateRequestProto difference between branch-2/branch-2.8 and trunk

2017-09-19 Thread Ray Chiang (JIRA)
Ray Chiang created YARN-7219:


 Summary: Fix AllocateRequestProto difference between 
branch-2/branch-2.8 and trunk
 Key: YARN-7219
 URL: https://issues.apache.org/jira/browse/YARN-7219
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: yarn
Affects Versions: 3.0.0-beta1
Reporter: Ray Chiang
Priority: Critical


For yarn_service_protos.proto, we have the following code in
(branch-2.8.0, branch-2.8, branch-2)

{noformat}
message AllocateRequestProto {
  repeated ResourceRequestProto ask = 1;
  repeated ContainerIdProto release = 2;
  optional ResourceBlacklistRequestProto blacklist_request = 3;
  optional int32 response_id = 4;
  optional float progress = 5;
  repeated ContainerResourceIncreaseRequestProto increase_request = 6;
  repeated UpdateContainerRequestProto update_requests = 7;
}
{noformat}

For yarn_service_protos.proto, we have the following code in
(trunk)

{noformat}
message AllocateRequestProto {
  repeated ResourceRequestProto ask = 1;
  repeated ContainerIdProto release = 2;
  optional ResourceBlacklistRequestProto blacklist_request = 3;
  optional int32 response_id = 4;
  optional float progress = 5;
  repeated UpdateContainerRequestProto update_requests = 6;
}
{noformat}

Notes
* YARN-3866 was the original JIRA for container resizing.
* YARN-5221 is what introduced the incompatible change.
* In branch-2/branch-2.8/branch-2.8.0, this protobuf change was undone by 
"Addendum patch to YARN-3866: fix incompatible API change."
* There was a similar API fix done in YARN-6071.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7219) Fix AllocateRequestProto difference between branch-2/branch-2.8 and trunk

2017-09-19 Thread Ray Chiang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172328#comment-16172328
 ] 

Ray Chiang commented on YARN-7219:
--

Will updating the update_requests field to 7 will be enough to fix the 
compatibility issue?  [~asuresh] or [~djp], any comment?

> Fix AllocateRequestProto difference between branch-2/branch-2.8 and trunk
> -
>
> Key: YARN-7219
> URL: https://issues.apache.org/jira/browse/YARN-7219
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 3.0.0-beta1
>Reporter: Ray Chiang
>Priority: Critical
>
> For yarn_service_protos.proto, we have the following code in
> (branch-2.8.0, branch-2.8, branch-2)
> {noformat}
> message AllocateRequestProto {
>   repeated ResourceRequestProto ask = 1;
>   repeated ContainerIdProto release = 2;
>   optional ResourceBlacklistRequestProto blacklist_request = 3;
>   optional int32 response_id = 4;
>   optional float progress = 5;
>   repeated ContainerResourceIncreaseRequestProto increase_request = 6;
>   repeated UpdateContainerRequestProto update_requests = 7;
> }
> {noformat}
> For yarn_service_protos.proto, we have the following code in
> (trunk)
> {noformat}
> message AllocateRequestProto {
>   repeated ResourceRequestProto ask = 1;
>   repeated ContainerIdProto release = 2;
>   optional ResourceBlacklistRequestProto blacklist_request = 3;
>   optional int32 response_id = 4;
>   optional float progress = 5;
>   repeated UpdateContainerRequestProto update_requests = 6;
> }
> {noformat}
> Notes
> * YARN-3866 was the original JIRA for container resizing.
> * YARN-5221 is what introduced the incompatible change.
> * In branch-2/branch-2.8/branch-2.8.0, this protobuf change was undone by 
> "Addendum patch to YARN-3866: fix incompatible API change."
> * There was a similar API fix done in YARN-6071.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7215) REST API to list all deployed services by the same user

2017-09-19 Thread Eric Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-7215:

Description: 
In Slider, it is possible to list deployed applications from the same user by 
using:

{code}
slider list
{code}

This API can help UI to display application and services deployed by the same 
user.
Apiserver does not have ability to list all applications/services at this time. 
 This API requires fast response to list all applications because it is a 
common UI operation.  ApiServer deployed applications persist configuration in 
HDFS similar to slider, but using directory listing to display deployed 
application might cost too much overhead to namenode.  We may want to use 
alternative storage mechanism to cache deployed application configuration to 
accelerate the response time of list deployed applications.

  was:
In Slider, it is possible to list deployed applications from the same user by 
using:


slider list


This API can help UI to display application and services deployed by the same 
user.
Apiserver does not have ability to list all applications/services at this time. 
 This API requires fast response to list all applications because it is a 
common UI operation.  ApiServer deployed applications persist configuration in 
HDFS similar to slider, but using directory listing to display deployed 
application might cost too much overhead to namenode.  We may want to use 
alternative storage mechanism to cache deployed application configuration to 
accelerate the response time of list deployed applications.


> REST API to list all deployed services by the same user
> ---
>
> Key: YARN-7215
> URL: https://issues.apache.org/jira/browse/YARN-7215
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, applications
>Reporter: Eric Yang
>Assignee: Eric Yang
>
> In Slider, it is possible to list deployed applications from the same user by 
> using:
> {code}
> slider list
> {code}
> This API can help UI to display application and services deployed by the same 
> user.
> Apiserver does not have ability to list all applications/services at this 
> time.  This API requires fast response to list all applications because it is 
> a common UI operation.  ApiServer deployed applications persist configuration 
> in HDFS similar to slider, but using directory listing to display deployed 
> application might cost too much overhead to namenode.  We may want to use 
> alternative storage mechanism to cache deployed application configuration to 
> accelerate the response time of list deployed applications.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6968) Hard coded reference to an absolute pathname in org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.launchContainer(Contai

2017-09-19 Thread Eric Badger (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated YARN-6968:
--
Attachment: YARN-6968.001.patch

Attaching a patch that makes the cgroups root directory a NM config in 
yarn-site.xml

> Hard coded reference to an absolute pathname in 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.launchContainer(ContainerRuntimeContext)
> -
>
> Key: YARN-6968
> URL: https://issues.apache.org/jira/browse/YARN-6968
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Miklos Szegedi
>Assignee: Eric Badger
> Attachments: YARN-6968.001.patch
>
>
> This could be done after YARN-6757 is checked in.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7217) PUT method for update service for Service API doesn't function correctly

2017-09-19 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172355#comment-16172355
 ] 

Jian He commented on YARN-7217:
---

Thanks Eric, how about call it spec instead of config ? because the spec itself 
has a config filed which will be confusing
{code} 
@PUT
/ws/v1/services/[service_name]/spec
{code}

In request body, the name field can be an optional field, since it can be 
retrieved from URL path

> PUT method for update service for Service API doesn't function correctly
> 
>
> Key: YARN-7217
> URL: https://issues.apache.org/jira/browse/YARN-7217
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: api, applications
>Reporter: Eric Yang
>
> The PUT method for updateService API provides multiple functions:
> # Stopping a service.
> # Start a service.
> # Increase or decrease number of containers.
> The overloading is buggy depending on how the configuration should be applied.
> Scenario 1
> A user retrieves Service object from getService call, and the Service object 
> contains state: STARTED.  The user would like to increase number of 
> containers for the deployed service.  The JSON has been updated to increase 
> container count.  The PUT method does not actually increase container count.
> Scenario 2
> A user retrieves Service object from getService call, and the Service object 
> contains state: STOPPED.  The user would like to make a environment 
> configuration change.  The configuration does not get updated after PUT 
> method.
> This is possible to address by rearranging the logic of START/STOP after 
> configuration update.  However, there are other potential combinations that 
> can break PUT method.  For example, user like to make configuration changes, 
> but not yet restart the service until a later time.
> The alternative is to separate the PUT method into PUT method for 
> configuration vs status.  This increase the number of action that can be 
> performed.  New API could look like:
> {code}
> @PUT
> /ws/v1/services/[service_name]/config
> Request Data:
> {
>   "name":"[service_name]",
>   "number_of_containers": 5
> }
> {code}
> {code}
> @PUT
> /ws/v1/services/[service_name]/state
> Request data:
> {
>   "name": "[service_name]",
>   "state": "STOPPED|STARTED"
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7220) Use apidoc for REST API documentation

2017-09-19 Thread Eric Yang (JIRA)
Eric Yang created YARN-7220:
---

 Summary: Use apidoc for REST API documentation
 Key: YARN-7220
 URL: https://issues.apache.org/jira/browse/YARN-7220
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: documentation
Reporter: Eric Yang
Assignee: Eric Yang


There are more REST API being developed in Hadoop, and it would be great to 
standardize on the method of generate REST API document.

There are several method done today:
Swagger YAML
Javadoc
Wiki pages
JIRA comments

The most frequently used method is JIRA comments and Wiki pages.  Both methods 
are prone to data loss through passage of time.  We will need a more effortless 
approach to maintain REST API documentation.  Swagger YAML can also be out of 
sync with reality, if new methods are added to java code directly.  Javadoc 
annotation seems like a good approach to maintain REST API document.  Both 
Jersey and Atlassian community has maven plugin to help generating REST API 
document, but those maven plugins have ceased to function.  After searching 
online for REST API documentation for a bit, [apidoc|http://apidocjs.com/] is 
one library that stand out.  This could be the ideal approach to manage Hadoop 
REST API document.  It supports javadoc like annotations, and generate 
beautiful schema changes documentation.

If this is accepted, I will add apidoc installation to dev-support Dockerfile, 
and pom.xml changes for javadoc plugin to ignore the custom tags.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7217) PUT method for update service for Service API doesn't function correctly

2017-09-19 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172359#comment-16172359
 ] 

Eric Yang commented on YARN-7217:
-

[~jianhe] Agree, spec will make this less confusing.

> PUT method for update service for Service API doesn't function correctly
> 
>
> Key: YARN-7217
> URL: https://issues.apache.org/jira/browse/YARN-7217
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: api, applications
>Reporter: Eric Yang
>
> The PUT method for updateService API provides multiple functions:
> # Stopping a service.
> # Start a service.
> # Increase or decrease number of containers.
> The overloading is buggy depending on how the configuration should be applied.
> Scenario 1
> A user retrieves Service object from getService call, and the Service object 
> contains state: STARTED.  The user would like to increase number of 
> containers for the deployed service.  The JSON has been updated to increase 
> container count.  The PUT method does not actually increase container count.
> Scenario 2
> A user retrieves Service object from getService call, and the Service object 
> contains state: STOPPED.  The user would like to make a environment 
> configuration change.  The configuration does not get updated after PUT 
> method.
> This is possible to address by rearranging the logic of START/STOP after 
> configuration update.  However, there are other potential combinations that 
> can break PUT method.  For example, user like to make configuration changes, 
> but not yet restart the service until a later time.
> The alternative is to separate the PUT method into PUT method for 
> configuration vs status.  This increase the number of action that can be 
> performed.  New API could look like:
> {code}
> @PUT
> /ws/v1/services/[service_name]/config
> Request Data:
> {
>   "name":"[service_name]",
>   "number_of_containers": 5
> }
> {code}
> {code}
> @PUT
> /ws/v1/services/[service_name]/state
> Request data:
> {
>   "name": "[service_name]",
>   "state": "STOPPED|STARTED"
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7215) REST API to list all deployed services by the same user

2017-09-19 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172369#comment-16172369
 ] 

Jian He commented on YARN-7215:
---

Another approach is, we can simply get the list of services from RM by a type 
filter set to "yarn-service", in fact, I was trying to implement that but then 
ran into a bug YARN-7076.

> REST API to list all deployed services by the same user
> ---
>
> Key: YARN-7215
> URL: https://issues.apache.org/jira/browse/YARN-7215
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, applications
>Reporter: Eric Yang
>Assignee: Eric Yang
>
> In Slider, it is possible to list deployed applications from the same user by 
> using:
> {code}
> slider list
> {code}
> This API can help UI to display application and services deployed by the same 
> user.
> Apiserver does not have ability to list all applications/services at this 
> time.  This API requires fast response to list all applications because it is 
> a common UI operation.  ApiServer deployed applications persist configuration 
> in HDFS similar to slider, but using directory listing to display deployed 
> application might cost too much overhead to namenode.  We may want to use 
> alternative storage mechanism to cache deployed application configuration to 
> accelerate the response time of list deployed applications.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6943) Update Yarn to YARN in documentation

2017-09-19 Thread Chetna Chaudhari (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetna Chaudhari updated YARN-6943:
---
Attachment: YARN-6943-1.patch

Please review the attached patch. 

> Update Yarn to YARN in documentation
> 
>
> Key: YARN-6943
> URL: https://issues.apache.org/jira/browse/YARN-6943
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Miklos Szegedi
>Assignee: Chetna Chaudhari
>Priority: Minor
>  Labels: newbie
> Attachments: YARN-6943-1.patch
>
>
> Based on the discussion with [~templedf] in YARN-6757 the official case of 
> YARN is YARN, not Yarn, so we should update all the md files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-6333) Improve doc for minSharePreemptionTimeout, fairSharePreemptionTimeout and fairSharePreemptionThreshold

2017-09-19 Thread Chetna Chaudhari (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetna Chaudhari reassigned YARN-6333:
--

Assignee: Chetna Chaudhari

> Improve doc for minSharePreemptionTimeout, fairSharePreemptionTimeout and 
> fairSharePreemptionThreshold
> --
>
> Key: YARN-6333
> URL: https://issues.apache.org/jira/browse/YARN-6333
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 2.9.0, 3.0.0-alpha2
>Reporter: Yufei Gu
>Assignee: Chetna Chaudhari
>  Labels: newbie++
>
> Default values of them are not mentioned in doc. For example, default value 
> of minSharePreemptionTimeout is {{Long.MAX_VALUE}}, which means the min share 
> preemption won't happen until you set a meaningful value. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7221) Add security check for privileged docker container

2017-09-19 Thread Eric Yang (JIRA)
Eric Yang created YARN-7221:
---

 Summary: Add security check for privileged docker container
 Key: YARN-7221
 URL: https://issues.apache.org/jira/browse/YARN-7221
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Eric Yang


When a docker is running with privileges, majority of the use case is to have 
some program running with root then drop privileges to another user.  i.e. 
httpd to start with privileged and bind to port 80, then drop privileges to www 
user.  

# We should add security check for submitting users, to verify they have "sudo" 
access to run privileged container.  
# We should remove --user=uid:gid for privileged containers.  
 
Docker can be launched with --privileged=true, and --user=uid:gid flag.  With 
this parameter combinations, user will not have access to become root user.  
All docker exec command will be drop to uid:gid user to run instead of granting 
privileges.  User can gain root privileges if container file system contains 
files that give user extra power, but this type of image is considered as 
dangerous.  Non-privileged user can launch container with special bits to 
acquire same level of root power.  Hence, we lose control of which image should 
be run with --privileges, and who have sudo rights to use privileged container 
images.  As the result, we should check for sudo access then decide to 
parameterize --privileged=true OR --user=uid:gid.  This will avoid leading 
developer down the wrong path.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7222) Merge org.apache.hadoop.yarn.server.resourcemanager.NodeManager with MockNM

2017-09-19 Thread Yufei Gu (JIRA)
Yufei Gu created YARN-7222:
--

 Summary: Merge 
org.apache.hadoop.yarn.server.resourcemanager.NodeManager with MockNM
 Key: YARN-7222
 URL: https://issues.apache.org/jira/browse/YARN-7222
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 3.1.0
Reporter: Yufei Gu


The existence of org.apache.hadoop.yarn.server.resourcemanager.NodeManager is 
confusing. It is only for RM testing and basically another MockNM. There is no 
Java Doc for class, which easily let people consider it a real Nodemanager. 
Suggest to merge it with MockNM.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6570) No logs were found for running application, running container

2017-09-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172425#comment-16172425
 ] 

Hadoop QA commented on YARN-6570:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
30s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.8 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
19s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
20s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
36s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
33s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
59s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
15s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
36s{color} | {color:green} branch-2.8 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  2m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
29s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 54m 59s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
36s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 86m 51s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.nodemanager.TestNodeStatusUpdaterForLabels |
|   | hadoop.yarn.server.nodemanager.TestNodeManagerReboot |
| Timed out junit tests | 
org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown |
|   | org.apache.hadoop.yarn.server.nodemanager.TestNodeStatusUpdater |
|   | org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerResync |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:c2d96dd |
| JIRA Issue | YARN-6570 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12887966/YARN-6570-branch-2.8.002.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  cc  |
| uname | Linux da7fbbb6 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | branch-2.8 / a81167e |
| Default Java | 1.7.0_151 |
| findbugs | v3.0.0 |
| unit | 
https://builds.ap

[jira] [Updated] (YARN-6333) Improve doc for minSharePreemptionTimeout, fairSharePreemptionTimeout and fairSharePreemptionThreshold

2017-09-19 Thread Chetna Chaudhari (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetna Chaudhari updated YARN-6333:
---
Attachment: YARN-6333-1.patch

> Improve doc for minSharePreemptionTimeout, fairSharePreemptionTimeout and 
> fairSharePreemptionThreshold
> --
>
> Key: YARN-6333
> URL: https://issues.apache.org/jira/browse/YARN-6333
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 2.9.0, 3.0.0-alpha2
>Reporter: Yufei Gu
>Assignee: Chetna Chaudhari
>  Labels: newbie++
> Attachments: YARN-6333-1.patch
>
>
> Default values of them are not mentioned in doc. For example, default value 
> of minSharePreemptionTimeout is {{Long.MAX_VALUE}}, which means the min share 
> preemption won't happen until you set a meaningful value. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7212) [Atsv2] TimelineSchemaCreator fails to create flowrun table causes RegionServer down!

2017-09-19 Thread Rohith Sharma K S (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith Sharma K S updated YARN-7212:

Issue Type: Sub-task  (was: Bug)
Parent: YARN-7213

> [Atsv2] TimelineSchemaCreator fails to create flowrun table causes 
> RegionServer down!
> -
>
> Key: YARN-7212
> URL: https://issues.apache.org/jira/browse/YARN-7212
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Rohith Sharma K S
>
> *Hbase-2.0* officially support *hadoop-alpha* compilations. So I was trying 
> to build and test with HBase-2.0. But table schema creation fails and causes 
> RegionServer to shutdown with following error
> {noformat}
> Caused by: java.lang.NoSuchMethodError: 
> org.apache.hadoop.hbase.Tag.asList([BII)Ljava/util/List;
> at 
> org.apache.hadoop.yarn.server.timelineservice.storage.flow.FlowScanner.getCurrentAggOp(FlowScanner.java:250)
> at 
> org.apache.hadoop.yarn.server.timelineservice.storage.flow.FlowScanner.nextInternal(FlowScanner.java:226)
> at 
> org.apache.hadoop.yarn.server.timelineservice.storage.flow.FlowScanner.next(FlowScanner.java:145)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFlusher.performFlush(StoreFlusher.java:132)
> at 
> org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:75)
> at org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:973)
> at 
> org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:2252)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2672)
> {noformat}
> Since HBase-2.0 community is ready to release Hadoop-3.x compatible versions, 
> ATSv2 also need to support HBase-2.0 versions. For this, we need to take up a 
> task of test and validate HBase-2.0 issues! 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7215) REST API to list all deployed services by the same user

2017-09-19 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172434#comment-16172434
 ] 

Eric Yang commented on YARN-7215:
-

[~jianhe] How does RM handle a service that is in stopped state?  Stopped 
slider application does not have any record in resource manager.  Same slider 
application can have multiple Application ID when the application has been 
restarted.  Slider uses HDFS file to persist the paused application, but having 
resource manager to crawl through lists of HDFS directories to find stopped 
service seems like potential load attack to namenode.  It would be better to 
have the operational record index, and cached by well known mechanism like a 
SOLR collection.  This also reduces having to brew another random read/write, 
low latency, index, cache mechanism in YARN.  Both HBase and SOLR have solved 
random read/write on top of HDFS with some success.  It would be better to we 
use existing libraries that have been baked for several years than inventing 
something new for specialized purpose.

> REST API to list all deployed services by the same user
> ---
>
> Key: YARN-7215
> URL: https://issues.apache.org/jira/browse/YARN-7215
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, applications
>Reporter: Eric Yang
>Assignee: Eric Yang
>
> In Slider, it is possible to list deployed applications from the same user by 
> using:
> {code}
> slider list
> {code}
> This API can help UI to display application and services deployed by the same 
> user.
> Apiserver does not have ability to list all applications/services at this 
> time.  This API requires fast response to list all applications because it is 
> a common UI operation.  ApiServer deployed applications persist configuration 
> in HDFS similar to slider, but using directory listing to display deployed 
> application might cost too much overhead to namenode.  We may want to use 
> alternative storage mechanism to cache deployed application configuration to 
> accelerate the response time of list deployed applications.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-7215) REST API to list all deployed services by the same user

2017-09-19 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172434#comment-16172434
 ] 

Eric Yang edited comment on YARN-7215 at 9/19/17 10:19 PM:
---

[~jianhe] How does RM handle a service that is in stopped state?  Stopped 
slider application does not have any record in resource manager.  Same slider 
application can have multiple Application ID when the application has been 
restarted.  Slider uses HDFS file to persist the paused application, but having 
resource manager to crawl through lists of HDFS directories to find stopped 
service seems like potential load attack to namenode.  It would be better to 
have the operational record index, and cached by well known mechanism like a 
SOLR collection.  This also reduces having to brew another random read/write, 
low latency, index, cache mechanism in YARN.  Both HBase and SOLR have solved 
random read/write on top of HDFS with some success.  It would be better to use 
existing libraries that have been baked for several years than inventing 
something new for specialized purpose.


was (Author: eyang):
[~jianhe] How does RM handle a service that is in stopped state?  Stopped 
slider application does not have any record in resource manager.  Same slider 
application can have multiple Application ID when the application has been 
restarted.  Slider uses HDFS file to persist the paused application, but having 
resource manager to crawl through lists of HDFS directories to find stopped 
service seems like potential load attack to namenode.  It would be better to 
have the operational record index, and cached by well known mechanism like a 
SOLR collection.  This also reduces having to brew another random read/write, 
low latency, index, cache mechanism in YARN.  Both HBase and SOLR have solved 
random read/write on top of HDFS with some success.  It would be better to we 
use existing libraries that have been baked for several years than inventing 
something new for specialized purpose.

> REST API to list all deployed services by the same user
> ---
>
> Key: YARN-7215
> URL: https://issues.apache.org/jira/browse/YARN-7215
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, applications
>Reporter: Eric Yang
>Assignee: Eric Yang
>
> In Slider, it is possible to list deployed applications from the same user by 
> using:
> {code}
> slider list
> {code}
> This API can help UI to display application and services deployed by the same 
> user.
> Apiserver does not have ability to list all applications/services at this 
> time.  This API requires fast response to list all applications because it is 
> a common UI operation.  ApiServer deployed applications persist configuration 
> in HDFS similar to slider, but using directory listing to display deployed 
> application might cost too much overhead to namenode.  We may want to use 
> alternative storage mechanism to cache deployed application configuration to 
> accelerate the response time of list deployed applications.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-6943) Update Yarn to YARN in documentation

2017-09-19 Thread Chetna Chaudhari (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172385#comment-16172385
 ] 

Chetna Chaudhari edited comment on YARN-6943 at 9/19/17 10:20 PM:
--

Thanks [~haibo.chen] and [~miklos.szeg...@cloudera.com]. Please review the 
attached patch. 


was (Author: chetna):
Please review the attached patch. 

> Update Yarn to YARN in documentation
> 
>
> Key: YARN-6943
> URL: https://issues.apache.org/jira/browse/YARN-6943
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Miklos Szegedi
>Assignee: Chetna Chaudhari
>Priority: Minor
>  Labels: newbie
> Attachments: YARN-6943-1.patch
>
>
> Based on the discussion with [~templedf] in YARN-6757 the official case of 
> YARN is YARN, not Yarn, so we should update all the md files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7221) Add security check for privileged docker container

2017-09-19 Thread Eric Badger (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172454#comment-16172454
 ] 

Eric Badger commented on YARN-7221:
---

Linking YARN-4266 as a blocker, since that is the JIRA that will add the code 
necessary for the user to run as a uid:gid pair. I agree that this will break 
privileged containers, since it will force them into their uid:gid pair instead 
of root

> Add security check for privileged docker container
> --
>
> Key: YARN-7221
> URL: https://issues.apache.org/jira/browse/YARN-7221
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>
> When a docker is running with privileges, majority of the use case is to have 
> some program running with root then drop privileges to another user.  i.e. 
> httpd to start with privileged and bind to port 80, then drop privileges to 
> www user.  
> # We should add security check for submitting users, to verify they have 
> "sudo" access to run privileged container.  
> # We should remove --user=uid:gid for privileged containers.  
>  
> Docker can be launched with --privileged=true, and --user=uid:gid flag.  With 
> this parameter combinations, user will not have access to become root user.  
> All docker exec command will be drop to uid:gid user to run instead of 
> granting privileges.  User can gain root privileges if container file system 
> contains files that give user extra power, but this type of image is 
> considered as dangerous.  Non-privileged user can launch container with 
> special bits to acquire same level of root power.  Hence, we lose control of 
> which image should be run with --privileges, and who have sudo rights to use 
> privileged container images.  As the result, we should check for sudo access 
> then decide to parameterize --privileged=true OR --user=uid:gid.  This will 
> avoid leading developer down the wrong path.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6333) Improve doc for minSharePreemptionTimeout, fairSharePreemptionTimeout and fairSharePreemptionThreshold

2017-09-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172457#comment-16172457
 ] 

Hadoop QA commented on YARN-6333:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 17m 31s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | YARN-6333 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12887981/YARN-6333-1.patch |
| Optional Tests |  asflicense  mvnsite  |
| uname | Linux 9c43277230d9 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 
12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 51edaac |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/17525/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Improve doc for minSharePreemptionTimeout, fairSharePreemptionTimeout and 
> fairSharePreemptionThreshold
> --
>
> Key: YARN-6333
> URL: https://issues.apache.org/jira/browse/YARN-6333
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 2.9.0, 3.0.0-alpha2
>Reporter: Yufei Gu
>Assignee: Chetna Chaudhari
>  Labels: newbie++
> Attachments: YARN-6333-1.patch
>
>
> Default values of them are not mentioned in doc. For example, default value 
> of minSharePreemptionTimeout is {{Long.MAX_VALUE}}, which means the min share 
> preemption won't happen until you set a meaningful value. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7215) REST API to list all deployed services by the same user

2017-09-19 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172482#comment-16172482
 ] 

Jian He commented on YARN-7215:
---

bq. How does RM handle a service that is in stopped state?
Actually, RM today already remembers the stopped apps in ZooKeeper, it also has 
its own way to lookup the applications.  I'm not suggesting making RM do any 
more reads/writes.
What is the scope of this jira ? By the description,  it looks only to support 
the old slider list, the slider was also looking up  from  RM, it wasn't 
reading from HDFS.

> REST API to list all deployed services by the same user
> ---
>
> Key: YARN-7215
> URL: https://issues.apache.org/jira/browse/YARN-7215
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, applications
>Reporter: Eric Yang
>Assignee: Eric Yang
>
> In Slider, it is possible to list deployed applications from the same user by 
> using:
> {code}
> slider list
> {code}
> This API can help UI to display application and services deployed by the same 
> user.
> Apiserver does not have ability to list all applications/services at this 
> time.  This API requires fast response to list all applications because it is 
> a common UI operation.  ApiServer deployed applications persist configuration 
> in HDFS similar to slider, but using directory listing to display deployed 
> application might cost too much overhead to namenode.  We may want to use 
> alternative storage mechanism to cache deployed application configuration to 
> accelerate the response time of list deployed applications.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7223) Document GPU isolation feature

2017-09-19 Thread Wangda Tan (JIRA)
Wangda Tan created YARN-7223:


 Summary: Document GPU isolation feature
 Key: YARN-7223
 URL: https://issues.apache.org/jira/browse/YARN-7223
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Wangda Tan
Assignee: Wangda Tan






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7224) Support GPU isolation for docker container

2017-09-19 Thread Wangda Tan (JIRA)
Wangda Tan created YARN-7224:


 Summary: Support GPU isolation for docker container
 Key: YARN-7224
 URL: https://issues.apache.org/jira/browse/YARN-7224
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Wangda Tan
Assignee: Wangda Tan






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6119) Add javadoc for SchedulerApplicationAttempt#getNextResourceRequest

2017-09-19 Thread Chetna Chaudhari (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172488#comment-16172488
 ] 

Chetna Chaudhari commented on YARN-6119:


[~kasha]: This method was removed as a part of 
[YARN-6040|https://issues.apache.org/jira/browse/YARN-6040]

> Add javadoc for SchedulerApplicationAttempt#getNextResourceRequest
> --
>
> Key: YARN-6119
> URL: https://issues.apache.org/jira/browse/YARN-6119
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Reporter: Karthik Kambatla
>Priority: Minor
>  Labels: newbie++
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-6119) Add javadoc for SchedulerApplicationAttempt#getNextResourceRequest

2017-09-19 Thread Chetna Chaudhari (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetna Chaudhari reassigned YARN-6119:
--

Assignee: Chetna Chaudhari

> Add javadoc for SchedulerApplicationAttempt#getNextResourceRequest
> --
>
> Key: YARN-6119
> URL: https://issues.apache.org/jira/browse/YARN-6119
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Reporter: Karthik Kambatla
>Assignee: Chetna Chaudhari
>Priority: Minor
>  Labels: newbie++
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7219) Fix AllocateRequestProto difference between branch-2/branch-2.8 and trunk

2017-09-19 Thread Ray Chiang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172491#comment-16172491
 ] 

Ray Chiang commented on YARN-7219:
--

Similar fix

> Fix AllocateRequestProto difference between branch-2/branch-2.8 and trunk
> -
>
> Key: YARN-7219
> URL: https://issues.apache.org/jira/browse/YARN-7219
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 3.0.0-beta1
>Reporter: Ray Chiang
>Priority: Critical
>
> For yarn_service_protos.proto, we have the following code in
> (branch-2.8.0, branch-2.8, branch-2)
> {noformat}
> message AllocateRequestProto {
>   repeated ResourceRequestProto ask = 1;
>   repeated ContainerIdProto release = 2;
>   optional ResourceBlacklistRequestProto blacklist_request = 3;
>   optional int32 response_id = 4;
>   optional float progress = 5;
>   repeated ContainerResourceIncreaseRequestProto increase_request = 6;
>   repeated UpdateContainerRequestProto update_requests = 7;
> }
> {noformat}
> For yarn_service_protos.proto, we have the following code in
> (trunk)
> {noformat}
> message AllocateRequestProto {
>   repeated ResourceRequestProto ask = 1;
>   repeated ContainerIdProto release = 2;
>   optional ResourceBlacklistRequestProto blacklist_request = 3;
>   optional int32 response_id = 4;
>   optional float progress = 5;
>   repeated UpdateContainerRequestProto update_requests = 6;
> }
> {noformat}
> Notes
> * YARN-3866 was the original JIRA for container resizing.
> * YARN-5221 is what introduced the incompatible change.
> * In branch-2/branch-2.8/branch-2.8.0, this protobuf change was undone by 
> "Addendum patch to YARN-3866: fix incompatible API change."
> * There was a similar API fix done in YARN-6071.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-6499) Remove the doc about Schedulable#redistributeShare()

2017-09-19 Thread Chetna Chaudhari (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetna Chaudhari reassigned YARN-6499:
--

Assignee: Chetna Chaudhari

> Remove the doc about Schedulable#redistributeShare() 
> -
>
> Key: YARN-6499
> URL: https://issues.apache.org/jira/browse/YARN-6499
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: fairscheduler
>Reporter: Yufei Gu
>Assignee: Chetna Chaudhari
>Priority: Trivial
>  Labels: newbie++
>
> Schedulable#redistributeShare() has been removed since YARN-187. We need to 
> remove the doc about it as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6499) Remove the doc about Schedulable#redistributeShare()

2017-09-19 Thread Chetna Chaudhari (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetna Chaudhari updated YARN-6499:
---
Attachment: YARN-6499.patch

> Remove the doc about Schedulable#redistributeShare() 
> -
>
> Key: YARN-6499
> URL: https://issues.apache.org/jira/browse/YARN-6499
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: fairscheduler
>Reporter: Yufei Gu
>Assignee: Chetna Chaudhari
>Priority: Trivial
>  Labels: newbie++
> Attachments: YARN-6499.patch
>
>
> Schedulable#redistributeShare() has been removed since YARN-187. We need to 
> remove the doc about it as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-6169) container-executor message on empty configuration file can be improved

2017-09-19 Thread Chetna Chaudhari (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetna Chaudhari reassigned YARN-6169:
--

Assignee: Chetna Chaudhari

> container-executor message on empty configuration file can be improved
> --
>
> Key: YARN-6169
> URL: https://issues.apache.org/jira/browse/YARN-6169
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Miklos Szegedi
>Assignee: Chetna Chaudhari
>Priority: Trivial
>  Labels: newbie
>
> If the configuration file is empty, we get the following error message:
> {{Invalid configuration provided in /root/etc/hadoop/container-executor.cfg}}
> This is does not provide enough details to figure out what is the issue at 
> the first glance. We should use something like 'Empty configuration file 
> provided...'
> {code}
>   if (cfg->size == 0) {
> fprintf(ERRORFILE, "Invalid configuration provided in %s\n", file_name);
> exit(INVALID_CONFIG_FILE);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4500) Missing default config values in yarn-default.xml

2017-09-19 Thread Chetna Chaudhari (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172498#comment-16172498
 ] 

Chetna Chaudhari commented on YARN-4500:


[~lewuathe]: Are you working on it? If not , can I pick it up ?

> Missing default config values in yarn-default.xml
> -
>
> Key: YARN-4500
> URL: https://issues.apache.org/jira/browse/YARN-4500
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.7.1, 2.6.2
>Reporter: Tianyin Xu
>Assignee: Kai Sasaki
>  Labels: oct16-easy
> Attachments: YARN-4500.01.patch
>
>
> The docs 
> [yarn-default.xml|https://hadoop.apache.org/docs/r2.7.1/hadoop-yarn/hadoop-yarn-common/yarn-default.xml]
>  miss the default values of the following parameters: 
> {{yarn.web-proxy.address}}
> {{yarn.ipc.client.factory.class}}
> {{yarn.ipc.server.factory.class}}
> {{yarn.ipc.record.factory.class}}
> Here we go,
> {code:title=YarnConfiguration.java|borderStyle=solid}
>   97   /** Factory to create client IPC classes.*/
>   98   public static final String IPC_CLIENT_FACTORY_CLASS =
>   99 IPC_PREFIX + "client.factory.class";
>  100   public static final String DEFAULT_IPC_CLIENT_FACTORY_CLASS =
>  101   "org.apache.hadoop.yarn.factories.impl.pb.RpcClientFactoryPBImpl";
>  102 
>  103   /** Factory to create server IPC classes.*/
>  104   public static final String IPC_SERVER_FACTORY_CLASS =
>  105 IPC_PREFIX + "server.factory.class";
>  106   public static final String DEFAULT_IPC_SERVER_FACTORY_CLASS =
>  107   "org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl";
>  108 
>  109   /** Factory to create serializeable records.*/
>  110   public static final String IPC_RECORD_FACTORY_CLASS =
>  111 IPC_PREFIX + "record.factory.class";
>  112   public static final String DEFAULT_IPC_RECORD_FACTORY_CLASS =
>  113   "org.apache.hadoop.yarn.factories.impl.pb.RecordFactoryPBImpl";
>  ...
>  1119   /** The address for the web proxy.*/
>  1120   public static final String PROXY_ADDRESS =
>  1121 PROXY_PREFIX + "address";
>  1122   public static final int DEFAULT_PROXY_PORT = 9099;
>  1123   public static final String DEFAULT_PROXY_ADDRESS =
>  1124 "0.0.0.0:" + DEFAULT_PROXY_PORT;
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7026) Fair scheduler docs should explain what happens when no placement rules are specified

2017-09-19 Thread Chetna Chaudhari (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetna Chaudhari updated YARN-7026:
---
Labels: documentation  (was: )

> Fair scheduler docs should explain what happens when no placement rules are 
> specified
> -
>
> Key: YARN-7026
> URL: https://issues.apache.org/jira/browse/YARN-7026
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: docs
>Affects Versions: 3.0.0-alpha4
>Reporter: Daniel Templeton
>  Labels: documentation
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6968) Hard coded reference to an absolute pathname in org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.launchContainer(Cont

2017-09-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172507#comment-16172507
 ] 

Hadoop QA commented on YARN-6968:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 15m 
55s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
50s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
50s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 in trunk has 1 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
37s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m 
45s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 57s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 4 new + 225 unchanged - 0 fixed = 229 total (was 225) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
16s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
22s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
58s{color} | {color:green} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 generated 0 new + 0 unchanged - 1 fixed = 0 total (was 1) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
31s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
35s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
34s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 14m 28s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 91m 35s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.nodemanager.containermanager.TestContainerManager |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | YARN-6968 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12887973/YARN-6968.001.pa

[jira] [Commented] (YARN-7196) Fix finicky TestContainerManager tests

2017-09-19 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172518#comment-16172518
 ] 

Arun Suresh commented on YARN-7196:
---

[~djp] / [~wangda], what do u think of the latest patch ?

> Fix finicky TestContainerManager tests
> --
>
> Key: YARN-7196
> URL: https://issues.apache.org/jira/browse/YARN-7196
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Attachments: YARN-7196.002.patch, YARN-7196.patch
>
>
> The Testcase {{testContainerUpdateExecTypeGuaranteedToOpportunistic}} seem to 
> fail every once in a while. Maybe have to change the way the event is 
> triggered.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6499) Remove the doc about Schedulable#redistributeShare()

2017-09-19 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172527#comment-16172527
 ] 

Yufei Gu commented on YARN-6499:


+1. Thanks for working on this.

> Remove the doc about Schedulable#redistributeShare() 
> -
>
> Key: YARN-6499
> URL: https://issues.apache.org/jira/browse/YARN-6499
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: fairscheduler
>Reporter: Yufei Gu
>Assignee: Chetna Chaudhari
>Priority: Trivial
>  Labels: newbie++
> Attachments: YARN-6499.patch
>
>
> Schedulable#redistributeShare() has been removed since YARN-187. We need to 
> remove the doc about it as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6499) Remove the doc about Schedulable#redistributeShare()

2017-09-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172538#comment-16172538
 ] 

Hadoop QA commented on YARN-6499:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
 5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 45m 25s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 66m 54s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | YARN-6499 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12887986/YARN-6499.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 28957bf3a9d3 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 51edaac |
| Default Java | 1.8.0_144 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/17526/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/17526/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/17526/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Remove the doc about Schedulable#redistributeShare() 
> -
>
>   

[jira] [Created] (YARN-7225) Add queue and partition info to RM audit log

2017-09-19 Thread Jonathan Hung (JIRA)
Jonathan Hung created YARN-7225:
---

 Summary: Add queue and partition info to RM audit log
 Key: YARN-7225
 URL: https://issues.apache.org/jira/browse/YARN-7225
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Jonathan Hung


Right now RM audit log has fields such as user, ip, resource, etc. Having queue 
and partition  is useful for resource tracking.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



  1   2   >