date:20170804

[jira] [Updated] (YARN-6820) Restrict read access to timelineservice v2 data

2017-08-04 Thread Vrushali C (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vrushali C updated YARN-6820:
-
Attachment: YARN-6820-YARN-5355.002.patch

Attaching patch 002 , updated as per review recommendations.  

> Restrict read access to timelineservice v2 data 
> 
>
> Key: YARN-6820
> URL: https://issues.apache.org/jira/browse/YARN-6820
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Vrushali C
>Assignee: Vrushali C
>  Labels: yarn-5355-merge-blocker
> Attachments: YARN-6820-YARN-5355.0001.patch, 
> YARN-6820-YARN-5355.002.patch
>
>
> Need to provide a way to restrict read access in ATSv2. Not all users should 
> be able to read all entities. On the flip side, some folks may not need any 
> read restrictions, so we need to provide a way to disable this access 
> restriction as well. 
> Initially this access restriction could be done in a simple way via a 
> whitelist of users allowed to read data. That set of users can read all data, 
> no other user can read any data. Can be turned off for all users to read all 
> data.
> Could be stored in a "domain" table in hbase perhaps. Or a configuration 
> setting for the cluster. Or something else that's simple enough. ATSv1 has a 
> concept of domain for isolating users for reading. Would be good to keep that 
> in consideration. 
> In ATSv1, domain offers a namespace for Timeline server allowing users to 
> host multiple entities, isolating them from other users and applications. A 
> “Domain” in ATSV1 primarily stores owner info, read and& write ACL 
> information, created and modified time stamp information. Each Domain is 
> identified by an ID which must be unique across all users in the YARN cluster.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6947) The implementation of Schedulable#getResourceUsage so inefficiency that can reduce the performance of scheduling

2017-08-04 Thread YunFan Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

YunFan Zhou updated YARN-6947:
--
Description: 
Each time the FairScheduler assign container, it will checks whether the 
resources used by the queue exceed Max Share. However, our current calculation 
of the resources of the queue is particularly inefficient, which recursively 
iterates over all child nodes, with high time complexity. 
We can refactor this logic by using lazy update way.

{code:java}
  @Override
  public Resource assignContainer(FSSchedulerNode node) {
Resource assigned = Resources.none();

// If this queue is over its limit, reject
if (!assignContainerPreCheck(node)) {
  return assigned;
}
{code}


{code:java}
   * Helper method to check if the queue should attempt assigning resources
   * 
   * @return true if check passes (can assign) or false otherwise
   */
  boolean assignContainerPreCheck(FSSchedulerNode node) {
if (node.getReservedContainer() != null) {
  if (LOG.isDebugEnabled()) {
LOG.debug("Assigning container failed on node '" + node.getNodeName()
+ " because it has reserved containers.");
  }
  return false;
} else if (!Resources.fitsIn(getResourceUsage(), maxShare)) {
  if (LOG.isDebugEnabled()) {
LOG.debug("Assigning container failed on node '" + node.getNodeName()
+ " because queue resource usage is larger than MaxShare: "
+ dumpState());
  }
  return false;
} else {
  return true;
}
  }
{code}


{code:java}
  @Override
  public Resource getResourceUsage() {
Resource usage = Resources.createResource(0);
readLock.lock();
try {
  for (FSQueue child : childQueues) {
Resources.addTo(usage, child.getResourceUsage());
  }
} finally {
  readLock.unlock();
}
return usage;
  }
{code}




  was:

{code:java}
  @Override
  public Resource assignContainer(FSSchedulerNode node) {
Resource assigned = Resources.none();

// If this queue is over its limit, reject
if (!assignContainerPreCheck(node)) {
  return assigned;
}
{code}



> The implementation of Schedulable#getResourceUsage so inefficiency that can 
> reduce the performance of scheduling
> 
>
> Key: YARN-6947
> URL: https://issues.apache.org/jira/browse/YARN-6947
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Reporter: YunFan Zhou
>Priority: Critical
>
> Each time the FairScheduler assign container, it will checks whether the 
> resources used by the queue exceed Max Share. However, our current 
> calculation of the resources of the queue is particularly inefficient, which 
> recursively iterates over all child nodes, with high time complexity. 
> We can refactor this logic by using lazy update way.
> {code:java}
>   @Override
>   public Resource assignContainer(FSSchedulerNode node) {
> Resource assigned = Resources.none();
> // If this queue is over its limit, reject
> if (!assignContainerPreCheck(node)) {
>   return assigned;
> }
> {code}
> {code:java}
>* Helper method to check if the queue should attempt assigning resources
>* 
>* @return true if check passes (can assign) or false otherwise
>*/
>   boolean assignContainerPreCheck(FSSchedulerNode node) {
> if (node.getReservedContainer() != null) {
>   if (LOG.isDebugEnabled()) {
> LOG.debug("Assigning container failed on node '" + node.getNodeName()
> + " because it has reserved containers.");
>   }
>   return false;
> } else if (!Resources.fitsIn(getResourceUsage(), maxShare)) {
>   if (LOG.isDebugEnabled()) {
> LOG.debug("Assigning container failed on node '" + node.getNodeName()
> + " because queue resource usage is larger than MaxShare: "
> + dumpState());
>   }
>   return false;
> } else {
>   return true;
> }
>   }
> {code}
> {code:java}
>   @Override
>   public Resource getResourceUsage() {
> Resource usage = Resources.createResource(0);
> readLock.lock();
> try {
>   for (FSQueue child : childQueues) {
> Resources.addTo(usage, child.getResourceUsage());
>   }
> } finally {
>   readLock.unlock();
> }
> return usage;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Resolved] (YARN-6947) The implementation of Schedulable#getResourceUsage so inefficiency that can reduce the performance of scheduling

2017-08-04 Thread Yufei Gu (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu resolved YARN-6947.

Resolution: Duplicate

> The implementation of Schedulable#getResourceUsage so inefficiency that can 
> reduce the performance of scheduling
> 
>
> Key: YARN-6947
> URL: https://issues.apache.org/jira/browse/YARN-6947
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Reporter: YunFan Zhou
>Priority: Critical
>
> Each time the FairScheduler assign container, it will checks whether the 
> resources used by the queue exceed Max Share. However, our current 
> calculation of the resources of the queue is particularly inefficient, which 
> recursively iterates over all child nodes, with high time complexity. 
> We can refactor this logic by using lazy update way.
> {code:java}
>   @Override
>   public Resource assignContainer(FSSchedulerNode node) {
> Resource assigned = Resources.none();
> // If this queue is over its limit, reject
> if (!assignContainerPreCheck(node)) {
>   return assigned;
> }
> {code}
> {code:java}
>* Helper method to check if the queue should attempt assigning resources
>* 
>* @return true if check passes (can assign) or false otherwise
>*/
>   boolean assignContainerPreCheck(FSSchedulerNode node) {
> if (node.getReservedContainer() != null) {
>   if (LOG.isDebugEnabled()) {
> LOG.debug("Assigning container failed on node '" + node.getNodeName()
> + " because it has reserved containers.");
>   }
>   return false;
> } else if (!Resources.fitsIn(getResourceUsage(), maxShare)) {
>   if (LOG.isDebugEnabled()) {
> LOG.debug("Assigning container failed on node '" + node.getNodeName()
> + " because queue resource usage is larger than MaxShare: "
> + dumpState());
>   }
>   return false;
> } else {
>   return true;
> }
>   }
> {code}
> {code:java}
>   @Override
>   public Resource getResourceUsage() {
> Resource usage = Resources.createResource(0);
> readLock.lock();
> try {
>   for (FSQueue child : childQueues) {
> Resources.addTo(usage, child.getResourceUsage());
>   }
> } finally {
>   readLock.unlock();
> }
> return usage;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-6788) Improve performance of resource profile branch

2017-08-04 Thread Sunil G (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114043#comment-16114043
 ] 

Sunil G edited comment on YARN-6788 at 8/4/17 8:13 AM:
---

Thanks [~templedf]

Quick clarification for few points:

bq.the findbugs warning is worth fixing
Findbugs warnings which are pointed by Jenkins is the one which is existing at 
YARN-3926. I have fixed as part of this patch, hence there are no findbugs 
warning shown with patch (Please refer *Patch Compile Tests* part in jenkins)

bq.don't forget to move TestResourceUtils into yarn-api
There were two fundamental reasons why it is kept in yarn-common
# ResourceUtils open files like {{resource-types.xml}} or any other files using 
{{ConfigurationProvider}}. Default ConfigurationProvider class is 
{{org.apache.hadoop.yarn.LocalConfigurationProvider}}. But this compiles with 
yarn-common package. Due to this, we can't compile TestResourceUtils when it is 
in yarn-api since this package is first built as per hadoop-yarn pom 
(yarn-common is built post yarn-api) 
# A bunch of sample resource files are also added as {{testResources}} in 
yarn-common pom.xml. I can point to same dir from yarn-api pom or needed to 
copy/duplicate these resources. This is something which we can do (point lookup 
to yarn-common resources for junit tests)

I think point 1 is little tricky and we can leave this test file in yarn-common 
for now. I could add a comment and detail in this file for reference.

bq.checkstyle issue in ResourceUtils s
I could handle this in next patch. I guess I will wait for your comment for 
above point before sharing next patch.


was (Author: sunilg):
Thanks [~templedf]

Quick clarification for few points:

bq.the findbugs warning is worth fixing
Current findbugs warnings pointed by Jenkins is the one which is existing at 
YARN-3926. I have fixed as part of this patch, hence there are no findbugs 
warning shown with patch (Please refer *Patch Compile Tests* part in jenkins)

bq.don't forget to move TestResourceUtils into yarn-api
There were two fundamental reason why its kept in yarn-common
# ResourceUtils open files such {{resource-types.xml}} or any other files using 
{{ConfigurationProvider}}. Default ConfigurationProvider class is 
{{org.apache.hadoop.yarn.LocalConfigurationProvider}}. But this compiles with 
yarn-common package. Due to this, we can't compile TestResourceUtils when its 
in yarn-api since this package is first build as per hadoop-yarn pom 
(yarn-common is built post yarn-api) 
# A bunch of sample resource files are added as {{testResources}} in 
yarn-common pom.xml. I can hard point to same dir from yarn-api or need to 
copy/duplicate these resource. This something doing (point lookup to 
yarn-common resources for junit tests)

I think point 1 is little tricky and we can leave this file in yarn-common for 
now. I could add a comment and detail in this file for reference.

bq.checkstyle issue in ResourceUtils s
I could handle this in next patch. I guess I will wait for your comment for 
above point before sharing next patch.

> Improve performance of resource profile branch
> --
>
> Key: YARN-6788
> URL: https://issues.apache.org/jira/browse/YARN-6788
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Sunil G
>Assignee: Sunil G
>Priority: Blocker
> Attachments: YARN-6788-YARN-3926.001.patch, 
> YARN-6788-YARN-3926.002.patch, YARN-6788-YARN-3926.003.patch, 
> YARN-6788-YARN-3926.004.patch, YARN-6788-YARN-3926.005.patch, 
> YARN-6788-YARN-3926.006.patch, YARN-6788-YARN-3926.007.patch, 
> YARN-6788-YARN-3926.008.patch, YARN-6788-YARN-3926.009.patch, 
> YARN-6788-YARN-3926.010.patch, YARN-6788-YARN-3926.011.patch, 
> YARN-6788-YARN-3926.012.patch, YARN-6788-YARN-3926.013.patch, 
> YARN-6788-YARN-3926.014.patch, YARN-6788-YARN-3926.015.patch, 
> YARN-6788-YARN-3926.016.patch, YARN-6788-YARN-3926.017.patch, 
> YARN-6788-YARN-3926.018.patch, YARN-6788-YARN-3926.019.patch, 
> YARN-6788-YARN-3926.020.patch, YARN-6788-YARN-3926.021.patch, 
> YARN-6788-YARN-3926.022.patch, YARN-6788-YARN-3926.022.patch
>
>
> Currently we could see a 15% performance delta with this branch. 
> Few performance improvements to improve the same.
> Also this patch will handle 
> [comments|https://issues.apache.org/jira/browse/YARN-6761?focusedCommentId=16075418=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16075418]
>  from [~leftnoteasy].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6920) Fix TestNMClient failure due to YARN-6706

2017-08-04 Thread Jian He (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114006#comment-16114006
 ] 

Jian He commented on YARN-6920:
---

I see, thanks for the explanation, so, it's guaranteed the Guaranteed container 
will be started. 
But I was wondering if this will cause unnecessary churn. Like:
1) CONTAINER_COMPLETED sent
2) opportunistic container started.
3) SCHEDULE_CONTAINER sent
3) opportunistic killed and make room for the original upgrading container.
If above is possible occur, we can eliminate this by skipping checking if 
should launch opportunistic container, and container upgrade can happen more 
smoothly.

> Fix TestNMClient failure due to YARN-6706
> -
>
> Key: YARN-6920
> URL: https://issues.apache.org/jira/browse/YARN-6920
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Attachments: YARN-6920.001.patch, YARN-6920.002.patch, 
> YARN-6920.003.patch, YARN-6920.004.patch
>
>
> Looks like {{TestNMClient}} has been failing for a while. Opening this JIRA 
> to track the fix.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6920) Fix TestNMClient failure due to YARN-6706

2017-08-04 Thread Arun Suresh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114021#comment-16114021
 ] 

Arun Suresh commented on YARN-6920:
---

True.. the sequence of events you mentioned can happen. I am hoping that once 
YARN-5972 is completed, freezing and later thawing of the opportunistic 
container using cgroups freezer / docker pause - rather than simply killing it 
- will ensure no work is lost (we have seen good results in production on 
Windows). I can maybe raise a JIRA to optimize the above path and keep it open 
till we finish with YARN-5972. That way, I can get some data and see an 
optimization is required. Thoughts ?

> Fix TestNMClient failure due to YARN-6706
> -
>
> Key: YARN-6920
> URL: https://issues.apache.org/jira/browse/YARN-6920
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Attachments: YARN-6920.001.patch, YARN-6920.002.patch, 
> YARN-6920.003.patch, YARN-6920.004.patch
>
>
> Looks like {{TestNMClient}} has been failing for a while. Opening this JIRA 
> to track the fix.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-6947) The implementation of Schedulable#getResourceUsage so inefficiency that can reduce the performance of scheduling

2017-08-04 Thread YunFan Zhou (JIRA)

YunFan Zhou created YARN-6947:
-

 Summary: The implementation of Schedulable#getResourceUsage so 
inefficiency that can reduce the performance of scheduling
 Key: YARN-6947
 URL: https://issues.apache.org/jira/browse/YARN-6947
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: fairscheduler
Reporter: YunFan Zhou
Priority: Critical






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6788) Improve performance of resource profile branch

2017-08-04 Thread Sunil G (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114043#comment-16114043
 ] 

Sunil G commented on YARN-6788:
---

Thanks [~templedf]

Quick clarification for few points:

bq.the findbugs warning is worth fixing
Current findbugs warnings pointed by Jenkins is the one which is existing at 
YARN-3926. I have fixed as part of this patch, hence there are no findbugs 
warning shown with patch (Please refer *Patch Compile Tests* part in jenkins)

bq.don't forget to move TestResourceUtils into yarn-api
There were two fundamental reason why its kept in yarn-common
# ResourceUtils open files such {{resource-types.xml}} or any other files using 
{{ConfigurationProvider}}. Default ConfigurationProvider class is 
{{org.apache.hadoop.yarn.LocalConfigurationProvider}}. But this compiles with 
yarn-common package. Due to this, we can't compile TestResourceUtils when its 
in yarn-api since this package is first build as per hadoop-yarn pom 
(yarn-common is built post yarn-api) 
# A bunch of sample resource files are added as {{testResources}} in 
yarn-common pom.xml. I can hard point to same dir from yarn-api or need to 
copy/duplicate these resource. This something doing (point lookup to 
yarn-common resources for junit tests)

I think point 1 is little tricky and we can leave this file in yarn-common for 
now. I could add a comment and detail in this file for reference.

bq.checkstyle issue in ResourceUtils s
I could handle this in next patch. I guess I will wait for your comment for 
above point before sharing next patch.

> Improve performance of resource profile branch
> --
>
> Key: YARN-6788
> URL: https://issues.apache.org/jira/browse/YARN-6788
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Sunil G
>Assignee: Sunil G
>Priority: Blocker
> Attachments: YARN-6788-YARN-3926.001.patch, 
> YARN-6788-YARN-3926.002.patch, YARN-6788-YARN-3926.003.patch, 
> YARN-6788-YARN-3926.004.patch, YARN-6788-YARN-3926.005.patch, 
> YARN-6788-YARN-3926.006.patch, YARN-6788-YARN-3926.007.patch, 
> YARN-6788-YARN-3926.008.patch, YARN-6788-YARN-3926.009.patch, 
> YARN-6788-YARN-3926.010.patch, YARN-6788-YARN-3926.011.patch, 
> YARN-6788-YARN-3926.012.patch, YARN-6788-YARN-3926.013.patch, 
> YARN-6788-YARN-3926.014.patch, YARN-6788-YARN-3926.015.patch, 
> YARN-6788-YARN-3926.016.patch, YARN-6788-YARN-3926.017.patch, 
> YARN-6788-YARN-3926.018.patch, YARN-6788-YARN-3926.019.patch, 
> YARN-6788-YARN-3926.020.patch, YARN-6788-YARN-3926.021.patch, 
> YARN-6788-YARN-3926.022.patch, YARN-6788-YARN-3926.022.patch
>
>
> Currently we could see a 15% performance delta with this branch. 
> Few performance improvements to improve the same.
> Also this patch will handle 
> [comments|https://issues.apache.org/jira/browse/YARN-6761?focusedCommentId=16075418=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16075418]
>  from [~leftnoteasy].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6873) Moving logging APIs over to slf4j in hadoop-yarn-server-applicationhistoryservice

2017-08-04 Thread Wenxin He (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16113991#comment-16113991
 ] 

Wenxin He commented on YARN-6873:
-

[~Cyl], since HADOOP-14706 is commited, would you use the helper method 
{{isLog4jLogger}} in your patch?

> Moving logging APIs over to slf4j in 
> hadoop-yarn-server-applicationhistoryservice
> -
>
> Key: YARN-6873
> URL: https://issues.apache.org/jira/browse/YARN-6873
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Yeliang Cang
>Assignee: Yeliang Cang
> Attachments: YARN-6873.001.patch, YARN-6873.002.patch, 
> YARN-6873.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6938) Add a flag to indicate whether timeline server acl enabled

2017-08-04 Thread YunFan Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

YunFan Zhou updated YARN-6938:
--
Component/s: timelineserver

> Add a flag to indicate whether timeline server acl enabled
> --
>
> Key: YARN-6938
> URL: https://issues.apache.org/jira/browse/YARN-6938
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: timelineserver
>Reporter: YunFan Zhou
>Assignee: YunFan Zhou
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6947) The implementation of Schedulable#getResourceUsage so inefficiency that can reduce the performance of scheduling

2017-08-04 Thread YunFan Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

YunFan Zhou updated YARN-6947:
--
Description: 

{code:java}
  @Override
  public Resource assignContainer(FSSchedulerNode node) {
Resource assigned = Resources.none();

// If this queue is over its limit, reject
if (!assignContainerPreCheck(node)) {
  return assigned;
}
{code}


> The implementation of Schedulable#getResourceUsage so inefficiency that can 
> reduce the performance of scheduling
> 
>
> Key: YARN-6947
> URL: https://issues.apache.org/jira/browse/YARN-6947
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Reporter: YunFan Zhou
>Priority: Critical
>
> {code:java}
>   @Override
>   public Resource assignContainer(FSSchedulerNode node) {
> Resource assigned = Resources.none();
> // If this queue is over its limit, reject
> if (!assignContainerPreCheck(node)) {
>   return assigned;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-6820) Restrict read access to timelineservice v2 data

2017-08-04 Thread Vrushali C (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114028#comment-16114028
 ] 

Vrushali C edited comment on YARN-6820 at 8/4/17 7:07 AM:
--

Attaching patch 002 , updated as per review recommendations.  

I have added two new classes: 
TimelineReaderWhitelistAuthorizationFilterInitializer and 
TimelineReaderWhitelistAuthorizationFilter. These are similar to other filter 
classes in hadoop. These names feel a bit too lengthy to me, wondering if / how 
to make them shorter.

The filter class now uses AccessControlList to determine if a user should be 
allowed or not. It also checks for admins and allows them to read timeline 
service v2 data.

I have added unit tests for checking users and groups set in the config similar 
to the way yarn admin acl config params are set. I also ran other unit tests 
for timeline v2 reader webservices and saw that these filters are being 
invoked. Thanks [~jrottinghuis] for helping me wade through the code base this 
afternoon. 

I will be out for the next 3 days, so will respond to review suggestions after 
Monday afternoon. 

(I am yet to update the documentation for this. Will do so in either this jira 
or the documentation jira YARN-6047.)



was (Author: vrushalic):
Attaching patch 002 , updated as per review recommendations.  

> Restrict read access to timelineservice v2 data 
> 
>
> Key: YARN-6820
> URL: https://issues.apache.org/jira/browse/YARN-6820
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Vrushali C
>Assignee: Vrushali C
>  Labels: yarn-5355-merge-blocker
> Attachments: YARN-6820-YARN-5355.0001.patch, 
> YARN-6820-YARN-5355.002.patch
>
>
> Need to provide a way to restrict read access in ATSv2. Not all users should 
> be able to read all entities. On the flip side, some folks may not need any 
> read restrictions, so we need to provide a way to disable this access 
> restriction as well. 
> Initially this access restriction could be done in a simple way via a 
> whitelist of users allowed to read data. That set of users can read all data, 
> no other user can read any data. Can be turned off for all users to read all 
> data.
> Could be stored in a "domain" table in hbase perhaps. Or a configuration 
> setting for the cluster. Or something else that's simple enough. ATSv1 has a 
> concept of domain for isolating users for reading. Would be good to keep that 
> in consideration. 
> In ATSv1, domain offers a namespace for Timeline server allowing users to 
> host multiple entities, isolating them from other users and applications. A 
> “Domain” in ATSV1 primarily stores owner info, read and& write ACL 
> information, created and modified time stamp information. Each Domain is 
> identified by an ID which must be unique across all users in the YARN cluster.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6361) FairScheduler: FSLeafQueue.fetchAppsWithDemand CPU usage is high with big queues

2017-08-04 Thread Yufei Gu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114094#comment-16114094
 ] 

Yufei Gu commented on YARN-6361:


You can take it.

> FairScheduler: FSLeafQueue.fetchAppsWithDemand CPU usage is high with big 
> queues
> 
>
> Key: YARN-6361
> URL: https://issues.apache.org/jira/browse/YARN-6361
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Miklos Szegedi
>Assignee: Yufei Gu
>Priority: Minor
> Attachments: dispatcherthread.png, threads.png
>
>
> FSLeafQueue.fetchAppsWithDemand sorts the applications by the current policy. 
> Most of the time is spent in FairShareComparator.compare. We could improve 
> this by doing the calculations outside the sort loop {{(O\(n\))}} and we 
> sorted by a fixed number inside instead {{O(n*log\(n\))}}. This could be an 
> performance issue when there are huge number of applications in a single 
> queue. The attachments shows the performance impact when there are 10k 
> applications in one queue.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6361) FairScheduler: FSLeafQueue.fetchAppsWithDemand CPU usage is high with big queues

2017-08-04 Thread YunFan Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114061#comment-16114061
 ] 

YunFan Zhou commented on YARN-6361:
---

[~yufeigu] Hi, Yufei. I would like to work on this JIRA if you have not yet 
started working, please inform can I take over this JIRA. Thank you.

> FairScheduler: FSLeafQueue.fetchAppsWithDemand CPU usage is high with big 
> queues
> 
>
> Key: YARN-6361
> URL: https://issues.apache.org/jira/browse/YARN-6361
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Miklos Szegedi
>Assignee: Yufei Gu
>Priority: Minor
> Attachments: dispatcherthread.png, threads.png
>
>
> FSLeafQueue.fetchAppsWithDemand sorts the applications by the current policy. 
> Most of the time is spent in FairShareComparator.compare. We could improve 
> this by doing the calculations outside the sort loop {{(O\(n\))}} and we 
> sorted by a fixed number inside instead {{O(n*log\(n\))}}. This could be an 
> performance issue when there are huge number of applications in a single 
> queue. The attachments shows the performance impact when there are 10k 
> applications in one queue.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6951) Fix debug log when Resource handler chain is enabled

2017-08-04 Thread Yang Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Wang updated YARN-6951:

Description: 
{code:title=LinuxContainerExecutor.java}
  ... ...
  if (LOG.isDebugEnabled()) {
LOG.debug("Resource handler chain enabled = " + (resourceHandlerChain
== null));
  }
  ... ...
{code}
I think it is just a typo.When resourceHandlerChain is not null, print the log 
"Resource handler chain enabled = true".

  was:
{code title=LinuxContainerExecutor.java}
  ... ...
  if (LOG.isDebugEnabled()) {
LOG.debug("Resource handler chain enabled = " + (resourceHandlerChain
== null));
  }
  ... ...
{code}
I think it is just a typo.When resourceHandlerChain is not null, print the log 
"Resource handler chain enabled = true".


> Fix debug log when Resource handler chain is enabled
> 
>
> Key: YARN-6951
> URL: https://issues.apache.org/jira/browse/YARN-6951
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Yang Wang
>
> {code:title=LinuxContainerExecutor.java}
>   ... ...
>   if (LOG.isDebugEnabled()) {
> LOG.debug("Resource handler chain enabled = " + (resourceHandlerChain
> == null));
>   }
>   ... ...
> {code}
> I think it is just a typo.When resourceHandlerChain is not null, print the 
> log "Resource handler chain enabled = true".



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Assigned] (YARN-6951) Fix debug log when Resource handler chain is enabled

2017-08-04 Thread Yang Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Wang reassigned YARN-6951:
---

Assignee: Yang Wang

> Fix debug log when Resource handler chain is enabled
> 
>
> Key: YARN-6951
> URL: https://issues.apache.org/jira/browse/YARN-6951
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Yang Wang
>Assignee: Yang Wang
> Attachments: YARN-6951.001.patch
>
>
> {code:title=LinuxContainerExecutor.java}
>   ... ...
>   if (LOG.isDebugEnabled()) {
> LOG.debug("Resource handler chain enabled = " + (resourceHandlerChain
> == null));
>   }
>   ... ...
> {code}
> I think it is just a typo.When resourceHandlerChain is not null, print the 
> log "Resource handler chain enabled = true".



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6133) [ATSv2 Security] Renew delegation token for app automatically if an app collector is active

2017-08-04 Thread Rohith Sharma K S (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114235#comment-16114235
 ] 

Rohith Sharma K S commented on YARN-6133:
-

thanks @varun for the patch. 
Some comments
# Token is renewed just before 10 seconds. Should it be increased?
# TimelineCollectorManager has introduced synchronized block. This is not 
necessary right.?
# Renewer threads count is 1. Given load on NM not much, one thread can renew 
it. But I would suggest to keep it to 50? 

> [ATSv2 Security] Renew delegation token for app automatically if an app 
> collector is active
> ---
>
> Key: YARN-6133
> URL: https://issues.apache.org/jira/browse/YARN-6133
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>  Labels: yarn-5355-merge-blocker
> Attachments: YARN-6133-YARN-5355.01.patch, 
> YARN-6133-YARN-5355.02.patch, YARN-6133-YARN-5355.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-6951) Fix debug log when Resource handler chain is enabled

2017-08-04 Thread Yang Wang (JIRA)

Yang Wang created YARN-6951:
---

 Summary: Fix debug log when Resource handler chain is enabled
 Key: YARN-6951
 URL: https://issues.apache.org/jira/browse/YARN-6951
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Yang Wang


{code title=LinuxContainerExecutor.java}
  ... ...
  if (LOG.isDebugEnabled()) {
LOG.debug("Resource handler chain enabled = " + (resourceHandlerChain
== null));
  }
  ... ...
{code}
I think it is just a typo.When resourceHandlerChain is not null, print the log 
"Resource handler chain enabled = true".



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6951) Fix debug log when Resource handler chain is enabled

2017-08-04 Thread Yang Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Wang updated YARN-6951:

Attachment: YARN-6951.001.patch

> Fix debug log when Resource handler chain is enabled
> 
>
> Key: YARN-6951
> URL: https://issues.apache.org/jira/browse/YARN-6951
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Yang Wang
> Attachments: YARN-6951.001.patch
>
>
> {code:title=LinuxContainerExecutor.java}
>   ... ...
>   if (LOG.isDebugEnabled()) {
> LOG.debug("Resource handler chain enabled = " + (resourceHandlerChain
> == null));
>   }
>   ... ...
> {code}
> I think it is just a typo.When resourceHandlerChain is not null, print the 
> log "Resource handler chain enabled = true".



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6133) [ATSv2 Security] Renew delegation token for app automatically if an app collector is active

2017-08-04 Thread Varun Saxena (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114248#comment-16114248
 ] 

Varun Saxena commented on YARN-6133:


bq. Token is renewed just before 10 seconds. Should it be increased?
What do you suggest? 10 seconds should be enough as we renew only in DT manager 
i.e. internally in NM. Token doesn't need to go to AM. Right?

bq. TimelineCollectorManager has introduced synchronized block. This is not 
necessary right.?
This is to avoid race between Collector stopping and renewal timer expiring. So 
that additional renewal timer is not set unnecessarily.  Has no functional 
impact though even if we set because it just wont find collector on expiry. But 
I thought better to avoid it altogether. Thoughts?

bq. Renewer threads count is 1. Given load on NM not much, one thread can renew 
it. But I would suggest to keep it to 50?
How many active collectors do we expect in one NM? Token renewal and token 
generation is not a very heavy task as well. Assuming we have 1000 active apps 
in say a 5000 node large cluster, we will have AMs' distributed across multiple 
nodes. So It is unlikely you will have more than 4-5 app collectors running in 
any NM at a particular moment. And even there it is unlikely that all 
collectors will have their token renewal expiry at same moment.
There are no guarantees though. But it is unlikely. We may have a situation 
wherein we launch AMs' on a particular node partition though. In this case 
there might be some hotspotting, as in multiple app collectors on one node.
But even there, 50 might be too many I think. We can keep a value higher than 1 
though if you have concerns with only 1 thread, maybe 3-5. Keep it configurable 
with default 3 or 5?

> [ATSv2 Security] Renew delegation token for app automatically if an app 
> collector is active
> ---
>
> Key: YARN-6133
> URL: https://issues.apache.org/jira/browse/YARN-6133
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>  Labels: yarn-5355-merge-blocker
> Attachments: YARN-6133-YARN-5355.01.patch, 
> YARN-6133-YARN-5355.02.patch, YARN-6133-YARN-5355.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6361) FairScheduler: FSLeafQueue.fetchAppsWithDemand CPU usage is high with big queues

2017-08-04 Thread YunFan Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114258#comment-16114258
 ] 

YunFan Zhou commented on YARN-6361:
---

[~yufeigu] Thank Yufei.
For this question, including the optimization of the scheduling performance of 
the FairScheduler, I have the following ideas, and I apply these ideas to our 
production environment. The performance of the scheduling is ideal, and the 
speed of the assigning container can reach 5000 ~ 1 per second when 
aggregate resource requirements for the cluster is high.

Here's what I do:
* Avoid frequent ordering, and it's pointless and a waste of time to do a 
sequence before each assign container. Because, after each assignment, the 
whole child nodes of the queue are basically staying in order. 
And we don't really need to ensure that all of our fair shares is guaranteed, 
after all, even though we do a sort of order before each of the container's 
assignment because the *FSQueue#demand* is updated in the last time the 
*FairScheduler# update* cycle. 
So the value of demand is not real time, which also leads to the fact that we 
are not strictly and fairly shared.
So, we can sort all the queues at the *FairScheduler#update* cycle, and we now 
have a default of 0.5 s per update cycle, which is worth doing. 
Since we have not been able to make a strict fair share, why don't we sacrifice 
some of our semantics of fair scheduler in exchange for better performance?
* Improve the performance of the *Schedulable#getResourceUsage* calculation, 
making it complex in O(1).

For one, there are several related smaller but especially useful optimization 
points. 
But I don't know if you can accept that. 
If you can accept it, I will list a few more detailed points later.


> FairScheduler: FSLeafQueue.fetchAppsWithDemand CPU usage is high with big 
> queues
> 
>
> Key: YARN-6361
> URL: https://issues.apache.org/jira/browse/YARN-6361
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Miklos Szegedi
>Assignee: Yufei Gu
>Priority: Minor
> Attachments: dispatcherthread.png, threads.png
>
>
> FSLeafQueue.fetchAppsWithDemand sorts the applications by the current policy. 
> Most of the time is spent in FairShareComparator.compare. We could improve 
> this by doing the calculations outside the sort loop {{(O\(n\))}} and we 
> sorted by a fixed number inside instead {{O(n*log\(n\))}}. This could be an 
> performance issue when there are huge number of applications in a single 
> queue. The attachments shows the performance impact when there are 10k 
> applications in one queue.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Assigned] (YARN-6361) FairScheduler: FSLeafQueue.fetchAppsWithDemand CPU usage is high with big queues

2017-08-04 Thread YunFan Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

YunFan Zhou reassigned YARN-6361:
-

Assignee: YunFan Zhou  (was: Yufei Gu)

> FairScheduler: FSLeafQueue.fetchAppsWithDemand CPU usage is high with big 
> queues
> 
>
> Key: YARN-6361
> URL: https://issues.apache.org/jira/browse/YARN-6361
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Miklos Szegedi
>Assignee: YunFan Zhou
>Priority: Minor
> Attachments: dispatcherthread.png, threads.png
>
>
> FSLeafQueue.fetchAppsWithDemand sorts the applications by the current policy. 
> Most of the time is spent in FairShareComparator.compare. We could improve 
> this by doing the calculations outside the sort loop {{(O\(n\))}} and we 
> sorted by a fixed number inside instead {{O(n*log\(n\))}}. This could be an 
> performance issue when there are huge number of applications in a single 
> queue. The attachments shows the performance impact when there are 10k 
> applications in one queue.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-6948) Invalid event: ATTEMPT_ADDED at FINAL_SAVING

2017-08-04 Thread lujie (JIRA)

lujie created YARN-6948:
---

 Summary: Invalid event: ATTEMPT_ADDED at FINAL_SAVING
 Key: YARN-6948
 URL: https://issues.apache.org/jira/browse/YARN-6948
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 2.8.0
Reporter: lujie


When I send kill command to a running job, I check the logs and find the 
Exception:

{code:java}
2017-08-03 01:35:20,485 ERROR 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
Can't handle this event at current state
org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
ATTEMPT_ADDED at FINAL_SAVING
at 
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
at 
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
at 
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:757)
at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:106)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:834)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:815)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
at java.lang.Thread.run(Thread.java:745)
{code}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-6950) Invalid event: LAUNCH_FAILED at FAILED

2017-08-04 Thread lujie (JIRA)

lujie created YARN-6950:
---

 Summary: Invalid event: LAUNCH_FAILED at FAILED
 Key: YARN-6950
 URL: https://issues.apache.org/jira/browse/YARN-6950
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 2.6.0
Reporter: lujie


A RMAppAttemptImpl fail due to some reason,meanwhile AM fails to launch a 
container and send event  LAUNCH_FAILED,and the StateMachine can not handle it:

{code:java}
2017-07-05 03:33:09,013 ERROR 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
Can't handle this event at current state
org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
LAUNCH_FAILED at FAILED
at 
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
at 
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
at 
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:757)
at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:106)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:834)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:815)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
at java.lang.Thread.run(Thread.java:745)
{code}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5621) Support LinuxContainerExecutor to create symlinks for continuously localized resources

2017-08-04 Thread Yang Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114150#comment-16114150
 ] 

Yang Wang commented on YARN-5621:
-

{code:title=LinuxContainerExecutor.java}
  protected void createSymlinkAsUser(String user, File privateScriptFile,
  String userScriptFile)
  throws PrivilegedOperationException {
  String runAsUser = getRunAsUser(user);
  ... ...
{code}
I think we should use containerUser instead of runAsUser here. Because it may 
cause "Invalid command" in container-executor when getRunAsUser return 
nonsecureLocalUser.

> Support LinuxContainerExecutor to create symlinks for continuously localized 
> resources
> --
>
> Key: YARN-5621
> URL: https://issues.apache.org/jira/browse/YARN-5621
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Jian He
>Assignee: Jian He
>  Labels: oct16-hard
> Attachments: YARN-5621.1.patch, YARN-5621.2.patch, YARN-5621.3.patch, 
> YARN-5621.4.patch, YARN-5621.5.patch
>
>
> When new resources are localized, new symlink needs to be created for the 
> localized resource. This is the change for the LinuxContainerExecutor to 
> create the symlinks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-5621) Support LinuxContainerExecutor to create symlinks for continuously localized resources

2017-08-04 Thread Yang Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114150#comment-16114150
 ] 

Yang Wang edited comment on YARN-5621 at 8/4/17 9:08 AM:
-

{code:title=LinuxContainerExecutor.java}
  protected void createSymlinkAsUser(String user, File privateScriptFile,
  String userScriptFile)
  throws PrivilegedOperationException {
  String runAsUser = getRunAsUser(user);
  ... ...
{code}
Hi,[~jianhe]
I think we should use containerUser instead of runAsUser here. Because it may 
cause "Invalid command" in container-executor when getRunAsUser return 
nonsecureLocalUser.


was (Author: fly_in_gis):
{code:title=LinuxContainerExecutor.java}
  protected void createSymlinkAsUser(String user, File privateScriptFile,
  String userScriptFile)
  throws PrivilegedOperationException {
  String runAsUser = getRunAsUser(user);
  ... ...
{code}
I think we should use containerUser instead of runAsUser here. Because it may 
cause "Invalid command" in container-executor when getRunAsUser return 
nonsecureLocalUser.

> Support LinuxContainerExecutor to create symlinks for continuously localized 
> resources
> --
>
> Key: YARN-5621
> URL: https://issues.apache.org/jira/browse/YARN-5621
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Jian He
>Assignee: Jian He
>  Labels: oct16-hard
> Attachments: YARN-5621.1.patch, YARN-5621.2.patch, YARN-5621.3.patch, 
> YARN-5621.4.patch, YARN-5621.5.patch
>
>
> When new resources are localized, new symlink needs to be created for the 
> localized resource. This is the change for the LinuxContainerExecutor to 
> create the symlinks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-6949) Invalid event: LOCALIZED at LOCALIZED

2017-08-04 Thread lujie (JIRA)

lujie created YARN-6949:
---

 Summary: Invalid event: LOCALIZED at LOCALIZED
 Key: YARN-6949
 URL: https://issues.apache.org/jira/browse/YARN-6949
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 2.8.0
Reporter: lujie


When job is running, I stop a nodemanager in one machine due to some reason, 
Then I check the logs to see the running state,I find many 
InvalidStateTransitionException:

{code:java}
rg.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: 
LOCALIZATION_FAILED at LOCALIZED
at 
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
at 
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
at 
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource.handle(LocalizedResource.java:198)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl.handle(LocalResourcesTrackerImpl.java:194)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl.handle(LocalResourcesTrackerImpl.java:58)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.processHeartbeat(ResourceLocalizationService.java:1058)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.processHeartbeat(ResourceLocalizationService.java:720)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.heartbeat(ResourceLocalizationService.java:355)
at 
org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.service.LocalizationProtocolPBServiceImpl.heartbeat(LocalizationProtocolPBServiceImpl.java:48)
at 
org.apache.hadoop.yarn.proto.LocalizationProtocol$LocalizationProtocolService$2.callBlockingMethod(LocalizationProtocol.java:63)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:447)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:845)
{code}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6726) Fix issues with docker commands executed by container-executor

2017-08-04 Thread Sunil G (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114160#comment-16114160
 ] 

Sunil G commented on YARN-6726:
---

Sorry for pitching in late here:

Some minor comments:
# It will be better if we can write to LOGFILE ot ERRORFILE regarding 
{{regex_match}} failure if any from {{validate_docker_image_name}} method. It 
could help us getting for information regarding regex failure if any.
# A suggestion. {{validate_docker_image_name}} could also take {{regex_str}} as 
input. In that case we can use this method for any future regex matching.
# validate_container_id could take const param
# I think i am missing something. Could you please to share why we need a 
prefix of UTILS here, is this a standard.  {{#ifndef _UTILS_STRING_UTILS_H_}}


> Fix issues with docker commands executed by container-executor
> --
>
> Key: YARN-6726
> URL: https://issues.apache.org/jira/browse/YARN-6726
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
> Attachments: YARN-6726.001.patch, YARN-6726.002.patch
>
>
> docker inspect, rm, stop, etc are issued through container-executor. Commands 
> other than docker run are not functioning properly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6788) Improve performance of resource profile branch

2017-08-04 Thread Sunil G (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114705#comment-16114705
 ] 

Sunil G commented on YARN-6788:
---

Thank you very much for thorough reviews and commit [~templedf] and 
[~leftnoteasy]. Really appreciate the same. 

> Improve performance of resource profile branch
> --
>
> Key: YARN-6788
> URL: https://issues.apache.org/jira/browse/YARN-6788
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Sunil G
>Assignee: Sunil G
>Priority: Blocker
> Fix For: YARN-3926
>
> Attachments: YARN-6788-YARN-3926.001.patch, 
> YARN-6788-YARN-3926.002.patch, YARN-6788-YARN-3926.003.patch, 
> YARN-6788-YARN-3926.004.patch, YARN-6788-YARN-3926.005.patch, 
> YARN-6788-YARN-3926.006.patch, YARN-6788-YARN-3926.007.patch, 
> YARN-6788-YARN-3926.008.patch, YARN-6788-YARN-3926.009.patch, 
> YARN-6788-YARN-3926.010.patch, YARN-6788-YARN-3926.011.patch, 
> YARN-6788-YARN-3926.012.patch, YARN-6788-YARN-3926.013.patch, 
> YARN-6788-YARN-3926.014.patch, YARN-6788-YARN-3926.015.patch, 
> YARN-6788-YARN-3926.016.patch, YARN-6788-YARN-3926.017.patch, 
> YARN-6788-YARN-3926.018.patch, YARN-6788-YARN-3926.019.patch, 
> YARN-6788-YARN-3926.020.patch, YARN-6788-YARN-3926.021.patch, 
> YARN-6788-YARN-3926.022.patch, YARN-6788-YARN-3926.022.patch
>
>
> Currently we could see a 15% performance delta with this branch. 
> Few performance improvements to improve the same.
> Also this patch will handle 
> [comments|https://issues.apache.org/jira/browse/YARN-6761?focusedCommentId=16075418=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16075418]
>  from [~leftnoteasy].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6871) Add additional deSelects params in getAppReport

2017-08-04 Thread Giovanni Matteo Fumarola (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114710#comment-16114710
 ] 

Giovanni Matteo Fumarola commented on YARN-6871:


Thanks [~tanujnay] for the patch.
Few comments:
* I still think the formatting is not correctly set. Let me sync with you 
offline.
* You have few checkstyle warnings. The [LineLength]s may be fixed with the 
correct formatting.
* You have 2 unit tests that timed out. Please validate on your dev box that 
these tests pass successfully with your patch. 

> Add additional deSelects params in getAppReport
> ---
>
> Key: YARN-6871
> URL: https://issues.apache.org/jira/browse/YARN-6871
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: resourcemanager, router
>Reporter: Giovanni Matteo Fumarola
>Assignee: Tanuj Nayak
> Attachments: YARN-6871.002.patch, YARN-6871.proto.patch
>
>
> This jira tracks the effort to add additional deSelect params to the 
> GetAppReport to make it lighter and faster.
> With the current one we are facing a scalability issues.
> E.g. with ~500 applications running the AppReport can reach up to 300MB in 
> size due to the {{ResourceRequest}} in the {{AppInfo}}.
> Yarn RM will return the new result faster and it will use less compute cycles 
> to create the report and it will improve the YARN RM and Client's 
> performances.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-3254) HealthReport should include disk full information

2017-08-04 Thread Suma Shivaprasad (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suma Shivaprasad updated YARN-3254:
---
Attachment: YARN-3254-005.patch

> HealthReport should include disk full information
> -
>
> Key: YARN-3254
> URL: https://issues.apache.org/jira/browse/YARN-3254
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.6.0
>Reporter: Akira Ajisaka
>Assignee: Suma Shivaprasad
> Fix For: 3.0.0-beta1
>
> Attachments: Screen Shot 2015-02-24 at 17.57.39.png, Screen Shot 
> 2015-02-25 at 14.38.10.png, YARN-3254-001.patch, YARN-3254-002.patch, 
> YARN-3254-003.patch, YARN-3254-004.patch, YARN-3254-005.patch
>
>
> When a NodeManager's local disk gets almost full, the NodeManager sends a 
> health report to ResourceManager that "local/log dir is bad" and the message 
> is displayed on ResourceManager Web UI. It's difficult for users to detect 
> why the dir is bad.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-3254) HealthReport should include disk full information

2017-08-04 Thread Suma Shivaprasad (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114935#comment-16114935
 ] 

Suma Shivaprasad edited comment on YARN-3254 at 8/4/17 8:30 PM:


Updated the patch to display the exact root cause for the disk error/capacity 
exceeded cases. These error diagnostics were already available in 
DirectoryCollection earlier but was ignored and  not surfaced in the health 
report. Along with the ratio of disks marked as unhealthy for local/log dirs, 
the reason why each of them was marked unhealthy will be surfaced in health 
report. Sample errors below

{noformat}
1/1 local-dirs have errors: [ /invalidDir1 : Cannot create directory: 
/invalidDir1 ] 1/1 log-dirs usable space is below configured utilization 
percentage/no more usable space [ /hadoop-3.0.0-beta1-SNAPSHOT/logs/userlogs : 
used space above threshold of 1.0% ]
{noformat}
 


was (Author: suma.shivaprasad):
Updated the patch to display the exact root cause for the disk error/capacity 
exceeded cases. These error diagnostics were already available in 
DirectoryCollection earlier but was ignored and  not surfaced in the health 
report. Along with the ratio of disks marked as unhealthy for local/log dirs, 
the reason why each of them was marked unhealthy will be surfaced in health 
report. Sample errors below

{noformat}
1/1 local-dirs have errors: [ /invalidDir1 : Cannot create directory: 
/invalidDir1 ] 1/1 log-dirs usable space is below configured utilization 
percentage/no more usable space [ /hadoop-3.0.0-beta1-SNAPSHOT/logs/userlogs : 
used space above threshold of 1.0% ]

 

> HealthReport should include disk full information
> -
>
> Key: YARN-3254
> URL: https://issues.apache.org/jira/browse/YARN-3254
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.6.0
>Reporter: Akira Ajisaka
>Assignee: Suma Shivaprasad
> Fix For: 3.0.0-beta1
>
> Attachments: Screen Shot 2015-02-24 at 17.57.39.png, Screen Shot 
> 2015-02-25 at 14.38.10.png, YARN-3254-001.patch, YARN-3254-002.patch, 
> YARN-3254-003.patch, YARN-3254-004.patch, YARN-3254-005.patch
>
>
> When a NodeManager's local disk gets almost full, the NodeManager sends a 
> health report to ResourceManager that "local/log dir is bad" and the message 
> is displayed on ResourceManager Web UI. It's difficult for users to detect 
> why the dir is bad.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6955) Concurrent registerAM thread in Federation Interceptor

2017-08-04 Thread Botong Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated YARN-6955:
---
Attachment: YARN-6955.v1.patch

> Concurrent registerAM thread in Federation Interceptor
> --
>
> Key: YARN-6955
> URL: https://issues.apache.org/jira/browse/YARN-6955
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Minor
> Attachments: YARN-6955.v1.patch
>
>
>  The timeout between AM and AMRMProxy is shorter than the timeout + failOver 
> between FederationInterceptor (AMRMProxy) and RM. When the first register 
> thread in FI is blocked because of an RM failover, AM can timeout and resend 
> register call, leading to two outstanding register call inside FI. 
> Eventually when RM comes back up, one thread succeeds register and the other 
> thread got an application already registered exception. FI should swallow the 
> exception and return success back to AM in both threads. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6820) Restrict read access to timelineservice v2 data

2017-08-04 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114990#comment-16114990
 ] 

Jason Lowe commented on YARN-6820:
--

Thanks for updating the patch!

The javadoc errors are relevant:
{noformat}
[ERROR] 
/testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java:2119:
 error: bad HTML entity
[ERROR] * The name for setting that lists the users & groups who are allowed to
[ERROR] ^
[ERROR] 
/testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java:2122:
 error: bad HTML entity
[ERROR] * It will allow this list of users & groups to read the data
{noformat}

There's no default value constant for TIMELINE_SERVICE_READ_AUTH_ENABLED but 
there probably should be one.

DEFAULT_TIMELINE_SERVICE_READ_ALLOWED_USERS is defined but never used.

I think it'd be simpler to always have an admin acl (so no need for null 
check), initializing it with a default value of an empty string if the 
YARN_ADMIN_ACL property is not set.

It would be nice to have a unit test that verifies that even if a user not in 
the whitelist tries to perform a read it will be allowed if the master enable 
is off.


> Restrict read access to timelineservice v2 data 
> 
>
> Key: YARN-6820
> URL: https://issues.apache.org/jira/browse/YARN-6820
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Vrushali C
>Assignee: Vrushali C
>  Labels: yarn-5355-merge-blocker
> Attachments: YARN-6820-YARN-5355.0001.patch, 
> YARN-6820-YARN-5355.002.patch
>
>
> Need to provide a way to restrict read access in ATSv2. Not all users should 
> be able to read all entities. On the flip side, some folks may not need any 
> read restrictions, so we need to provide a way to disable this access 
> restriction as well. 
> Initially this access restriction could be done in a simple way via a 
> whitelist of users allowed to read data. That set of users can read all data, 
> no other user can read any data. Can be turned off for all users to read all 
> data.
> Could be stored in a "domain" table in hbase perhaps. Or a configuration 
> setting for the cluster. Or something else that's simple enough. ATSv1 has a 
> concept of domain for isolating users for reading. Would be good to keep that 
> in consideration. 
> In ATSv1, domain offers a namespace for Timeline server allowing users to 
> host multiple entities, isolating them from other users and applications. A 
> “Domain” in ATSV1 primarily stores owner info, read and& write ACL 
> information, created and modified time stamp information. Each Domain is 
> identified by an ID which must be unique across all users in the YARN cluster.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Resolved] (YARN-6944) The comment about ResourceManager#createPolicyMonitors lies

2017-08-04 Thread Yufei Gu (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu resolved YARN-6944.

Resolution: Duplicate

> The comment about ResourceManager#createPolicyMonitors lies
> ---
>
> Key: YARN-6944
> URL: https://issues.apache.org/jira/browse/YARN-6944
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.8.1, 3.0.0-alpha3
>Reporter: Yufei Gu
>Priority: Trivial
>
> {code} 
>  // creating monitors that handle preemption
>   createPolicyMonitors();
> {code} 
> Monitors don't handle preemption. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6952) Enable scheduling monitor in FS

2017-08-04 Thread Yufei Gu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114904#comment-16114904
 ] 

Yufei Gu commented on YARN-6952:


Uploaded patch v1. File YARN-6954 to remove interface 
PreemptableResourceScheduler.

> Enable scheduling monitor in FS
> ---
>
> Key: YARN-6952
> URL: https://issues.apache.org/jira/browse/YARN-6952
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler, resourcemanager
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Attachments: YARN-6952.001.patch
>
>
> {{SchedulingEditPolicy#init}} doesn't need to take interface 
> {{PreemptableResourceScheduler}} as the scheduler input. A ResourceScheduler 
> is good enough. With that change, fair scheduler is able to use scheduling 
> monitor(e.g. invariant checks) as CS does. Further more, there is no need for 
> interface {{PreemptableResourceScheduler}}. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-6955) Concurrent registerAM thread in Federation Interceptor

2017-08-04 Thread Botong Huang (JIRA)

Botong Huang created YARN-6955:
--

 Summary: Concurrent registerAM thread in Federation Interceptor
 Key: YARN-6955
 URL: https://issues.apache.org/jira/browse/YARN-6955
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Botong Huang
Assignee: Botong Huang
Priority: Minor


 The timeout between AM and AMRMProxy is shorter than the timeout + failOver 
between FederationInterceptor (AMRMProxy) and RM. When the first register 
thread in FI is blocked because of an RM failover, AM can timeout and resend 
register call, leading to two outstanding register call inside FI. 

Eventually when RM comes back up, one thread succeeds register and the other 
thread got an application already registered exception. FI should swallow the 
exception and return success back to AM in both threads. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6033) Add support for sections in container-executor configuration file

2017-08-04 Thread Wangda Tan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-6033:
-
Attachment: YARN-6033.009.patch

Attached ver.009 patch, fixed warnings.

> Add support for sections in container-executor configuration file
> -
>
> Key: YARN-6033
> URL: https://issues.apache.org/jira/browse/YARN-6033
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Attachments: YARN-6033.003.patch, YARN-6033.004.patch, 
> YARN-6033.005.patch, YARN-6033.006.patch, YARN-6033.007.patch, 
> YARN-6033.008.patch, YARN-6033.009.patch, YARN-6033-YARN-5673.001.patch, 
> YARN-6033-YARN-5673.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6955) Concurrent registerAM thread in Federation Interceptor

2017-08-04 Thread Botong Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated YARN-6955:
---
Attachment: (was: YARN-6955.v1.patch)

> Concurrent registerAM thread in Federation Interceptor
> --
>
> Key: YARN-6955
> URL: https://issues.apache.org/jira/browse/YARN-6955
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Minor
>
>  The timeout between AM and AMRMProxy is shorter than the timeout + failOver 
> between FederationInterceptor (AMRMProxy) and RM. When the first register 
> thread in FI is blocked because of an RM failover, AM can timeout and resend 
> register call, leading to two outstanding register call inside FI. 
> Eventually when RM comes back up, one thread succeeds register and the other 
> thread got an application already registered exception. FI should swallow the 
> exception and return success back to AM in both threads. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6361) FairScheduler: FSLeafQueue.fetchAppsWithDemand CPU usage is high with big queues

2017-08-04 Thread Yufei Gu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114654#comment-16114654
 ] 

Yufei Gu commented on YARN-6361:


Thanks for taking this, [~daemon]. This jira is dedicated to performance issue 
in FSLeafQueue.fetchAppsWithDemand. We can always open new JIRAs or using 
existing jiras for other performance issues. YARN-4090 is for the improvement 
the performance of the Schedulable#getResourceUsage calculation, which is your 
second point.


> FairScheduler: FSLeafQueue.fetchAppsWithDemand CPU usage is high with big 
> queues
> 
>
> Key: YARN-6361
> URL: https://issues.apache.org/jira/browse/YARN-6361
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Miklos Szegedi
>Assignee: YunFan Zhou
>Priority: Minor
> Attachments: dispatcherthread.png, threads.png
>
>
> FSLeafQueue.fetchAppsWithDemand sorts the applications by the current policy. 
> Most of the time is spent in FairShareComparator.compare. We could improve 
> this by doing the calculations outside the sort loop {{(O\(n\))}} and we 
> sorted by a fixed number inside instead {{O(n*log\(n\))}}. This could be an 
> performance issue when there are huge number of applications in a single 
> queue. The attachments shows the performance impact when there are 10k 
> applications in one queue.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-6789) new api to get all supported resources from RM

2017-08-04 Thread Sunil G (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114722#comment-16114722
 ] 

Sunil G edited comment on YARN-6789 at 8/4/17 5:58 PM:
---

Submitting patch to run jenkins.
This is an initial version of patch for api improvements.

cc/[~leftnoteasy] [~templedf]


was (Author: sunilg):
Submitting patch to run jenkins.
This is an initial version of patch for api improvements.

cc/[~leftnoteasy]

> new api to get all supported resources from RM
> --
>
> Key: YARN-6789
> URL: https://issues.apache.org/jira/browse/YARN-6789
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: YARN-6789-YARN-3926.001.patch
>
>
> It will be better to provide an api to get all supported resource types from 
> RM.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6726) Fix issues with docker commands executed by container-executor

2017-08-04 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114746#comment-16114746
 ] 

Hadoop QA commented on YARN-6726:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 14m 
12s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 32m  2s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | YARN-6726 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12880436/YARN-6726.003.patch |
| Optional Tests |  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux dc1077390d7d 3.13.0-123-generic #172-Ubuntu SMP Mon Jun 26 
18:04:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 02bf328 |
| Default Java | 1.8.0_131 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/16712/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/16712/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Fix issues with docker commands executed by container-executor
> --
>
> Key: YARN-6726
> URL: https://issues.apache.org/jira/browse/YARN-6726
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
> Attachments: YARN-6726.001.patch, YARN-6726.002.patch, 
> YARN-6726.003.patch
>
>
> docker inspect, rm, stop, etc are issued through container-executor. Commands 
> other than docker run are not functioning properly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Resolved] (YARN-6934) ResourceUtils.checkMandatoryResources() should also ensure that no min or max is set for vcores or memory

2017-08-04 Thread Daniel Templeton (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Templeton resolved YARN-6934.

Resolution: Invalid

> ResourceUtils.checkMandatoryResources() should also ensure that no min or max 
> is set for vcores or memory
> -
>
> Key: YARN-6934
> URL: https://issues.apache.org/jira/browse/YARN-6934
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: YARN-3926
>Reporter: Daniel Templeton
>  Labels: newbie++
> Attachments: YARN-6934.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6934) ResourceUtils.checkMandatoryResources() should also ensure that no min or max is set for vcores or memory

2017-08-04 Thread Daniel Templeton (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114863#comment-16114863
 ] 

Daniel Templeton commented on YARN-6934:


Thanks for the patch, [~maniraj...@gmail.com].  I just took a closer look at 
the code, and it looks like I was wrong when I filed this JIRA.  
{{setMinimumAllocationForMandatoryResources()}} and 
{{setMaximumAllocationForMandatoryResources()}} explicitly allow for the min 
and max to be set on CPU and memory.  I'm going to close it as invalid.

I have added you as a contributor for the YARN project now, so you can assign 
JIRAs to yourself in the future.  YARN-6933?

> ResourceUtils.checkMandatoryResources() should also ensure that no min or max 
> is set for vcores or memory
> -
>
> Key: YARN-6934
> URL: https://issues.apache.org/jira/browse/YARN-6934
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: YARN-3926
>Reporter: Daniel Templeton
>  Labels: newbie++
> Attachments: YARN-6934.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-4090) Make Collections.sort() more efficient by caching resource usage

2017-08-04 Thread Yufei Gu (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-4090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu updated YARN-4090:
---
Summary: Make Collections.sort() more efficient by caching resource usage  
(was: Make Collections.sort() more efficient by )

> Make Collections.sort() more efficient by caching resource usage
> 
>
> Key: YARN-4090
> URL: https://issues.apache.org/jira/browse/YARN-4090
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Reporter: Xianyin Xin
>Assignee: zhangshilong
> Attachments: sampling1.jpg, sampling2.jpg, YARN-4090.001.patch, 
> YARN-4090.002.patch, YARN-4090.003.patch, YARN-4090.004.patch, 
> YARN-4090.005.patch, YARN-4090.006.patch, YARN-4090-preview.patch, 
> YARN-4090-TestResult.pdf
>
>
> Collections.sort() consumes too much time in a scheduling round.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6726) Fix issues with docker commands executed by container-executor

2017-08-04 Thread Shane Kumpf (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shane Kumpf updated YARN-6726:
--
Attachment: YARN-6726.003.patch

> Fix issues with docker commands executed by container-executor
> --
>
> Key: YARN-6726
> URL: https://issues.apache.org/jira/browse/YARN-6726
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
> Attachments: YARN-6726.001.patch, YARN-6726.002.patch, 
> YARN-6726.003.patch
>
>
> docker inspect, rm, stop, etc are issued through container-executor. Commands 
> other than docker run are not functioning properly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6726) Fix issues with docker commands executed by container-executor

2017-08-04 Thread Shane Kumpf (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114829#comment-16114829
 ] 

Shane Kumpf commented on YARN-6726:
---

Thanks for the review, [~sunilg]. I have attached a new patch that addresses 
your comments.

{quote}
Could you please to share why we need a prefix of UTILS here, is this a 
standard. #ifndef UTILS_STRING_UTILS_H
{quote}

I don't know if it's a standard, but I've seen the convention elsewhere, and it 
aligns with YARN-6852 that Wangda had called out above. It is the relative path 
to the file (utils/strings-utils.h).

> Fix issues with docker commands executed by container-executor
> --
>
> Key: YARN-6726
> URL: https://issues.apache.org/jira/browse/YARN-6726
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
> Attachments: YARN-6726.001.patch, YARN-6726.002.patch, 
> YARN-6726.003.patch
>
>
> docker inspect, rm, stop, etc are issued through container-executor. Commands 
> other than docker run are not functioning properly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-4090) Make Collections.sort() more efficient by

2017-08-04 Thread Yufei Gu (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-4090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu updated YARN-4090:
---
Summary: Make Collections.sort() more efficient by   (was: Make 
Collections.sort() more efficient in FSParentQueue.java)

> Make Collections.sort() more efficient by 
> --
>
> Key: YARN-4090
> URL: https://issues.apache.org/jira/browse/YARN-4090
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Reporter: Xianyin Xin
>Assignee: zhangshilong
> Attachments: sampling1.jpg, sampling2.jpg, YARN-4090.001.patch, 
> YARN-4090.002.patch, YARN-4090.003.patch, YARN-4090.004.patch, 
> YARN-4090.005.patch, YARN-4090.006.patch, YARN-4090-preview.patch, 
> YARN-4090-TestResult.pdf
>
>
> Collections.sort() consumes too much time in a scheduling round.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6892) Improve API implementation in Resources and DominantResourceCalculator in align to ResourceInformation

2017-08-04 Thread Sunil G (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114721#comment-16114721
 ] 

Sunil G commented on YARN-6892:
---

YARN-6788 is committed. Submitting patch for jenkins.

> Improve API implementation in Resources and DominantResourceCalculator in 
> align to ResourceInformation
> --
>
> Key: YARN-6892
> URL: https://issues.apache.org/jira/browse/YARN-6892
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: YARN-6892-YARN-3926.001.patch
>
>
> In YARN-3926, apis in Resources and DRC spents significant cpu cycles in most 
> of its api. For better performance, its better to improve the apis as 
> resource types order is defined in system level (ResourceUtils class ensures 
> this post YARN-6788)
> This work is preceding to YARN-6788



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-6953) Clean up ResourceUtils.setMinimumAllocationForMandatoryResources() and setMaximumAllocationForMandatoryResources()

2017-08-04 Thread Daniel Templeton (JIRA)

Daniel Templeton created YARN-6953:
--

 Summary: Clean up 
ResourceUtils.setMinimumAllocationForMandatoryResources() and 
setMaximumAllocationForMandatoryResources()
 Key: YARN-6953
 URL: https://issues.apache.org/jira/browse/YARN-6953
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: YARN-3926
Reporter: Daniel Templeton
Priority: Minor


The {{setMinimumAllocationForMandatoryResources()}} and 
{{setMaximumAllocationForMandatoryResources()}} methods are quite convoluted.  
They'd be much simpler if they just handled CPU and memory manually instead of 
trying to be clever about doing it in a loop.  There are also issues, such as 
the log warning always talking about memory or the last element of the inner 
array being a copy of the first element.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6952) Enable scheduling monitor in FS

2017-08-04 Thread Yufei Gu (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu updated YARN-6952:
---
Attachment: YARN-6952.001.patch

> Enable scheduling monitor in FS
> ---
>
> Key: YARN-6952
> URL: https://issues.apache.org/jira/browse/YARN-6952
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler, resourcemanager
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Attachments: YARN-6952.001.patch
>
>
> {{SchedulingEditPolicy#init}} doesn't need to take interface 
> {{PreemptableResourceScheduler}} as the scheduler input. A ResourceScheduler 
> is good enough. With that change, fair scheduler is able to use scheduling 
> monitor(e.g. invariant checks) as CS does. Further more, there is no need for 
> interface {{PreemptableResourceScheduler}}. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6934) ResourceUtils.checkMandatoryResources() should also ensure that no min or max is set for vcores or memory

2017-08-04 Thread Manikandan R (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikandan R updated YARN-6934:
---
Attachment: YARN-6934.001.patch

> ResourceUtils.checkMandatoryResources() should also ensure that no min or max 
> is set for vcores or memory
> -
>
> Key: YARN-6934
> URL: https://issues.apache.org/jira/browse/YARN-6934
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: YARN-3926
>Reporter: Daniel Templeton
>  Labels: newbie++
> Attachments: YARN-6934.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-65) Reduce RM app memory footprint once app has completed

2017-08-04 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-65?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114793#comment-16114793
 ] 

Hadoop QA commented on YARN-65:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  4m 
35s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 27s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 13 new + 137 unchanged - 1 fixed = 150 total (was 138) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 43m 
30s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 71m 43s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | YARN-65 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12880287/YARN-65.004.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux f4b5a072a392 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 
14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 02bf328 |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/16711/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/16711/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/16711/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Reduce RM app memory footprint once app has completed
> -
>
> Key: YARN-65
> URL: https://issues.apache.org/jira/browse/YARN-65
>

[jira] [Created] (YARN-6952) Enable scheduling monitor in FS

2017-08-04 Thread Yufei Gu (JIRA)

Yufei Gu created YARN-6952:
--

 Summary: Enable scheduling monitor in FS
 Key: YARN-6952
 URL: https://issues.apache.org/jira/browse/YARN-6952
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: fairscheduler, resourcemanager
Reporter: Yufei Gu
Assignee: Yufei Gu


{{SchedulingEditPolicy#init}} doesn't need to take interface 
{{PreemptableResourceScheduler}} as the scheduler input. A ResourceScheduler is 
good enough. With that change, fair scheduler is able to use scheduling 
monitor(e.g. invariant checks) as CS does. Further more, there is no need for 
interface {{PreemptableResourceScheduler}}. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6934) ResourceUtils.checkMandatoryResources() should also ensure that no min or max is set for vcores or memory

2017-08-04 Thread Manikandan R (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114849#comment-16114849
 ] 

Manikandan R commented on YARN-6934:


Based on discussion, Attached patch for review. It contains even the changes 
required for YARN-6933 as it is closely related.

> ResourceUtils.checkMandatoryResources() should also ensure that no min or max 
> is set for vcores or memory
> -
>
> Key: YARN-6934
> URL: https://issues.apache.org/jira/browse/YARN-6934
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: YARN-3926
>Reporter: Daniel Templeton
>  Labels: newbie++
> Attachments: YARN-6934.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-6954) Remove interface PreemptableResourceScheduler

2017-08-04 Thread Yufei Gu (JIRA)

Yufei Gu created YARN-6954:
--

 Summary: Remove interface PreemptableResourceScheduler
 Key: YARN-6954
 URL: https://issues.apache.org/jira/browse/YARN-6954
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacity scheduler, resourcemanager
Reporter: Yufei Gu


Once YARN-6952 is done, the only place references interface 
PreemptableResourceScheduler is Capacity Scheduler. We could remove 
PreemptableResourceScheduler for simplicity.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-3254) HealthReport should include disk full information

2017-08-04 Thread Suma Shivaprasad (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114935#comment-16114935
 ] 

Suma Shivaprasad commented on YARN-3254:


Updated the patch to display the exact root cause for the disk error/capacity 
exceeded cases. These error diagnostics were already available in 
DirectoryCollection earlier but was ignored and  not surfaced in the health 
report. Along with the ratio of disks marked as unhealthy for local/log dirs, 
the reason why each of them was marked unhealthy will be surfaced in health 
report. Sample errors below

{noformat}
1/1 local-dirs have errors: [ /invalidDir1 : Cannot create directory: 
/invalidDir1 ] 1/1 log-dirs usable space is below configured utilization 
percentage/no more usable space [ /hadoop-3.0.0-beta1-SNAPSHOT/logs/userlogs : 
used space above threshold of 1.0% ]

 

> HealthReport should include disk full information
> -
>
> Key: YARN-3254
> URL: https://issues.apache.org/jira/browse/YARN-3254
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.6.0
>Reporter: Akira Ajisaka
>Assignee: Suma Shivaprasad
> Fix For: 3.0.0-beta1
>
> Attachments: Screen Shot 2015-02-24 at 17.57.39.png, Screen Shot 
> 2015-02-25 at 14.38.10.png, YARN-3254-001.patch, YARN-3254-002.patch, 
> YARN-3254-003.patch, YARN-3254-004.patch
>
>
> When a NodeManager's local disk gets almost full, the NodeManager sends a 
> health report to ResourceManager that "local/log dir is bad" and the message 
> is displayed on ResourceManager Web UI. It's difficult for users to detect 
> why the dir is bad.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6955) Concurrent registerAM thread in Federation Interceptor

2017-08-04 Thread Botong Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated YARN-6955:
---
Attachment: YARN-6955.v1.patch

> Concurrent registerAM thread in Federation Interceptor
> --
>
> Key: YARN-6955
> URL: https://issues.apache.org/jira/browse/YARN-6955
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Minor
> Attachments: YARN-6955.v1.patch
>
>
>  The timeout between AM and AMRMProxy is shorter than the timeout + failOver 
> between FederationInterceptor (AMRMProxy) and RM. When the first register 
> thread in FI is blocked because of an RM failover, AM can timeout and resend 
> register call, leading to two outstanding register call inside FI. 
> Eventually when RM comes back up, one thread succeeds register and the other 
> thread got an application already registered exception. FI should swallow the 
> exception and return success back to AM in both threads. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6413) Decouple Yarn Registry API from ZK

2017-08-04 Thread Ellen Hui (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ellen Hui updated YARN-6413:

Attachment: 0003-Registry-API-api-only.patch

This patch only adds the API, record, types, and exceptions, the implementation 
has not been touched. It compiles on top of yarn-native-services f1a358e178e.

[~jianhe], can you please take a look at the latest patch, and let me know if 
this way of splitting the interface will work for you?

> Decouple Yarn Registry API from ZK
> --
>
> Key: YARN-6413
> URL: https://issues.apache.org/jira/browse/YARN-6413
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: amrmproxy, api, resourcemanager
>Reporter: Ellen Hui
>Assignee: Ellen Hui
> Attachments: 0001-Registry-API-v2.patch, 0002-Registry-API-v2.patch, 
> 0003-Registry-API-api-only.patch
>
>
> Right now the Yarn Registry API (defined in the RegistryOperations interface) 
> is a very thin layer over Zookeeper. This jira proposes changing the 
> interface to abstract away the implementation details so that we can write a 
> FS-based implementation of the registry service, which will be used to 
> support AMRMProxy HA.
> The new interface will use register/delete/resolve APIs instead of 
> Zookeeper-specific operations like mknode. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6361) FairScheduler: FSLeafQueue.fetchAppsWithDemand CPU usage is high with big queues

2017-08-04 Thread YunFan Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115205#comment-16115205
 ] 

YunFan Zhou commented on YARN-6361:
---

[~yufeigu] Thank Yufei.
Sorry, I'm off the subject. But either way, the efficiency of raising the 
*fetchAppsWithDemand *is something that must be done. I have thought about the 
optimal method for two days and tested my thoughts today.
*Thank you very much for your confidence in me.*

> FairScheduler: FSLeafQueue.fetchAppsWithDemand CPU usage is high with big 
> queues
> 
>
> Key: YARN-6361
> URL: https://issues.apache.org/jira/browse/YARN-6361
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Miklos Szegedi
>Assignee: YunFan Zhou
>Priority: Minor
> Attachments: dispatcherthread.png, threads.png
>
>
> FSLeafQueue.fetchAppsWithDemand sorts the applications by the current policy. 
> Most of the time is spent in FairShareComparator.compare. We could improve 
> this by doing the calculations outside the sort loop {{(O\(n\))}} and we 
> sorted by a fixed number inside instead {{O(n*log\(n\))}}. This could be an 
> performance issue when there are huge number of applications in a single 
> queue. The attachments shows the performance impact when there are 10k 
> applications in one queue.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-6361) FairScheduler: FSLeafQueue.fetchAppsWithDemand CPU usage is high with big queues

2017-08-04 Thread YunFan Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115205#comment-16115205
 ] 

YunFan Zhou edited comment on YARN-6361 at 8/5/17 1:49 AM:
---

[~yufeigu] Thank Yufei.
Sorry, I'm off the subject. But either way, the efficiency of raising the 
*fetchAppsWithDemand* is something that must be done. I have thought about the 
optimal method for two days and tested my thoughts today.
Thank you very much for your confidence in me.


was (Author: daemon):
[~yufeigu] Thank Yufei.
Sorry, I'm off the subject. But either way, the efficiency of raising the 
*fetchAppsWithDemand  *is something that must be done. I have thought about the 
optimal method for two days and tested my thoughts today.
Thank you very much for your confidence in me.

> FairScheduler: FSLeafQueue.fetchAppsWithDemand CPU usage is high with big 
> queues
> 
>
> Key: YARN-6361
> URL: https://issues.apache.org/jira/browse/YARN-6361
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Miklos Szegedi
>Assignee: YunFan Zhou
>Priority: Minor
> Attachments: dispatcherthread.png, threads.png
>
>
> FSLeafQueue.fetchAppsWithDemand sorts the applications by the current policy. 
> Most of the time is spent in FairShareComparator.compare. We could improve 
> this by doing the calculations outside the sort loop {{(O\(n\))}} and we 
> sorted by a fixed number inside instead {{O(n*log\(n\))}}. This could be an 
> performance issue when there are huge number of applications in a single 
> queue. The attachments shows the performance impact when there are 10k 
> applications in one queue.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5938) Refactoring OpportunisticContainerAllocator to use SchedulerRequestKey instead of Priority and other misc fixes

2017-08-04 Thread Arun Suresh (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh updated YARN-5938:
--
Fix Version/s: 2.9.0

> Refactoring OpportunisticContainerAllocator to use SchedulerRequestKey 
> instead of Priority and other misc fixes
> ---
>
> Key: YARN-5938
> URL: https://issues.apache.org/jira/browse/YARN-5938
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: YARN-5938.001.patch, YARN-5938.002.patch, 
> YARN-5938.003.patch, YARN-5938-YARN-5085.001.patch, 
> YARN-5938-YARN-5085.002.patch, YARN-5938-YARN-5085.003.patch, 
> YARN-5938-YARN-5085.004.patch
>
>
> Minor code re-organization to do the following:
> # The OpportunisticContainerAllocatorAMService currently allocates outside 
> the ApplicationAttempt lock maintained by the ApplicationMasterService. This 
> should happen inside the lock.
> # Refactored out some code to simplify the allocate() method.
> # Removed some unused fields inside the OpportunisticContainerAllocator.
> # Re-organized some of the code in the 
> OpportunisticContainerAllocatorAMService::allocate method to make it a bit 
> more readable.
> # Moved SchedulerRequestKey to a new package, so it can be used by the 
> OpportunisticContainerAllocator/Context.
> # Moved all usages of Priority in the OpportunisticContainerAllocator -> 
> SchedulerRequestKey. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6952) Enable scheduling monitor in FS

2017-08-04 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115156#comment-16115156
 ] 

Hadoop QA commented on YARN-6952:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 25s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 1 new + 65 unchanged - 1 fixed = 66 total (was 66) {color} 
|
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 43m  3s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 67m 49s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.fair.TestFSAppStarvation |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | YARN-6952 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12880453/YARN-6952.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 78d21be2ade1 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 
12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / f44b349 |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/16717/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/16717/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/16717/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U:

[jira] [Updated] (YARN-6802) Add Max AM Resource and AM Resource Usage to Leaf Queue View in FairScheduler WebUI

2017-08-04 Thread Yufei Gu (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu updated YARN-6802:
---
Attachment: YARN-6802.branch-2.001.patch

> Add Max AM Resource and AM Resource Usage to Leaf Queue View in FairScheduler 
> WebUI
> ---
>
> Key: YARN-6802
> URL: https://issues.apache.org/jira/browse/YARN-6802
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 2.7.2
>Reporter: YunFan Zhou
>Assignee: YunFan Zhou
> Fix For: 2.9.0, 3.0.0-beta1
>
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, 
> YARN-6802.001.patch, YARN-6802.002.patch, YARN-6802.003.patch, 
> YARN-6802.branch-2.001.patch
>
>
> RM Web ui should support view leaf queue am resource usage. 
> !screenshot-2.png!
> I will upload my patch later.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-65) Reduce RM app memory footprint once app has completed

2017-08-04 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-65?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115160#comment-16115160
 ] 

Hadoop QA commented on YARN-65:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 31s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 13 new + 137 unchanged - 1 fixed = 150 total (was 138) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 47m  
7s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 75m 29s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | YARN-65 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12880287/YARN-65.004.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux c2b2bfa9805e 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 
14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / f44b349 |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/16718/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/16718/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/16718/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Reduce RM app memory footprint once app has completed
> -
>
> Key: YARN-65
> URL: https://issues.apache.org/jira/browse/YARN-65
>

[jira] [Commented] (YARN-6945) Display the ACL of leaf queue in RM scheduler page

2017-08-04 Thread YunFan Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115185#comment-16115185
 ] 

YunFan Zhou commented on YARN-6945:
---

[~Naganarasimha] Hi, Naganarasimha G R. Any suggestion?

> Display the ACL of leaf queue in RM scheduler page
> --
>
> Key: YARN-6945
> URL: https://issues.apache.org/jira/browse/YARN-6945
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Reporter: YunFan Zhou
>Assignee: YunFan Zhou
> Attachments: screenshot-1.png, YARN-6945.001.patch
>
>
> {code:java}
> java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Failed 
> to submit application_1501748492123_0298 to YARN : User yarn cannot submit 
> applications to queue root.jack
> at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:306)
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:240)
> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1785)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
> {code}
> Sometimes, when we submit our application to a queue, the submitted 
> application may fail because we have no permission to submit the application 
> to the corresponding queue. 
> But we have no place apart from the fair-scheduler.xml to see the ACL of the 
> queue, 
> you can't see such information from RM scheduler page.
> So, I want to join the *aclSubmitApps*、*aclAdministerApps *message to the RM 
> scheduler page.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6952) Enable scheduling monitor in FS

2017-08-04 Thread Yufei Gu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115162#comment-16115162
 ] 

Yufei Gu commented on YARN-6952:


The test failure is unrelated.

> Enable scheduling monitor in FS
> ---
>
> Key: YARN-6952
> URL: https://issues.apache.org/jira/browse/YARN-6952
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler, resourcemanager
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Attachments: YARN-6952.001.patch
>
>
> {{SchedulingEditPolicy#init}} doesn't need to take interface 
> {{PreemptableResourceScheduler}} as the scheduler input. A ResourceScheduler 
> is good enough. With that change, fair scheduler is able to use scheduling 
> monitor(e.g. invariant checks) as CS does. Further more, there is no need for 
> interface {{PreemptableResourceScheduler}}. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6802) Add Max AM Resource and AM Resource Usage to Leaf Queue View in FairScheduler WebUI

2017-08-04 Thread Yufei Gu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115161#comment-16115161
 ] 

Yufei Gu commented on YARN-6802:


Uploaded the patch for branch-2 and committed it to branch-2. 

> Add Max AM Resource and AM Resource Usage to Leaf Queue View in FairScheduler 
> WebUI
> ---
>
> Key: YARN-6802
> URL: https://issues.apache.org/jira/browse/YARN-6802
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 2.7.2
>Reporter: YunFan Zhou
>Assignee: YunFan Zhou
> Fix For: 2.9.0, 3.0.0-beta1
>
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, 
> YARN-6802.001.patch, YARN-6802.002.patch, YARN-6802.003.patch, 
> YARN-6802.branch-2.001.patch
>
>
> RM Web ui should support view leaf queue am resource usage. 
> !screenshot-2.png!
> I will upload my patch later.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6955) Concurrent registerAM thread in Federation Interceptor

2017-08-04 Thread Subru Krishnan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115167#comment-16115167
 ] 

Subru Krishnan commented on YARN-6955:
--

Thanks [~botong] for surfacing this issue. The patch looks mostly good (pending 
Yetus warnings fix) except that we should be save the registration request only 
if _this.amRegistrationRequest == null_.

> Concurrent registerAM thread in Federation Interceptor
> --
>
> Key: YARN-6955
> URL: https://issues.apache.org/jira/browse/YARN-6955
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Minor
> Attachments: YARN-6955.v1.patch
>
>
>  The timeout between AM and AMRMProxy is shorter than the timeout + failOver 
> between FederationInterceptor (AMRMProxy) and RM. When the first register 
> thread in FI is blocked because of an RM failover, AM can timeout and resend 
> register call, leading to two outstanding register call inside FI. 
> Eventually when RM comes back up, one thread succeeds register and the other 
> thread got an application already registered exception. FI should swallow the 
> exception and return success back to AM in both threads. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6033) Add support for sections in container-executor configuration file

2017-08-04 Thread Miklos Szegedi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115181#comment-16115181
 ] 

Miklos Szegedi commented on YARN-6033:
--

Sorry, I think I found one more in the latest patch.
{code}
58  // free an entry set of values
59  void free_values(char** values) {
60if (*values != NULL) {
61  free(*values);
62}
63if (values != NULL) {
64  free(values);
65}
66  }
{code}
If I understand correctly this does not free all values, just the first value. 
This is expected, if the items come from strtok, so this would definitely 
deserve a comment.
Moreover, if strtok finds a delimiter on the first character, the first value 
is inside the string, so free will crash and leak the memory.

> Add support for sections in container-executor configuration file
> -
>
> Key: YARN-6033
> URL: https://issues.apache.org/jira/browse/YARN-6033
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Attachments: YARN-6033.003.patch, YARN-6033.004.patch, 
> YARN-6033.005.patch, YARN-6033.006.patch, YARN-6033.007.patch, 
> YARN-6033.008.patch, YARN-6033.009.patch, YARN-6033-YARN-5673.001.patch, 
> YARN-6033-YARN-5673.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6361) FairScheduler: FSLeafQueue.fetchAppsWithDemand CPU usage is high with big queues

2017-08-04 Thread YunFan Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

YunFan Zhou updated YARN-6361:
--
Priority: Major  (was: Minor)

> FairScheduler: FSLeafQueue.fetchAppsWithDemand CPU usage is high with big 
> queues
> 
>
> Key: YARN-6361
> URL: https://issues.apache.org/jira/browse/YARN-6361
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Miklos Szegedi
>Assignee: YunFan Zhou
> Attachments: dispatcherthread.png, threads.png
>
>
> FSLeafQueue.fetchAppsWithDemand sorts the applications by the current policy. 
> Most of the time is spent in FairShareComparator.compare. We could improve 
> this by doing the calculations outside the sort loop {{(O\(n\))}} and we 
> sorted by a fixed number inside instead {{O(n*log\(n\))}}. This could be an 
> performance issue when there are huge number of applications in a single 
> queue. The attachments shows the performance impact when there are 10k 
> applications in one queue.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-6361) FairScheduler: FSLeafQueue.fetchAppsWithDemand CPU usage is high with big queues

2017-08-04 Thread YunFan Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115205#comment-16115205
 ] 

YunFan Zhou edited comment on YARN-6361 at 8/5/17 1:49 AM:
---

[~yufeigu] Thank Yufei.
Sorry, I'm off the subject. But either way, the efficiency of raising the 
*fetchAppsWithDemand  *is something that must be done. I have thought about the 
optimal method for two days and tested my thoughts today.
Thank you very much for your confidence in me.


was (Author: daemon):
[~yufeigu] Thank Yufei.
Sorry, I'm off the subject. But either way, the efficiency of raising the 
*fetchAppsWithDemand *is something that must be done. I have thought about the 
optimal method for two days and tested my thoughts today.
*Thank you very much for your confidence in me.*

> FairScheduler: FSLeafQueue.fetchAppsWithDemand CPU usage is high with big 
> queues
> 
>
> Key: YARN-6361
> URL: https://issues.apache.org/jira/browse/YARN-6361
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Miklos Szegedi
>Assignee: YunFan Zhou
>Priority: Minor
> Attachments: dispatcherthread.png, threads.png
>
>
> FSLeafQueue.fetchAppsWithDemand sorts the applications by the current policy. 
> Most of the time is spent in FairShareComparator.compare. We could improve 
> this by doing the calculations outside the sort loop {{(O\(n\))}} and we 
> sorted by a fixed number inside instead {{O(n*log\(n\))}}. This could be an 
> performance issue when there are huge number of applications in a single 
> queue. The attachments shows the performance impact when there are 10k 
> applications in one queue.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6949) Invalid event: LOCALIZATION_FAILED at LOCALIZED

2017-08-04 Thread lujie (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115220#comment-16115220
 ] 

lujie commented on YARN-6949:
-

I check the log and also find some NullPointerException:

{code:java}
ava.lang.NullPointerException
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl.getPathForLocalization(LocalResourcesTrackerImpl.java:505)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.getPathForLocalization(ResourceLocalizationService.java:1131)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.processHeartbeat(ResourceLocalizationService.java:1093)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.processHeartbeat(ResourceLocalizationService.java:720)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.heartbeat(ResourceLocalizationService.java:355)
at 
org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.service.LocalizationProtocolPBServiceImpl.heartbeat(LocalizationProtocolPBServiceImpl.java:48)

{code}



> Invalid event: LOCALIZATION_FAILED at LOCALIZED
> ---
>
> Key: YARN-6949
> URL: https://issues.apache.org/jira/browse/YARN-6949
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.8.0
>Reporter: lujie
>
> When job is running, I stop a nodemanager in one machine due to some reason, 
> Then I check the logs to see the running state,I find many 
> InvalidStateTransitionException:
> {code:java}
> rg.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: 
> LOCALIZATION_FAILED at LOCALIZED
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource.handle(LocalizedResource.java:198)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl.handle(LocalResourcesTrackerImpl.java:194)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl.handle(LocalResourcesTrackerImpl.java:58)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.processHeartbeat(ResourceLocalizationService.java:1058)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.processHeartbeat(ResourceLocalizationService.java:720)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.heartbeat(ResourceLocalizationService.java:355)
> at 
> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.service.LocalizationProtocolPBServiceImpl.heartbeat(LocalizationProtocolPBServiceImpl.java:48)
> at 
> org.apache.hadoop.yarn.proto.LocalizationProtocol$LocalizationProtocolService$2.callBlockingMethod(LocalizationProtocol.java:63)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:447)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:845)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5966) AMRMClient changes to support ExecutionType update

2017-08-04 Thread Arun Suresh (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh updated YARN-5966:
--
Fix Version/s: 2.9.0

> AMRMClient changes to support ExecutionType update
> --
>
> Key: YARN-5966
> URL: https://issues.apache.org/jira/browse/YARN-5966
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Fix For: 2.9.0, 3.0.0-alpha4
>
> Attachments: YARN-5966.001.patch, YARN-5966.002.patch, 
> YARN-5966.003.patch, YARN-5966.004.patch, YARN-5966.005.patch, 
> YARN-5966.006.patch, YARN-5966.007.patch, YARN-5966.008.patch, 
> YARN-5966.008.patch, YARN-5966.wip.001.patch
>
>
> {{AMRMClient}} changes to support change of container ExecutionType



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5966) AMRMClient changes to support ExecutionType update

2017-08-04 Thread Arun Suresh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115085#comment-16115085
 ] 

Arun Suresh commented on YARN-5966:
---

Committed this to branch-2

> AMRMClient changes to support ExecutionType update
> --
>
> Key: YARN-5966
> URL: https://issues.apache.org/jira/browse/YARN-5966
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Fix For: 2.9.0, 3.0.0-alpha4
>
> Attachments: YARN-5966.001.patch, YARN-5966.002.patch, 
> YARN-5966.003.patch, YARN-5966.004.patch, YARN-5966.005.patch, 
> YARN-5966.006.patch, YARN-5966.007.patch, YARN-5966.008.patch, 
> YARN-5966.008.patch, YARN-5966.wip.001.patch
>
>
> {{AMRMClient}} changes to support change of container ExecutionType



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6811) [ATS1.5] All history logs should be kept under its own User Directory.

2017-08-04 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115124#comment-16115124
 ] 

Junping Du commented on YARN-6811:
--

I have commit the patch to trunk. For branch-2, my cherry-pick has several 
conflicts and the build still get failed even after I fix these conflicts. 
[~rohithsharma], can you upload a patch for branch-2?

> [ATS1.5]  All history logs should be kept under its own User Directory.
> ---
>
> Key: YARN-6811
> URL: https://issues.apache.org/jira/browse/YARN-6811
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: timelineclient, timelineserver
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: YARN-6811.01.patch, YARN-6811.02.patch
>
>
> ATS1.5 allows to store history data in underlying FileSystem folder path i.e 
> */acitve-dir* and */done-dir*. These base directories are protected for 
> unauthorized user access for other users data by setting sticky bit for 
> /active-dir. 
> But object store filesystems such as WASB does not have user access control 
> on folders and files. When WASB are used as underlying file system for 
> ATS1.5, the history data which are stored in FS are accessible to all users. 
> *This would be a security risk*
> I would propose to keep history data under its own user directory i.e 
> */active-dir/$USER*. Even this do not solve basic user access from FS, but it 
> provides capability to plugin Apache Ranger policies for each user folders. 
> One thing to note that setting policies to each user folder is admin 
> responsibility. But grouping all history data of one user folder allows to 
> set policies so that user access control is achieved. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6777) Support for ApplicationMasterService processing chain of interceptors

2017-08-04 Thread Arun Suresh (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh updated YARN-6777:
--
Fix Version/s: 2.9.0

> Support for ApplicationMasterService processing chain of interceptors
> -
>
> Key: YARN-6777
> URL: https://issues.apache.org/jira/browse/YARN-6777
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Fix For: 2.9.0, 3.0.0-beta1
>
> Attachments: YARN-6777.001.patch, YARN-6777.002.patch, 
> YARN-6777.003.patch, YARN-6777.004.patch, YARN-6777.005.patch, 
> YARN-6777.006.patch
>
>
> This JIRA extends the Processor introduced in YARN-6776 with a configurable 
> processing chain of interceptors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6777) Support for ApplicationMasterService processing chain of interceptors

2017-08-04 Thread Arun Suresh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115127#comment-16115127
 ] 

Arun Suresh commented on YARN-6777:
---

Committed to branch-2

> Support for ApplicationMasterService processing chain of interceptors
> -
>
> Key: YARN-6777
> URL: https://issues.apache.org/jira/browse/YARN-6777
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Fix For: 2.9.0, 3.0.0-beta1
>
> Attachments: YARN-6777.001.patch, YARN-6777.002.patch, 
> YARN-6777.003.patch, YARN-6777.004.patch, YARN-6777.005.patch, 
> YARN-6777.006.patch
>
>
> This JIRA extends the Processor introduced in YARN-6776 with a configurable 
> processing chain of interceptors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6852) [YARN-6223] Native code changes to support isolate GPU devices by using CGroups

2017-08-04 Thread Wangda Tan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-6852:
-
Attachment: (was: YARN-6033.009.patch)

> [YARN-6223] Native code changes to support isolate GPU devices by using 
> CGroups
> ---
>
> Key: YARN-6852
> URL: https://issues.apache.org/jira/browse/YARN-6852
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-6852.001.patch, YARN-6852.002.patch, 
> YARN-6852.003.patch
>
>
> This JIRA plan to add support of:
> 1) Isolation in CGroups. (native side).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6852) [YARN-6223] Native code changes to support isolate GPU devices by using CGroups

2017-08-04 Thread Wangda Tan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-6852:
-
Attachment: YARN-6033.009.patch

> [YARN-6223] Native code changes to support isolate GPU devices by using 
> CGroups
> ---
>
> Key: YARN-6852
> URL: https://issues.apache.org/jira/browse/YARN-6852
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-6852.001.patch, YARN-6852.002.patch, 
> YARN-6852.003.patch
>
>
> This JIRA plan to add support of:
> 1) Isolation in CGroups. (native side).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6852) [YARN-6223] Native code changes to support isolate GPU devices by using CGroups

2017-08-04 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115055#comment-16115055
 ] 

Wangda Tan commented on YARN-6852:
--

Hi Miklos, really appreciate your thorough reviews, very helpful!

I address most of your comments.

Few items which I haven't addressed in the updated patch.
bq. Why do you have cgroup_cfg_section? You could eliminate it and get it all 
the time or just cache cgroups_root.
I still prefer to have it since this can help us get more configs without 
changing major code structure.

bq. int input_argv_idx = 0; the first argument is the process name.
Actually the argc and argv are modified in main.c before passed to modules, I 
removed process name already:
{code}
+return handle_gpu_request(_cgroups_parameters, "gpu", argc - 1,
+   [1]);
{code}
Please let me know if you have any suggestions to the approach. 

bq. opts->keys = malloc(sizeof(char*) * (argc + 1)); Why argc+1 and not argc-1? 
Updated to argc.

bq. required and has_values could be implemented as a bit array instead of a 
byte array. Another option ...
Since container-executor is not a memory-intensive application, I would prefer 
to spend time on changing it when it is necessary or there's any safety 
concerns. :) 

bq. This pattern is C+0x.
I think Varun mentioned this in YARN-6033, it is C99: 
https://stackoverflow.com/a/330867

bq. arr[idx] = n; There is no overflow check. This could also be exploitable.
This might not be an issue since we have already checked the input string once:
{code}
  for (int i = 0; i < strlen(input); i++) {
if (input[i] == ',') {
  n_numbers++;
}
  }
{code}

bq. container_1 is an invalid container id in the unit tests. They will fail. 
Did you mean we should not fail the check? "container_1" is actually an invalid 
id in YARN. 

bq. There is no indentation after namespace ContainerExecutor
I would prefer to not add extra indention for namespace. There're some 
discussions on SO: 
https://stackoverflow.com/questions/713698/c-namespaces-advice

bq. static std::vector cgroups_parameters_invoked; I think you 
should consider std::string here. No need to malloc later
bq. You do not clean up files in the unit tests, do you? Is there a reason?
(TODO) Will include unit test related changes and clean ups in the next patch.

Updated ver.003 patch.

> [YARN-6223] Native code changes to support isolate GPU devices by using 
> CGroups
> ---
>
> Key: YARN-6852
> URL: https://issues.apache.org/jira/browse/YARN-6852
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-6852.001.patch, YARN-6852.002.patch, 
> YARN-6852.003.patch
>
>
> This JIRA plan to add support of:
> 1) Isolation in CGroups. (native side).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6852) [YARN-6223] Native code changes to support isolate GPU devices by using CGroups

2017-08-04 Thread Wangda Tan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-6852:
-
Attachment: YARN-6852.003.patch

> [YARN-6223] Native code changes to support isolate GPU devices by using 
> CGroups
> ---
>
> Key: YARN-6852
> URL: https://issues.apache.org/jira/browse/YARN-6852
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-6852.001.patch, YARN-6852.002.patch, 
> YARN-6852.003.patch
>
>
> This JIRA plan to add support of:
> 1) Isolation in CGroups. (native side).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-6852) [YARN-6223] Native code changes to support isolate GPU devices by using CGroups

2017-08-04 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115055#comment-16115055
 ] 

Wangda Tan edited comment on YARN-6852 at 8/4/17 10:46 PM:
---

Hi Miklos, really appreciate your thorough reviews, very helpful!

I address most of your comments.

Few items which I haven't addressed in the updated patch.
bq. Why do you have cgroup_cfg_section? You could eliminate it and get it all 
the time or just cache cgroups_root.
I still prefer to have it since this can help us get more configs without 
changing major code structure.

bq. int input_argv_idx = 0; the first argument is the process name.
Actually the argc and argv are modified in main.c before passed to modules, I 
removed process name already:
{code}
+return handle_gpu_request(_cgroups_parameters, "gpu", argc - 1,
+   [1]);
{code}
Please let me know if you have any suggestions to the approach. 

bq. opts->keys = malloc(sizeof(char*) * (argc + 1)); Why argc+1 and not argc-1? 
Updated to argc.

bq. required and has_values could be implemented as a bit array instead of a 
byte array. Another option ...
Since container-executor is not a memory-intensive application, I would prefer 
to spend time on changing it when it is necessary or there's any safety 
concerns. :) 

bq. This pattern is C+0x.
I think Varun mentioned this in YARN-6033, it is C99: 
https://stackoverflow.com/a/330867

bq. arr[idx] = n; There is no overflow check. This could also be exploitable.
This might not be an issue since we have already checked the input string once:
{code}
  for (int i = 0; i < strlen(input); i++) {
if (input[i] == ',') {
  n_numbers++;
}
  }
{code}

bq. container_1 is an invalid container id in the unit tests. They will fail. 
Did you mean we should not fail the check? "container_1" is actually an invalid 
id in YARN. 

bq. There is no indentation after namespace ContainerExecutor
I would prefer to not add extra indention for namespace. There're some 
discussions on SO: 
https://stackoverflow.com/questions/713698/c-namespaces-advice

bq. static std::vector cgroups_parameters_invoked; I think you 
should consider std::string here. No need to malloc later
bq. You do not clean up files in the unit tests, do you? Is there a reason?
(TODO) Will include unit test related changes and clean ups in the next patch.

Updated ver.003 patch. [~miklos.szeg...@cloudera.com], mind to check again?


was (Author: leftnoteasy):
Hi Miklos, really appreciate your thorough reviews, very helpful!

I address most of your comments.

Few items which I haven't addressed in the updated patch.
bq. Why do you have cgroup_cfg_section? You could eliminate it and get it all 
the time or just cache cgroups_root.
I still prefer to have it since this can help us get more configs without 
changing major code structure.

bq. int input_argv_idx = 0; the first argument is the process name.
Actually the argc and argv are modified in main.c before passed to modules, I 
removed process name already:
{code}
+return handle_gpu_request(_cgroups_parameters, "gpu", argc - 1,
+   [1]);
{code}
Please let me know if you have any suggestions to the approach. 

bq. opts->keys = malloc(sizeof(char*) * (argc + 1)); Why argc+1 and not argc-1? 
Updated to argc.

bq. required and has_values could be implemented as a bit array instead of a 
byte array. Another option ...
Since container-executor is not a memory-intensive application, I would prefer 
to spend time on changing it when it is necessary or there's any safety 
concerns. :) 

bq. This pattern is C+0x.
I think Varun mentioned this in YARN-6033, it is C99: 
https://stackoverflow.com/a/330867

bq. arr[idx] = n; There is no overflow check. This could also be exploitable.
This might not be an issue since we have already checked the input string once:
{code}
  for (int i = 0; i < strlen(input); i++) {
if (input[i] == ',') {
  n_numbers++;
}
  }
{code}

bq. container_1 is an invalid container id in the unit tests. They will fail. 
Did you mean we should not fail the check? "container_1" is actually an invalid 
id in YARN. 

bq. There is no indentation after namespace ContainerExecutor
I would prefer to not add extra indention for namespace. There're some 
discussions on SO: 
https://stackoverflow.com/questions/713698/c-namespaces-advice

bq. static std::vector cgroups_parameters_invoked; I think you 
should consider std::string here. No need to malloc later
bq. You do not clean up files in the unit tests, do you? Is there a reason?
(TODO) Will include unit test related changes and clean ups in the next patch.

Updated ver.003 patch.

> [YARN-6223] Native code changes to support isolate GPU devices by using 
> CGroups
> ---
>
> Key: YARN-6852
> URL:

[jira] [Commented] (YARN-6811) [ATS1.5] All history logs should be kept under its own User Directory.

2017-08-04 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115099#comment-16115099
 ] 

Hudson commented on YARN-6811:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #12122 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/12122/])
YARN-6811. [ATS1.5] All history logs should be kept under its own User 
(junping_du: rev f44b349b813508f0f6d99ca10bddba683dedf6c4)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timeline-pluginstorage/src/main/java/org/apache/hadoop/yarn/server/timeline/EntityGroupFSTimelineStore.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/impl/FileSystemTimelineWriter.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestTimelineClientForATS1_5.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timeline-pluginstorage/src/test/java/org/apache/hadoop/yarn/server/timeline/TestEntityGroupFSTimelineStore.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml


> [ATS1.5]  All history logs should be kept under its own User Directory.
> ---
>
> Key: YARN-6811
> URL: https://issues.apache.org/jira/browse/YARN-6811
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: timelineclient, timelineserver
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: YARN-6811.01.patch, YARN-6811.02.patch
>
>
> ATS1.5 allows to store history data in underlying FileSystem folder path i.e 
> */acitve-dir* and */done-dir*. These base directories are protected for 
> unauthorized user access for other users data by setting sticky bit for 
> /active-dir. 
> But object store filesystems such as WASB does not have user access control 
> on folders and files. When WASB are used as underlying file system for 
> ATS1.5, the history data which are stored in FS are accessible to all users. 
> *This would be a security risk*
> I would propose to keep history data under its own user directory i.e 
> */active-dir/$USER*. Even this do not solve basic user access from FS, but it 
> provides capability to plugin Apache Ranger policies for each user folders. 
> One thing to note that setting policies to each user folder is admin 
> responsibility. But grouping all history data of one user folder allows to 
> set policies so that user access control is achieved. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6033) Add support for sections in container-executor configuration file

2017-08-04 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115111#comment-16115111
 ] 

Hadoop QA commented on YARN-6033:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
48s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 7 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
 9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  0m 
44s{color} | {color:green} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager
 generated 0 new + 0 unchanged - 1 fixed = 0 total (was 1) {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 13m 
29s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 33m 38s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | YARN-6033 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12880477/YARN-6033.009.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  xml  cc  |
| uname | Linux f21460b5090a 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 
12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / f44b349 |
| Default Java | 1.8.0_131 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/16716/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/16716/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Add support for sections in container-executor configuration file
> -
>
> Key: YARN-6033
> URL: https://issues.apache.org/jira/browse/YARN-6033
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Attachments: YARN-6033.003.patch, YARN-6033.004.patch, 
> YARN-6033.005.patch, YARN-6033.006.patch, YARN-6033.007.patch, 
> YARN-6033.008.patch, YARN-6033.009.patch, YARN-6033-YARN-5673.001.patch, 
> YARN-6033-YARN-5673.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (YARN-3254) HealthReport should include disk full information

2017-08-04 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115112#comment-16115112
 ] 

Hadoop QA commented on YARN-3254:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
52s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
42s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 in trunk has 5 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 14s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 14 new + 35 unchanged - 0 fixed = 49 total (was 35) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 13m 
13s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 34m 26s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | YARN-3254 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12880458/YARN-3254-005.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 6023976564c6 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / f44b349 |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-YARN-Build/16714/artifact/patchprocess/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-warnings.html
 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/16714/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/16714/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/16714/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT

[jira] [Commented] (YARN-6776) Refactor ApplicaitonMasterService to move actual processing logic to a separate class

2017-08-04 Thread Arun Suresh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115117#comment-16115117
 ] 

Arun Suresh commented on YARN-6776:
---

Committed this to branch-2

> Refactor ApplicaitonMasterService to move actual processing logic to a 
> separate class
> -
>
> Key: YARN-6776
> URL: https://issues.apache.org/jira/browse/YARN-6776
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>Priority: Minor
> Fix For: 2.9.0, 3.0.0-beta1
>
> Attachments: YARN-6776.001.patch, YARN-6776.002.patch, 
> YARN-6776.003.patch, YARN-6776.004.patch
>
>
> Minor refactoring to move the processing logic of the 
> {{ApplicationMasterService}} into a separate class.
> The per appattempt locking as well as the extraction of the appAttemptId etc. 
> will remain in the ApplicationMasterService 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6776) Refactor ApplicaitonMasterService to move actual processing logic to a separate class

2017-08-04 Thread Arun Suresh (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh updated YARN-6776:
--
Fix Version/s: 2.9.0

> Refactor ApplicaitonMasterService to move actual processing logic to a 
> separate class
> -
>
> Key: YARN-6776
> URL: https://issues.apache.org/jira/browse/YARN-6776
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>Priority: Minor
> Fix For: 2.9.0, 3.0.0-beta1
>
> Attachments: YARN-6776.001.patch, YARN-6776.002.patch, 
> YARN-6776.003.patch, YARN-6776.004.patch
>
>
> Minor refactoring to move the processing logic of the 
> {{ApplicationMasterService}} into a separate class.
> The per appattempt locking as well as the extraction of the appAttemptId etc. 
> will remain in the ApplicationMasterService 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6955) Concurrent registerAM thread in Federation Interceptor

2017-08-04 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115131#comment-16115131
 ] 

Hadoop QA commented on YARN-6955:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
50s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
20s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
39s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 in trunk has 5 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
7s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
38s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 33s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server: The patch generated 3 new + 
1 unchanged - 0 fixed = 4 total (was 1) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
47s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 generated 1 new + 5 unchanged - 0 fixed = 6 total (was 5) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
12s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 13m 
31s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 47m 17s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | 
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
|  |  Inconsistent synchronization of 
org.apache.hadoop.yarn.server.nodemanager.amrmproxy.FederationInterceptor.amRegistrationRequest;
 locked 50% of time  Unsynchronized access at FederationInterceptor.java:50% of 
time  Unsynchronized access at FederationInterceptor.java:[line 305] |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | YARN-6955 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12880476/YARN-6955.v1.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 11c3c52aa14f 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality |

[jira] [Commented] (YARN-3254) HealthReport should include disk full information

2017-08-04 Thread Suma Shivaprasad (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115142#comment-16115142
 ] 

Suma Shivaprasad commented on YARN-3254:


[~sunilg] Can you pls review the updated patch?

> HealthReport should include disk full information
> -
>
> Key: YARN-3254
> URL: https://issues.apache.org/jira/browse/YARN-3254
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.6.0
>Reporter: Akira Ajisaka
>Assignee: Suma Shivaprasad
> Fix For: 3.0.0-beta1
>
> Attachments: Screen Shot 2015-02-24 at 17.57.39.png, Screen Shot 
> 2015-02-25 at 14.38.10.png, YARN-3254-001.patch, YARN-3254-002.patch, 
> YARN-3254-003.patch, YARN-3254-004.patch, YARN-3254-005.patch
>
>
> When a NodeManager's local disk gets almost full, the NodeManager sends a 
> health report to ResourceManager that "local/log dir is bad" and the message 
> is displayed on ResourceManager Web UI. It's difficult for users to detect 
> why the dir is bad.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6634) [API] Refactor ResourceManager WebServices to make API explicit

2017-08-04 Thread Carlo Curino (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115150#comment-16115150
 ] 

Carlo Curino commented on YARN-6634:


Thanks [~giovanni.fumarola] for the branch-2 version. Patch looks good to me, I 
committed this to branch-2 and I am closing this JIRA.

> [API] Refactor ResourceManager WebServices to make API explicit
> ---
>
> Key: YARN-6634
> URL: https://issues.apache.org/jira/browse/YARN-6634
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.8.0
>Reporter: Subru Krishnan
>Assignee: Giovanni Matteo Fumarola
>Priority: Critical
> Fix For: 2.9.0, 3.0.0-alpha4
>
> Attachments: YARN-6634-branch-2.v1.patch, YARN-6634.proto.patch, 
> YARN-6634.v1.patch, YARN-6634.v2.patch, YARN-6634.v3.patch, 
> YARN-6634.v4.patch, YARN-6634.v5.patch, YARN-6634.v6.patch, 
> YARN-6634.v7.patch, YARN-6634.v8.patch, YARN-6634.v9.patch
>
>
> The RM exposes few REST queries but there's no clear API interface defined. 
> This makes it painful to build either clients or extension components like 
> Router (YARN-5412) that expose REST interfaces themselves. This jira proposes 
> adding a RM WebServices protocol similar to the one we have for RPC, i.e. 
> {{ApplicationClientProtocol}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6955) Concurrent registerAM thread in Federation Interceptor

2017-08-04 Thread Subru Krishnan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subru Krishnan updated YARN-6955:
-
Issue Type: Sub-task  (was: Bug)
Parent: YARN-5597

> Concurrent registerAM thread in Federation Interceptor
> --
>
> Key: YARN-6955
> URL: https://issues.apache.org/jira/browse/YARN-6955
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Minor
> Attachments: YARN-6955.v1.patch
>
>
>  The timeout between AM and AMRMProxy is shorter than the timeout + failOver 
> between FederationInterceptor (AMRMProxy) and RM. When the first register 
> thread in FI is blocked because of an RM failover, AM can timeout and resend 
> register call, leading to two outstanding register call inside FI. 
> Eventually when RM comes back up, one thread succeeds register and the other 
> thread got an application already registered exception. FI should swallow the 
> exception and return success back to AM in both threads. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Assigned] (YARN-3661) Basic Federation UI

2017-08-04 Thread Subru Krishnan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subru Krishnan reassigned YARN-3661:


Assignee: Inigo Goiri  (was: Giovanni Matteo Fumarola)

> Basic Federation UI 
> 
>
> Key: YARN-3661
> URL: https://issues.apache.org/jira/browse/YARN-3661
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Giovanni Matteo Fumarola
>Assignee: Inigo Goiri
>
> The UIs provided by each RM, provide a correct "local" view of what is 
> running in a sub-cluster. In the context of federation we need new 
> UIs that can track load, jobs, users across sub-clusters.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6946) Upgrade JUnit from 4 to 5 in hadoop-yarn-common

2017-08-04 Thread Akira Ajisaka (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated YARN-6946:

Attachment: YARN-6946.wip001.patch

I'm trying to upgrade JUnit 4 to 5 in hadoop-yarn-common. Very hard work.

> Upgrade JUnit from 4 to 5 in hadoop-yarn-common
> ---
>
> Key: YARN-6946
> URL: https://issues.apache.org/jira/browse/YARN-6946
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: test
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
> Attachments: YARN-6946.wip001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-6361) FairScheduler: FSLeafQueue.fetchAppsWithDemand CPU usage is high with big queues

2017-08-04 Thread YunFan Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114258#comment-16114258
 ] 

YunFan Zhou edited comment on YARN-6361 at 8/4/17 11:21 AM:


[~yufeigu] Thank Yufei.
For this question, including the optimization of the scheduling performance of 
the FairScheduler, I have the following ideas, and I apply these ideas to our 
production environment. The performance of the scheduling is ideal, and the 
speed of the assigning container can reach 5000 ~ 1 per second when 
aggregate resource requirements for the cluster is high.

Here's what I do:
* Avoid frequent ordering, and it's pointless and a waste of time to do a 
sequence before each assign container. Because, after each assignment, the 
whole child nodes of the queue are basically staying in order. 
And we don't really need to ensure that all of our fair shares is guaranteed, 
after all, even though we do a sort of order before each of the container's 
assignment because the *FSQueue#demand* is updated in the last time the 
*FairScheduler# update* cycle. 
So the value of demand is not real time, which also leads to the fact that we 
are not strictly and fairly shared.
So, we can sort all the queues at the *FairScheduler#update* cycle, and we now 
have a default of 0.5 s per update cycle, which is worth doing. 
Since we have not been able to make a strict fair share, why don't we sacrifice 
some of our semantics of fair scheduler in exchange for better performance?
* Improve the performance of the *Schedulable#getResourceUsage* calculation, 
making it complex in O(1).

For one, there are several related smaller but especially useful optimization 
points. 
And we can guarantee that the cost of assigning a container is at O(1) 
complexity.
But I don't know if you can accept that. If you can accept it, I will list a 
few more detailed points later.



was (Author: daemon):
[~yufeigu] Thank Yufei.
For this question, including the optimization of the scheduling performance of 
the FairScheduler, I have the following ideas, and I apply these ideas to our 
production environment. The performance of the scheduling is ideal, and the 
speed of the assigning container can reach 5000 ~ 1 per second when 
aggregate resource requirements for the cluster is high.

Here's what I do:
* Avoid frequent ordering, and it's pointless and a waste of time to do a 
sequence before each assign container. Because, after each assignment, the 
whole child nodes of the queue are basically staying in order. 
And we don't really need to ensure that all of our fair shares is guaranteed, 
after all, even though we do a sort of order before each of the container's 
assignment because the *FSQueue#demand* is updated in the last time the 
*FairScheduler# update* cycle. 
So the value of demand is not real time, which also leads to the fact that we 
are not strictly and fairly shared.
So, we can sort all the queues at the *FairScheduler#update* cycle, and we now 
have a default of 0.5 s per update cycle, which is worth doing. 
Since we have not been able to make a strict fair share, why don't we sacrifice 
some of our semantics of fair scheduler in exchange for better performance?
* Improve the performance of the *Schedulable#getResourceUsage* calculation, 
making it complex in O(1).

For one, there are several related smaller but especially useful optimization 
points. 
But I don't know if you can accept that. 
If you can accept it, I will list a few more detailed points later.


> FairScheduler: FSLeafQueue.fetchAppsWithDemand CPU usage is high with big 
> queues
> 
>
> Key: YARN-6361
> URL: https://issues.apache.org/jira/browse/YARN-6361
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Miklos Szegedi
>Assignee: YunFan Zhou
>Priority: Minor
> Attachments: dispatcherthread.png, threads.png
>
>
> FSLeafQueue.fetchAppsWithDemand sorts the applications by the current policy. 
> Most of the time is spent in FairShareComparator.compare. We could improve 
> this by doing the calculations outside the sort loop {{(O\(n\))}} and we 
> sorted by a fixed number inside instead {{O(n*log\(n\))}}. This could be an 
> performance issue when there are huge number of applications in a single 
> queue. The attachments shows the performance impact when there are 10k 
> applications in one queue.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6946) Upgrade JUnit from 4 to 5 in hadoop-yarn-common

2017-08-04 Thread Akira Ajisaka (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114283#comment-16114283
 ] 

Akira Ajisaka commented on YARN-6946:
-

I'm thinking it's too early to migrate JUnit from 4 to 5. JUnit 5 supports Java 
8 or upper, so the migration effects on trunk, not on branch-2. Now most of the 
patches are backported to branch-2, and the backports will become much harder.

> Upgrade JUnit from 4 to 5 in hadoop-yarn-common
> ---
>
> Key: YARN-6946
> URL: https://issues.apache.org/jira/browse/YARN-6946
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: test
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
> Attachments: YARN-6946.wip001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-6133) [ATSv2 Security] Renew delegation token for app automatically if an app collector is active

2017-08-04 Thread Varun Saxena (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114248#comment-16114248
 ] 

Varun Saxena edited comment on YARN-6133 at 8/4/17 12:26 PM:
-

Thanks [~rohithsharma] for the review.
bq. Token is renewed just before 10 seconds. Should it be increased?
What do you suggest? 10 seconds should be enough as we renew only in DT manager 
i.e. internally in NM. Token doesn't need to go to AM. Right?

bq. TimelineCollectorManager has introduced synchronized block. This is not 
necessary right.?
This is to avoid race between Collector stopping and renewal timer expiring. So 
that additional renewal timer is not set unnecessarily.  Has no functional 
impact though even if we set because it just wont find collector on expiry. But 
I thought better to avoid it altogether. Thoughts?

bq. Renewer threads count is 1. Given load on NM not much, one thread can renew 
it. But I would suggest to keep it to 50?
How many active collectors do we expect in one NM? Token renewal and token 
generation is not a very heavy task as well. Assuming we have 1000 active apps 
in say a 5000 node large cluster, we will have AMs' distributed across multiple 
nodes. So It is unlikely you will have more than 4-5 app collectors running in 
any NM at a particular moment. And even there it is unlikely that all 
collectors will have their token renewal expiry at same moment.
There are no guarantees though. But it is unlikely. We may have a situation 
wherein we launch AMs' on a particular node partition though. In this case 
there might be some hotspotting, as in multiple app collectors on one node.
But even there, 50 might be too many I think. We can keep a value higher than 1 
though if you have concerns with only 1 thread, maybe 3-5. Keep it configurable 
with default 3 or 5?


was (Author: varun_saxena):
bq. Token is renewed just before 10 seconds. Should it be increased?
What do you suggest? 10 seconds should be enough as we renew only in DT manager 
i.e. internally in NM. Token doesn't need to go to AM. Right?

bq. TimelineCollectorManager has introduced synchronized block. This is not 
necessary right.?
This is to avoid race between Collector stopping and renewal timer expiring. So 
that additional renewal timer is not set unnecessarily.  Has no functional 
impact though even if we set because it just wont find collector on expiry. But 
I thought better to avoid it altogether. Thoughts?

bq. Renewer threads count is 1. Given load on NM not much, one thread can renew 
it. But I would suggest to keep it to 50?
How many active collectors do we expect in one NM? Token renewal and token 
generation is not a very heavy task as well. Assuming we have 1000 active apps 
in say a 5000 node large cluster, we will have AMs' distributed across multiple 
nodes. So It is unlikely you will have more than 4-5 app collectors running in 
any NM at a particular moment. And even there it is unlikely that all 
collectors will have their token renewal expiry at same moment.
There are no guarantees though. But it is unlikely. We may have a situation 
wherein we launch AMs' on a particular node partition though. In this case 
there might be some hotspotting, as in multiple app collectors on one node.
But even there, 50 might be too many I think. We can keep a value higher than 1 
though if you have concerns with only 1 thread, maybe 3-5. Keep it configurable 
with default 3 or 5?

> [ATSv2 Security] Renew delegation token for app automatically if an app 
> collector is active
> ---
>
> Key: YARN-6133
> URL: https://issues.apache.org/jira/browse/YARN-6133
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>  Labels: yarn-5355-merge-blocker
> Attachments: YARN-6133-YARN-5355.01.patch, 
> YARN-6133-YARN-5355.02.patch, YARN-6133-YARN-5355.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6930) Admins should be able to explicitly enable specific LinuxContainerRuntime in the NodeManager

2017-08-04 Thread Greg Phillips (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114336#comment-16114336
 ] 

Greg Phillips commented on YARN-6930:
-

Whitelisting runtimes seems to be the best option. I would likely modify the 
way sandbox-mode is selected to rely on the runtime whitelist and the container 
environment instead of using {{yarn.nodemanager.runtime.linux.sandbox-mode}}. 
This would remove the redundant knob issue.

> Admins should be able to explicitly enable specific LinuxContainerRuntime in 
> the NodeManager
> 
>
> Key: YARN-6930
> URL: https://issues.apache.org/jira/browse/YARN-6930
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Shane Kumpf
>
> Today, in the java land, all LinuxContainerRuntimes are always enabled when 
> using LinuxContainerExecutor and the user can simply invoke anything that 
> he/she wants - default, docker, java-sandbox.
> We should have a way for admins to explicitly enable only specific runtimes 
> that he/she decides for the cluster. And by default, we should have 
> everything other than the default one disabled.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6951) Fix debug log when Resource handler chain is enabled

2017-08-04 Thread Sunil G (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-6951:
--
Priority: Minor  (was: Major)

> Fix debug log when Resource handler chain is enabled
> 
>
> Key: YARN-6951
> URL: https://issues.apache.org/jira/browse/YARN-6951
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Yang Wang
>Assignee: Yang Wang
>Priority: Minor
>  Labels: newbie++
> Attachments: YARN-6951.001.patch
>
>
> {code:title=LinuxContainerExecutor.java}
>   ... ...
>   if (LOG.isDebugEnabled()) {
> LOG.debug("Resource handler chain enabled = " + (resourceHandlerChain
> == null));
>   }
>   ... ...
> {code}
> I think it is just a typo.When resourceHandlerChain is not null, print the 
> log "Resource handler chain enabled = true".



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6951) Fix debug log when Resource handler chain is enabled

2017-08-04 Thread Sunil G (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114345#comment-16114345
 ] 

Sunil G commented on YARN-6951:
---

Looks fine. Will commit once jenkins is run.

> Fix debug log when Resource handler chain is enabled
> 
>
> Key: YARN-6951
> URL: https://issues.apache.org/jira/browse/YARN-6951
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Yang Wang
>Assignee: Yang Wang
>Priority: Minor
>  Labels: newbie++
> Attachments: YARN-6951.001.patch
>
>
> {code:title=LinuxContainerExecutor.java}
>   ... ...
>   if (LOG.isDebugEnabled()) {
> LOG.debug("Resource handler chain enabled = " + (resourceHandlerChain
> == null));
>   }
>   ... ...
> {code}
> I think it is just a typo.When resourceHandlerChain is not null, print the 
> log "Resource handler chain enabled = true".



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

1 2 >

1 - 100 of 118 matches

Mail list logo