[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2016-01-20 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15109942#comment-15109942
 ] 

Wangda Tan commented on YARN-4479:
--

bq. Wangda Tan Would you mind if I take over new JIRA fixing all these issues?
+1 to fix this in a separated JIRA

> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Fix For: 2.8.0
>
> Attachments: 0001-YARN-4479.patch, 0002-YARN-4479.patch, 
> 0003-YARN-4479.patch, 0004-YARN-4479.patch, 0004-YARN-4479.patch, 
> 0005-YARN-4479.patch, 0006-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2016-01-20 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15109985#comment-15109985
 ] 

Naganarasimha G R commented on YARN-4479:
-

Thanks for pointing out  [~wangda],
bq. Instead of using queue's configured ordering-policy for pending apps, it 
should use FifoOrderingPolicyForPendingApps.
Yes and having {{FairOrderingPolicy}} for the pending apps does not make sense 
as the {{getCachedUsed}} will be always zero . And as you said currently we can 
keep it fixed policy for now as there no real world need to make pending queues 
configurable.  So i am ok with this approach .

> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Fix For: 2.8.0
>
> Attachments: 0001-YARN-4479.patch, 0002-YARN-4479.patch, 
> 0003-YARN-4479.patch, 0004-YARN-4479.patch, 0004-YARN-4479.patch, 
> 0005-YARN-4479.patch, 0006-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2016-01-20 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110003#comment-15110003
 ] 

Rohith Sharma K S commented on YARN-4479:
-

Filed a ticket YARN-4617  for handling above mentioned points.

> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Fix For: 2.8.0
>
> Attachments: 0001-YARN-4479.patch, 0002-YARN-4479.patch, 
> 0003-YARN-4479.patch, 0004-YARN-4479.patch, 0004-YARN-4479.patch, 
> 0005-YARN-4479.patch, 0006-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2016-01-20 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15109057#comment-15109057
 ] 

Jian He commented on YARN-4479:
---

[~rohithsharma], I think we had same discussion offline whether it's worth for 
pendingApps to have its own ordering policy. But it was then crossed off. What 
was the reason ? Is it because we thought pending apps and active apps should 
share most policy ? Now I agree this approach is better,  if we agree that 
pending apps can just be treated separately,   


> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Fix For: 2.8.0
>
> Attachments: 0001-YARN-4479.patch, 0002-YARN-4479.patch, 
> 0003-YARN-4479.patch, 0004-YARN-4479.patch, 0004-YARN-4479.patch, 
> 0005-YARN-4479.patch, 0006-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2016-01-20 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15109636#comment-15109636
 ] 

Rohith Sharma K S commented on YARN-4479:
-

bq. But it was then crossed off. What was the reason ?
Yes, we discussed it offline and crossed off. Reason is
# After YARN-3873, it was assumed that both  active-&-pending should *always* 
share same policy. And in earlier 
[comment|https://issues.apache.org/jira/browse/YARN-4479?focusedCommentId=15108236=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15108236]
 in this jira  [~leftnoteasy] pointed out it should not be same.
# Another reason was, was thinking that to introduce a new configuration for 
pending ordering policy. Wangda suggests that we can have fixed ordering policy 
nevertheless of any active ordering policy.

> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Fix For: 2.8.0
>
> Attachments: 0001-YARN-4479.patch, 0002-YARN-4479.patch, 
> 0003-YARN-4479.patch, 0004-YARN-4479.patch, 0004-YARN-4479.patch, 
> 0005-YARN-4479.patch, 0006-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2016-01-20 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15108236#comment-15108236
 ] 

Wangda Tan commented on YARN-4479:
--

Thanks [~sunilg]/[~Naganarasimha],

I looked at existing code again, I still think we should make such logic 
existed in orderingPolicy only.

Such logic increases complexities of LeafQueue, and it has to handle the new 
added {{pendingOPForRecoveredApps}} In many places, such as:
{code}
765   if (application.isAttemptRecovering()) {
766 pendingOPForRecoveredApps.removeSchedulableEntity(application);
767   } else {
768 pendingOrderingPolicy.removeSchedulableEntity(application);
769   }
{code}
And
{code}
1566for (FiCaSchedulerApp pendingApp : pendingOPForRecoveredApps
1567.getSchedulableEntities()) {
1568  apps.add(pendingApp.getApplicationAttemptId());
1569}
1538for (FiCaSchedulerApp pendingApp : pendingOrderingPolicy1570
for (FiCaSchedulerApp pendingApp : pendingOrderingPolicy
{code}

In addition to this problem, I just noticed another issue of 
pending-ordering-policy introduced by YARN-3873.
It assumes queue's ordering-policy for pending apps should as same as 
ordering-policy for active apps. But actually it doesn't, for example, 
pending-ordering-policy of fair-ordering-policy should be FIFO instead of FAIR.

Some ideas on top of my mind to fix the above issue and cleanup code.
- Keep changes for SchedulerApplicationAttempt/CapacityScheduler in the patch: 
0006-YARN-4479.patch to set "isRecoverying" field
- Add a RecoveryComparator, and add a new FifoOrderingPolicyForPendingApps 
which extends FifoOrderingPolicy but uses RecoveryComparator
- Instead of using queue's configured ordering-policy for pending apps, it 
should use FifoOrderingPolicyForPendingApps.
with the change, user cannot configure ordering-policy for pending-apps, I 
didn't see a strong real life requirement for that now.

Thoughts?

> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Fix For: 2.8.0
>
> Attachments: 0001-YARN-4479.patch, 0002-YARN-4479.patch, 
> 0003-YARN-4479.patch, 0004-YARN-4479.patch, 0004-YARN-4479.patch, 
> 0005-YARN-4479.patch, 0006-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2016-01-20 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15108300#comment-15108300
 ] 

Sunil G commented on YARN-4479:
---

Thanks [~leftnoteasy] for the detailed comment.
Yes, I understood the reason behind the same when FairOrderingPolicy is used. 
Generally fine with approach suggested. However few minor points.

1.
bq.Keep changes for SchedulerApplicationAttempt/CapacityScheduler in the patch: 
0006-YARN-4479.patch to set "isRecoverying" field
We already have a recovering field in SchedulerApplicationAttempt. But if we 
make use of this, we need to reset the flag once application is moved to 
{{activeApplications}} list. So if we reset this flag, we will loose 
information whether this was a recovered app. As of now I am not seeing any use 
with this, but it may confuse. So cud we introduce a new flag/field for this to 
avoid confusion.

2. With new FifoOrderingPolicyForPendingApps, we are agreeing that Priority and 
Submission Time will be factor to decide pending apps ordering, correct?


> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Fix For: 2.8.0
>
> Attachments: 0001-YARN-4479.patch, 0002-YARN-4479.patch, 
> 0003-YARN-4479.patch, 0004-YARN-4479.patch, 0004-YARN-4479.patch, 
> 0005-YARN-4479.patch, 0006-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2016-01-20 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15108340#comment-15108340
 ] 

Rohith Sharma K S commented on YARN-4479:
-

Right, I think we can spin it as improvement since there were few JIRA went on 
top of this.
[~leftnoteasy] Would you mind if I take over new JIRA fixing all these issues?

> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Fix For: 2.8.0
>
> Attachments: 0001-YARN-4479.patch, 0002-YARN-4479.patch, 
> 0003-YARN-4479.patch, 0004-YARN-4479.patch, 0004-YARN-4479.patch, 
> 0005-YARN-4479.patch, 0006-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2016-01-20 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15108323#comment-15108323
 ] 

Wangda Tan commented on YARN-4479:
--

Hi [~sunilg]/[~rohithsharma],

Please note one of my comment above:
{code}
Instead of using queue's configured ordering-policy for pending apps, it should 
use FifoOrderingPolicyForPendingApps.
with the change, user cannot configure ordering-policy for pending-apps, I 
didn't see a strong real life requirement for that now.
{code}

I think we still need to treat ordering policy of pending apps specially, and 
with the new added FifoOrderingPolicyForPendingApps, we don't need to reset 
isRecoverying and other ordering policies don't need the isRecoverying field, 
so it won't affect performance.

bq. 2. With new FifoOrderingPolicyForPendingApps, we are agreeing that Priority 
and Submission Time will be factor to decide pending apps ordering, correct?
Yes, the only difference between FifoOrderingPolicyForPendingApps and 
FifoOrderingPolicy is FifoOrderingPolicyForPendingApps let recovering apps go 
first.

> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Fix For: 2.8.0
>
> Attachments: 0001-YARN-4479.patch, 0002-YARN-4479.patch, 
> 0003-YARN-4479.patch, 0004-YARN-4479.patch, 0004-YARN-4479.patch, 
> 0005-YARN-4479.patch, 0006-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2016-01-20 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15108310#comment-15108310
 ] 

Rohith Sharma K S commented on YARN-4479:
-

Hi [~leftnoteasy] 
There were 2 approaches to solve this
# Handle this problem in upper layer i.e LeafQueue layer nevertheless of any 
ordering 
policy.[0006-YARN-4479.patch|https://issues.apache.org/jira/secure/attachment/12780671/0006-YARN-4479.patch]
# Handle in very bottom layer i.e specific to ordering policy. If any new 
ordering policy added then that has to take care of this JIRA problem. 
[0003-YARN-4479.patch|https://issues.apache.org/jira/secure/attachment/12779965/0003-YARN-4479.patch]
Issues in approach-2 is 
## Performance : It does comparison for wasAttemptRecovering on every addition 
or removal. When there are huge  number of applications, it is significantly 
impact.
## The flag need to reset while adding to activeApplication list.  

> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Fix For: 2.8.0
>
> Attachments: 0001-YARN-4479.patch, 0002-YARN-4479.patch, 
> 0003-YARN-4479.patch, 0004-YARN-4479.patch, 0004-YARN-4479.patch, 
> 0005-YARN-4479.patch, 0006-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2016-01-20 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15108326#comment-15108326
 ] 

Rohith Sharma K S commented on YARN-4479:
-

bq. I think we still need to treat ordering policy of pending apps specially, 
and with the new added FifoOrderingPolicyForPendingApps, we don't need to reset 
isRecoverying and other ordering policies don't need the isRecoverying field, 
so it won't affect performance.
+1 for the approach, makesense to me

> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Fix For: 2.8.0
>
> Attachments: 0001-YARN-4479.patch, 0002-YARN-4479.patch, 
> 0003-YARN-4479.patch, 0004-YARN-4479.patch, 0004-YARN-4479.patch, 
> 0005-YARN-4479.patch, 0006-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2016-01-20 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15108330#comment-15108330
 ] 

Sunil G commented on YARN-4479:
---

Yes. Makes sense to me. +1 for the approach.
I think we can spin off this in a new ticket, because few patches went on top 
of this change. So reverting may be complex.

Thoughts?

> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Fix For: 2.8.0
>
> Attachments: 0001-YARN-4479.patch, 0002-YARN-4479.patch, 
> 0003-YARN-4479.patch, 0004-YARN-4479.patch, 0004-YARN-4479.patch, 
> 0005-YARN-4479.patch, 0006-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2016-01-19 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15108033#comment-15108033
 ] 

Naganarasimha G R commented on YARN-4479:
-

I think you are referring to the approach similar to the one done in 
0002-YARN-4479.patch ? having additional logic in the comparator which checks 
whether the attempt was wasAttemptRunningEarlier. After discussion we tried to 
avoid it as unnecessary comparisions happen s even after recovery when 
comparing each app. If you have any other approach may be we can discuss further


> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Fix For: 2.8.0
>
> Attachments: 0001-YARN-4479.patch, 0002-YARN-4479.patch, 
> 0003-YARN-4479.patch, 0004-YARN-4479.patch, 0004-YARN-4479.patch, 
> 0005-YARN-4479.patch, 0006-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2016-01-19 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107994#comment-15107994
 ] 

Wangda Tan commented on YARN-4479:
--

Hi [~rohithsharma],

Apologize for my very late feedback. Instead of adding a new list of 
recovery-and-pending-apps, could we add this behavior (early submitted & 
running apps goes first) to our existing policy? Maintaining only one ordering 
policy in LeafQueue is easier.

Thoughts? [~jianhe]/[~Naganarasimha]/[~sunilg]

> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Fix For: 2.8.0
>
> Attachments: 0001-YARN-4479.patch, 0002-YARN-4479.patch, 
> 0003-YARN-4479.patch, 0004-YARN-4479.patch, 0004-YARN-4479.patch, 
> 0005-YARN-4479.patch, 0006-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2016-01-19 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15108048#comment-15108048
 ] 

Sunil G commented on YARN-4479:
---

Yes [~leftnoteasy], as mentioned by [~Naganarasimha Garla] , this option came 
up as a possible solution. However, there were few complexities:

For this approach, we Needed a new {{RecoveryComparator}}. This has to be added 
to {{FifoOrderingPolicy}} also. RecoveryComparator was supposed to run with the 
information that whether this app was running prior to recovery. So a flag has 
to be added to FicaSchedulerApp, and then reset the same after first round of 
activation. Hence more complexities in various part of scheduler was needed for 
this approach. So a simpler approach is made in LeafQueue. Pls share your 
thoughts if we missed any in this approach.
[~rohithsharma] Could u pls add if I missed any point for this approach.

> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Fix For: 2.8.0
>
> Attachments: 0001-YARN-4479.patch, 0002-YARN-4479.patch, 
> 0003-YARN-4479.patch, 0004-YARN-4479.patch, 0004-YARN-4479.patch, 
> 0005-YARN-4479.patch, 0006-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2016-01-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15090233#comment-15090233
 ] 

Hudson commented on YARN-4479:
--

FAILURE: Integrated in Hadoop-trunk-Commit #9075 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9075/])
YARN-4479. Change CS LeafQueue pendingOrderingPolicy to hornor recovered 
(jianhe: rev 109e528ef5d8df07443373751266b4417acc981a)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationPriority.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml


> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Fix For: 2.8.0
>
> Attachments: 0001-YARN-4479.patch, 0002-YARN-4479.patch, 
> 0003-YARN-4479.patch, 0004-YARN-4479.patch, 0004-YARN-4479.patch, 
> 0005-YARN-4479.patch, 0006-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2016-01-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15085274#comment-15085274
 ] 

Hadoop QA commented on YARN-4479:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
51s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 58s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 16s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
30s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 43s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
21s {color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 6m 39s 
{color} | {color:red} branch/hadoop-yarn-project/hadoop-yarn no findbugs output 
file (hadoop-yarn-project/hadoop-yarn/target/findbugsXml.xml) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 57s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 32s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 56s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 56s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 14s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 14s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
32s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 40s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
19s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 0s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 6m 40s 
{color} | {color:red} patch/hadoop-yarn-project/hadoop-yarn no findbugs output 
file (hadoop-yarn-project/hadoop-yarn/target/findbugsXml.xml) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 56s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 31s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 75m 29s {color} 
| {color:red} hadoop-yarn in the patch failed with JDK v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 80m 25s {color} 
| {color:red} hadoop-yarn in the patch failed with JDK v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
22s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 209m 2s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestAMAuthorization |
|   | hadoop.yarn.server.resourcemanager.TestClientRMTokens |
| JDK v1.7.0_91 Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestAMAuthorization |
|   | hadoop.yarn.server.resourcemanager.TestClientRMTokens |
\\
\\
|| Subsystem || 

[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2016-01-06 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15085297#comment-15085297
 ] 

Rohith Sharma K S commented on YARN-4479:
-

test cases are unrelated to this patch. These tests failures will be handled in 
YARN-4478

> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: 0001-YARN-4479.patch, 0002-YARN-4479.patch, 
> 0003-YARN-4479.patch, 0004-YARN-4479.patch, 0004-YARN-4479.patch, 
> 0005-YARN-4479.patch, 0006-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2016-01-05 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15083283#comment-15083283
 ] 

Sunil G commented on YARN-4479:
---

Patch looks good [~rohithsharma].

> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: 0001-YARN-4479.patch, 0002-YARN-4479.patch, 
> 0003-YARN-4479.patch, 0004-YARN-4479.patch, 0004-YARN-4479.patch, 
> 0005-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2016-01-05 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15083248#comment-15083248
 ] 

Rohith Sharma K S commented on YARN-4479:
-

updated 0005-YARN-4479 patch , kindly review it

> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: 0001-YARN-4479.patch, 0002-YARN-4479.patch, 
> 0003-YARN-4479.patch, 0004-YARN-4479.patch, 0004-YARN-4479.patch, 
> 0005-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2016-01-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15083611#comment-15083611
 ] 

Hadoop QA commented on YARN-4479:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
49s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 20s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 6s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
30s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 37s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
18s {color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 6m 18s 
{color} | {color:red} branch/hadoop-yarn-project/hadoop-yarn no findbugs output 
file (hadoop-yarn-project/hadoop-yarn/target/findbugsXml.xml) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 53s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 31s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
6s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 49s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 49s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 5s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 5s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
29s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 34s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
18s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 0s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 6m 21s 
{color} | {color:red} patch/hadoop-yarn-project/hadoop-yarn no findbugs output 
file (hadoop-yarn-project/hadoop-yarn/target/findbugsXml.xml) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 50s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 14s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 83m 27s {color} 
| {color:red} hadoop-yarn in the patch failed with JDK v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 84m 51s {color} 
| {color:red} hadoop-yarn in the patch failed with JDK v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
24s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 219m 58s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationLimits |
|   | hadoop.yarn.server.resourcemanager.TestClientRMTokens |
|   | hadoop.yarn.server.resourcemanager.scheduler.fifo.TestFifoScheduler |
|   | hadoop.yarn.server.resourcemanager.TestAMAuthorization |
| JDK v1.7.0_91 Failed 

[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2016-01-05 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15084608#comment-15084608
 ] 

Jian He commented on YARN-4479:
---

+1, [~rohithsharma], could you see if the warnings are related

> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: 0001-YARN-4479.patch, 0002-YARN-4479.patch, 
> 0003-YARN-4479.patch, 0004-YARN-4479.patch, 0004-YARN-4479.patch, 
> 0005-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2016-01-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15081156#comment-15081156
 ] 

Hadoop QA commented on YARN-4479:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
33s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 54s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 16s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
32s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 36s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
20s {color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 6m 32s 
{color} | {color:red} branch/hadoop-yarn-project/hadoop-yarn no findbugs output 
file (hadoop-yarn-project/hadoop-yarn/target/findbugsXml.xml) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 55s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 22s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 1s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 1s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 6s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 6s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 31s 
{color} | {color:red} Patch generated 1 new checkstyle issues in 
hadoop-yarn-project/hadoop-yarn (total was 357, now 353). {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 34s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
18s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 6m 19s 
{color} | {color:red} patch/hadoop-yarn-project/hadoop-yarn no findbugs output 
file (hadoop-yarn-project/hadoop-yarn/target/findbugsXml.xml) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 48s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 18s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 80m 54s {color} 
| {color:red} hadoop-yarn in the patch failed with JDK v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 82m 56s {color} 
| {color:red} hadoop-yarn in the patch failed with JDK v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
19s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 215m 20s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationLimits |
|   | hadoop.yarn.server.resourcemanager.TestClientRMTokens |
|   | 

[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2015-12-31 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15075860#comment-15075860
 ] 

Rohith Sharma K S commented on YARN-4479:
-

Discussed offline with [~jianhe] for sync up and for the solution. The summary 
is follows and respective patch updated.
# For failed attempt, while recovering need not to add to scheduler. Necessary 
changes is done at RMAppAttemptImpl
# If attempts are added to scheduler means attempts were running before RM 
restart.
# Any recovering attempts are added to new ordering policy 
{{pendingOrderingPolicyRecovery}} which is given higher preference than 
{{pendingOrderingPolicy}} while activating the applications.

> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: 0001-YARN-4479.patch, 0002-YARN-4479.patch, 
> 0003-YARN-4479.patch, 0004-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2015-12-31 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15075862#comment-15075862
 ] 

Rohith Sharma K S commented on YARN-4479:
-

kindly review the updated patch.

> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: 0001-YARN-4479.patch, 0002-YARN-4479.patch, 
> 0003-YARN-4479.patch, 0004-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2015-12-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15075912#comment-15075912
 ] 

Hadoop QA commented on YARN-4479:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
27s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
17s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 36s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
12s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
31s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 30s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 18s 
{color} | {color:red} Patch generated 5 new checkstyle issues in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 (total was 357, now 356). {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 21s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 introduced 2 new FindBugs issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 63m 27s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 43s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
19s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 145m 49s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | 
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
|  |  Inconsistent synchronization of 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.pendingOrderingPolicy;
 locked 87% of time  Unsynchronized access at LeafQueue.java:87% of time  
Unsynchronized access at LeafQueue.java:[line 1517] |
|  |  Inconsistent 

[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2015-12-30 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15075149#comment-15075149
 ] 

Sunil G commented on YARN-4479:
---

Sorry I didnt meant FairScheduler, I was trying to mention 
{{FairOrderingPolicy}}.

> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: 0001-YARN-4479.patch, 0002-YARN-4479.patch, 
> 0003-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2015-12-30 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15075138#comment-15075138
 ] 

Sunil G commented on YARN-4479:
---

HI [~rohithsharma]
This new fix will also introduce RecoveryComparator to FairOrderingPolicy too. 
Is it needed? I think it can be tracked separate after checkin whether same 
pblm will arise with FairScheduler.

> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: 0001-YARN-4479.patch, 0002-YARN-4479.patch, 
> 0003-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2015-12-30 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15075452#comment-15075452
 ] 

Jian He commented on YARN-4479:
---

 sorry, I missed this part, the recovered app need to be respected first only 
for the LeafQueue#pendingOrderingPolicy, right ? for LeafQueue#orderingPolicy, 
this is not needed.
bq. Reference test case TestRMRestart#testRMRestartAppRunningAMFailed
I don't understand how this test case is related. 

> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: 0001-YARN-4479.patch, 0002-YARN-4479.patch, 
> 0003-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2015-12-30 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15074779#comment-15074779
 ] 

Rohith Sharma K S commented on YARN-4479:
-

I think we can just relay on isAppRecovering flag which should be sufficient. 
And existing code in RMAppAttemptImpl can be there as-it-is(without patch). 
Only FAILED attempts are added to scheduler which will be removed in next event 
itself.

> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: 0001-YARN-4479.patch, 0002-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2015-12-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15075069#comment-15075069
 ] 

Hadoop QA commented on YARN-4479:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
29s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
16s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 36s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
10s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
30s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 23s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 27s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 16s 
{color} | {color:red} Patch generated 7 new checkstyle issues in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 (total was 274, now 275). {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 18s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 introduced 1 new FindBugs issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 63m 50s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 34s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
17s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 145m 43s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | 
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
|  |  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.policy.PriorityComparator
 implements Comparator but not Serializable  At 
PriorityComparator.java:Serializable  At PriorityComparator.java:[lines 26-34] |
| JDK v1.8.0_66 Failed junit tests | 

[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2015-12-29 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15074472#comment-15074472
 ] 

Jian He commented on YARN-4479:
---

- For finished attempt, I think we do not need to re-add into scheduler, so 
this whole code could be removed. 
{code}
  if (EnumSet.of(RMAppAttemptState.RUNNING, RMAppAttemptState.LAUNCHED)
  .contains(appAttempt.recoveredFinalState)) {
appAttempt.scheduler.handle(new AppAttemptAddedSchedulerEvent(
appAttempt.getAppAttemptId(), false, true, true));
  } else {
appAttempt.scheduler.handle(new AppAttemptAddedSchedulerEvent(
appAttempt.getAppAttemptId(), false, true));
  }
 {code}
Accordinlgy in BaseFinalTransition, this code need to be invoked if 
recoveredFinalState == null
 {code}
 appAttempt.eventHandler.handle(new AppAttemptRemovedSchedulerEvent(
appAttemptId, finalAttemptState, keepContainersAcrossAppAttempts));
 {code}
 - With above change, we can assume that attempt added into scheduler should be 
running, so the extra field wasAttemptRunning in AppAttemptAddedSchedulerEvent 
is not needed, the existing isAttemptRecovering flag should be enough.
- I think [~Naganarasimha]'s suggestion make sense. we should consider 
FairComparator too. May be we can add a predefined comparator in 
AbstractComparatorOrderingPolicy with the recoveryComparator initialized  and 
force underlying implementations to use this ?

> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: 0001-YARN-4479.patch, 0002-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2015-12-29 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15074705#comment-15074705
 ] 

Rohith Sharma K S commented on YARN-4479:
-

bq. For finished attempt, I think we do not need to re-add into scheduler, so 
this whole code could be removed.
While recovering application and attempts, If the last attempt is FAILED then 
from scheduler transfer state from previous attempt. So, whenever there is 
failed attempt, attempt has to be added to scheduler for obtaining the state. 
Reference test case {{TestRMRestart#testRMRestartAppRunningAMFailed}}

> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: 0001-YARN-4479.patch, 0002-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2015-12-23 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15070521#comment-15070521
 ] 

Naganarasimha G R commented on YARN-4479:
-

Hi [~rohithsharma],  Thanks for the patch,
New approach seems to be better than the older as it tries to avoid additional 
data structure used for the same purpose, but few points :
* If we consider for FairOrderingPolicy it first considers {{FairComparator}} 
and then the {{FifoComparator}}, so only if fairness is equal it will consider 
whether the application was already running, so would it be better to add 
additional comparator for recovery which can be used by both Fair and Fifo ?
* So it will be totally left to Ordering policy whether to consider the order 
of the recovered app based on submission time or not, so better to get that 
documented so that custom ordering policy can consider it.

> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: 0001-YARN-4479.patch, 0002-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2015-12-23 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15070556#comment-15070556
 ] 

Rohith Sharma K S commented on YARN-4479:
-

I had 2 options in doing this in fifoordering policy. I took simpler approach 
to make working patch. Further improvements like this will/can be addressed in 
coming patches once initial approach is agreed upon.

> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: 0001-YARN-4479.patch, 0002-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2015-12-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15070242#comment-15070242
 ] 

Hadoop QA commented on YARN-4479:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
5s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 43s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
18s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
18s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
36s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 36s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 37s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 14s 
{color} | {color:red} Patch generated 10 new checkstyle issues in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 (total was 373, now 379). {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 39s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 3 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
27s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 3m 10s 
{color} | {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_91
 with JDK v1.7.0_91 generated 1 new issues (was 2, now 3). {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 65m 54s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 66m 50s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.7.0_91. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 23s 
{color} | {color:red} Patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 152m 28s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | 

[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2015-12-21 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15066292#comment-15066292
 ] 

Rohith Sharma K S commented on YARN-4479:
-

The patch does following things
# During attempt recovery, adds new flag which tells scheduler that was it 
RUNNING/LAUNCHED before RM restart
# For all the attempts during recovery, previous running attempts are added to 
separate pendingApplicationOrdering policy which is used to track applications 
which are running before RMRestart.
# When a  node registers, activate first which are running before RMRestart. 

> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: 0001-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2015-12-21 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15066317#comment-15066317
 ] 

Naganarasimha G R commented on YARN-4479:
-

Hi [~rohithsharma], 
As discussed offline, 
Assume b4 recovery Apps which were activated  : 
*A1*(Low),*A2*(Low),*A3*(Medium) and pending were *A4*(High) & *A5*(High)
Based on the current approach applications will be activated in the order 
*A4,A5,A3,A1,A2*
After your patch it will be  *A3,A1,A2,A4,A5*
So in a way its better than the existing approach but wanted to know whether 
the order of activation should be *A1, A2, A3* itself and not based on the 
ordering policy for the recovered apps, Thoughts ?
IIUC [~sunilg] also wanted to say the same ?


> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: 0001-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2015-12-21 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15066484#comment-15066484
 ] 

Naganarasimha G R commented on YARN-4479:
-

Thanks for the comments [~sunilg] & [~rohithsharma],
bq. This patch tries to activate all applications which were running before RM 
restart happened. 
IIUC the patch, it goes through the existing flow hence all applications will 
not be activated by default but only if queue's AM resource limit is available, 
app will get activated.

bq. 2. All containers which were running earlier will still continue, 
To elaborate further Based on the scenario which i had mentioned, Assume queue 
capacity is 120GB (for simplicity), and AM resource limit is 10%(=12GB) and AM 
resource : A1 = 8GB , A2 = 2GB, A3 = 2GB, A4 = 2Gb, A5 =2Gb. After recovery 
assume all nodes are not up and only 100 Gb is available So as per the code in 
patch A3, A2, A4 & A5 will get activated (8GB)  and A1 will not get activated 
though the app is running. Correct me if my understanding is wrong

bq. Being said all this points, I also feel that we may need to add more 
complex code to keep the same order as you proposed. So if there are no major 
impacts, I think the approach taken in this patch looks fine. Thoughts?
IIUC point 1 is same as with or without the patch so no issues, point 2 IIUC 
your assumption is wrong. ??All containers which were running earlier will 
still continue??
But the approach to the scenario which i mentioned is debate able, if it 
introduces too much complexity then we can skip but just wanted to share the 
scenario, as i said current approach is fine except for the scenario mentioned.

few nits/query in the patch
{code}
@@ -607,9 +612,24 @@ private synchronized void activateApplications() {
 Map userAmPartitionLimit =
 new HashMap();
 
-for (Iterator i = getPendingAppsOrderingPolicy()
-.getAssignmentIterator(); i.hasNext();) {
-  FiCaSchedulerApp application = i.next();
+for (Iterator i =
+getPendingAppsOrderingPolicyRecovery().getAssignmentIterator(); i
+.hasNext();) {
+  activateApplications(i, amPartitionLimit, userAmPartitionLimit);
+}
+
{code}

Is for loop required here as we are looping the iterator in overloaded 
{{activateApplications(fsApp, amPartitionLimit, userAmPartitionLimit)}}

> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: 0001-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2015-12-21 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15066282#comment-15066282
 ] 

Rohith Sharma K S commented on YARN-4479:
-

Scenario which cause issue is
# Submitted the app-1 and app-2 with priority 5. Both applications are 
activated and RUNNING state.
# Submit app-3 with priority 6. This application is in pending state because of 
AMLimit.
# RM restarted, app-1 application is activated(it is behavior that for 1st 
application AMLimit is not considered) and app-2 and app-3 are in 
pendingOrderingPolicy
# AM re-registered for app-1 and app-2. Its state is now RUNNING. But app-2 and 
app-3 are still in pendingapplications.
# NodeManager re-registered with RM. As a result 1 application supposed to be 
get activated. Here, always app-3 get activated since app-3 priority is higher, 
but app-2 should get activated first since it is running before RMrestart.

> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: 0001-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2015-12-21 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15066286#comment-15066286
 ] 

Rohith Sharma K S commented on YARN-4479:
-

Attaching the patch , kindly review..

> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: 0001-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2015-12-21 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15066294#comment-15066294
 ] 

Sunil G commented on YARN-4479:
---

Adding to this,
Application which is in RUNNING state prior to restart will still be in RUNNING 
state. However, it will *NOT* be getting any new containers until its been made 
active again when other high priority applications are completed.

Patch generally looks fine, I will give some comments after checking patch more 
detail.

> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: 0001-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2015-12-21 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15066368#comment-15066368
 ] 

Rohith Sharma K S commented on YARN-4479:
-

bq. but wanted to know whether the order of activation should be A1, A2, A3 
itself and not based on the ordering policy for the recovered apps, Thoughts ?
It should be based on the ordering policy only. At this stage, all 3 
applications A1,A2 and A3 are in same level of position. So activation should 
be based on the ordering policy implementation. In specific to priority, always 
highest priority application should be activated first.


> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: 0001-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2015-12-21 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15066378#comment-15066378
 ] 

Naganarasimha G R commented on YARN-4479:
-

In Most cases this would be sufficient, but consider a case where in A3 is an 
app with large number of containers and A1 and A2 are short jobs. May be after 
recovery all nodes have not registered, due to AM resource limits A1 and/or A2 
AM's resource if activated will exceed AM's resource limits, so only A3 will be 
running. Problem here is : there is no configurable ordering policy for 
recovered apps seperately, it applies the same policy so in rare cases it might 
lead to starvation ?

> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: 0001-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2015-12-21 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15066408#comment-15066408
 ] 

Rohith Sharma K S commented on YARN-4479:
-

Without RM restart also now apps can be starved if AMLimit is reached. 

> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: 0001-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2015-12-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15066402#comment-15066402
 ] 

Hadoop QA commented on YARN-4479:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
58s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 39s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
16s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
36s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 29s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 32s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 14s 
{color} | {color:red} Patch generated 10 new checkstyle issues in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 (total was 321, now 325). {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 40s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 4 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 3 line(s) with tabs. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 29s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 introduced 2 new FindBugs issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 65m 34s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 66m 29s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
23s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 150m 57s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | 
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
|  |  Inconsistent synchronization of 

[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2015-12-21 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15066415#comment-15066415
 ] 

Sunil G commented on YARN-4479:
---

Hi [~Naganarasimha]
bq.IUC Sunil G also wanted to say the same ?
I meant in slight different way. With existing approach, any application which 
was running earlier to RM restart will also be in RUNNING state. It may starve 
as per the scenario which you and Rohith mentioned, but the state of 
application will be running.

Also adding to the existing discussion, I would like to point out few things.
This patch tries to activate all applications which were running before RM 
restart happened. It may be getting activated with different order, but it is 
trying to put all these apps in the activated list of scheduler (App state will 
be RUNNING still). 
1. After restart with or without ordering, only highest priority app will be 
selected for scheduling from activated list. This same behavior is happening 
before RM restart also. SO there seems no impact with this. Pls correct me if I 
am wrong.
2. All containers which were running earlier will still continue, and all 
pending requests will be updated/refreshed and this is from 
ApplicationMasterService thread. So if all earlier running apps are 
activated,then behavior will be same from scheduler end, correct?

Being said all this points, I also feel that we may need to add more complex 
code to keep the same order as you proposed. So if there are no major impacts, 
I think the approach taken in this patch looks fine. Thoughts?

> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: 0001-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2015-12-21 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15066571#comment-15066571
 ] 

Naganarasimha G R commented on YARN-4479:
-

Thanks [~sunilg],
bq. Its debatable and I think with discussion we can conclude the approach here.
True its debate-able, but one more thing to be considered(/not missed) here A4 
and A5 gets activated even before A2 (as per the correction i mentioned).

bq. {{All containers which were running earlier will still continue}} 
I mistook what you meant, it seems like you got what i wanted to mention.



> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: 0001-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2015-12-21 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15066519#comment-15066519
 ] 

Naganarasimha G R commented on YARN-4479:
-

small correction in the example : {{A1 = 8GB , A2 = 2GB, A3 = 2GB, A4 = 2Gb, A5 
=2Gb}}  => {{A1 = 2GB , A2 = 8GB, A3 = 2GB, A4 = 2Gb, A5 =2Gb}} 

> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: 0001-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2015-12-21 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15066553#comment-15066553
 ] 

Sunil G commented on YARN-4479:
---

Thanks [~Naganarasimha Garla] fr the comments.

bq.This patch tries to activate all applications which were running before RM 
restart happened
Being said this, yes its definitely depends on available AM limit after restart 
(I meant the positive case in my earlier comment where all cluster resource 
were available). 
I did think about the case when some NMs are not registered back, and limit is 
lesser. In that case, we will have app-A1 pending in the list to get activated. 
And this application will be the one which will be activated first if any space 
is available.  This ensures that high priority apps which were in the pending 
list will get containers, and app-A1 which were low in priority will wait. Even 
though A1 is activated, it has to wait till other high priority apps are done 
with its request. So A1 in pending list is may be fine provided other apps are 
completed sooner or failed NMs are up. But I am not saying its correct. Its 
debatable and I think with discussion we can conclude the approach here.

Also abt {{All containers which were running earlier will still continue}}, I 
meant about the live containers of apps which were running prior to restart. 
After restart, even for the pending apps (apps like A1) as mentioned in ur 
scenario, its running containers wont be killed. Am I missing something?


> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: 0001-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2015-12-17 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15063619#comment-15063619
 ] 

Sunil G commented on YARN-4479:
---

Thanks [~rohithsharma] for raising this. As discussed offline, all running 
attempts during recovery (attempts state *NOT* in final states and *NOT* null) 
could be considered for this list. Such apps can be made activated first just 
to be in sync for the state before recovery. Now also  its possible that we may 
get starvation but it will be same as what it was happening before HA. 

> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2015-12-17 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15063610#comment-15063610
 ] 

Rohith Sharma K S commented on YARN-4479:
-

Thinking while recovering the applications, previous running applications 
should be added to separate ordering policy like 
{{pendingOrderingPolicyRecovery}} and use this while activating applications 
prior to pendingOrderingPolicy iteration.

> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)