date:20140605

Remus Rusanu created YARN-2129:
--

 Summary: Add scheduling priority to the 
WindowsSecureContainerExecutor
 Key: YARN-2129
 URL: https://issues.apache.org/jira/browse/YARN-2129
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 3.0.0
Reporter: Remus Rusanu
Assignee: Remus Rusanu


The WCE (YARN-1972) could and should honor NM_CONTAINER_EXECUTOR_SCHED_PRIORITY.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2030) Use StateMachine to simplify handleStoreEvent() in RMStateStore

2014-06-05 Thread Jian He (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019588#comment-14019588
 ] 

Jian He commented on YARN-2030:
---

I meant add an abstract getProto method.

> Use StateMachine to simplify handleStoreEvent() in RMStateStore
> ---
>
> Key: YARN-2030
> URL: https://issues.apache.org/jira/browse/YARN-2030
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Junping Du
>Assignee: Binglin Chang
> Attachments: YARN-2030.v1.patch, YARN-2030.v2.patch, 
> YARN-2030.v3.patch
>
>
> Now the logic to handle different store events in handleStoreEvent() is as 
> following:
> {code}
> if (event.getType().equals(RMStateStoreEventType.STORE_APP)
> || event.getType().equals(RMStateStoreEventType.UPDATE_APP)) {
>   ...
>   if (event.getType().equals(RMStateStoreEventType.STORE_APP)) {
> ...
>   } else {
> ...
>   }
>   ...
>   try {
> if (event.getType().equals(RMStateStoreEventType.STORE_APP)) {
>   ...
> } else {
>   ...
> }
>   } 
>   ...
> } else if (event.getType().equals(RMStateStoreEventType.STORE_APP_ATTEMPT)
> || event.getType().equals(RMStateStoreEventType.UPDATE_APP_ATTEMPT)) {
>   ...
>   if (event.getType().equals(RMStateStoreEventType.STORE_APP_ATTEMPT)) {
> ...
>   } else {
> ...
>   }
> ...
> if (event.getType().equals(RMStateStoreEventType.STORE_APP_ATTEMPT)) {
>   ...
> } else {
>   ...
> }
>   }
>   ...
> } else if (event.getType().equals(RMStateStoreEventType.REMOVE_APP)) {
> ...
> } else {
>   ...
> }
> }
> {code}
> This is not only confuse people but also led to mistake easily. We may 
> leverage state machine to simply this even no state transitions.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2030) Use StateMachine to simplify handleStoreEvent() in RMStateStore

2014-06-05 Thread Jian He (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019581#comment-14019581
 ] 

Jian He commented on YARN-2030:
---

I see, thanks for the update.  Maybe just promote getProto() to the abstract 
class, given this record is used internally by RM only? should be fine?

> Use StateMachine to simplify handleStoreEvent() in RMStateStore
> ---
>
> Key: YARN-2030
> URL: https://issues.apache.org/jira/browse/YARN-2030
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Junping Du
>Assignee: Binglin Chang
> Attachments: YARN-2030.v1.patch, YARN-2030.v2.patch, 
> YARN-2030.v3.patch
>
>
> Now the logic to handle different store events in handleStoreEvent() is as 
> following:
> {code}
> if (event.getType().equals(RMStateStoreEventType.STORE_APP)
> || event.getType().equals(RMStateStoreEventType.UPDATE_APP)) {
>   ...
>   if (event.getType().equals(RMStateStoreEventType.STORE_APP)) {
> ...
>   } else {
> ...
>   }
>   ...
>   try {
> if (event.getType().equals(RMStateStoreEventType.STORE_APP)) {
>   ...
> } else {
>   ...
> }
>   } 
>   ...
> } else if (event.getType().equals(RMStateStoreEventType.STORE_APP_ATTEMPT)
> || event.getType().equals(RMStateStoreEventType.UPDATE_APP_ATTEMPT)) {
>   ...
>   if (event.getType().equals(RMStateStoreEventType.STORE_APP_ATTEMPT)) {
> ...
>   } else {
> ...
>   }
> ...
> if (event.getType().equals(RMStateStoreEventType.STORE_APP_ATTEMPT)) {
>   ...
> } else {
>   ...
> }
>   }
>   ...
> } else if (event.getType().equals(RMStateStoreEventType.REMOVE_APP)) {
> ...
> } else {
>   ...
> }
> }
> {code}
> This is not only confuse people but also led to mistake easily. We may 
> leverage state machine to simply this even no state transitions.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1365) ApplicationMasterService to allow Register and Unregister of an app that was running before restart

2014-06-05 Thread Jian He (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019574#comment-14019574
 ] 

Jian He commented on YARN-1365:
---

bq. The option is see is we pass in a flag to AppAttemptAddedSchedulerEvent 
that tells scheduler not to issue ATTEMPT_ADDED.
Makes sense. Anubhav, do you want to comment on YARN-1368 so that I can fix it 
or you want to include the fix here?

> ApplicationMasterService to allow Register and Unregister of an app that was 
> running before restart
> ---
>
> Key: YARN-1365
> URL: https://issues.apache.org/jira/browse/YARN-1365
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Bikas Saha
>Assignee: Anubhav Dhoot
> Attachments: YARN-1365.001.patch, YARN-1365.002.patch, 
> YARN-1365.003.patch, YARN-1365.initial.patch
>
>
> For an application that was running before restart, the 
> ApplicationMasterService currently throws an exception when the app tries to 
> make the initial register or final unregister call. These should succeed and 
> the RMApp state machine should transition to completed like normal. 
> Unregistration should succeed for an app that the RM considers complete since 
> the RM may have died after saving completion in the store but before 
> notifying the AM that the AM is free to exit.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2122) In AllocationFileLoaderService, the reloadThread should be created in init() and started in start()


[ 
https://issues.apache.org/jira/browse/YARN-2122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019565#comment-14019565
 ] 

Karthik Kambatla commented on YARN-2122:


{code}
  public void serviceInit(Configuration conf) {
this.allocFile = getAllocationFile(conf);
super.init(conf);
reloadThread = new Thread() {
  public void run() {
{code}
- We should call super.serviceInit instead of super.init, and that call should 
be the last statement of the method.
- Creation of reloadThread should be guarded by if (allocFile != null)

{code}
  public void serviceStart() {
if (allocFile == null) {
  return;
}
reloadThread.start();
super.start();
  }
{code}
- It is good practice to call super.serviceStart() at the end of this method 
(not super.start()). So, we should probably not check for (allocFile == null) 
and instead call reloadThread.start() after checking (allocFile != null).

- super.serviceStop instead of super.stop
- For the findbugs warning, I see why we don't have to do any additional 
synchronization. Can we add a findbugs exclusion? 

> In AllocationFileLoaderService, the reloadThread should be created in init() 
> and started in start()
> ---
>
> Key: YARN-2122
> URL: https://issues.apache.org/jira/browse/YARN-2122
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 2.4.0
>Reporter: Karthik Kambatla
>Assignee: Robert Kanter
> Attachments: YARN-2122.patch, YARN-2122.patch
>
>
> AllcoationFileLoaderService has this reloadThread that is currently created 
> and started in start(). Instead, it should be created in init() and started 
> in start().



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1514) Utility to benchmark ZKRMStateStore#loadState for ResourceManager-HA


[ 
https://issues.apache.org/jira/browse/YARN-1514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019561#comment-14019561
 ] 

Karthik Kambatla commented on YARN-1514:


The patch looks like a good first-cut. Could you add the configuration options 
as you mentioned? 

> Utility to benchmark ZKRMStateStore#loadState for ResourceManager-HA
> 
>
> Key: YARN-1514
> URL: https://issues.apache.org/jira/browse/YARN-1514
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Tsuyoshi OZAWA
>Assignee: Tsuyoshi OZAWA
> Fix For: 2.5.0
>
> Attachments: YARN-1514.wip.patch
>
>
> ZKRMStateStore is very sensitive to ZNode-related operations as discussed in 
> YARN-1307, YARN-1378 and so on. Especially, ZKRMStateStore#loadState is 
> called when RM-HA cluster does failover. Therefore, its execution time 
> impacts failover time of RM-HA.
> We need utility to benchmark time execution time of ZKRMStateStore#loadStore 
> as development tool.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1874) Cleanup: Move RMActiveServices out of ResourceManager into its own file


[ 
https://issues.apache.org/jira/browse/YARN-1874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019558#comment-14019558
 ] 

Karthik Kambatla commented on YARN-1874:


Barely skimmed through the patch, changes look reasonable. However, it would be 
easier if we could split this into smaller patches. At the least, RMContext 
related parts could be done in a separate JIRA first. 

> Cleanup: Move RMActiveServices out of ResourceManager into its own file
> ---
>
> Key: YARN-1874
> URL: https://issues.apache.org/jira/browse/YARN-1874
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Reporter: Karthik Kambatla
>Assignee: Tsuyoshi OZAWA
> Attachments: YARN-1874.1.patch, YARN-1874.2.patch, YARN-1874.3.patch, 
> YARN-1874.4.patch
>
>
> As [~vinodkv] noticed on YARN-1867, ResourceManager is hard to maintain. We 
> should move RMActiveServices out to make it more manageable. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1424) RMAppAttemptImpl should precompute a zeroed ApplicationResourceUsageReport to return when attempt not active


[ 
https://issues.apache.org/jira/browse/YARN-1424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019541#comment-14019541
 ] 

Karthik Kambatla commented on YARN-1424:


In my opinion, DUMMY_APPLICATION_RESOURCE_USAGE_REPORT should have zeroes for 
all resource-related values as opposed to -1. In case, people fetch these 
reports and try to accumulate statistics, -1 would through their math off. The 
app_id itself has -1 in it, so I don't see a risk of it being misconstrued.

> RMAppAttemptImpl should precompute a zeroed ApplicationResourceUsageReport to 
> return when attempt not active
> 
>
> Key: YARN-1424
> URL: https://issues.apache.org/jira/browse/YARN-1424
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.4.0
>Reporter: Sandy Ryza
>Assignee: Ray Chiang
>Priority: Minor
>  Labels: newbie
> Attachments: YARN1424-01.patch
>
>
> RMAppImpl has a DUMMY_APPLICATION_RESOURCE_USAGE_REPORT to return when the 
> caller of createAndGetApplicationReport doesn't have access.
> RMAppAttemptImpl should have something similar for 
> getApplicationResourceUsageReport.
> It also might make sense to put the dummy report into 
> ApplicationResourceUsageReport and allow both to use it.
> A test would also be useful to verify that 
> RMAppAttemptImpl#getApplicationResourceUsageReport doesn't return null if the 
> scheduler doesn't have a report to return.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2074) Preemption of AM containers shouldn't count towards AM failures


[ 
https://issues.apache.org/jira/browse/YARN-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019510#comment-14019510
 ] 

Wangda Tan commented on YARN-2074:
--

Can we populate "attemptFailureCount" to AM when AM registering? This should be 
a valuable fix, other applications besides MR can benefit from this too. I 
think it's better to create a separated JIRA to track this.
[~jianhe], [~mayank_bansal] Any thoughts? 

> Preemption of AM containers shouldn't count towards AM failures
> ---
>
> Key: YARN-2074
> URL: https://issues.apache.org/jira/browse/YARN-2074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Jian He
> Attachments: YARN-2074.1.patch, YARN-2074.2.patch, YARN-2074.3.patch
>
>
> One orthogonal concern with issues like YARN-2055 and YARN-2022 is that AM 
> containers getting preempted shouldn't count towards AM failures and thus 
> shouldn't eventually fail applications.
> We should explicitly handle AM container preemption/kill as a separate issue 
> and not count it towards the limit on AM failures.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2120) Coloring queues running over minShare on RM Scheduler page

2014-06-05 Thread Siqi Li (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019494#comment-14019494
 ] 

Siqi Li commented on YARN-2120:
---

[~ashwinshankar77] I have attached a new screenshot, which retains the original 
color format for fairShare. Additionally, using blue and red border color to 
differentiate if a queue exceeds its minshare or maxshare. In case of not 
setting minShare, the blue border will not be displayed. I will run some tests 
and upload the patch shortly  

> Coloring queues running over minShare on RM Scheduler page
> --
>
> Key: YARN-2120
> URL: https://issues.apache.org/jira/browse/YARN-2120
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.3.0
>Reporter: Siqi Li
>Assignee: Siqi Li
> Attachments: 1595210E-C2D6-42D3-8C1D-73DFF1614CF9.png, 
> YARN-2120.v1.patch
>
>
> Today RM Scheduler page shows FairShare, Used, Used (over fair share) and 
> MaxCapacity.
> Since fairShare is displaying with dotted line, I think we can stop 
> displaying orange when a queue over its fairshare.
> It would be better to show a queue running over minShare with orange color, 
> so that we know queue is running more than its min share. 
> Also, we can display a queue running at maxShare with red color.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2120) Coloring queues running over minShare on RM Scheduler page


[ 
https://issues.apache.org/jira/browse/YARN-2120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019488#comment-14019488
 ] 

Hadoop QA commented on YARN-2120:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12648584/1595210E-C2D6-42D3-8C1D-73DFF1614CF9.png
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3918//console

This message is automatically generated.

> Coloring queues running over minShare on RM Scheduler page
> --
>
> Key: YARN-2120
> URL: https://issues.apache.org/jira/browse/YARN-2120
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.3.0
>Reporter: Siqi Li
>Assignee: Siqi Li
> Attachments: 1595210E-C2D6-42D3-8C1D-73DFF1614CF9.png, 
> YARN-2120.v1.patch
>
>
> Today RM Scheduler page shows FairShare, Used, Used (over fair share) and 
> MaxCapacity.
> Since fairShare is displaying with dotted line, I think we can stop 
> displaying orange when a queue over its fairshare.
> It would be better to show a queue running over minShare with orange color, 
> so that we know queue is running more than its min share. 
> Also, we can display a queue running at maxShare with red color.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2128) SchedulerApplicationAttempt's amResource should be normalized instead of fetching from ApplicationSubmissionContext directly


[ 
https://issues.apache.org/jira/browse/YARN-2128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019486#comment-14019486
 ] 

Hadoop QA commented on YARN-2128:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12648575/YARN-2128.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3917//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3917//console

This message is automatically generated.

> SchedulerApplicationAttempt's amResource should be normalized instead of 
> fetching from ApplicationSubmissionContext directly
> 
>
> Key: YARN-2128
> URL: https://issues.apache.org/jira/browse/YARN-2128
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wei Yan
>Assignee: Wei Yan
> Attachments: YARN-2128.patch
>
>
> The amResource should be normalized.
> {code}
> ApplicationSubmissionContext appSubmissionContext =
> rmContext.getRMApps().get(applicationAttemptId.getApplicationId())
> .getApplicationSubmissionContext();
> if (appSubmissionContext != null) {
> amResource = appSubmissionContext.getResource();
> unmanagedAM = appSubmissionContext.getUnmanagedAM();
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2120) Coloring queues running over minShare on RM Scheduler page

2014-06-05 Thread Siqi Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated YARN-2120:
--

Attachment: (was: AD45B623-9F14-420B-B1FB-1186E2B5EC4A.png)

> Coloring queues running over minShare on RM Scheduler page
> --
>
> Key: YARN-2120
> URL: https://issues.apache.org/jira/browse/YARN-2120
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.3.0
>Reporter: Siqi Li
>Assignee: Siqi Li
> Attachments: 1595210E-C2D6-42D3-8C1D-73DFF1614CF9.png, 
> YARN-2120.v1.patch
>
>
> Today RM Scheduler page shows FairShare, Used, Used (over fair share) and 
> MaxCapacity.
> Since fairShare is displaying with dotted line, I think we can stop 
> displaying orange when a queue over its fairshare.
> It would be better to show a queue running over minShare with orange color, 
> so that we know queue is running more than its min share. 
> Also, we can display a queue running at maxShare with red color.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2120) Coloring queues running over minShare on RM Scheduler page

2014-06-05 Thread Siqi Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated YARN-2120:
--

Attachment: 1595210E-C2D6-42D3-8C1D-73DFF1614CF9.png

> Coloring queues running over minShare on RM Scheduler page
> --
>
> Key: YARN-2120
> URL: https://issues.apache.org/jira/browse/YARN-2120
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.3.0
>Reporter: Siqi Li
>Assignee: Siqi Li
> Attachments: 1595210E-C2D6-42D3-8C1D-73DFF1614CF9.png, 
> YARN-2120.v1.patch
>
>
> Today RM Scheduler page shows FairShare, Used, Used (over fair share) and 
> MaxCapacity.
> Since fairShare is displaying with dotted line, I think we can stop 
> displaying orange when a queue over its fairshare.
> It would be better to show a queue running over minShare with orange color, 
> so that we know queue is running more than its min share. 
> Also, we can display a queue running at maxShare with red color.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2128) SchedulerApplicationAttempt's amResource should be normalized instead of fetching from ApplicationSubmissionContext directly


 [ 
https://issues.apache.org/jira/browse/YARN-2128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Yan updated YARN-2128:
--

Attachment: YARN-2128.patch

> SchedulerApplicationAttempt's amResource should be normalized instead of 
> fetching from ApplicationSubmissionContext directly
> 
>
> Key: YARN-2128
> URL: https://issues.apache.org/jira/browse/YARN-2128
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wei Yan
>Assignee: Wei Yan
> Attachments: YARN-2128.patch
>
>
> The amResource should be normalized.
> {code}
> ApplicationSubmissionContext appSubmissionContext =
> rmContext.getRMApps().get(applicationAttemptId.getApplicationId())
> .getApplicationSubmissionContext();
> if (appSubmissionContext != null) {
> amResource = appSubmissionContext.getResource();
> unmanagedAM = appSubmissionContext.getUnmanagedAM();
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (YARN-2128) SchedulerApplicationAttempt's amResource should be normalized instead of fetching from ApplicationSubmissionContext directly

Wei Yan created YARN-2128:
-

 Summary: SchedulerApplicationAttempt's amResource should be 
normalized instead of fetching from ApplicationSubmissionContext directly
 Key: YARN-2128
 URL: https://issues.apache.org/jira/browse/YARN-2128
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Wei Yan
Assignee: Wei Yan


The amResource should be normalized.
{code}
ApplicationSubmissionContext appSubmissionContext =
rmContext.getRMApps().get(applicationAttemptId.getApplicationId())
.getApplicationSubmissionContext();
if (appSubmissionContext != null) {
amResource = appSubmissionContext.getResource();
unmanagedAM = appSubmissionContext.getUnmanagedAM();
}
{code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2126) The FSLeafQueue.amResourceUsage shouldn't be updated when an Application removed before it runs AM


 [ 
https://issues.apache.org/jira/browse/YARN-2126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Yan updated YARN-2126:
--

Description: 
When an application is removed, the FSLeafQueue updates its amResourceUsage.
{code}
  if (runnableAppScheds.remove(app.getAppSchedulable())) {
  // Update AM resource usage
  if (app.getAMResource() != null) {
Resources.subtractFrom(amResourceUsage, app.getAMResource());
  }
  return true;
  }
{code}

If an application is removed before it has a chance to start its AM, the 
amResourceUsage shouldn't be updated.

  was:
When an application is removed, the FSLeafQueue updates its amResourceUsage.
{code}
  if (runnableAppScheds.remove(app.getAppSchedulable())) {
  // Update AM resource usage
  if (app.getAMResource() != null) {
Resources.subtractFrom(amResourceUsage, app.getAMResource());
  }
  return true;
  }
{code}

If an application is removed before it has a change to start its AM, the 
amResourceUsage shouldn't be updated.


> The FSLeafQueue.amResourceUsage shouldn't be updated when an Application 
> removed before it runs AM
> --
>
> Key: YARN-2126
> URL: https://issues.apache.org/jira/browse/YARN-2126
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wei Yan
>Assignee: Wei Yan
> Fix For: 2.5.0
>
>
> When an application is removed, the FSLeafQueue updates its amResourceUsage.
> {code}
>   if (runnableAppScheds.remove(app.getAppSchedulable())) {
>   // Update AM resource usage
>   if (app.getAMResource() != null) {
> Resources.subtractFrom(amResourceUsage, app.getAMResource());
>   }
>   return true;
>   }
> {code}
> If an application is removed before it has a chance to start its AM, the 
> amResourceUsage shouldn't be updated.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2127) Move YarnUncaughtExceptionHandler into Hadoop common

2014-06-05 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019165#comment-14019165
 ] 

Steve Loughran commented on YARN-2127:
--

I can incorporate this into YARN-679 easily enough -I just wanted to flag it as 
one of the actions I'd like to do. The YARN-769 service launcher does not 
depend on it -but its throwable catching logic would be flawed without it

> Move YarnUncaughtExceptionHandler into Hadoop common
> 
>
> Key: YARN-2127
> URL: https://issues.apache.org/jira/browse/YARN-2127
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: api
>Affects Versions: 2.4.0
>Reporter: Steve Loughran
>Priority: Minor
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Create a superclass of {{YarnUncaughtExceptionHandler}}  in the hadoop-common 
> code (retaining the original for compatibility).
> This would be available for any hadoop application to use, and the YARN-679 
> launcher could automatically set up the handler.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (YARN-2127) Move YarnUncaughtExceptionHandler into Hadoop common

2014-06-05 Thread Steve Loughran (JIRA)

Steve Loughran created YARN-2127:


 Summary: Move YarnUncaughtExceptionHandler into Hadoop common
 Key: YARN-2127
 URL: https://issues.apache.org/jira/browse/YARN-2127
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: api
Affects Versions: 2.4.0
Reporter: Steve Loughran
Priority: Minor


Create a superclass of {{YarnUncaughtExceptionHandler}}  in the hadoop-common 
code (retaining the original for compatibility).

This would be available for any hadoop application to use, and the YARN-679 
launcher could automatically set up the handler.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2030) Use StateMachine to simplify handleStoreEvent() in RMStateStore

2014-06-05 Thread Binglin Chang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang updated YARN-2030:


Attachment: YARN-2030.v3.patch

Thanks for the comments [~djp] and [~jianhe]. I update the patch to make 
ApplicationAttempStateData and ApplicationStateData abstract classes. 
bq. Accordingly storeApplicationStateInternal can take in ApplicationStateData 
instead of ApplicationStateDataPBImpl as the argument to avoid the type cast.
I try to change updateApplicationAttemptStateInternal paramter type from PBImpl 
to abstract records, but looks like some RMStateStore(FileSystemRMStateStore 
and ZKRMStateStore) require the parameter to be PBImpl(so they can use toProto 
to serialize)

> Use StateMachine to simplify handleStoreEvent() in RMStateStore
> ---
>
> Key: YARN-2030
> URL: https://issues.apache.org/jira/browse/YARN-2030
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Junping Du
>Assignee: Binglin Chang
> Attachments: YARN-2030.v1.patch, YARN-2030.v2.patch, 
> YARN-2030.v3.patch
>
>
> Now the logic to handle different store events in handleStoreEvent() is as 
> following:
> {code}
> if (event.getType().equals(RMStateStoreEventType.STORE_APP)
> || event.getType().equals(RMStateStoreEventType.UPDATE_APP)) {
>   ...
>   if (event.getType().equals(RMStateStoreEventType.STORE_APP)) {
> ...
>   } else {
> ...
>   }
>   ...
>   try {
> if (event.getType().equals(RMStateStoreEventType.STORE_APP)) {
>   ...
> } else {
>   ...
> }
>   } 
>   ...
> } else if (event.getType().equals(RMStateStoreEventType.STORE_APP_ATTEMPT)
> || event.getType().equals(RMStateStoreEventType.UPDATE_APP_ATTEMPT)) {
>   ...
>   if (event.getType().equals(RMStateStoreEventType.STORE_APP_ATTEMPT)) {
> ...
>   } else {
> ...
>   }
> ...
> if (event.getType().equals(RMStateStoreEventType.STORE_APP_ATTEMPT)) {
>   ...
> } else {
>   ...
> }
>   }
>   ...
> } else if (event.getType().equals(RMStateStoreEventType.REMOVE_APP)) {
> ...
> } else {
>   ...
> }
> }
> {code}
> This is not only confuse people but also led to mistake easily. We may 
> leverage state machine to simply this even no state transitions.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2126) The FSLeafQueue.amResourceUsage shouldn't be updated when an Application removed before it runs AM


 [ 
https://issues.apache.org/jira/browse/YARN-2126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Yan updated YARN-2126:
--

Fix Version/s: 2.5.0

> The FSLeafQueue.amResourceUsage shouldn't be updated when an Application 
> removed before it runs AM
> --
>
> Key: YARN-2126
> URL: https://issues.apache.org/jira/browse/YARN-2126
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wei Yan
>Assignee: Wei Yan
> Fix For: 2.5.0
>
>
> When an application is removed, the FSLeafQueue updates its amResourceUsage.
> {code}
>   if (runnableAppScheds.remove(app.getAppSchedulable())) {
>   // Update AM resource usage
>   if (app.getAMResource() != null) {
> Resources.subtractFrom(amResourceUsage, app.getAMResource());
>   }
>   return true;
>   }
> {code}
> If an application is removed before it has a change to start its AM, the 
> amResourceUsage shouldn't be updated.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1977) Add tests on getApplicationRequest with filtering start time range


[ 
https://issues.apache.org/jira/browse/YARN-1977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14018901#comment-14018901
 ] 

Hudson commented on YARN-1977:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5651 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5651/])
YARN-1977. Add tests on getApplicationRequest with filtering start time range. 
(Contributed by Junping Du) (junping_du: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1600644)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestClientRMService.java


> Add tests on getApplicationRequest with filtering start time range
> --
>
> Key: YARN-1977
> URL: https://issues.apache.org/jira/browse/YARN-1977
> Project: Hadoop YARN
>  Issue Type: Test
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Minor
> Fix For: 2.5.0
>
> Attachments: YARN-1977.patch
>
>
> There is no unit test to verify if request with start time range works to get 
> right application list, we should add it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (YARN-2126) The FSLeafQueue.amResourceUsage shouldn't be updated when an Application removed before it runs AM

Wei Yan created YARN-2126:
-

 Summary: The FSLeafQueue.amResourceUsage shouldn't be updated when 
an Application removed before it runs AM
 Key: YARN-2126
 URL: https://issues.apache.org/jira/browse/YARN-2126
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Wei Yan
Assignee: Wei Yan


When an application is removed, the FSLeafQueue updates its amResourceUsage.
{code}
  if (runnableAppScheds.remove(app.getAppSchedulable())) {
  // Update AM resource usage
  if (app.getAMResource() != null) {
Resources.subtractFrom(amResourceUsage, app.getAMResource());
  }
  return true;
  }
{code}

If an application is removed before it has a change to start its AM, the 
amResourceUsage shouldn't be updated.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2061) Revisit logging levels in ZKRMStateStore


[ 
https://issues.apache.org/jira/browse/YARN-2061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14018858#comment-14018858
 ] 

Hudson commented on YARN-2061:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1792 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1792/])
YARN-2061. Revisit logging levels in ZKRMStateStore. (Ray Chiang via kasha) 
(kasha: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1600498)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java


> Revisit logging levels in ZKRMStateStore 
> -
>
> Key: YARN-2061
> URL: https://issues.apache.org/jira/browse/YARN-2061
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.4.0
>Reporter: Karthik Kambatla
>Assignee: Ray Chiang
>Priority: Minor
>  Labels: newbie
> Fix For: 2.5.0
>
> Attachments: YARN2061-01.patch
>
>
> ZKRMStateStore has a few places where it is logging at the INFO level. We 
> should change these to DEBUG or TRACE level messages.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2119) DEFAULT_PROXY_ADDRESS should use DEFAULT_PROXY_PORT


[ 
https://issues.apache.org/jira/browse/YARN-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14018857#comment-14018857
 ] 

Hudson commented on YARN-2119:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1792 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1792/])
YARN-2119. DEFAULT_PROXY_ADDRESS should use DEFAULT_PROXY_PORT. (Anubhav Dhoot 
via kasha) (kasha: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1600484)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy/src/test/java/org/apache/hadoop/yarn/server/webproxy/TestWebAppProxyServer.java


> DEFAULT_PROXY_ADDRESS should use DEFAULT_PROXY_PORT
> ---
>
> Key: YARN-2119
> URL: https://issues.apache.org/jira/browse/YARN-2119
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Anubhav Dhoot
>Assignee: Anubhav Dhoot
> Fix For: 2.5.0
>
> Attachments: YARN-2119.patch
>
>
> The fix for [YARN-1590|https://issues.apache.org/jira/browse/YARN-1590] 
> introduced an method to get web proxy bind address with the incorrect default 
> port. Because all the users of the method (only 1 user) ignores the port, its 
> not breaking anything yet. Fixing it in case someone else uses this in the 
> future. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-1972) Implement secure Windows Container Executor

[
https://issues.apache.org/jira/browse/YARN-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Remus Rusanu updated YARN-1972:
---

Description:
h1. Windows Secure Container Executor (WCE)
YARN-1063 adds the necessary infrasturcture to launch a process as a domain
user as a solution for the problem of having a security boundary between
processes executed in YARN containers and the Hadoop services. The WCE is a
container executor that leverages the winutils capabilities introduced in
YARN-1063 and launches containers as an OS process running as the job submitter
user. A description of the S4U infrastructure used by YARN-1063 alternatives
considered can be read on that JIRA.

The WCE is based on the DefaultContainerExecutor. It relies on the DCE to drive
the flow of execution, but it overwrrides some emthods to the effect of:

* change the DCE created user cache directories to be owned by the job user and
by the nodemanager group.
* changes the actual container run command to use the 'createAsUser' command of
winutils task instead of 'create'
* runs the localization as standalone process instead of an in-process Java
method call. This in turn relies on the winutil createAsUser feature to run the
localization as the job user.

When compared to LinuxContainerExecutor (LCE), the WCE has some minor
differences:

* it does no delegate the creation of the user cache directories to the native
implementation.
* it does no require special handling to be able to delete user files

The approach on the WCE came from a practical trial-and-error approach. I had
to iron out some issues around the Windows script shell limitations (command
line length) to get it to work, the biggest issue being the huge CLASSPATH that
is commonplace in Hadoop environment container executions. The job container
itself is already dealing with this via a so called 'classpath jar', see
HADOOP-8899 and YARN-316 for details. For the WCE localizer launch as a
separate container the same issue had to be resolved and I used the same
'classpath jar' approach.

h2. Deployment Requirements
To use the WCE one needs to set the `yarn.nodemanager.container-executor.class`
to `org.apache.hadoop.yarn.server.nodemanager.WindowsSecureContainerExecutor`
and set the `yarn.nodemanager.windows-secure-container-executor.group` to a
Windows security group name that is the nodemanager service principal is a
member of (equivalent of LCE
`yarn.nodemanager.linux-container-executor.group`). Unlike the LCE the WCE does
not require any configuration outside of the Hadoop own's yar-site.xml.

For WCE to work the nodemanager must run as a service principal that is member
of the local Administrators group or LocalSystem. this is derived from the need
to invoke LoadUserProfile API which mention these requirements in the
specifications. This is in addition to the SE_TCB privilege mentioned in
YARN-1063, but this requirement will automatically imply that the SE_TCB
privilege is held by the nodemanager. For the Linux speakers in the audience,
the requirement is basically to run NM as root.

h2. Dedicated high privilege Service
Due to the high privilege required by the WCE we had discussed the need to
isolate the high privilege operations into a separate process, an 'executor'
service that is solely responsible to start the containers (incloding the
localizer). The NM would have to authenticate, authorize and communicate with
this service via an IPC mechanism and use this service to launch the
containers. I still believe we'll end up deploying such a service, but the
effort to onboard such a new platfrom specific new service on the project are
not trivial.

was:
Windows Secure Container Executor (WCE)

YARN-1063 adds the necessary infrasturcture to launch a process as a domain
user as a solution for the problem of having a security boundary between
processes executed in YARN containers and the Hadoop services. The WCE is a
container executor that leverages the winutils capabilities introduced in
YARN-1063 and launches containers as an OS process running as the job submitter
user. A description of the S4U infrastructure used by YARN-1063 alternatives
considered can be read on that JIRA.

The WCE is based on the DefaultContainerExecutor. It relies on the DCE to drive
the flow of execution, but it overwrrides some emthods to the effect of:

- change the DCE created user cache directories to be owned by the job user
and by the nodemanager group.
- changes the actual container run command to use the 'createAsUser' command
of winutils task instead of 'create'
- runs the localization as standalone process instead of an in-process Java
method call. This in turn relies on the winutil createAsUser feature to run the
localization as the job user.

When compared to LinuxContainerExecutor (LCE), the WCE has some minor
differences:

- it doe

[jira] [Updated] (YARN-1972) Implement secure Windows Container Executor


 [ 
https://issues.apache.org/jira/browse/YARN-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated YARN-1972:
---

Description: 
Windows Secure Container Executor (WCE)


YARN-1063 adds the necessary infrasturcture to launch a process as a domain 
user as a solution for the problem of having a security boundary between 
processes executed in YARN containers and the Hadoop services. The WCE is a 
container executor that leverages the winutils capabilities introduced in 
YARN-1063 and launches containers as an OS process running as the job submitter 
user. A description of the S4U infrastructure used by YARN-1063 alternatives 
considered can be read on that JIRA.

The WCE is based on the DefaultContainerExecutor. It relies on the DCE to drive 
the flow of execution, but it overwrrides some emthods to the effect of:

 - change the DCE created user cache directories to be owned by the job user 
and by the nodemanager group.
 - changes the actual container run command to use the 'createAsUser' command 
of winutils task instead of 'create'
 - runs the localization as standalone process instead of an in-process Java 
method call. This in turn relies on the winutil createAsUser feature to run the 
localization as the job user.
 
When compared to LinuxContainerExecutor (LCE), the WCE has some minor 
differences:

 - it does no delegate the creation of the user cache directories to the native 
implementation.
 - it does no require special handling to be able to delete user files

The approach on the WCE came from a practical trial-and-error approach. I had 
to iron out some issues around the Windows script shell limitations (command 
line length) to get it to work, the biggest issue being the huge CLASSPATH that 
is commonplace in Hadoop environment container executions. The job container 
itself is already dealing with this via a so called 'classpath jar', see 
HADOOP-8899 and YARN-316 for details. For the WCE localizer launch as a 
separate container the same issue had to be resolved and I used the same 
'classpath jar' approach.

Deployment Requirements
---

To use the WCE one needs to set the `yarn.nodemanager.container-executor.class` 
to `org.apache.hadoop.yarn.server.nodemanager.WindowsSecureContainerExecutor` 
and set the `yarn.nodemanager.windows-secure-container-executor.group` to a 
Windows security group name that is the nodemanager service principal is a 
member of (equivalent of LCE 
`yarn.nodemanager.linux-container-executor.group`). Unlike the LCE the WCE does 
not require any configuration outside of the Hadoop own's yar-site.xml.

For WCE to work the nodemanager must run as a service principal that is member 
of the local Administrators group or LocalSystem. this is derived from the need 
to invoke LoadUserProfile API which mention these requirements in the 
specifications. This is in addition to the SE_TCB privilege mentioned in 
YARN-1063, but this requirement will automatically imply that the SE_TCB 
privilege is held by the nodemanager. For the Linux speakers in the audience, 
the requirement is basically to run NM as root.

Dedicated high privilege Service


Due to the high privilege required by the WCE we had discussed the need to 
isolate the high privilege operations into a separate process, an 'executor' 
service that is solely responsible to start the containers (incloding the 
localizer). The NM would have to authenticate, authorize and communicate with 
this service via an IPC mechanism and use this service to launch the 
containers. I still believe we'll end up deploying such a service, but the 
effort to onboard such a new platfrom specific new service on the project are 
not trivial.

  was:
This work item represents the Java side changes required to implement a secure 
windows container executor, based on the YARN-1063 changes on native/winutils 
side. 

Necessary changes include leveraging the winutils task createas to launch the 
container process as the required user and a secure localizer (launch 
localization as a separate process running as the container user).


> Implement secure Windows Container Executor
> ---
>
> Key: YARN-1972
> URL: https://issues.apache.org/jira/browse/YARN-1972
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Remus Rusanu
>Assignee: Remus Rusanu
>  Labels: security, windows
> Attachments: YARN-1972.1.patch, YARN-1972.2.patch
>
>
> Windows Secure Container Executor (WCE)
> 
> YARN-1063 adds the necessary infrasturcture to launch a process as a domain 
> user as a solution for the problem of having a security boundary between 
> processes executed in YARN containers and the Hadoop ser

[jira] [Commented] (YARN-2061) Revisit logging levels in ZKRMStateStore


[ 
https://issues.apache.org/jira/browse/YARN-2061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14018798#comment-14018798
 ] 

Hudson commented on YARN-2061:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1765 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1765/])
YARN-2061. Revisit logging levels in ZKRMStateStore. (Ray Chiang via kasha) 
(kasha: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1600498)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java


> Revisit logging levels in ZKRMStateStore 
> -
>
> Key: YARN-2061
> URL: https://issues.apache.org/jira/browse/YARN-2061
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.4.0
>Reporter: Karthik Kambatla
>Assignee: Ray Chiang
>Priority: Minor
>  Labels: newbie
> Fix For: 2.5.0
>
> Attachments: YARN2061-01.patch
>
>
> ZKRMStateStore has a few places where it is logging at the INFO level. We 
> should change these to DEBUG or TRACE level messages.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2119) DEFAULT_PROXY_ADDRESS should use DEFAULT_PROXY_PORT


[ 
https://issues.apache.org/jira/browse/YARN-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14018797#comment-14018797
 ] 

Hudson commented on YARN-2119:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1765 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1765/])
YARN-2119. DEFAULT_PROXY_ADDRESS should use DEFAULT_PROXY_PORT. (Anubhav Dhoot 
via kasha) (kasha: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1600484)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy/src/test/java/org/apache/hadoop/yarn/server/webproxy/TestWebAppProxyServer.java


> DEFAULT_PROXY_ADDRESS should use DEFAULT_PROXY_PORT
> ---
>
> Key: YARN-2119
> URL: https://issues.apache.org/jira/browse/YARN-2119
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Anubhav Dhoot
>Assignee: Anubhav Dhoot
> Fix For: 2.5.0
>
> Attachments: YARN-2119.patch
>
>
> The fix for [YARN-1590|https://issues.apache.org/jira/browse/YARN-1590] 
> introduced an method to get web proxy bind address with the incorrect default 
> port. Because all the users of the method (only 1 user) ignores the port, its 
> not breaking anything yet. Fixing it in case someone else uses this in the 
> future. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2030) Use StateMachine to simplify handleStoreEvent() in RMStateStore

2014-06-05 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14018758#comment-14018758
 ] 

Junping Du commented on YARN-2030:
--

bq. Junping Du, can you help with the review and commit ? thx.
Sure. I am glad to help here. 
[~decster], we want to have abstract class here as we want to provide as simple 
interface as possible to end user (or AM) who can just simply call 
ApplicationStateData.newInstance() without involving the complexity of 
ApplicationStateDataPBImpl. Make sense?

> Use StateMachine to simplify handleStoreEvent() in RMStateStore
> ---
>
> Key: YARN-2030
> URL: https://issues.apache.org/jira/browse/YARN-2030
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Junping Du
>Assignee: Binglin Chang
> Attachments: YARN-2030.v1.patch, YARN-2030.v2.patch
>
>
> Now the logic to handle different store events in handleStoreEvent() is as 
> following:
> {code}
> if (event.getType().equals(RMStateStoreEventType.STORE_APP)
> || event.getType().equals(RMStateStoreEventType.UPDATE_APP)) {
>   ...
>   if (event.getType().equals(RMStateStoreEventType.STORE_APP)) {
> ...
>   } else {
> ...
>   }
>   ...
>   try {
> if (event.getType().equals(RMStateStoreEventType.STORE_APP)) {
>   ...
> } else {
>   ...
> }
>   } 
>   ...
> } else if (event.getType().equals(RMStateStoreEventType.STORE_APP_ATTEMPT)
> || event.getType().equals(RMStateStoreEventType.UPDATE_APP_ATTEMPT)) {
>   ...
>   if (event.getType().equals(RMStateStoreEventType.STORE_APP_ATTEMPT)) {
> ...
>   } else {
> ...
>   }
> ...
> if (event.getType().equals(RMStateStoreEventType.STORE_APP_ATTEMPT)) {
>   ...
> } else {
>   ...
> }
>   }
>   ...
> } else if (event.getType().equals(RMStateStoreEventType.REMOVE_APP)) {
> ...
> } else {
>   ...
> }
> }
> {code}
> This is not only confuse people but also led to mistake easily. We may 
> leverage state machine to simply this even no state transitions.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1977) Add tests on getApplicationRequest with filtering start time range


[ 
https://issues.apache.org/jira/browse/YARN-1977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14018709#comment-14018709
 ] 

Hadoop QA commented on YARN-1977:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12641524/YARN-1977.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3915//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3915//console

This message is automatically generated.

> Add tests on getApplicationRequest with filtering start time range
> --
>
> Key: YARN-1977
> URL: https://issues.apache.org/jira/browse/YARN-1977
> Project: Hadoop YARN
>  Issue Type: Test
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Minor
> Fix For: 2.4.1
>
> Attachments: YARN-1977.patch
>
>
> There is no unit test to verify if request with start time range works to get 
> right application list, we should add it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2061) Revisit logging levels in ZKRMStateStore


[ 
https://issues.apache.org/jira/browse/YARN-2061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14018682#comment-14018682
 ] 

Hudson commented on YARN-2061:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #574 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/574/])
YARN-2061. Revisit logging levels in ZKRMStateStore. (Ray Chiang via kasha) 
(kasha: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1600498)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java


> Revisit logging levels in ZKRMStateStore 
> -
>
> Key: YARN-2061
> URL: https://issues.apache.org/jira/browse/YARN-2061
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.4.0
>Reporter: Karthik Kambatla
>Assignee: Ray Chiang
>Priority: Minor
>  Labels: newbie
> Fix For: 2.5.0
>
> Attachments: YARN2061-01.patch
>
>
> ZKRMStateStore has a few places where it is logging at the INFO level. We 
> should change these to DEBUG or TRACE level messages.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2119) DEFAULT_PROXY_ADDRESS should use DEFAULT_PROXY_PORT


[ 
https://issues.apache.org/jira/browse/YARN-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14018681#comment-14018681
 ] 

Hudson commented on YARN-2119:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #574 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/574/])
YARN-2119. DEFAULT_PROXY_ADDRESS should use DEFAULT_PROXY_PORT. (Anubhav Dhoot 
via kasha) (kasha: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1600484)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy/src/test/java/org/apache/hadoop/yarn/server/webproxy/TestWebAppProxyServer.java


> DEFAULT_PROXY_ADDRESS should use DEFAULT_PROXY_PORT
> ---
>
> Key: YARN-2119
> URL: https://issues.apache.org/jira/browse/YARN-2119
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Anubhav Dhoot
>Assignee: Anubhav Dhoot
> Fix For: 2.5.0
>
> Attachments: YARN-2119.patch
>
>
> The fix for [YARN-1590|https://issues.apache.org/jira/browse/YARN-1590] 
> introduced an method to get web proxy bind address with the incorrect default 
> port. Because all the users of the method (only 1 user) ignores the port, its 
> not breaking anything yet. Fixing it in case someone else uses this in the 
> future. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1972) Implement secure Windows Container Executor


[ 
https://issues.apache.org/jira/browse/YARN-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14018642#comment-14018642
 ] 

Remus Rusanu commented on YARN-1972:


There is more feedback to address (DRY between LCE and WCE localization launch, 
proper place for localization classpath jar). I do not plan to address launch 
nice-ness in this JIRA.

> Implement secure Windows Container Executor
> ---
>
> Key: YARN-1972
> URL: https://issues.apache.org/jira/browse/YARN-1972
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Remus Rusanu
>Assignee: Remus Rusanu
>  Labels: security, windows
> Attachments: YARN-1972.1.patch, YARN-1972.2.patch
>
>
> This work item represents the Java side changes required to implement a 
> secure windows container executor, based on the YARN-1063 changes on 
> native/winutils side. 
> Necessary changes include leveraging the winutils task createas to launch the 
> container process as the required user and a secure localizer (launch 
> localization as a separate process running as the container user).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-1972) Implement secure Windows Container Executor


 [ 
https://issues.apache.org/jira/browse/YARN-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated YARN-1972:
---

Attachment: YARN-1972.2.patch

Patch.2, rebased to current trunk, with review feedback:
 
- DCE and WCE no longer create user file cache, this is done solely by the 
localizer initDirs. DCE Test modified to reflect this. DCE.createUserCacheDirs 
renamed to createUserAppCacheDirs accordingly
- namenodeGroup -> nodeManagerGroup
- removed appLocalizationCounter, use locId instead (container ID) as the 
winutils "jobName" for the localizer runas launch

> Implement secure Windows Container Executor
> ---
>
> Key: YARN-1972
> URL: https://issues.apache.org/jira/browse/YARN-1972
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Remus Rusanu
>Assignee: Remus Rusanu
>  Labels: security, windows
> Attachments: YARN-1972.1.patch, YARN-1972.2.patch
>
>
> This work item represents the Java side changes required to implement a 
> secure windows container executor, based on the YARN-1063 changes on 
> native/winutils side. 
> Necessary changes include leveraging the winutils task createas to launch the 
> container process as the required user and a secure localizer (launch 
> localization as a separate process running as the container user).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1972) Implement secure Windows Container Executor


[ 
https://issues.apache.org/jira/browse/YARN-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14018625#comment-14018625
 ] 

Remus Rusanu commented on YARN-1972:


Fix is trivial, WCE startLocalizer should use locId for the winutils jobName, 
not appId.

> Implement secure Windows Container Executor
> ---
>
> Key: YARN-1972
> URL: https://issues.apache.org/jira/browse/YARN-1972
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Remus Rusanu
>Assignee: Remus Rusanu
>  Labels: security, windows
> Attachments: YARN-1972.1.patch
>
>
> This work item represents the Java side changes required to implement a 
> secure windows container executor, based on the YARN-1063 changes on 
> native/winutils side. 
> Necessary changes include leveraging the winutils task createas to launch the 
> container process as the required user and a secure localizer (launch 
> localization as a separate process running as the container user).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2124) ProportionalCapacityPreemptionPolicy cannot work because it's initialized before scheduler initialized


[ 
https://issues.apache.org/jira/browse/YARN-2124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14018593#comment-14018593
 ] 

Hadoop QA commented on YARN-2124:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12648441/YARN-2124.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3914//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3914//console

This message is automatically generated.

> ProportionalCapacityPreemptionPolicy cannot work because it's initialized 
> before scheduler initialized
> --
>
> Key: YARN-2124
> URL: https://issues.apache.org/jira/browse/YARN-2124
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, scheduler
>Affects Versions: 3.0.0
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Critical
> Attachments: YARN-2124.patch
>
>
> When I play with scheduler with preemption, I found 
> ProportionalCapacityPreemptionPolicy cannot work. NPE will be raised when RM 
> start
> {code}
> 2014-06-05 11:01:33,201 ERROR 
> org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread 
> Thread[SchedulingMonitor (ProportionalCapacityPreemptionPolicy),5,main] threw 
> an Exception.
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.util.resource.Resources.greaterThan(Resources.java:225)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy.computeIdealResourceDistribution(ProportionalCapacityPreemptionPolicy.java:302)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy.recursivelyComputeIdealAssignment(ProportionalCapacityPreemptionPolicy.java:261)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy.containerBasedPreemptOrKill(ProportionalCapacityPreemptionPolicy.java:198)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy.editSchedule(ProportionalCapacityPreemptionPolicy.java:174)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.monitor.SchedulingMonitor.invokePolicy(SchedulingMonitor.java:72)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.monitor.SchedulingMonitor$PreemptionChecker.run(SchedulingMonitor.java:82)
> at java.lang.Thread.run(Thread.java:744)
> {code}
> This is caused by ProportionalCapacityPreemptionPolicy needs 
> ResourceCalculator from CapacityScheduler. But 
> ProportionalCapacityPreemptionPolicy get initialized before CapacityScheduler 
> initialized. So ResourceCalculator will set to null in 
> ProportionalCapacityPreemptionPolicy. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2125) ProportionalCapacityPreemptionPolicy should only log CSV when debug enabled


[ 
https://issues.apache.org/jira/browse/YARN-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14018592#comment-14018592
 ] 

Hadoop QA commented on YARN-2125:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12648444/YARN-2125.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3913//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3913//console

This message is automatically generated.

> ProportionalCapacityPreemptionPolicy should only log CSV when debug enabled
> ---
>
> Key: YARN-2125
> URL: https://issues.apache.org/jira/browse/YARN-2125
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: resourcemanager, scheduler
>Affects Versions: 3.0.0
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Minor
> Attachments: YARN-2125.patch
>
>
> Currently, logToCSV() will be output using LOG.info() in 
> ProportionalCapacityPreemptionPolicy. Which will generate non-human-readable 
> texts in resource manager's log every several seconds, like
> {code}
> ...
> 2014-06-05 15:57:07,603 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy:
>   QUEUESTATE: 1401955027603, a1, 4096, 3, 2048, 2, 4096, 3, 4096, 3, 0, 0, 0, 
> 0, b1, 3072, 2, 1024, 1, 3072, 2, 3072, 2, 0, 0, 0, 0, b2, 3072, 2, 1024, 1, 
> 3072, 2, 3072, 2, 0, 0, 0, 0
> 2014-06-05 15:57:10,603 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy:
>   QUEUESTATE: 1401955030603, a1, 4096, 3, 2048, 2, 4096, 3, 4096, 3, 0, 0, 0, 
> 0, b1, 3072, 2, 1024, 1, 3072, 2, 3072, 2, 0, 0, 0, 0, b2, 3072, 2, 1024, 1, 
> 3072, 2, 3072, 2, 0, 0, 0, 0
> ...
> {code}
> It's better to make it output when debug enabled.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1972) Implement secure Windows Container Executor


[ 
https://issues.apache.org/jira/browse/YARN-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14018583#comment-14018583
 ] 

Remus Rusanu commented on YARN-1972:


Tracked this down to {code}LocalizerRunner.run(){code}:
{code}
 exec.startLocalizer(nmPrivateCTokensPath, localizationServerAddress,
  context.getUser(),
  ConverterUtils.toString(
  context.getContainerId().
  getApplicationAttemptId().getApplicationId()),
{code}
Notice the use of application id, not attempt id when launching the localizer. 
I will change this to attempt id to eliminate the possibility of duplicates.

> Implement secure Windows Container Executor
> ---
>
> Key: YARN-1972
> URL: https://issues.apache.org/jira/browse/YARN-1972
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Remus Rusanu
>Assignee: Remus Rusanu
>  Labels: security, windows
> Attachments: YARN-1972.1.patch
>
>
> This work item represents the Java side changes required to implement a 
> secure windows container executor, based on the YARN-1063 changes on 
> native/winutils side. 
> Necessary changes include leveraging the winutils task createas to launch the 
> container process as the required user and a secure localizer (launch 
> localization as a separate process running as the container user).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2124) ProportionalCapacityPreemptionPolicy cannot work because it's initialized before scheduler initialized


[ 
https://issues.apache.org/jira/browse/YARN-2124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14018573#comment-14018573
 ] 

Wangda Tan commented on YARN-2124:
--

And I just verified this error is not occurred in 2.4.0.

> ProportionalCapacityPreemptionPolicy cannot work because it's initialized 
> before scheduler initialized
> --
>
> Key: YARN-2124
> URL: https://issues.apache.org/jira/browse/YARN-2124
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, scheduler
>Affects Versions: 3.0.0
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Critical
> Attachments: YARN-2124.patch
>
>
> When I play with scheduler with preemption, I found 
> ProportionalCapacityPreemptionPolicy cannot work. NPE will be raised when RM 
> start
> {code}
> 2014-06-05 11:01:33,201 ERROR 
> org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread 
> Thread[SchedulingMonitor (ProportionalCapacityPreemptionPolicy),5,main] threw 
> an Exception.
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.util.resource.Resources.greaterThan(Resources.java:225)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy.computeIdealResourceDistribution(ProportionalCapacityPreemptionPolicy.java:302)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy.recursivelyComputeIdealAssignment(ProportionalCapacityPreemptionPolicy.java:261)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy.containerBasedPreemptOrKill(ProportionalCapacityPreemptionPolicy.java:198)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy.editSchedule(ProportionalCapacityPreemptionPolicy.java:174)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.monitor.SchedulingMonitor.invokePolicy(SchedulingMonitor.java:72)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.monitor.SchedulingMonitor$PreemptionChecker.run(SchedulingMonitor.java:82)
> at java.lang.Thread.run(Thread.java:744)
> {code}
> This is caused by ProportionalCapacityPreemptionPolicy needs 
> ResourceCalculator from CapacityScheduler. But 
> ProportionalCapacityPreemptionPolicy get initialized before CapacityScheduler 
> initialized. So ResourceCalculator will set to null in 
> ProportionalCapacityPreemptionPolicy. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2125) ProportionalCapacityPreemptionPolicy should only log CSV when debug enabled


 [ 
https://issues.apache.org/jira/browse/YARN-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-2125:
-

Attachment: YARN-2125.patch

Attached a simple patch to fix this.

> ProportionalCapacityPreemptionPolicy should only log CSV when debug enabled
> ---
>
> Key: YARN-2125
> URL: https://issues.apache.org/jira/browse/YARN-2125
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: resourcemanager, scheduler
>Affects Versions: 3.0.0
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Minor
> Attachments: YARN-2125.patch
>
>
> Currently, logToCSV() will be output using LOG.info() in 
> ProportionalCapacityPreemptionPolicy. Which will generate non-human-readable 
> texts in resource manager's log every several seconds, like
> {code}
> ...
> 2014-06-05 15:57:07,603 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy:
>   QUEUESTATE: 1401955027603, a1, 4096, 3, 2048, 2, 4096, 3, 4096, 3, 0, 0, 0, 
> 0, b1, 3072, 2, 1024, 1, 3072, 2, 3072, 2, 0, 0, 0, 0, b2, 3072, 2, 1024, 1, 
> 3072, 2, 3072, 2, 0, 0, 0, 0
> 2014-06-05 15:57:10,603 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy:
>   QUEUESTATE: 1401955030603, a1, 4096, 3, 2048, 2, 4096, 3, 4096, 3, 0, 0, 0, 
> 0, b1, 3072, 2, 1024, 1, 3072, 2, 3072, 2, 0, 0, 0, 0, b2, 3072, 2, 1024, 1, 
> 3072, 2, 3072, 2, 0, 0, 0, 0
> ...
> {code}
> It's better to make it output when debug enabled.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (YARN-2125) ProportionalCapacityPreemptionPolicy should only log CSV when debug enabled

Wangda Tan created YARN-2125:


 Summary: ProportionalCapacityPreemptionPolicy should only log CSV 
when debug enabled
 Key: YARN-2125
 URL: https://issues.apache.org/jira/browse/YARN-2125
 Project: Hadoop YARN
  Issue Type: Task
  Components: resourcemanager, scheduler
Affects Versions: 3.0.0
Reporter: Wangda Tan
Assignee: Wangda Tan
Priority: Minor
 Attachments: YARN-2125.patch

Currently, logToCSV() will be output using LOG.info() in 
ProportionalCapacityPreemptionPolicy. Which will generate non-human-readable 
texts in resource manager's log every several seconds, like
{code}
...
2014-06-05 15:57:07,603 INFO 
org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy:
  QUEUESTATE: 1401955027603, a1, 4096, 3, 2048, 2, 4096, 3, 4096, 3, 0, 0, 0, 
0, b1, 3072, 2, 1024, 1, 3072, 2, 3072, 2, 0, 0, 0, 0, b2, 3072, 2, 1024, 1, 
3072, 2, 3072, 2, 0, 0, 0, 0
2014-06-05 15:57:10,603 INFO 
org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy:
  QUEUESTATE: 1401955030603, a1, 4096, 3, 2048, 2, 4096, 3, 4096, 3, 0, 0, 0, 
0, b1, 3072, 2, 1024, 1, 3072, 2, 3072, 2, 0, 0, 0, 0, b2, 3072, 2, 1024, 1, 
3072, 2, 3072, 2, 0, 0, 0, 0
...
{code}

It's better to make it output when debug enabled.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2124) ProportionalCapacityPreemptionPolicy cannot work because it's initialized before scheduler initialized


 [ 
https://issues.apache.org/jira/browse/YARN-2124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-2124:
-

Attachment: YARN-2124.patch

Attached a patch to solve this problem. Moved 
ProportionalCapacityPreemptionPolicy.init(...) from RMActiveService.init() to 
SchedulerMonitor.serviceInit(...). SchedulerMonitor will be always added after 
Scheduler added, so that ProportionalCapacityPreemptionPolicy will be 
initialized after SchedulerMonitor initialized.
Added a test to ProportionalCapacityPreemptionPolicy to make sure no regression 
of in the future.

> ProportionalCapacityPreemptionPolicy cannot work because it's initialized 
> before scheduler initialized
> --
>
> Key: YARN-2124
> URL: https://issues.apache.org/jira/browse/YARN-2124
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, scheduler
>Affects Versions: 3.0.0
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Critical
> Attachments: YARN-2124.patch
>
>
> When I play with scheduler with preemption, I found 
> ProportionalCapacityPreemptionPolicy cannot work. NPE will be raised when RM 
> start
> {code}
> 2014-06-05 11:01:33,201 ERROR 
> org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread 
> Thread[SchedulingMonitor (ProportionalCapacityPreemptionPolicy),5,main] threw 
> an Exception.
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.util.resource.Resources.greaterThan(Resources.java:225)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy.computeIdealResourceDistribution(ProportionalCapacityPreemptionPolicy.java:302)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy.recursivelyComputeIdealAssignment(ProportionalCapacityPreemptionPolicy.java:261)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy.containerBasedPreemptOrKill(ProportionalCapacityPreemptionPolicy.java:198)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy.editSchedule(ProportionalCapacityPreemptionPolicy.java:174)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.monitor.SchedulingMonitor.invokePolicy(SchedulingMonitor.java:72)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.monitor.SchedulingMonitor$PreemptionChecker.run(SchedulingMonitor.java:82)
> at java.lang.Thread.run(Thread.java:744)
> {code}
> This is caused by ProportionalCapacityPreemptionPolicy needs 
> ResourceCalculator from CapacityScheduler. But 
> ProportionalCapacityPreemptionPolicy get initialized before CapacityScheduler 
> initialized. So ResourceCalculator will set to null in 
> ProportionalCapacityPreemptionPolicy. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (YARN-2124) ProportionalCapacityPreemptionPolicy cannot work because it's initialized before scheduler initialized

Wangda Tan created YARN-2124:


 Summary: ProportionalCapacityPreemptionPolicy cannot work because 
it's initialized before scheduler initialized
 Key: YARN-2124
 URL: https://issues.apache.org/jira/browse/YARN-2124
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, scheduler
Affects Versions: 3.0.0
Reporter: Wangda Tan
Assignee: Wangda Tan
Priority: Critical


When I play with scheduler with preemption, I found 
ProportionalCapacityPreemptionPolicy cannot work. NPE will be raised when RM 
start
{code}
2014-06-05 11:01:33,201 ERROR 
org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread 
Thread[SchedulingMonitor (ProportionalCapacityPreemptionPolicy),5,main] threw 
an Exception.
java.lang.NullPointerException
at 
org.apache.hadoop.yarn.util.resource.Resources.greaterThan(Resources.java:225)
at 
org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy.computeIdealResourceDistribution(ProportionalCapacityPreemptionPolicy.java:302)
at 
org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy.recursivelyComputeIdealAssignment(ProportionalCapacityPreemptionPolicy.java:261)
at 
org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy.containerBasedPreemptOrKill(ProportionalCapacityPreemptionPolicy.java:198)
at 
org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy.editSchedule(ProportionalCapacityPreemptionPolicy.java:174)
at 
org.apache.hadoop.yarn.server.resourcemanager.monitor.SchedulingMonitor.invokePolicy(SchedulingMonitor.java:72)
at 
org.apache.hadoop.yarn.server.resourcemanager.monitor.SchedulingMonitor$PreemptionChecker.run(SchedulingMonitor.java:82)
at java.lang.Thread.run(Thread.java:744)
{code}

This is caused by ProportionalCapacityPreemptionPolicy needs ResourceCalculator 
from CapacityScheduler. But ProportionalCapacityPreemptionPolicy get 
initialized before CapacityScheduler initialized. So ResourceCalculator will 
set to null in ProportionalCapacityPreemptionPolicy. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1972) Implement secure Windows Container Executor