[jira] [Commented] (YARN-4090) Make Collections.sort() more efficient in FSParentQueue.java

2016-12-30 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15787286#comment-15787286
 ] 

Xianyin Xin commented on YARN-4090:
---

Hi [~zsl2007], sorry i have moved to another project, and don't have enough 
time. Change it to unassigned, and anyone who wants to take over is welcome. :)

> Make Collections.sort() more efficient in FSParentQueue.java
> 
>
> Key: YARN-4090
> URL: https://issues.apache.org/jira/browse/YARN-4090
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Reporter: Xianyin Xin
> Attachments: YARN-4090-TestResult.pdf, YARN-4090-preview.patch, 
> YARN-4090.001.patch, YARN-4090.002.patch, YARN-4090.003.patch, 
> YARN-4090.004.patch, sampling1.jpg, sampling2.jpg
>
>
> Collections.sort() consumes too much time in a scheduling round.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4090) Make Collections.sort() more efficient in FSParentQueue.java

2016-12-30 Thread Xianyin Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xianyin Xin updated YARN-4090:
--
Assignee: (was: Xianyin Xin)

> Make Collections.sort() more efficient in FSParentQueue.java
> 
>
> Key: YARN-4090
> URL: https://issues.apache.org/jira/browse/YARN-4090
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Reporter: Xianyin Xin
> Attachments: YARN-4090-TestResult.pdf, YARN-4090-preview.patch, 
> YARN-4090.001.patch, YARN-4090.002.patch, YARN-4090.003.patch, sampling1.jpg, 
> sampling2.jpg
>
>
> Collections.sort() consumes too much time in a scheduling round.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4090) Make Collections.sort() more efficient in FSParentQueue.java

2016-11-27 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15700910#comment-15700910
 ] 

Xianyin Xin commented on YARN-4090:
---

Hi [~zhengchenyu], I've moved to another project so i don't have enough time to 
handle this problem. From your convincing analysis, i believe you must have had 
a patch right? would you mind to take it over? :)

> Make Collections.sort() more efficient in FSParentQueue.java
> 
>
> Key: YARN-4090
> URL: https://issues.apache.org/jira/browse/YARN-4090
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Reporter: Xianyin Xin
>Assignee: Xianyin Xin
> Attachments: YARN-4090-TestResult.pdf, YARN-4090-preview.patch, 
> YARN-4090.001.patch, YARN-4090.002.patch, YARN-4090.003.patch, sampling1.jpg, 
> sampling2.jpg
>
>
> Collections.sort() consumes too much time in a scheduling round.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4090) Make Collections.sort() more efficient in FSParentQueue.java

2016-11-07 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15646181#comment-15646181
 ] 

Xianyin Xin commented on YARN-4090:
---

Sorry for the delay [~gsaha]. I attached the new patch based on the latest 
trunk.

> Make Collections.sort() more efficient in FSParentQueue.java
> 
>
> Key: YARN-4090
> URL: https://issues.apache.org/jira/browse/YARN-4090
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Reporter: Xianyin Xin
>Assignee: Xianyin Xin
> Attachments: YARN-4090-TestResult.pdf, YARN-4090-preview.patch, 
> YARN-4090.001.patch, YARN-4090.002.patch, YARN-4090.003.patch, sampling1.jpg, 
> sampling2.jpg
>
>
> Collections.sort() consumes too much time in a scheduling round.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4090) Make Collections.sort() more efficient in FSParentQueue.java

2016-11-07 Thread Xianyin Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xianyin Xin updated YARN-4090:
--
Attachment: YARN-4090.003.patch

> Make Collections.sort() more efficient in FSParentQueue.java
> 
>
> Key: YARN-4090
> URL: https://issues.apache.org/jira/browse/YARN-4090
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Reporter: Xianyin Xin
>Assignee: Xianyin Xin
> Attachments: YARN-4090-TestResult.pdf, YARN-4090-preview.patch, 
> YARN-4090.001.patch, YARN-4090.002.patch, YARN-4090.003.patch, sampling1.jpg, 
> sampling2.jpg
>
>
> Collections.sort() consumes too much time in a scheduling round.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4090) Make Collections.sort() more efficient in FSParentQueue.java

2016-11-07 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15646171#comment-15646171
 ] 

Xianyin Xin commented on YARN-4090:
---

Sorry for the delay [~He Tianyi]. and that i didn't examine the patch behavior 
in 2.6.0. Have you found the reason that cause the periodically assignment?

> Make Collections.sort() more efficient in FSParentQueue.java
> 
>
> Key: YARN-4090
> URL: https://issues.apache.org/jira/browse/YARN-4090
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Reporter: Xianyin Xin
>Assignee: Xianyin Xin
> Attachments: YARN-4090-TestResult.pdf, YARN-4090-preview.patch, 
> YARN-4090.001.patch, YARN-4090.002.patch, sampling1.jpg, sampling2.jpg
>
>
> Collections.sort() consumes too much time in a scheduling round.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-1558) After apps are moved across queues, store new queue info in the RM state store

2016-09-12 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485925#comment-15485925
 ] 

Xianyin Xin commented on YARN-1558:
---

is there any update about this?

> After apps are moved across queues, store new queue info in the RM state store
> --
>
> Key: YARN-1558
> URL: https://issues.apache.org/jira/browse/YARN-1558
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler
>Reporter: Sandy Ryza
>
> The result of moving an app to a new queue should persist across RM restarts. 
>  This will require updating the ApplicationSubmissionContext, the single 
> source of truth upon state recovery, with the new queue info.
> There will be a brief window after the move completes before the move is 
> stored.  If the RM dies during this window, the recovered RM will include the 
> old queue info.  Schedulers should be resilient to this situation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5479) FairScheduler: Scheduling performance improvement

2016-08-15 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15422002#comment-15422002
 ] 

Xianyin Xin commented on YARN-5479:
---

[~He Tianyi], hope YARN-4090 can provide some information, in which the locked 
resourceusage was snapshoted and such the performance was improved greatly.

> FairScheduler: Scheduling performance improvement
> -
>
> Key: YARN-5479
> URL: https://issues.apache.org/jira/browse/YARN-5479
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler, resourcemanager
>Affects Versions: 2.6.0
>Reporter: He Tianyi
>Assignee: He Tianyi
>
> Currently ResourceManager uses a single thread to handle async events for 
> scheduling. As number of nodes grows, more events need to be processed in 
> time in FairScheduler. Also, increased number of applications & queues slows 
> down processing of each single event. 
> There are two cases that slow processing of nodeUpdate events is problematic:
> A. global throughput is lower than number of nodes through heartbeat rounds. 
> This keeps resource from being allocated since the inefficiency.
> B. global throughput meets the need, but for some of these rounds, events of 
> some nodes cannot get processed before next heartbeat. This brings 
> inefficiency handling burst requests (i.e. newly submitted MapReduce 
> application cannot get its all task launched soon given enough resource).
> Pretty sure some people will encounter the problem eventually after a single 
> cluster is scaled to several K of nodes (even with {{assignmultiple}} 
> enabled).
> This issue proposes to perform several optimization towards performance in 
> FairScheduler {{nodeUpdate}} method. To be specific:
> A. trading off fairness with efficiency, queue & app sorting can be skipped 
> (or should this be called 'delayed sorting'?). we can either start another 
> dedicated thread to do the sorting & updating, or actually perform sorting 
> after current result have been used several times (say sort once in every 100 
> calls.)
> B. performing calculation on {{Resource}} instances is expensive, since at 
> least 2 objects ({{ResourceImpl}} and its proto builder) is created each time 
> (using 'immutable' apis). the overhead can be eliminated with a 
> light-weighted implementation of Resource, which do not instantiate a builder 
> until necessary, because most instances are used as intermediate result in 
> scheduler instead of being exchanged via IPC. Also, {{createResource}} is 
> using reflection, which can be replaced by a plain {{new}} (for scheduler 
> usage only). furthermore, perhaps we could 'intern' resource to avoid 
> allocation.
> C. other minor changes: such as move {{updateRootMetrics}} call to 
> {{update}}, making root queue metrics eventual consistent (which may 
> satisfies most of the needs). or introduce counters to {{getResourceUsage}} 
> and make changing of resource incrementally instead of recalculate each time.
> With A and B, I was looking at 4 times improvement in a cluster with 2K nodes.
> Suggestions? Opinions?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5310) AM restart failed because of the expired HDFS delegation tokens

2016-08-02 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403481#comment-15403481
 ] 

Xianyin Xin commented on YARN-5310:
---

Thanks [~aw]. Then do we have any good idea on this problem? 

> AM restart failed because of the expired HDFS delegation tokens
> ---
>
> Key: YARN-5310
> URL: https://issues.apache.org/jira/browse/YARN-5310
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Xianyin Xin
>Assignee: Xianyin Xin
>
> For a long running AM, it would get failed when restart because the token in 
> ApplicationSubmissionContext expires. We should update it when we get a new 
> delegation token on behalf of the user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5305) Yarn Application log Aggreagation fails due to NM can not get correct HDFS delegation token III

2016-07-13 Thread Xianyin Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xianyin Xin updated YARN-5305:
--
Description: 
Different with YARN-5098 and YARN-5302, this problem happens when AM submits a 
startContainer request with a new HDFS token (say, tokenB) which is not managed 
by YARN, so two tokens exist in the credentials of the user on NM, one is 
tokenB, the other is the one renewed on RM (tokenA). If tokenB is selected when 
connect to HDFS and tokenB expires, exception happens.

Supplementary: this problem happen due to that AM didn't use the service name 
as the token alias in credentials, so two tokens for the same service can 
co-exist in one credentials. TokenSelector can only select the first matched 
token, it doesn't care if the token is valid or not.

  was:Different with YARN-5098 and YARN-5302, this problem happens when AM 
submits a startContainer request with a new HDFS token (say, tokenB) which is 
not managed by YARN, so two tokens exist in the credentials of the user on NM, 
one is tokenB, the other is the one renewed on RM (tokenA). If tokenB is 
selected when connect to HDFS and tokenB expires, exception happens.


> Yarn Application log Aggreagation fails due to NM can not get correct HDFS 
> delegation token III
> ---
>
> Key: YARN-5305
> URL: https://issues.apache.org/jira/browse/YARN-5305
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Xianyin Xin
>
> Different with YARN-5098 and YARN-5302, this problem happens when AM submits 
> a startContainer request with a new HDFS token (say, tokenB) which is not 
> managed by YARN, so two tokens exist in the credentials of the user on NM, 
> one is tokenB, the other is the one renewed on RM (tokenA). If tokenB is 
> selected when connect to HDFS and tokenB expires, exception happens.
> Supplementary: this problem happen due to that AM didn't use the service name 
> as the token alias in credentials, so two tokens for the same service can 
> co-exist in one credentials. TokenSelector can only select the first matched 
> token, it doesn't care if the token is valid or not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5367) HDFS delegation tokens in ApplicationSubmissionContext should be added to systemCrednetials

2016-07-12 Thread Xianyin Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xianyin Xin updated YARN-5367:
--
Attachment: YARN-5367.001.patch

> HDFS delegation tokens in ApplicationSubmissionContext should be added to 
> systemCrednetials
> ---
>
> Key: YARN-5367
> URL: https://issues.apache.org/jira/browse/YARN-5367
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Xianyin Xin
>Assignee: Xianyin Xin
> Attachments: YARN-5367.001.patch
>
>
> App log aggregation may failed because of the below flow:
> 0) suppose the token.max-lifetime is 7 days and renew interval is 1 day;
> 1) start a long running job, like sparkJDBC, of which the AM acts as a 
> service. When submitting the job, HDFS token A in 
> ApplicationSubmissionContext will be added to DelegationTokenRenewer, but not 
> added to systemCredentials;
> 2) after 1 day, submit a spark query. After received the query, AM will 
> request containers and start tasks. When start the containers, a new HDFS 
> token B is used;
> 3) after 1 day, kill the job, when doing log aggregation, exception occurs 
> which show token B is not in the HDFS token cache so the connecting to HDFS 
> fails;
> We should add token A to systemCredentials to make sure token A can be 
> delivered to NMs in time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5367) HDFS delegation tokens in ApplicationSubmissionContext should be added to systemCrednetials

2016-07-12 Thread Xianyin Xin (JIRA)
Xianyin Xin created YARN-5367:
-

 Summary: HDFS delegation tokens in ApplicationSubmissionContext 
should be added to systemCrednetials
 Key: YARN-5367
 URL: https://issues.apache.org/jira/browse/YARN-5367
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Reporter: Xianyin Xin
Assignee: Xianyin Xin


App log aggregation may failed because of the below flow:
0) suppose the token.max-lifetime is 7 days and renew interval is 1 day;
1) start a long running job, like sparkJDBC, of which the AM acts as a service. 
When submitting the job, HDFS token A in ApplicationSubmissionContext will be 
added to DelegationTokenRenewer, but not added to systemCredentials;
2) after 1 day, submit a spark query. After received the query, AM will request 
containers and start tasks. When start the containers, a new HDFS token B is 
used;
3) after 1 day, kill the job, when doing log aggregation, exception occurs 
which show token B is not in the HDFS token cache so the connecting to HDFS 
fails;

We should add token A to systemCredentials to make sure token A can be 
delivered to NMs in time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5302) Yarn Application log Aggreagation fails due to NM can not get correct HDFS delegation token II

2016-07-06 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15365461#comment-15365461
 ] 

Xianyin Xin commented on YARN-5302:
---

Thanks [~varun_saxena]. Will upload a new patch as soon as possible.

> Yarn Application log Aggreagation fails due to NM can not get correct HDFS 
> delegation token II
> --
>
> Key: YARN-5302
> URL: https://issues.apache.org/jira/browse/YARN-5302
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Xianyin Xin
>Assignee: Xianyin Xin
> Attachments: YARN-5032.001.patch, YARN-5032.002.patch, 
> YARN-5302.003.patch, YARN-5302.004.patch
>
>
> Different with YARN-5098, this happens at NM side. When NM recovers, 
> credentials are read from NMStateStore. When initialize app aggregators, 
> exception happens because of the overdue tokens. The app is a long running 
> service.
> {code:title=LogAggregationService.java}
>   protected void initAppAggregator(final ApplicationId appId, String user,
>   Credentials credentials, ContainerLogsRetentionPolicy 
> logRetentionPolicy,
>   Map appAcls,
>   LogAggregationContext logAggregationContext) {
> // Get user's FileSystem credentials
> final UserGroupInformation userUgi =
> UserGroupInformation.createRemoteUser(user);
> if (credentials != null) {
>   userUgi.addCredentials(credentials);
> }
>...
> try {
>   // Create the app dir
>   createAppDir(user, appId, userUgi);
> } catch (Exception e) {
>   appLogAggregator.disableLogAggregation();
>   if (!(e instanceof YarnRuntimeException)) {
> appDirException = new YarnRuntimeException(e);
>   } else {
> appDirException = (YarnRuntimeException)e;
>   }
>   appLogAggregators.remove(appId);
>   closeFileSystems(userUgi);
>   throw appDirException;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5302) Yarn Application log Aggreagation fails due to NM can not get correct HDFS delegation token II

2016-07-05 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363625#comment-15363625
 ] 

Xianyin Xin commented on YARN-5302:
---

Thanks [~jianhe]. I'm also in favor of the second approach. Adopting the first 
one just because it change little on the work flow, just update the statestore 
when NM receives new token. We also don't need to worry about the efficiency of 
the operation, because the updating is not frequently (only when token changes, 
typically, this happens on day-level).
[~Naganarasimha], what's your opinion?

> Yarn Application log Aggreagation fails due to NM can not get correct HDFS 
> delegation token II
> --
>
> Key: YARN-5302
> URL: https://issues.apache.org/jira/browse/YARN-5302
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Xianyin Xin
>Assignee: Xianyin Xin
> Attachments: YARN-5032.001.patch, YARN-5032.002.patch, 
> YARN-5302.003.patch, YARN-5302.004.patch
>
>
> Different with YARN-5098, this happens at NM side. When NM recovers, 
> credentials are read from NMStateStore. When initialize app aggregators, 
> exception happens because of the overdue tokens. The app is a long running 
> service.
> {code:title=LogAggregationService.java}
>   protected void initAppAggregator(final ApplicationId appId, String user,
>   Credentials credentials, ContainerLogsRetentionPolicy 
> logRetentionPolicy,
>   Map appAcls,
>   LogAggregationContext logAggregationContext) {
> // Get user's FileSystem credentials
> final UserGroupInformation userUgi =
> UserGroupInformation.createRemoteUser(user);
> if (credentials != null) {
>   userUgi.addCredentials(credentials);
> }
>...
> try {
>   // Create the app dir
>   createAppDir(user, appId, userUgi);
> } catch (Exception e) {
>   appLogAggregator.disableLogAggregation();
>   if (!(e instanceof YarnRuntimeException)) {
> appDirException = new YarnRuntimeException(e);
>   } else {
> appDirException = (YarnRuntimeException)e;
>   }
>   appLogAggregators.remove(appId);
>   closeFileSystems(userUgi);
>   throw appDirException;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5302) Yarn Application log Aggreagation fails due to NM can not get correct HDFS delegation token II

2016-07-05 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362137#comment-15362137
 ] 

Xianyin Xin commented on YARN-5302:
---

Thanks [~Naganarasimha]. *we simply always manage our own tokens for 
localization and log-aggregation for long-running applications / services* is 
ok, but not enough if we want to support recovery of long running jobs. Of 
course we can limit our tokens management to the above two scopes considering 
some secure problems, the *cost* is we lose the ability of recovery of long 
running jobs(YARN-5302, YARN-5310) and may also obtain log aggregation failures 
for such jobs(YARN-5305).
However, all of this depends on how we *define* the behavior of YARN that on 
behalf of users, that is, what YARN can/cannot do when the token expires and 
can we accept the failures because of the token expiration.


> Yarn Application log Aggreagation fails due to NM can not get correct HDFS 
> delegation token II
> --
>
> Key: YARN-5302
> URL: https://issues.apache.org/jira/browse/YARN-5302
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Xianyin Xin
>Assignee: Xianyin Xin
> Attachments: YARN-5032.001.patch, YARN-5032.002.patch, 
> YARN-5302.003.patch, YARN-5302.004.patch
>
>
> Different with YARN-5098, this happens at NM side. When NM recovers, 
> credentials are read from NMStateStore. When initialize app aggregators, 
> exception happens because of the overdue tokens. The app is a long running 
> service.
> {code:title=LogAggregationService.java}
>   protected void initAppAggregator(final ApplicationId appId, String user,
>   Credentials credentials, ContainerLogsRetentionPolicy 
> logRetentionPolicy,
>   Map appAcls,
>   LogAggregationContext logAggregationContext) {
> // Get user's FileSystem credentials
> final UserGroupInformation userUgi =
> UserGroupInformation.createRemoteUser(user);
> if (credentials != null) {
>   userUgi.addCredentials(credentials);
> }
>...
> try {
>   // Create the app dir
>   createAppDir(user, appId, userUgi);
> } catch (Exception e) {
>   appLogAggregator.disableLogAggregation();
>   if (!(e instanceof YarnRuntimeException)) {
> appDirException = new YarnRuntimeException(e);
>   } else {
> appDirException = (YarnRuntimeException)e;
>   }
>   appLogAggregators.remove(appId);
>   closeFileSystems(userUgi);
>   throw appDirException;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5175) Simplify the delegation token renewal management for YARN localization and log-aggregation

2016-07-04 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361936#comment-15361936
 ] 

Xianyin Xin commented on YARN-5175:
---

Hi [~jianhe], what's the progress on this? Is there any design or thought?

> Simplify the delegation token renewal management for YARN localization and 
> log-aggregation
> --
>
> Key: YARN-5175
> URL: https://issues.apache.org/jira/browse/YARN-5175
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>
> Increasingly, this DelegationTokenRenewer class for renewing expiring token 
> for localization and log-aggregation is getting complicated. We could have 
> done it at a per-user level.  copying comments from vinod in YARN-5098:
> bq. Overall, I think we can simplify this code if we simply always manage our 
> own tokens for localization and log-aggregation for long-running applications 
> / services. Today, it's too complicated: for the first day, we use the user's 
> token T, second day we get a new token T' but share it for all the apps 
> originally sharing T, after RM restart we use a new token T'' which is 
> different for each of the apps originally sharing T. We can simplify this by 
> always managing it ourselves and managing them per-user!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5310) AM restart failed because of the expired HDFS delegation tokens

2016-07-04 Thread Xianyin Xin (JIRA)
Xianyin Xin created YARN-5310:
-

 Summary: AM restart failed because of the expired HDFS delegation 
tokens
 Key: YARN-5310
 URL: https://issues.apache.org/jira/browse/YARN-5310
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Reporter: Xianyin Xin
Assignee: Xianyin Xin


For a long running AM, it would get failed when restart because the token in 
ApplicationSubmissionContext expires. We should update it when we get a new 
delegation token on behalf of the user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5302) Yarn Application log Aggreagation fails due to NM can not get correct HDFS delegation token II

2016-07-04 Thread Xianyin Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xianyin Xin updated YARN-5302:
--
Attachment: YARN-5302.004.patch

> Yarn Application log Aggreagation fails due to NM can not get correct HDFS 
> delegation token II
> --
>
> Key: YARN-5302
> URL: https://issues.apache.org/jira/browse/YARN-5302
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Xianyin Xin
>Assignee: Xianyin Xin
> Attachments: YARN-5032.001.patch, YARN-5032.002.patch, 
> YARN-5302.003.patch, YARN-5302.004.patch
>
>
> Different with YARN-5098, this happens at NM side. When NM recovers, 
> credentials are read from NMStateStore. When initialize app aggregators, 
> exception happens because of the overdue tokens. The app is a long running 
> service.
> {code:title=LogAggregationService.java}
>   protected void initAppAggregator(final ApplicationId appId, String user,
>   Credentials credentials, ContainerLogsRetentionPolicy 
> logRetentionPolicy,
>   Map appAcls,
>   LogAggregationContext logAggregationContext) {
> // Get user's FileSystem credentials
> final UserGroupInformation userUgi =
> UserGroupInformation.createRemoteUser(user);
> if (credentials != null) {
>   userUgi.addCredentials(credentials);
> }
>...
> try {
>   // Create the app dir
>   createAppDir(user, appId, userUgi);
> } catch (Exception e) {
>   appLogAggregator.disableLogAggregation();
>   if (!(e instanceof YarnRuntimeException)) {
> appDirException = new YarnRuntimeException(e);
>   } else {
> appDirException = (YarnRuntimeException)e;
>   }
>   appLogAggregators.remove(appId);
>   closeFileSystems(userUgi);
>   throw appDirException;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5302) Yarn Application log Aggreagation fails due to NM can not get correct HDFS delegation token II

2016-07-04 Thread Xianyin Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xianyin Xin updated YARN-5302:
--
Attachment: YARN-5302.003.patch

Upload a new patch to fix the checkstyles. The javadoc warning is not involved 
by this patch.

> Yarn Application log Aggreagation fails due to NM can not get correct HDFS 
> delegation token II
> --
>
> Key: YARN-5302
> URL: https://issues.apache.org/jira/browse/YARN-5302
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Xianyin Xin
>Assignee: Xianyin Xin
> Attachments: YARN-5032.001.patch, YARN-5032.002.patch, 
> YARN-5302.003.patch
>
>
> Different with YARN-5098, this happens at NM side. When NM recovers, 
> credentials are read from NMStateStore. When initialize app aggregators, 
> exception happens because of the overdue tokens. The app is a long running 
> service.
> {code:title=LogAggregationService.java}
>   protected void initAppAggregator(final ApplicationId appId, String user,
>   Credentials credentials, ContainerLogsRetentionPolicy 
> logRetentionPolicy,
>   Map appAcls,
>   LogAggregationContext logAggregationContext) {
> // Get user's FileSystem credentials
> final UserGroupInformation userUgi =
> UserGroupInformation.createRemoteUser(user);
> if (credentials != null) {
>   userUgi.addCredentials(credentials);
> }
>...
> try {
>   // Create the app dir
>   createAppDir(user, appId, userUgi);
> } catch (Exception e) {
>   appLogAggregator.disableLogAggregation();
>   if (!(e instanceof YarnRuntimeException)) {
> appDirException = new YarnRuntimeException(e);
>   } else {
> appDirException = (YarnRuntimeException)e;
>   }
>   appLogAggregators.remove(appId);
>   closeFileSystems(userUgi);
>   throw appDirException;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5302) Yarn Application log Aggreagation fails due to NM can not get correct HDFS delegation token II

2016-07-04 Thread Xianyin Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xianyin Xin updated YARN-5302:
--
Attachment: YARN-5032.002.patch

> Yarn Application log Aggreagation fails due to NM can not get correct HDFS 
> delegation token II
> --
>
> Key: YARN-5302
> URL: https://issues.apache.org/jira/browse/YARN-5302
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Xianyin Xin
>Assignee: Xianyin Xin
> Attachments: YARN-5032.001.patch, YARN-5032.002.patch
>
>
> Different with YARN-5098, this happens at NM side. When NM recovers, 
> credentials are read from NMStateStore. When initialize app aggregators, 
> exception happens because of the overdue tokens. The app is a long running 
> service.
> {code:title=LogAggregationService.java}
>   protected void initAppAggregator(final ApplicationId appId, String user,
>   Credentials credentials, ContainerLogsRetentionPolicy 
> logRetentionPolicy,
>   Map appAcls,
>   LogAggregationContext logAggregationContext) {
> // Get user's FileSystem credentials
> final UserGroupInformation userUgi =
> UserGroupInformation.createRemoteUser(user);
> if (credentials != null) {
>   userUgi.addCredentials(credentials);
> }
>...
> try {
>   // Create the app dir
>   createAppDir(user, appId, userUgi);
> } catch (Exception e) {
>   appLogAggregator.disableLogAggregation();
>   if (!(e instanceof YarnRuntimeException)) {
> appDirException = new YarnRuntimeException(e);
>   } else {
> appDirException = (YarnRuntimeException)e;
>   }
>   appLogAggregators.remove(appId);
>   closeFileSystems(userUgi);
>   throw appDirException;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5305) Yarn Application log Aggreagation fails due to NM can not get correct HDFS delegation token III

2016-07-03 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15360808#comment-15360808
 ] 

Xianyin Xin commented on YARN-5305:
---

Thanks [~Naganarasimha]. YARN-5175 may solve this, but the solution hasn't come 
out. Can we keep it open and keep track on YARN-5175 and if the finalized 
solution of YARN-5175 can solve this then we close this?

> Yarn Application log Aggreagation fails due to NM can not get correct HDFS 
> delegation token III
> ---
>
> Key: YARN-5305
> URL: https://issues.apache.org/jira/browse/YARN-5305
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Xianyin Xin
>
> Different with YARN-5098 and YARN-5302, this problem happens when AM submits 
> a startContainer request with a new HDFS token (say, tokenB) which is not 
> managed by YARN, so two tokens exist in the credentials of the user on NM, 
> one is tokenB, the other is the one renewed on RM (tokenA). If tokenB is 
> selected when connect to HDFS and tokenB expires, exception happens.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-5302) Yarn Application log Aggreagation fails due to NM can not get correct HDFS delegation token II

2016-07-03 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15360805#comment-15360805
 ] 

Xianyin Xin edited comment on YARN-5302 at 7/4/16 3:51 AM:
---

Thanks [~Naganarasimha]. YARN-2704 gives fundamental ability to renew a HDFS 
delegation token, however, it didn't cover all the token-caused failures, like 
this jira and YARN-5305. In this jira, the exception happens in recovery stage, 
where the token is read from NMStateStore and it has been expired.

I believe the new requested HDFS token should be persisted to NMStateStore, or, 
if we don't want to do so, we should use try the {{systemCredentials}} when the 
original tokens in NMStateStore expires.


was (Author: xinxianyin):
Thanks [~Naganarasimha]. YARN-2704 gives fundamental ability to renew a HDFS 
delegation token, however, it didn't cover all the token-caused failures, like 
this jira and YARN-5035. In this jira, the exception happens in recovery stage, 
where the token is read from NMStateStore and it has been expired.

I believe the new requested HDFS token should be persisted to NMStateStore, or, 
if we don't want to do so, we should use try the {{systemCredentials}} when the 
original tokens in NMStateStore expires.

> Yarn Application log Aggreagation fails due to NM can not get correct HDFS 
> delegation token II
> --
>
> Key: YARN-5302
> URL: https://issues.apache.org/jira/browse/YARN-5302
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Xianyin Xin
>Assignee: Xianyin Xin
> Attachments: YARN-5032.001.patch
>
>
> Different with YARN-5098, this happens at NM side. When NM recovers, 
> credentials are read from NMStateStore. When initialize app aggregators, 
> exception happens because of the overdue tokens. The app is a long running 
> service.
> {code:title=LogAggregationService.java}
>   protected void initAppAggregator(final ApplicationId appId, String user,
>   Credentials credentials, ContainerLogsRetentionPolicy 
> logRetentionPolicy,
>   Map appAcls,
>   LogAggregationContext logAggregationContext) {
> // Get user's FileSystem credentials
> final UserGroupInformation userUgi =
> UserGroupInformation.createRemoteUser(user);
> if (credentials != null) {
>   userUgi.addCredentials(credentials);
> }
>...
> try {
>   // Create the app dir
>   createAppDir(user, appId, userUgi);
> } catch (Exception e) {
>   appLogAggregator.disableLogAggregation();
>   if (!(e instanceof YarnRuntimeException)) {
> appDirException = new YarnRuntimeException(e);
>   } else {
> appDirException = (YarnRuntimeException)e;
>   }
>   appLogAggregators.remove(appId);
>   closeFileSystems(userUgi);
>   throw appDirException;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5302) Yarn Application log Aggreagation fails due to NM can not get correct HDFS delegation token II

2016-07-03 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15360805#comment-15360805
 ] 

Xianyin Xin commented on YARN-5302:
---

Thanks [~Naganarasimha]. YARN-2704 gives fundamental ability to renew a HDFS 
delegation token, however, it didn't cover all the token-caused failures, like 
this jira and YARN-5035. In this jira, the exception happens in recovery stage, 
where the token is read from NMStateStore and it has been expired.

I believe the new requested HDFS token should be persisted to NMStateStore, or, 
if we don't want to do so, we should use try the {{systemCredentials}} when the 
original tokens in NMStateStore expires.

> Yarn Application log Aggreagation fails due to NM can not get correct HDFS 
> delegation token II
> --
>
> Key: YARN-5302
> URL: https://issues.apache.org/jira/browse/YARN-5302
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Xianyin Xin
>Assignee: Xianyin Xin
> Attachments: YARN-5032.001.patch
>
>
> Different with YARN-5098, this happens at NM side. When NM recovers, 
> credentials are read from NMStateStore. When initialize app aggregators, 
> exception happens because of the overdue tokens. The app is a long running 
> service.
> {code:title=LogAggregationService.java}
>   protected void initAppAggregator(final ApplicationId appId, String user,
>   Credentials credentials, ContainerLogsRetentionPolicy 
> logRetentionPolicy,
>   Map appAcls,
>   LogAggregationContext logAggregationContext) {
> // Get user's FileSystem credentials
> final UserGroupInformation userUgi =
> UserGroupInformation.createRemoteUser(user);
> if (credentials != null) {
>   userUgi.addCredentials(credentials);
> }
>...
> try {
>   // Create the app dir
>   createAppDir(user, appId, userUgi);
> } catch (Exception e) {
>   appLogAggregator.disableLogAggregation();
>   if (!(e instanceof YarnRuntimeException)) {
> appDirException = new YarnRuntimeException(e);
>   } else {
> appDirException = (YarnRuntimeException)e;
>   }
>   appLogAggregators.remove(appId);
>   closeFileSystems(userUgi);
>   throw appDirException;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5302) Yarn Application log Aggreagation fails due to NM can not get correct HDFS delegation token II

2016-07-01 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15358718#comment-15358718
 ] 

Xianyin Xin commented on YARN-5302:
---

Hi [~varun_saxena], when initAppAggregator, it uses the credentials read from 
NMStateStore. So it has nothing to do with the new token sent from RM.

{quote}
Do we not update the token in NM state store when it changes ?
{quote}

Just upload a patch which adopt this way, but maybe we also have other better 
solutions.

> Yarn Application log Aggreagation fails due to NM can not get correct HDFS 
> delegation token II
> --
>
> Key: YARN-5302
> URL: https://issues.apache.org/jira/browse/YARN-5302
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Xianyin Xin
>Assignee: Xianyin Xin
> Attachments: YARN-5032.001.patch
>
>
> Different with YARN-5098, this happens at NM side. When NM recovers, 
> credentials are read from NMStateStore. When initialize app aggregators, 
> exception happens because of the overdue tokens. The app is a long running 
> service.
> {code:title=LogAggregationService.java}
>   protected void initAppAggregator(final ApplicationId appId, String user,
>   Credentials credentials, ContainerLogsRetentionPolicy 
> logRetentionPolicy,
>   Map appAcls,
>   LogAggregationContext logAggregationContext) {
> // Get user's FileSystem credentials
> final UserGroupInformation userUgi =
> UserGroupInformation.createRemoteUser(user);
> if (credentials != null) {
>   userUgi.addCredentials(credentials);
> }
>...
> try {
>   // Create the app dir
>   createAppDir(user, appId, userUgi);
> } catch (Exception e) {
>   appLogAggregator.disableLogAggregation();
>   if (!(e instanceof YarnRuntimeException)) {
> appDirException = new YarnRuntimeException(e);
>   } else {
> appDirException = (YarnRuntimeException)e;
>   }
>   appLogAggregators.remove(appId);
>   closeFileSystems(userUgi);
>   throw appDirException;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5302) Yarn Application log Aggreagation fails due to NM can not get correct HDFS delegation token II

2016-07-01 Thread Xianyin Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xianyin Xin updated YARN-5302:
--
Attachment: YARN-5032.001.patch

Upload a preview patch which update the new token to NMStateStore.

> Yarn Application log Aggreagation fails due to NM can not get correct HDFS 
> delegation token II
> --
>
> Key: YARN-5302
> URL: https://issues.apache.org/jira/browse/YARN-5302
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Xianyin Xin
>Assignee: Xianyin Xin
> Attachments: YARN-5032.001.patch
>
>
> Different with YARN-5098, this happens at NM side. When NM recovers, 
> credentials are read from NMStateStore. When initialize app aggregators, 
> exception happens because of the overdue tokens. The app is a long running 
> service.
> {code:title=LogAggregationService.java}
>   protected void initAppAggregator(final ApplicationId appId, String user,
>   Credentials credentials, ContainerLogsRetentionPolicy 
> logRetentionPolicy,
>   Map appAcls,
>   LogAggregationContext logAggregationContext) {
> // Get user's FileSystem credentials
> final UserGroupInformation userUgi =
> UserGroupInformation.createRemoteUser(user);
> if (credentials != null) {
>   userUgi.addCredentials(credentials);
> }
>...
> try {
>   // Create the app dir
>   createAppDir(user, appId, userUgi);
> } catch (Exception e) {
>   appLogAggregator.disableLogAggregation();
>   if (!(e instanceof YarnRuntimeException)) {
> appDirException = new YarnRuntimeException(e);
>   } else {
> appDirException = (YarnRuntimeException)e;
>   }
>   appLogAggregators.remove(appId);
>   closeFileSystems(userUgi);
>   throw appDirException;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-5302) Yarn Application log Aggreagation fails due to NM can not get correct HDFS delegation token II

2016-07-01 Thread Xianyin Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xianyin Xin reassigned YARN-5302:
-

Assignee: Xianyin Xin

> Yarn Application log Aggreagation fails due to NM can not get correct HDFS 
> delegation token II
> --
>
> Key: YARN-5302
> URL: https://issues.apache.org/jira/browse/YARN-5302
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Xianyin Xin
>Assignee: Xianyin Xin
>
> Different with YARN-5098, this happens at NM side. When NM recovers, 
> credentials are read from NMStateStore. When initialize app aggregators, 
> exception happens because of the overdue tokens. The app is a long running 
> service.
> {code:title=LogAggregationService.java}
>   protected void initAppAggregator(final ApplicationId appId, String user,
>   Credentials credentials, ContainerLogsRetentionPolicy 
> logRetentionPolicy,
>   Map appAcls,
>   LogAggregationContext logAggregationContext) {
> // Get user's FileSystem credentials
> final UserGroupInformation userUgi =
> UserGroupInformation.createRemoteUser(user);
> if (credentials != null) {
>   userUgi.addCredentials(credentials);
> }
>...
> try {
>   // Create the app dir
>   createAppDir(user, appId, userUgi);
> } catch (Exception e) {
>   appLogAggregator.disableLogAggregation();
>   if (!(e instanceof YarnRuntimeException)) {
> appDirException = new YarnRuntimeException(e);
>   } else {
> appDirException = (YarnRuntimeException)e;
>   }
>   appLogAggregators.remove(appId);
>   closeFileSystems(userUgi);
>   throw appDirException;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5305) Yarn Application log Aggreagation fails due to NM can not get correct HDFS delegation token III

2016-07-01 Thread Xianyin Xin (JIRA)
Xianyin Xin created YARN-5305:
-

 Summary: Yarn Application log Aggreagation fails due to NM can not 
get correct HDFS delegation token III
 Key: YARN-5305
 URL: https://issues.apache.org/jira/browse/YARN-5305
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Reporter: Xianyin Xin


Different with YARN-5098 and YARN-5302, this problem happens when AM submits a 
startContainer request with a new HDFS token (say, tokenB) which is not managed 
by YARN, so two tokens exist in the credentials of the user on NM, one is 
tokenB, the other is the one renewed on RM (tokenA). If tokenB is selected when 
connect to HDFS and tokenB expires, exception happens.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5302) Yarn Application log Aggreagation fails due to NM can not get correct HDFS delegation token II

2016-07-01 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15358518#comment-15358518
 ] 

Xianyin Xin commented on YARN-5302:
---

I think we have 2 options. One is persist the new token to the 
ContainerLaunchContext in NMStateStore; the other is change the 
{{initAppAggregator()}} to be async in which it can wait for RM's new token 
once using the original token fails.
[~jianhe], do you have any idea?

> Yarn Application log Aggreagation fails due to NM can not get correct HDFS 
> delegation token II
> --
>
> Key: YARN-5302
> URL: https://issues.apache.org/jira/browse/YARN-5302
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Xianyin Xin
>
> Different with YARN-5098, this happens at NM side. When NM recovers, 
> credentials are read from NMStateStore. When initialize app aggregators, 
> exception happens because of the overdue tokens. The app is a long running 
> service.
> {code:title=LogAggregationService.java}
>   protected void initAppAggregator(final ApplicationId appId, String user,
>   Credentials credentials, ContainerLogsRetentionPolicy 
> logRetentionPolicy,
>   Map appAcls,
>   LogAggregationContext logAggregationContext) {
> // Get user's FileSystem credentials
> final UserGroupInformation userUgi =
> UserGroupInformation.createRemoteUser(user);
> if (credentials != null) {
>   userUgi.addCredentials(credentials);
> }
>...
> try {
>   // Create the app dir
>   createAppDir(user, appId, userUgi);
> } catch (Exception e) {
>   appLogAggregator.disableLogAggregation();
>   if (!(e instanceof YarnRuntimeException)) {
> appDirException = new YarnRuntimeException(e);
>   } else {
> appDirException = (YarnRuntimeException)e;
>   }
>   appLogAggregators.remove(appId);
>   closeFileSystems(userUgi);
>   throw appDirException;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5302) Yarn Application log Aggreagation fails due to NM can not get correct HDFS delegation token II

2016-07-01 Thread Xianyin Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xianyin Xin updated YARN-5302:
--
Description: 
Different with YARN-5098, this happens at NM side. When NM recovers, 
credentials are read from NMStateStore. When initialize app aggregators, 
exception happens because of the overdue tokens. The app is a long running 
service.

{code:title=LogAggregationService.java}
  protected void initAppAggregator(final ApplicationId appId, String user,
  Credentials credentials, ContainerLogsRetentionPolicy logRetentionPolicy,
  Map appAcls,
  LogAggregationContext logAggregationContext) {

// Get user's FileSystem credentials
final UserGroupInformation userUgi =
UserGroupInformation.createRemoteUser(user);
if (credentials != null) {
  userUgi.addCredentials(credentials);
}

   ...

try {
  // Create the app dir
  createAppDir(user, appId, userUgi);
} catch (Exception e) {
  appLogAggregator.disableLogAggregation();
  if (!(e instanceof YarnRuntimeException)) {
appDirException = new YarnRuntimeException(e);
  } else {
appDirException = (YarnRuntimeException)e;
  }
  appLogAggregators.remove(appId);
  closeFileSystems(userUgi);
  throw appDirException;
}
{code}

  was:
Different with YARN-5098, this happens at NM side. When NM recovers, 
credentials are read from NMStateStore. When initialize app aggregators, 
exception happens because of the overdue tokens.

{code:title=LogAggregationService.java}
  protected void initAppAggregator(final ApplicationId appId, String user,
  Credentials credentials, ContainerLogsRetentionPolicy logRetentionPolicy,
  Map appAcls,
  LogAggregationContext logAggregationContext) {

// Get user's FileSystem credentials
final UserGroupInformation userUgi =
UserGroupInformation.createRemoteUser(user);
if (credentials != null) {
  userUgi.addCredentials(credentials);
}

   ...

try {
  // Create the app dir
  createAppDir(user, appId, userUgi);
} catch (Exception e) {
  appLogAggregator.disableLogAggregation();
  if (!(e instanceof YarnRuntimeException)) {
appDirException = new YarnRuntimeException(e);
  } else {
appDirException = (YarnRuntimeException)e;
  }
  appLogAggregators.remove(appId);
  closeFileSystems(userUgi);
  throw appDirException;
}
{code}


> Yarn Application log Aggreagation fails due to NM can not get correct HDFS 
> delegation token II
> --
>
> Key: YARN-5302
> URL: https://issues.apache.org/jira/browse/YARN-5302
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Xianyin Xin
>
> Different with YARN-5098, this happens at NM side. When NM recovers, 
> credentials are read from NMStateStore. When initialize app aggregators, 
> exception happens because of the overdue tokens. The app is a long running 
> service.
> {code:title=LogAggregationService.java}
>   protected void initAppAggregator(final ApplicationId appId, String user,
>   Credentials credentials, ContainerLogsRetentionPolicy 
> logRetentionPolicy,
>   Map appAcls,
>   LogAggregationContext logAggregationContext) {
> // Get user's FileSystem credentials
> final UserGroupInformation userUgi =
> UserGroupInformation.createRemoteUser(user);
> if (credentials != null) {
>   userUgi.addCredentials(credentials);
> }
>...
> try {
>   // Create the app dir
>   createAppDir(user, appId, userUgi);
> } catch (Exception e) {
>   appLogAggregator.disableLogAggregation();
>   if (!(e instanceof YarnRuntimeException)) {
> appDirException = new YarnRuntimeException(e);
>   } else {
> appDirException = (YarnRuntimeException)e;
>   }
>   appLogAggregators.remove(appId);
>   closeFileSystems(userUgi);
>   throw appDirException;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5302) Yarn Application log Aggreagation fails due to NM can not get correct HDFS delegation token II

2016-07-01 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15358491#comment-15358491
 ] 

Xianyin Xin commented on YARN-5302:
---

Thanks varun. Maybe you are mentioning YARN-4783. But from the discussing of 
YARN-4783, it seems in that case RM has canceled the token, so it is not secure 
to continue to maitain a HDFS delegation token for the app. In this case, app 
is still running, but RM has reqeusted a new HDFS token. Because this exception 
happens duration NM recovering, RM's new token hasn't be passed to NM. The old 
token is read from StateStore and cause the exception.

Sorry for insufficient information.

> Yarn Application log Aggreagation fails due to NM can not get correct HDFS 
> delegation token II
> --
>
> Key: YARN-5302
> URL: https://issues.apache.org/jira/browse/YARN-5302
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Xianyin Xin
>
> Different with YARN-5098, this happens at NM side. When NM recovers, 
> credentials are read from NMStateStore. When initialize app aggregators, 
> exception happens because of the overdue tokens.
> {code:title=LogAggregationService.java}
>   protected void initAppAggregator(final ApplicationId appId, String user,
>   Credentials credentials, ContainerLogsRetentionPolicy 
> logRetentionPolicy,
>   Map appAcls,
>   LogAggregationContext logAggregationContext) {
> // Get user's FileSystem credentials
> final UserGroupInformation userUgi =
> UserGroupInformation.createRemoteUser(user);
> if (credentials != null) {
>   userUgi.addCredentials(credentials);
> }
>...
> try {
>   // Create the app dir
>   createAppDir(user, appId, userUgi);
> } catch (Exception e) {
>   appLogAggregator.disableLogAggregation();
>   if (!(e instanceof YarnRuntimeException)) {
> appDirException = new YarnRuntimeException(e);
>   } else {
> appDirException = (YarnRuntimeException)e;
>   }
>   appLogAggregators.remove(appId);
>   closeFileSystems(userUgi);
>   throw appDirException;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5302) Yarn Application log Aggreagation fails due to NM can not get correct HDFS delegation token II

2016-07-01 Thread Xianyin Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xianyin Xin updated YARN-5302:
--
Description: 
Different with YARN-5098, this happens at NM side. When NM recovers, 
credentials are read from NMStateStore. When initialize app aggregators, 
exception happens because of the overdue tokens.

{code:title=LogAggregationService.java}
  protected void initAppAggregator(final ApplicationId appId, String user,
  Credentials credentials, ContainerLogsRetentionPolicy logRetentionPolicy,
  Map appAcls,
  LogAggregationContext logAggregationContext) {

// Get user's FileSystem credentials
final UserGroupInformation userUgi =
UserGroupInformation.createRemoteUser(user);
if (credentials != null) {
  userUgi.addCredentials(credentials);
}

   ...

try {
  // Create the app dir
  createAppDir(user, appId, userUgi);
} catch (Exception e) {
  appLogAggregator.disableLogAggregation();
  if (!(e instanceof YarnRuntimeException)) {
appDirException = new YarnRuntimeException(e);
  } else {
appDirException = (YarnRuntimeException)e;
  }
  appLogAggregators.remove(appId);
  closeFileSystems(userUgi);
  throw appDirException;
}
{code}

  was:
Different with YARN-5089, this happens at NM side. When NM recovers, 
credentials are read from NMStateStore. When initialize app aggregators, 
exception happens because of the overdue tokens.

{code:title=LogAggregationService.java}
  protected void initAppAggregator(final ApplicationId appId, String user,
  Credentials credentials, ContainerLogsRetentionPolicy logRetentionPolicy,
  Map appAcls,
  LogAggregationContext logAggregationContext) {

// Get user's FileSystem credentials
final UserGroupInformation userUgi =
UserGroupInformation.createRemoteUser(user);
if (credentials != null) {
  userUgi.addCredentials(credentials);
}

   ...

try {
  // Create the app dir
  createAppDir(user, appId, userUgi);
} catch (Exception e) {
  appLogAggregator.disableLogAggregation();
  if (!(e instanceof YarnRuntimeException)) {
appDirException = new YarnRuntimeException(e);
  } else {
appDirException = (YarnRuntimeException)e;
  }
  appLogAggregators.remove(appId);
  closeFileSystems(userUgi);
  throw appDirException;
}
{code}


> Yarn Application log Aggreagation fails due to NM can not get correct HDFS 
> delegation token II
> --
>
> Key: YARN-5302
> URL: https://issues.apache.org/jira/browse/YARN-5302
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Xianyin Xin
>
> Different with YARN-5098, this happens at NM side. When NM recovers, 
> credentials are read from NMStateStore. When initialize app aggregators, 
> exception happens because of the overdue tokens.
> {code:title=LogAggregationService.java}
>   protected void initAppAggregator(final ApplicationId appId, String user,
>   Credentials credentials, ContainerLogsRetentionPolicy 
> logRetentionPolicy,
>   Map appAcls,
>   LogAggregationContext logAggregationContext) {
> // Get user's FileSystem credentials
> final UserGroupInformation userUgi =
> UserGroupInformation.createRemoteUser(user);
> if (credentials != null) {
>   userUgi.addCredentials(credentials);
> }
>...
> try {
>   // Create the app dir
>   createAppDir(user, appId, userUgi);
> } catch (Exception e) {
>   appLogAggregator.disableLogAggregation();
>   if (!(e instanceof YarnRuntimeException)) {
> appDirException = new YarnRuntimeException(e);
>   } else {
> appDirException = (YarnRuntimeException)e;
>   }
>   appLogAggregators.remove(appId);
>   closeFileSystems(userUgi);
>   throw appDirException;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5302) Yarn Application log Aggreagation fails due to NM can not get correct HDFS delegation token II

2016-06-30 Thread Xianyin Xin (JIRA)
Xianyin Xin created YARN-5302:
-

 Summary: Yarn Application log Aggreagation fails due to NM can not 
get correct HDFS delegation token II
 Key: YARN-5302
 URL: https://issues.apache.org/jira/browse/YARN-5302
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Reporter: Xianyin Xin


Different with YARN-5089, this happens at NM side. When NM recovers, 
credentials are read from NMStateStore. When initialize app aggregators, 
exception happens because of the overdue tokens.

{code:title=LogAggregationService.java}
  protected void initAppAggregator(final ApplicationId appId, String user,
  Credentials credentials, ContainerLogsRetentionPolicy logRetentionPolicy,
  Map appAcls,
  LogAggregationContext logAggregationContext) {

// Get user's FileSystem credentials
final UserGroupInformation userUgi =
UserGroupInformation.createRemoteUser(user);
if (credentials != null) {
  userUgi.addCredentials(credentials);
}

   ...

try {
  // Create the app dir
  createAppDir(user, appId, userUgi);
} catch (Exception e) {
  appLogAggregator.disableLogAggregation();
  if (!(e instanceof YarnRuntimeException)) {
appDirException = new YarnRuntimeException(e);
  } else {
appDirException = (YarnRuntimeException)e;
  }
  appLogAggregators.remove(appId);
  closeFileSystems(userUgi);
  throw appDirException;
}
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-5241) FairScheduler repeat container completed

2016-06-13 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327315#comment-15327315
 ] 

Xianyin Xin edited comment on YARN-5241 at 6/13/16 1:06 PM:


Thank you for reporting this, [~chenfolin]. So if the containers occasionally 
exist in completed state of nodeHeartbeatRequest and in releaseList of 
AllocateRequest, this problem would happen, right?
About your patch, could you please re-format it according to the hadoop code 
conventions mentioned here [http://wiki.apache.org/hadoop/HowToContribute]? If 
you can provide one patch based on trunk, that would be better. :)
cc [~kasha].


was (Author: xinxianyin):
Thank you for reporting this, [~chenfolin]. So if the containers occasionally 
exist in completed state of nodeHeartbeatRequest and in releaseList of 
AllocateRequest, this problem would happen, right?
About your patch, could you please re-format it according to the hadoop code 
conventions mentioned here [http://wiki.apache.org/hadoop/HowToContribute]? 
cc [~kasha].

> FairScheduler repeat container completed
> 
>
> Key: YARN-5241
> URL: https://issues.apache.org/jira/browse/YARN-5241
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.5.0, 2.6.1, 2.8.0, 2.7.2
>Reporter: ChenFolin
> Attachments: YARN-5241-001.patch, repeatContainerCompleted.log
>
>
> NodeManager heartbeat event NODE_UPDATE and ApplicationMaster allocate 
> operate may cause repeat container completed, it can lead something wrong.
> Node releaseContainer can pervent repeat release operate:
> like:
> public synchronized void releaseContainer(Container container) {
> if (!isValidContainer(container.getId())) {
>   LOG.error("Invalid container released " + container);
>   return;
> }
> FSAppAttempt containerCompleted did not prevent repeat container completed 
> operate.
> Detail logs at attach file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5241) FairScheduler repeat container completed

2016-06-13 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327315#comment-15327315
 ] 

Xianyin Xin commented on YARN-5241:
---

Thank you for reporting this, [~chenfolin]. So if the containers occasionally 
exist in completed state of nodeHeartbeatRequest and in releaseList of 
AllocateRequest, this problem would happen, right?
About your patch, could you please re-format it according to the hadoop code 
conventions mentioned here [http://wiki.apache.org/hadoop/HowToContribute]? 
cc [~kasha].

> FairScheduler repeat container completed
> 
>
> Key: YARN-5241
> URL: https://issues.apache.org/jira/browse/YARN-5241
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.5.0, 2.6.1, 2.8.0, 2.7.2
>Reporter: ChenFolin
> Attachments: YARN-5241-001.patch, repeatContainerCompleted.log
>
>
> NodeManager heartbeat event NODE_UPDATE and ApplicationMaster allocate 
> operate may cause repeat container completed, it can lead something wrong.
> Node releaseContainer can pervent repeat release operate:
> like:
> public synchronized void releaseContainer(Container container) {
> if (!isValidContainer(container.getId())) {
>   LOG.error("Invalid container released " + container);
>   return;
> }
> FSAppAttempt containerCompleted did not prevent repeat container completed 
> operate.
> Detail logs at attach file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5188) FairScheduler performance bug

2016-06-01 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15309341#comment-15309341
 ] 

Xianyin Xin commented on YARN-5188:
---

hi [~chenfolin], thanks for reporting this.
{quote}
1: application sort before assign container at FSLeafQueue. TreeSet is not the 
best, Why not keep orderly ? and then we can use binary search to help keep 
orderly when a application's resource usage has changed.
{quote}
Can you please explain more and analyse the performance improvement after you 
have adopted the new approach?
{quote}
2: queue sort and assignContainerPreCheck will lead to compute all leafqueue 
resource usage ,Why can we store the leafqueue usage at memory and update it 
when assign container op release container happen?
{quote}
Can you verify if this is the same with YARN-4090? And if not, what's the 
difference?

Thanks.

> FairScheduler performance bug
> -
>
> Key: YARN-5188
> URL: https://issues.apache.org/jira/browse/YARN-5188
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.5.0, 2.8.0
>Reporter: ChenFolin
> Attachments: YARN-5188.patch
>
>
>  My Hadoop Cluster has recently encountered a performance problem. Details as 
> Follows.
> There are two point which can cause this performance issue.
> 1: application sort before assign container at FSLeafQueue. TreeSet is not 
> the best, Why not keep orderly ? and then we can use binary search to help 
> keep orderly when a application's resource usage has changed.
> 2: queue sort and assignContainerPreCheck will lead to compute all leafqueue 
> resource usage ,Why can we store the leafqueue usage at memory and update it 
> when assign container op release container happen?
>
>The efficiency of assign container in the Resourcemanager may fall 
> when the number of running and pending application grows. And the fact is the 
> cluster has too many PendingMB or PengdingVcore , and the Cluster 
> current utilization rate may below 20%.
>I checked the resourcemanager logs, I found that every assign 
> container may cost 5 ~ 10 ms, but just 0 ~ 1 ms at usual time.
>  
>I use TestFairScheduler to reproduce the scene:
>  
>Just one queue: root.defalut
>  10240 apps.
>  
>assign container avg time:  6753.9 us ( 6.7539 ms)  
>  apps sort time (FSLeafQueue : Collections.sort(runnableApps, 
> comparator); ): 4657.01 us ( 4.657 ms )
>  compute LeafQueue Resource usage : 905.171 us ( 0.905171 ms )
>  
>  When just root.default, one assign container op contains : ( one apps 
> sort op ) + 2 * ( compute leafqueue usage op )
>According to the above situation, I think the assign container op has 
> a performance problem  . 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4090) Make Collections.sort() more efficient in FSParentQueue.java

2016-05-23 Thread Xianyin Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xianyin Xin updated YARN-4090:
--
Attachment: YARN-4090.002.patch

> Make Collections.sort() more efficient in FSParentQueue.java
> 
>
> Key: YARN-4090
> URL: https://issues.apache.org/jira/browse/YARN-4090
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Reporter: Xianyin Xin
>Assignee: Xianyin Xin
> Attachments: YARN-4090-TestResult.pdf, YARN-4090-preview.patch, 
> YARN-4090.001.patch, YARN-4090.002.patch, sampling1.jpg, sampling2.jpg
>
>
> Collections.sort() consumes too much time in a scheduling round.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4090) Make Collections.sort() more efficient in FSParentQueue.java

2016-05-22 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15295948#comment-15295948
 ] 

Xianyin Xin commented on YARN-4090:
---

Sorry for the delay [~yufeigu]. 
{quote}
do you mean "bring down" or "decrease" when you said "write
down" in this comment? 
{quote}
Yes, i mean decrease, and changed the comments in the new patch.
{quote}
there are two spaces between "synchronized" and "void" in public synchronized 
void incResourceUsage(Resourc
{quote}
Fixed.

> Make Collections.sort() more efficient in FSParentQueue.java
> 
>
> Key: YARN-4090
> URL: https://issues.apache.org/jira/browse/YARN-4090
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Reporter: Xianyin Xin
>Assignee: Xianyin Xin
> Attachments: YARN-4090-TestResult.pdf, YARN-4090-preview.patch, 
> YARN-4090.001.patch, sampling1.jpg, sampling2.jpg
>
>
> Collections.sort() consumes too much time in a scheduling round.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4090) Make Collections.sort() more efficient in FSParentQueue.java

2016-04-28 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15261837#comment-15261837
 ] 

Xianyin Xin commented on YARN-4090:
---

It seems the failed UT with JDK1.8.0_92 and JDK1.7.0_95 is not triggered by the 
patch.

> Make Collections.sort() more efficient in FSParentQueue.java
> 
>
> Key: YARN-4090
> URL: https://issues.apache.org/jira/browse/YARN-4090
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Reporter: Xianyin Xin
>Assignee: Xianyin Xin
> Attachments: YARN-4090-TestResult.pdf, YARN-4090-preview.patch, 
> YARN-4090.001.patch, sampling1.jpg, sampling2.jpg
>
>
> Collections.sort() consumes too much time in a scheduling round.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4090) Make Collections.sort() more efficient in FSParentQueue.java

2016-04-27 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15261546#comment-15261546
 ] 

Xianyin Xin commented on YARN-4090:
---

Sorry for the delay, [~kasha] [~yufeigu]. Just uploaded the patch which fixed 
the three test fails above. For the two fails in 
{{TestFairSchedulerPreemption}}, it is because the {{decResourceUsage}} was 
double called when processing preemption (in both {{addPreemption()}} and 
{{containerCompleted()}}), and for the fail in {{TestAppRunnability}}, it is 
because we missed updating the queue's resource usage when moving an app.
Thanks [~yufeigu] for you info.

> Make Collections.sort() more efficient in FSParentQueue.java
> 
>
> Key: YARN-4090
> URL: https://issues.apache.org/jira/browse/YARN-4090
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Reporter: Xianyin Xin
>Assignee: Xianyin Xin
> Attachments: YARN-4090-TestResult.pdf, YARN-4090-preview.patch, 
> YARN-4090.001.patch, sampling1.jpg, sampling2.jpg
>
>
> Collections.sort() consumes too much time in a scheduling round.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4090) Make Collections.sort() more efficient in FSParentQueue.java

2016-04-27 Thread Xianyin Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xianyin Xin updated YARN-4090:
--
Attachment: YARN-4090.001.patch

> Make Collections.sort() more efficient in FSParentQueue.java
> 
>
> Key: YARN-4090
> URL: https://issues.apache.org/jira/browse/YARN-4090
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Reporter: Xianyin Xin
>Assignee: Xianyin Xin
> Attachments: YARN-4090-TestResult.pdf, YARN-4090-preview.patch, 
> YARN-4090.001.patch, sampling1.jpg, sampling2.jpg
>
>
> Collections.sort() consumes too much time in a scheduling round.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4090) Make Collections.sort() more efficient in FSParentQueue.java

2016-04-26 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15257960#comment-15257960
 ] 

Xianyin Xin commented on YARN-4090:
---

Thanks, [~kasha], will upload a new patch as soon as possible.

> Make Collections.sort() more efficient in FSParentQueue.java
> 
>
> Key: YARN-4090
> URL: https://issues.apache.org/jira/browse/YARN-4090
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Reporter: Xianyin Xin
>Assignee: Xianyin Xin
> Attachments: YARN-4090-TestResult.pdf, YARN-4090-preview.patch, 
> sampling1.jpg, sampling2.jpg
>
>
> Collections.sort() consumes too much time in a scheduling round.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1297) Miscellaneous Fair Scheduler speedups

2016-03-10 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15190517#comment-15190517
 ] 

Xianyin Xin commented on YARN-1297:
---

Hi [~asuresh], could you please update the patch based on the latest code?

> Miscellaneous Fair Scheduler speedups
> -
>
> Key: YARN-1297
> URL: https://issues.apache.org/jira/browse/YARN-1297
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Reporter: Sandy Ryza
>Assignee: Arun Suresh
> Attachments: YARN-1297-1.patch, YARN-1297-2.patch, YARN-1297.3.patch, 
> YARN-1297.4.patch, YARN-1297.4.patch, YARN-1297.patch, YARN-1297.patch
>
>
> I ran the Fair Scheduler's core scheduling loop through a profiler tool and 
> identified a bunch of minimally invasive changes that can shave off a few 
> milliseconds.
> The main one is demoting a couple INFO log messages to DEBUG, which brought 
> my benchmark down from 16000 ms to 6000.
> A few others (which had way less of an impact) were
> * Most of the time in comparisons was being spent in Math.signum.  I switched 
> this to direct ifs and elses and it halved the percent of time spent in 
> comparisons.
> * I removed some unnecessary instantiations of Resource objects
> * I made it so that queues' usage wasn't calculated from the applications up 
> each time getResourceUsage was called.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4090) Make Collections.sort() more efficient in FSParentQueue.java

2016-03-10 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15190514#comment-15190514
 ] 

Xianyin Xin commented on YARN-4090:
---

Let's move to YARN-1297 to continue the discussion. 

> Make Collections.sort() more efficient in FSParentQueue.java
> 
>
> Key: YARN-4090
> URL: https://issues.apache.org/jira/browse/YARN-4090
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Reporter: Xianyin Xin
>Assignee: Xianyin Xin
> Attachments: YARN-4090-TestResult.pdf, YARN-4090-preview.patch, 
> sampling1.jpg, sampling2.jpg
>
>
> Collections.sort() consumes too much time in a scheduling round.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4090) Make Collections.sort() more efficient in FSParentQueue.java

2016-03-10 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15190506#comment-15190506
 ] 

Xianyin Xin commented on YARN-4090:
---

Sorry [~kasha], i didn't notice that, and thanks for reminding. Yes, this jira 
can be seen as part of YARN-1297, so close it as duplicated.

> Make Collections.sort() more efficient in FSParentQueue.java
> 
>
> Key: YARN-4090
> URL: https://issues.apache.org/jira/browse/YARN-4090
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Reporter: Xianyin Xin
>Assignee: Xianyin Xin
> Attachments: YARN-4090-TestResult.pdf, YARN-4090-preview.patch, 
> sampling1.jpg, sampling2.jpg
>
>
> Collections.sort() consumes too much time in a scheduling round.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4752) [Umbrella] FairScheduler: Improve preemption

2016-03-06 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15182566#comment-15182566
 ] 

Xianyin Xin commented on YARN-4752:
---

hi [~kasha], can we consider to decouple the scheduling and preemption using 
the same {{getResourceUsage() }} as it block YARN-4120 and YARN-4090 which aims 
to improve the scheduling throughput? As discussed in YARN-4120, see, e.g, 
https://issues.apache.org/jira/browse/YARN-4120?focusedCommentId=14733235=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14733235.

Thanks.

> [Umbrella] FairScheduler: Improve preemption
> 
>
> Key: YARN-4752
> URL: https://issues.apache.org/jira/browse/YARN-4752
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.8.0
>Reporter: Karthik Kambatla
>
> A number of issues have been reported with respect to preemption in 
> FairScheduler along the lines of:
> # FairScheduler preempts resources from nodes even if the resultant free 
> resources cannot fit the incoming request.
> # Preemption doesn't preempt from sibling queues
> # Preemption doesn't preempt from sibling apps under the same queue that is 
> over its fairshare
> # ...
> Filing this umbrella JIRA to group all the issues together and think of a 
> comprehensive solution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4414) Nodemanager connection errors are retried at multiple levels

2016-01-07 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15088804#comment-15088804
 ] 

Xianyin Xin commented on YARN-4414:
---

Hi [~lichangleo], need we also revisit the two layer retries in {{RMProxy}}? 
IIUC, the proxy layer will retry upto 15 min with a retry interval 30 sec, but 
at the background, the RM proxy will calculate a max retry times by the two 
values. The time consuming of IPC layer retry is more than 1 sec, and by 
default retry 10 times, the result of which is the actual total wait time is 15 
min + 15 / 0.5 * 10 * (more than 1 sec), which is much more than 15 min.

> Nodemanager connection errors are retried at multiple levels
> 
>
> Key: YARN-4414
> URL: https://issues.apache.org/jira/browse/YARN-4414
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.1, 2.6.2
>Reporter: Jason Lowe
>Assignee: Chang Li
> Attachments: YARN-4414.1.2.patch, YARN-4414.1.2.patch, 
> YARN-4414.1.3.patch, YARN-4414.1.patch, YARN-4414.2.patch
>
>
> This is related to YARN-3238.  Ran into more scenarios where connection 
> errors are being retried at multiple levels, like NoRouteToHostException.  
> The fix for YARN-3238 was too specific, and I think we need a more general 
> solution to catch a wider array of connection errors that can occur to avoid 
> retrying them both at the RPC layer and at the NM proxy layer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3870) Providing raw container request information for fine scheduling

2015-12-29 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15074755#comment-15074755
 ] 

Xianyin Xin commented on YARN-3870:
---

+1 for adding an unique id for a resource request, but i would suggest we 
consider these kind of problems in a more systematic way, considering YARN-314, 
YARN-1042, YARN-371, YARN-4485 and this.

Like my comment in YARN-314, a natural way the scheduler works should like a 
factory, it receives orders, and prepare for that. Once we accept the work 
philosophy, we'll find it's natural and necessary for a resource order has the 
following dimensions
1. order id, which can identify an order, and can get overdue, or has a time 
limit;
2. priority;
3. a collection of request unit, each specifies a kind of resource request,that 
should have a coordinate of ;
4. relaxLocality;
5. canbeDecomposed, or ifGangScheduling;
6. ...
Scheduler do scheduling based on order form, and should not swallow any 
information passed from the app.

Any thoughts?

> Providing raw container request information for fine scheduling
> ---
>
> Key: YARN-3870
> URL: https://issues.apache.org/jira/browse/YARN-3870
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, applications, capacityscheduler, fairscheduler, 
> resourcemanager, scheduler, yarn
>Reporter: Lei Guo
>
> Currently, when AM sends container requests to RM and scheduler, it expands 
> individual container requests into host/rack/any format. For instance, if I 
> am asking for container request with preference "host1, host2, host3", 
> assuming all are in the same rack rack1, instead of sending one raw container 
> request to RM/Scheduler with raw preference list, it basically expand it to 
> become 5 different objects with host1, host2, host3, rack1 and any in there. 
> When scheduler receives information, it basically already lost the raw 
> request. This is ok for single container request, but it will cause trouble 
> when dealing with multiple container requests from the same application. 
> Consider this case:
> 6 hosts, two racks:
> rack1 (host1, host2, host3) rack2 (host4, host5, host6)
> When application requests two containers with different data locality 
> preference:
> c1: host1, host2, host4
> c2: host2, host3, host5
> This will end up with following container request list when client sending 
> request to RM/Scheduler:
> host1: 1 instance
> host2: 2 instances
> host3: 1 instance
> host4: 1 instance
> host5: 1 instance
> rack1: 2 instances
> rack2: 2 instances
> any: 2 instances
> Fundamentally, it is hard for scheduler to make a right judgement without 
> knowing the raw container request. The situation will get worse when dealing 
> with affinity and anti-affinity or even gang scheduling etc.
> We need some way to provide raw container request information for fine 
> scheduling purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4477) FairScheduler: encounter infinite loop in attemptScheduling

2015-12-17 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15063538#comment-15063538
 ] 

Xianyin Xin commented on YARN-4477:
---

cc [~asuresh].

> FairScheduler: encounter infinite loop in attemptScheduling
> ---
>
> Key: YARN-4477
> URL: https://issues.apache.org/jira/browse/YARN-4477
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Tao Jie
>
> This problem is introduced by YARN-4270 which add limitation on reservation.  
> In FSAppAttempt.reserve():
> {code}
> if (!reservationExceedsThreshold(node, type)) {
>   LOG.info("Making reservation: node=" + node.getNodeName() +
>   " app_id=" + getApplicationId());
>   if (!alreadyReserved) {
> getMetrics().reserveResource(getUser(), container.getResource());
> RMContainer rmContainer =
> super.reserve(node, priority, null, container);
> node.reserveResource(this, priority, rmContainer);
> setReservation(node);
>   } else {
> RMContainer rmContainer = node.getReservedContainer();
> super.reserve(node, priority, rmContainer, container);
> node.reserveResource(this, priority, rmContainer);
> setReservation(node);
>   }
> }
> {code}
> If reservation over threshod, current node will not set reservation.
> But in attemptScheduling in FairSheduler:
> {code}
>   while (node.getReservedContainer() == null) {
> boolean assignedContainer = false;
> if (!queueMgr.getRootQueue().assignContainer(node).equals(
> Resources.none())) {
>   assignedContainers++;
>   assignedContainer = true;
>   
> }
> 
> if (!assignedContainer) { break; }
> if (!assignMultiple) { break; }
> if ((assignedContainers >= maxAssign) && (maxAssign > 0)) { break; }
>   }
> {code}
> assignContainer(node) still return FairScheduler.CONTAINER_RESERVED, which not
> equals to Resources.none().
> As a result, if multiple assign is enabled and maxAssign is unlimited, this 
> while loop would never break.
> I suppose that assignContainer(node) should return Resource.none rather than 
> CONTAINER_RESERVED when the attempt doesn't take the reservation because of 
> the limitation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4198) CapacityScheduler locking / synchronization improvements

2015-12-16 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15059730#comment-15059730
 ] 

Xianyin Xin commented on YARN-4198:
---

Hi [~curino], does this improvement only considers the locking / 
synchronization that has relation with ReservationSystem? What's the 
relationship of this jira with YARN-3091 and the sub-jiras? 

> CapacityScheduler locking / synchronization improvements
> 
>
> Key: YARN-4198
> URL: https://issues.apache.org/jira/browse/YARN-4198
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Carlo Curino
>Assignee: Alexey Tumanov
>
> In the context of YARN-4193 (which stresses the RM/CS performance) we found 
> several performance problems with  in the locking/synchronization of the 
> CapacityScheduler, as well as inconsistencies that do not normally surface 
> (incorrect locking-order of queues protected by CS locks etc). This JIRA 
> proposes several refactoring that improve this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4415) Scheduler Web Ui shows max capacity for the queue is 100% but when we submit application doesnt get assigned

2015-12-14 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057134#comment-15057134
 ] 

Xianyin Xin commented on YARN-4415:
---

Hi [~leftnoteasy], thanks for you comments.
{quote}
You can see that there're different pros and cons to choose default values of 
the two options. Frankly I don't have strong preference for all these choices. 
But since we have decided default values since 2.6, I would suggest don't 
change the default values.
{quote}
i understand and respect your choice. The pros and cons are just the two sides 
of a coin, we must choose one. But i just feel it strange that the 
access-labels are "\*" but in fact we can't access it. so in this case "\*" 
means nothing except that it is just a symbol, or a abbreviation of all labels. 
(what i mean is it has something contradiction with intuition when one sees 
"*", i think naga has the same sense). You can claim that the access-labels and 
max-capacities are two things and if we want to use it, we must set the two 
separately and explicitly. If we finally choose such the way it works, i will 
reserve my opinion. At last, thanks again. :)

> Scheduler Web Ui shows max capacity for the queue is 100% but when we submit 
> application doesnt get assigned
> 
>
> Key: YARN-4415
> URL: https://issues.apache.org/jira/browse/YARN-4415
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler, resourcemanager
>Affects Versions: 2.7.2
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
> Attachments: App info with diagnostics info.png, 
> capacity-scheduler.xml, screenshot-1.png
>
>
> Steps to reproduce the issue :
> Scenario 1:
> # Configure a queue(default) with accessible node labels as *
> # create a exclusive partition *xxx* and map a NM to it
> # ensure no capacities are configured for default for label xxx
> # start an RM app with queue as default and label as xxx
> # application is stuck but scheduler ui shows 100% as max capacity for that 
> queue
> Scenario 2:
> # create a nonexclusive partition *sharedPartition* and map a NM to it
> # ensure no capacities are configured for default queue
> # start an RM app with queue as *default* and label as *sharedPartition*
> # application is stuck but scheduler ui shows 100% as max capacity for that 
> queue for *sharedPartition*
> For both issues cause is the same default max capacity and abs max capacity 
> is set to Zero %



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4415) Scheduler Web Ui shows max capacity for the queue is 100% but when we submit application doesnt get assigned

2015-12-07 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15046147#comment-15046147
 ] 

Xianyin Xin commented on YARN-4415:
---

sorry [~Naganarasimha], i missed that, thanks for you correction. For the case 
of labels "*", child queues should have accessibility of all labels, and the 
max capacity should be guaranteed to be 100 as a default value if the admin 
didn't specify the access-labels list for the child queue (at which case the 
child queues should inherit from its parent).

> Scheduler Web Ui shows max capacity for the queue is 100% but when we submit 
> application doesnt get assigned
> 
>
> Key: YARN-4415
> URL: https://issues.apache.org/jira/browse/YARN-4415
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler, resourcemanager
>Affects Versions: 2.7.2
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
> Attachments: App info with diagnostics info.png, screenshot-1.png
>
>
> Steps to reproduce the issue :
> Scenario 1:
> # Configure a queue(default) with accessible node labels as *
> # create a exclusive partition *xxx* and map a NM to it
> # ensure no capacities are configured for default for label xxx
> # start an RM app with queue as default and label as xxx
> # application is stuck but scheduler ui shows 100% as max capacity for that 
> queue
> Scenario 2:
> # create a nonexclusive partition *sharedPartition* and map a NM to it
> # ensure no capacities are configured for default queue
> # start an RM app with queue as *default* and label as *sharedPartition*
> # application is stuck but scheduler ui shows 100% as max capacity for that 
> queue for *sharedPartition*
> For both issues cause is the same default max capacity and abs max capacity 
> is set to Zero %



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4415) Scheduler Web Ui shows max capacity for the queue is 100% but when we submit application doesnt get assigned

2015-12-06 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15044498#comment-15044498
 ] 

Xianyin Xin commented on YARN-4415:
---

Sorry for the late, [~Naganarasimha]. 
I don't know i understand correctly, so pls correct me if i'm wrong. Now 
there're two cases, 1), if we have set the access-labels for a queue in xml, 
and 2), we didnt set the access-labels for a queue. For case 1), the 
access-labels and the configured capacities (0 for capacity and 100 max by 
default) are imported, and for case 2), the access-labels of the queue is 
inherited from its parent, but the capacities of the labels are 0 since 
{{setupConfigurableCapacities()}} only considers the configured access-labels 
in xml.
{code}
this.accessibleLabels =
csContext.getConfiguration().getAccessibleNodeLabels(getQueuePath());
this.defaultLabelExpression = csContext.getConfiguration()
.getDefaultNodeLabelExpression(getQueuePath());

// inherit from parent if labels not set
if (this.accessibleLabels == null && parent != null) {
  this.accessibleLabels = parent.getAccessibleNodeLabels();
}

// inherit from parent if labels not set
if (this.defaultLabelExpression == null && parent != null
&& this.accessibleLabels.containsAll(parent.getAccessibleNodeLabels())) 
{
  this.defaultLabelExpression = parent.getDefaultNodeLabelExpression();
}

// After we setup labels, we can setup capacities
setupConfigurableCapacities();
{code}

This would cause confusion because the access-labels inherited from parent have 
0 max capacities. If the case is true, i agree that the inherited access-labels 
has 100 max capacities by default.

But for the two scenarios in the descrition, i feel the final result is 
reasonable because you didnt set the access-labels for the queue and its parent 
doesn't have the access-labels also, so the label is not accessable explicitly 
by the queue. But the info that the web ui shows is wrong if the above analysis 
is right. i think the cause is from follow sentence in 
{QueueCapacitiesInfo.java},

{code}
if (maxCapacity < CapacitySchedulerQueueInfo.EPSILON || maxCapacity > 1f)
maxCapacity = 1f;
{code}
where it set the {{maxCapacity}} to 1 for case {{maxCapacity == 0}} which is 
just the case 2) above.

cc [~leftnoteasy].

> Scheduler Web Ui shows max capacity for the queue is 100% but when we submit 
> application doesnt get assigned
> 
>
> Key: YARN-4415
> URL: https://issues.apache.org/jira/browse/YARN-4415
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler, resourcemanager
>Affects Versions: 2.7.2
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
> Attachments: App info with diagnostics info.png, screenshot-1.png
>
>
> Steps to reproduce the issue :
> Scenario 1:
> # Configure a queue(default) with accessible node labels as *
> # create a exclusive partition *xxx* and map a NM to it
> # ensure no capacities are configured for default for label xxx
> # start an RM app with queue as default and label as xxx
> # application is stuck but scheduler ui shows 100% as max capacity for that 
> queue
> Scenario 2:
> # create a nonexclusive partition *sharedPartition* and map a NM to it
> # ensure no capacities are configured for default queue
> # start an RM app with queue as *default* and label as *sharedPartition*
> # application is stuck but scheduler ui shows 100% as max capacity for that 
> queue for *sharedPartition*
> For both issues cause is the same default max capacity and abs max capacity 
> is set to Zero %



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4403) (AM/NM/Container)LivelinessMonitor should use monotonic time when calculating period

2015-12-02 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15037185#comment-15037185
 ] 

Xianyin Xin commented on YARN-4403:
---

And will provide new patch of YARN-4177 once this is in.

> (AM/NM/Container)LivelinessMonitor should use monotonic time when calculating 
> period
> 
>
> Key: YARN-4403
> URL: https://issues.apache.org/jira/browse/YARN-4403
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Attachments: YARN-4403.patch
>
>
> Currently, (AM/NM/Container)LivelinessMonitor use current system time to 
> calculate a duration of expire which could be broken by settimeofday. We 
> should use Time.monotonicNow() instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4403) (AM/NM/Container)LivelinessMonitor should use monotonic time when calculating period

2015-12-02 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15037284#comment-15037284
 ] 

Xianyin Xin commented on YARN-4403:
---

Thanks, [~sunilg].

> (AM/NM/Container)LivelinessMonitor should use monotonic time when calculating 
> period
> 
>
> Key: YARN-4403
> URL: https://issues.apache.org/jira/browse/YARN-4403
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Attachments: YARN-4403.patch
>
>
> Currently, (AM/NM/Container)LivelinessMonitor use current system time to 
> calculate a duration of expire which could be broken by settimeofday. We 
> should use Time.monotonicNow() instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4403) (AM/NM/Container)LivelinessMonitor should use monotonic time when calculating period

2015-12-02 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15037180#comment-15037180
 ] 

Xianyin Xin commented on YARN-4403:
---

hi [~djp], this is a good suggestion, and YARN-4177 provides some discussion on 
this, so link it.

> (AM/NM/Container)LivelinessMonitor should use monotonic time when calculating 
> period
> 
>
> Key: YARN-4403
> URL: https://issues.apache.org/jira/browse/YARN-4403
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Attachments: YARN-4403.patch
>
>
> Currently, (AM/NM/Container)LivelinessMonitor use current system time to 
> calculate a duration of expire which could be broken by settimeofday. We 
> should use Time.monotonicNow() instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-3639) It takes too long time for RM to recover all apps if the original active RM and NN go down at the same time.

2015-10-26 Thread Xianyin Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xianyin Xin resolved YARN-3639.
---
Resolution: Fixed

This has been resolved by YARN-4041, so close it.

> It takes too long time for RM to recover all apps if the original active RM 
> and NN go down at the same time.
> 
>
> Key: YARN-3639
> URL: https://issues.apache.org/jira/browse/YARN-3639
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Xianyin Xin
> Attachments: YARN-3639-recovery_log_1_app.txt
>
>
> If the active RM and NN go down at the same time, the new RM will take long 
> time to recover all apps. After analysis, we found the root cause is renewing 
> HDFS tokens in the recovering process. The HDFS client created by the renewer 
> would firstly try to connect to the original NN, the result of which is 
> time-out after 10~20s, and then the client tries to connect to the new NN. 
> The entire recovery cost 15*#apps seconds according our test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4189) Capacity Scheduler : Improve location preference waiting mechanism

2015-09-21 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14901736#comment-14901736
 ] 

Xianyin Xin commented on YARN-4189:
---

[~leftnoteasy], convincing analysis. It's fine X << Y and X is close to the 
heartbeat interval, so, should we limit X to avoid users deploy it freely?

> Capacity Scheduler : Improve location preference waiting mechanism
> --
>
> Key: YARN-4189
> URL: https://issues.apache.org/jira/browse/YARN-4189
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-4189 design v1.pdf
>
>
> There're some issues with current Capacity Scheduler implementation of delay 
> scheduling:
> *1) Waiting time to allocate each container highly depends on cluster 
> availability*
> Currently, app can only increase missed-opportunity when a node has available 
> resource AND it gets traversed by a scheduler. There’re lots of possibilities 
> that an app doesn’t get traversed by a scheduler, for example:
> A cluster has 2 racks (rack1/2), each rack has 40 nodes. 
> Node-locality-delay=40. An application prefers rack1. 
> Node-heartbeat-interval=1s.
> Assume there are 2 nodes available on rack1, delay to allocate one container 
> = 40 sec.
> If there are 20 nodes available on rack1, delay of allocating one container = 
> 2 sec.
> *2) It could violate scheduling policies (Fifo/Priority/Fair)*
> Assume a cluster is highly utilized, an app (app1) has higher priority, it 
> wants locality. And there’s another app (app2) has lower priority, but it 
> doesn’t care about locality. When node heartbeats with available resource, 
> app1 decides to wait, so app2 gets the available slot. This should be 
> considered as a bug that we need to fix.
> The same problem could happen when we use FIFO/Fair queue policies.
> Another problem similar to this is related to preemption: when preemption 
> policy preempts some resources from queue-A for queue-B (queue-A is 
> over-satisfied and queue-B is under-satisfied). But queue-B is waiting for 
> the node-locality-delay so queue-A will get resources back. In next round, 
> preemption policy could preempt this resources again from queue-A.
> This JIRA is target to solve these problems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4177) yarn.util.Clock should not be used to time a duration or time interval

2015-09-21 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14900450#comment-14900450
 ] 

Xianyin Xin commented on YARN-4177:
---

Hi [~ste...@apache.org], thanks for your comment. I've read your post and did 
some investgations on this.
{quote}
1.Inconsistent across cores, hence non-monotonic on reads, especially reads 
likely to trigger thread suspend/resume (anything with sleep(), wait(), IO, 
accessing synchronized data under load).
{quote}
This was once a bug on some old OSs, but it seems not a problem on Linux newer 
than 2.6 or windows newer than XP SP2, if i understand your comment correctly. 
See 
http://stackoverflow.com/questions/510462/is-system-nanotime-completely-useless,
 and the refered 
https://blogs.oracle.com/dholmes/entry/inside_the_hotspot_vm_clocks.
{quote}
2.Not actually monotonic.
{quote}
Can you explain in detail? As a reference, there're some discussion on 
clock_gettime which nanoTime depends in 
http://stackoverflow.com/questions/4943733/is-clock-monotonic-process-or-thread-specific?rq=1,
 especially in the second post that has 4 supports.
{quote}
3.Achieving a consistency by querying heavyweight counters with possible longer 
function execution time and lower granularity than the wall clock.
That is: modern NUMA, multi-socket servers are essentially multiple computers 
wired together, and we have a term for that: distributed system
{quote}
You mean achieving a consistent time across nodes in a cluster? I think the 
monotonic time we plan to offer should be limited to node-local. It's hard to 
make it cluster wide. 
{quote}
I've known for a long time that CPU frequency could change its rate
{quote}
I remembered Linux higher than 2.6.18 takes some measures to overcome this 
problem. 
http://stackoverflow.com/questions/510462/is-system-nanotime-completely-useless#comment40382219_510940
 has little discussion.

> yarn.util.Clock should not be used to time a duration or time interval
> --
>
> Key: YARN-4177
> URL: https://issues.apache.org/jira/browse/YARN-4177
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Xianyin Xin
>Assignee: Xianyin Xin
> Attachments: YARN-4177.001.patch, YARN-4177.002.patch
>
>
> There're many places uses Clock to time intervals, which is dangerous as 
> commented by [~ste...@apache.org] in HADOOP-12409. Instead, we should use 
> hadoop.util.Timer#monotonicNow() to get monotonic time. Or we could provide a 
> MonotonicClock in yarn.util considering the consistency of code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4190) missing container information in FairScheduler preemption log.

2015-09-18 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876762#comment-14876762
 ] 

Xianyin Xin commented on YARN-4190:
---

Hi [~zxu], will we consider to fix the issue in YARN-4134 or YARN-3405, or just 
wait for the new preemption logic in YARN-2154? It blocks YARN-4090 and 
YARN-4120 which want to decouple scheduling and preemption and get a better 
performance.

> missing container information in FairScheduler preemption log.
> --
>
> Key: YARN-4190
> URL: https://issues.apache.org/jira/browse/YARN-4190
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 2.7.1
>Reporter: zhihai xu
>Assignee: zhihai xu
>Priority: Trivial
>
> Add container information in FairScheduler preemption log to help debug. 
> Currently the following log doesn't have container information
> {code}
> LOG.info("Preempting container (prio=" + 
> container.getContainer().getPriority() +
> "res=" + container.getContainer().getResource() +
> ") from queue " + queue.getName());
> {code}
> So it will be very difficult to debug preemption related issue for 
> FairScheduler.
> Even the container information is printed in the following code
> {code}
> LOG.info("Killing container" + container +
> " (after waiting for premption for " +
> (getClock().getTime() - time) + "ms)");
> {code}
> But we can't match these two logs based on the container ID.
> It will be very useful to add container information in the first log.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4177) yarn.util.Clock should not be used to time a duration or time interval

2015-09-18 Thread Xianyin Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xianyin Xin updated YARN-4177:
--
Attachment: YARN-4177.002.patch

Upload patch ver2.

> yarn.util.Clock should not be used to time a duration or time interval
> --
>
> Key: YARN-4177
> URL: https://issues.apache.org/jira/browse/YARN-4177
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Xianyin Xin
> Attachments: YARN-4177.001.patch, YARN-4177.002.patch
>
>
> There're many places uses Clock to time intervals, which is dangerous as 
> commented by [~ste...@apache.org] in HADOOP-12409. Instead, we should use 
> hadoop.util.Timer#monotonicNow() to get monotonic time. Or we could provide a 
> MonotonicClock in yarn.util considering the consistency of code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4189) Capacity Scheduler : Improve location preference waiting mechanism

2015-09-18 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876871#comment-14876871
 ] 

Xianyin Xin commented on YARN-4189:
---

Hi [~leftnoteasy], just go through the doc. Please correct me if i am wrong. 
Can a container that marked as ALLOCATING_WAITING be occupied by other 
requests? I'm afraid ALLOCATING_WAITING would decline the utilization of the 
cluster. In a cluster with many nodes and many jobs, it's uneasy to make the 
most of jobs satisfied with their allocations, especially in the app-oriented 
allocation mechanism(which maps the newly available resource to appropriate 
apps). Once a customer asks us "why we can get 100% locality in MR1 but only up 
to 60%~70% after making various of optimizations?". So we can guess there are 
quite a percents of resource requests which are not satisfied with their 
allocations in a cluster, thus there would be many containers experience the  
ALLOCATING_WAITING phase which makes many resource idle for a period of time.

> Capacity Scheduler : Improve location preference waiting mechanism
> --
>
> Key: YARN-4189
> URL: https://issues.apache.org/jira/browse/YARN-4189
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-4189 design v1.pdf
>
>
> There're some issues with current Capacity Scheduler implementation of delay 
> scheduling:
> *1) Waiting time to allocate each container highly depends on cluster 
> availability*
> Currently, app can only increase missed-opportunity when a node has available 
> resource AND it gets traversed by a scheduler. There’re lots of possibilities 
> that an app doesn’t get traversed by a scheduler, for example:
> A cluster has 2 racks (rack1/2), each rack has 40 nodes. 
> Node-locality-delay=40. An application prefers rack1. 
> Node-heartbeat-interval=1s.
> Assume there are 2 nodes available on rack1, delay to allocate one container 
> = 40 sec.
> If there are 20 nodes available on rack1, delay of allocating one container = 
> 2 sec.
> *2) It could violate scheduling policies (Fifo/Priority/Fair)*
> Assume a cluster is highly utilized, an app (app1) has higher priority, it 
> wants locality. And there’s another app (app2) has lower priority, but it 
> doesn’t care about locality. When node heartbeats with available resource, 
> app1 decides to wait, so app2 gets the available slot. This should be 
> considered as a bug that we need to fix.
> The same problem could happen when we use FIFO/Fair queue policies.
> Another problem similar to this is related to preemption: when preemption 
> policy preempts some resources from queue-A for queue-B (queue-A is 
> over-satisfied and queue-B is under-satisfied). But queue-B is waiting for 
> the node-locality-delay so queue-A will get resources back. In next round, 
> preemption policy could preempt this resources again from queue-A.
> This JIRA is target to solve these problems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-4177) yarn.util.Clock should not be used to time a duration or time interval

2015-09-18 Thread Xianyin Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xianyin Xin reassigned YARN-4177:
-

Assignee: Xianyin Xin

> yarn.util.Clock should not be used to time a duration or time interval
> --
>
> Key: YARN-4177
> URL: https://issues.apache.org/jira/browse/YARN-4177
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Xianyin Xin
>Assignee: Xianyin Xin
> Attachments: YARN-4177.001.patch, YARN-4177.002.patch
>
>
> There're many places uses Clock to time intervals, which is dangerous as 
> commented by [~ste...@apache.org] in HADOOP-12409. Instead, we should use 
> hadoop.util.Timer#monotonicNow() to get monotonic time. Or we could provide a 
> MonotonicClock in yarn.util considering the consistency of code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4177) yarn.util.Clock should not be used to time a duration or time interval

2015-09-17 Thread Xianyin Xin (JIRA)
Xianyin Xin created YARN-4177:
-

 Summary: yarn.util.Clock should not be used to time a duration or 
time interval
 Key: YARN-4177
 URL: https://issues.apache.org/jira/browse/YARN-4177
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Xianyin Xin


There're many places uses Clock to time intervals, which is dangerous as 
commented by [~ste...@apache.org] in HADOOP-12409. Instead, we should use 
hadoop.util.Timer#monotonicNow() to get monotonic time. Or we could provide a 
MonotonicClock in yarn.util considering the consistency of code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4177) yarn.util.Clock should not be used to time a duration or time interval

2015-09-17 Thread Xianyin Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xianyin Xin updated YARN-4177:
--
Attachment: YARN-4177.001.patch

Provide a MonotonicClock.

> yarn.util.Clock should not be used to time a duration or time interval
> --
>
> Key: YARN-4177
> URL: https://issues.apache.org/jira/browse/YARN-4177
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Xianyin Xin
> Attachments: YARN-4177.001.patch
>
>
> There're many places uses Clock to time intervals, which is dangerous as 
> commented by [~ste...@apache.org] in HADOOP-12409. Instead, we should use 
> hadoop.util.Timer#monotonicNow() to get monotonic time. Or we could provide a 
> MonotonicClock in yarn.util considering the consistency of code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4177) yarn.util.Clock should not be used to time a duration or time interval

2015-09-17 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14802911#comment-14802911
 ] 

Xianyin Xin commented on YARN-4177:
---

In a following steps I would try to fix several important places where clock is 
misused.

> yarn.util.Clock should not be used to time a duration or time interval
> --
>
> Key: YARN-4177
> URL: https://issues.apache.org/jira/browse/YARN-4177
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Xianyin Xin
> Attachments: YARN-4177.001.patch
>
>
> There're many places uses Clock to time intervals, which is dangerous as 
> commented by [~ste...@apache.org] in HADOOP-12409. Instead, we should use 
> hadoop.util.Timer#monotonicNow() to get monotonic time. Or we could provide a 
> MonotonicClock in yarn.util considering the consistency of code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4134) FairScheduler preemption stops at queue level that all child queues are not over their fairshare

2015-09-13 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14742795#comment-14742795
 ] 

Xianyin Xin commented on YARN-4134:
---

Just found this issue duplicated YARN-3405, but the two has different 
solutions. Since this issue would be addressed in YARN-2154, link them together.

> FairScheduler preemption stops at queue level that all child queues are not 
> over their fairshare
> 
>
> Key: YARN-4134
> URL: https://issues.apache.org/jira/browse/YARN-4134
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Xianyin Xin
>Assignee: Xianyin Xin
> Attachments: YARN-4134.001.patch, YARN-4134.002.patch, 
> YARN-4134.003.patch
>
>
> Now FairScheudler uses a choose-a-candidate method to select a container from 
> leaf queues that to be preempted, in {{FSParentQueue.preemptContainer()}},
> {code}
> readLock.lock();
> try {
>   for (FSQueue queue : childQueues) {
> if (candidateQueue == null ||
> comparator.compare(queue, candidateQueue) > 0) {
>   candidateQueue = queue;
> }
>   }
> } finally {
>   readLock.unlock();
> }
> // Let the selected queue choose which of its container to preempt
> if (candidateQueue != null) {
>   toBePreempted = candidateQueue.preemptContainer();
> }
> {code}
> a candidate child queue is selected. However, if the queue's usage isn't over 
> it's fairshare, preemption will not happen:
> {code}
> if (!preemptContainerPreCheck()) {
>   return toBePreempted;
> }
> {code}
>  A scenario:
> {code}
> root
>/\
>   queue1   queue2
>/\
>   queue2.3, (  queue2.4  )
> {code}
> suppose there're 8 containers, and queues at any level have the same weight. 
> queue1 takes 4 and queue2.3 takes 4, so both queue1 and queue2 are at their 
> fairshare. Now we submit an app in queue2.4 with 4 containers needs, it 
> should preempt 2 from queue2.3, but the candidate-containers selection 
> procedure will stop at queue1, so none of the containers will be preempted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4120) FSAppAttempt.getResourceUsage() should not take preemptedResource into account

2015-09-13 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14742793#comment-14742793
 ] 

Xianyin Xin commented on YARN-4120:
---

hi [~asuresh], thanks for your comment. I've go through YARN-2154, i believe it 
is a nice solution for the problems of current preemption logic. But i think 
the current patch of YARN-2154 could not solve the issue raised in this jira 
(please correct me if i wrongly understood YARN-2154.). We should distinguish 
{{usage}} and {{usage - preemption}} in {{getResourceUsgae}}, because 
{{getResourceUsage}} is used both by the preemption logic and the resource 
allocation logic. Of course we can consider this in the new implemention in 
YARN-2154 and solve them together.

> FSAppAttempt.getResourceUsage() should not take preemptedResource into account
> --
>
> Key: YARN-4120
> URL: https://issues.apache.org/jira/browse/YARN-4120
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Xianyin Xin
>
> When compute resource usage for Schedulables, the following code is envolved,
> {{FSAppAttempt.getResourceUsage}},
> {code}
> public Resource getResourceUsage() {
>   return Resources.subtract(getCurrentConsumption(), getPreemptedResources());
> }
> {code}
> and this value is aggregated to FSLeafQueues and FSParentQueues. In my 
> opinion, taking {{preemptedResource}} into account here is not reasonable, 
> there are two main reasons,
> # it is something in future, i.e., even though these resources are marked as 
> preempted, it is currently used by app, and these resources will be 
> subtracted from {{currentCosumption}} once the preemption is finished. it's 
> not reasonable to make arrange for it ahead of time. 
> # there's another problem here, consider following case,
> {code}
> root
>/\
>   queue1   queue2
>   /\
> queue1.3, queue1.4
> {code}
> suppose queue1.3 need resource and it can preempt resources from queue1.4, 
> the preemption happens in the interior of queue1. But when compute resource 
> usage of queue1, {{queue1.resourceUsage = it's_current_resource_usage - 
> preemption}} according to the current code, which is unfair to queue2 when 
> doing resource allocating.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4120) FSAppAttempt.getResourceUsage() should not take preemptedResource into account

2015-09-13 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14742812#comment-14742812
 ] 

Xianyin Xin commented on YARN-4120:
---

Hi [~kasha], [~asuresh], [~ashwinshankar77], now both the preemption logic and 
resource allocation logic uses {{comparator}} to sort the {{Schedulables}}. I 
think we have to introduce a different comparator to separate {{usage}} and 
{{usage - preemption}}, just as the patch in YARN-4134. There're also some 
discussion on changing {{Comparator.compare()}} in YARN-3453. I think for a 
collection of comparables, we can use different comparators to compare 
different attributes for different purpose. Any thoughts?

> FSAppAttempt.getResourceUsage() should not take preemptedResource into account
> --
>
> Key: YARN-4120
> URL: https://issues.apache.org/jira/browse/YARN-4120
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Xianyin Xin
>
> When compute resource usage for Schedulables, the following code is envolved,
> {{FSAppAttempt.getResourceUsage}},
> {code}
> public Resource getResourceUsage() {
>   return Resources.subtract(getCurrentConsumption(), getPreemptedResources());
> }
> {code}
> and this value is aggregated to FSLeafQueues and FSParentQueues. In my 
> opinion, taking {{preemptedResource}} into account here is not reasonable, 
> there are two main reasons,
> # it is something in future, i.e., even though these resources are marked as 
> preempted, it is currently used by app, and these resources will be 
> subtracted from {{currentCosumption}} once the preemption is finished. it's 
> not reasonable to make arrange for it ahead of time. 
> # there's another problem here, consider following case,
> {code}
> root
>/\
>   queue1   queue2
>   /\
> queue1.3, queue1.4
> {code}
> suppose queue1.3 need resource and it can preempt resources from queue1.4, 
> the preemption happens in the interior of queue1. But when compute resource 
> usage of queue1, {{queue1.resourceUsage = it's_current_resource_usage - 
> preemption}} according to the current code, which is unfair to queue2 when 
> doing resource allocating.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2154) FairScheduler: Improve preemption to preempt only those containers that would satisfy the incoming request

2015-09-13 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14742797#comment-14742797
 ] 

Xianyin Xin commented on YARN-2154:
---

The new logic would solve the issue raised in YARN-3405 and YARN-4134. link 
them for tracking.

> FairScheduler: Improve preemption to preempt only those containers that would 
> satisfy the incoming request
> --
>
> Key: YARN-2154
> URL: https://issues.apache.org/jira/browse/YARN-2154
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 2.4.0
>Reporter: Karthik Kambatla
>Assignee: Arun Suresh
>Priority: Critical
> Attachments: YARN-2154.1.patch
>
>
> Today, FairScheduler uses a spray-gun approach to preemption. Instead, it 
> should only preempt resources that would satisfy the incoming request. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4134) FairScheduler preemption stops at queue level that all child queues are not over their fairshare

2015-09-10 Thread Xianyin Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xianyin Xin updated YARN-4134:
--
Attachment: YARN-4134.002.patch

Remove testing remnant.

> FairScheduler preemption stops at queue level that all child queues are not 
> over their fairshare
> 
>
> Key: YARN-4134
> URL: https://issues.apache.org/jira/browse/YARN-4134
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Xianyin Xin
>Assignee: Xianyin Xin
> Attachments: YARN-4134.001.patch, YARN-4134.002.patch
>
>
> Now FairScheudler uses a choose-a-candidate method to select a container from 
> leaf queues that to be preempted, in {{FSParentQueue.preemptContainer()}},
> {code}
> readLock.lock();
> try {
>   for (FSQueue queue : childQueues) {
> if (candidateQueue == null ||
> comparator.compare(queue, candidateQueue) > 0) {
>   candidateQueue = queue;
> }
>   }
> } finally {
>   readLock.unlock();
> }
> // Let the selected queue choose which of its container to preempt
> if (candidateQueue != null) {
>   toBePreempted = candidateQueue.preemptContainer();
> }
> {code}
> a candidate child queue is selected. However, if the queue's usage isn't over 
> it's fairshare, preemption will not happen:
> {code}
> if (!preemptContainerPreCheck()) {
>   return toBePreempted;
> }
> {code}
>  A scenario:
> {code}
> root
>/\
>   queue1   queue2
>/\
>   queue2.3, (  queue2.4  )
> {code}
> suppose there're 8 containers, and queues at any level have the same weight. 
> queue1 takes 4 and queue2.3 takes 4, so both queue1 and queue2 are at their 
> fairshare. Now we submit an app in queue2.4 with 4 containers needs, it 
> should preempt 2 from queue2.3, but the candidate-containers selection 
> procedure will stop at queue1, so none of the containers will be preempted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4134) FairScheduler preemption stops at queue level that all child queues are not over their fairshare

2015-09-10 Thread Xianyin Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xianyin Xin updated YARN-4134:
--
Attachment: YARN-4134.001.patch

Upload a patch for preview.

> FairScheduler preemption stops at queue level that all child queues are not 
> over their fairshare
> 
>
> Key: YARN-4134
> URL: https://issues.apache.org/jira/browse/YARN-4134
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Xianyin Xin
>Assignee: Xianyin Xin
> Attachments: YARN-4134.001.patch
>
>
> Now FairScheudler uses a choose-a-candidate method to select a container from 
> leaf queues that to be preempted, in {{FSParentQueue.preemptContainer()}},
> {code}
> readLock.lock();
> try {
>   for (FSQueue queue : childQueues) {
> if (candidateQueue == null ||
> comparator.compare(queue, candidateQueue) > 0) {
>   candidateQueue = queue;
> }
>   }
> } finally {
>   readLock.unlock();
> }
> // Let the selected queue choose which of its container to preempt
> if (candidateQueue != null) {
>   toBePreempted = candidateQueue.preemptContainer();
> }
> {code}
> a candidate child queue is selected. However, if the queue's usage isn't over 
> it's fairshare, preemption will not happen:
> {code}
> if (!preemptContainerPreCheck()) {
>   return toBePreempted;
> }
> {code}
>  A scenario:
> {code}
> root
>/\
>   queue1   queue2
>/\
>   queue2.3, (  queue2.4  )
> {code}
> suppose there're 8 containers, and queues at any level have the same weight. 
> queue1 takes 4 and queue2.3 takes 4, so both queue1 and queue2 are at their 
> fairshare. Now we submit an app in queue2.4 with 4 containers needs, it 
> should preempt 2 from queue2.3, but the candidate-containers selection 
> procedure will stop at queue1, so none of the containers will be preempted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4134) FairScheduler preemption stops at queue level that all child queues are not over their fairshare

2015-09-10 Thread Xianyin Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xianyin Xin updated YARN-4134:
--
Attachment: YARN-4134.003.patch

A tiny fix.

> FairScheduler preemption stops at queue level that all child queues are not 
> over their fairshare
> 
>
> Key: YARN-4134
> URL: https://issues.apache.org/jira/browse/YARN-4134
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Xianyin Xin
>Assignee: Xianyin Xin
> Attachments: YARN-4134.001.patch, YARN-4134.002.patch, 
> YARN-4134.003.patch
>
>
> Now FairScheudler uses a choose-a-candidate method to select a container from 
> leaf queues that to be preempted, in {{FSParentQueue.preemptContainer()}},
> {code}
> readLock.lock();
> try {
>   for (FSQueue queue : childQueues) {
> if (candidateQueue == null ||
> comparator.compare(queue, candidateQueue) > 0) {
>   candidateQueue = queue;
> }
>   }
> } finally {
>   readLock.unlock();
> }
> // Let the selected queue choose which of its container to preempt
> if (candidateQueue != null) {
>   toBePreempted = candidateQueue.preemptContainer();
> }
> {code}
> a candidate child queue is selected. However, if the queue's usage isn't over 
> it's fairshare, preemption will not happen:
> {code}
> if (!preemptContainerPreCheck()) {
>   return toBePreempted;
> }
> {code}
>  A scenario:
> {code}
> root
>/\
>   queue1   queue2
>/\
>   queue2.3, (  queue2.4  )
> {code}
> suppose there're 8 containers, and queues at any level have the same weight. 
> queue1 takes 4 and queue2.3 takes 4, so both queue1 and queue2 are at their 
> fairshare. Now we submit an app in queue2.4 with 4 containers needs, it 
> should preempt 2 from queue2.3, but the candidate-containers selection 
> procedure will stop at queue1, so none of the containers will be preempted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4120) FSAppAttempt.getResourceUsage() should not take preemptedResource into account

2015-09-10 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14739974#comment-14739974
 ] 

Xianyin Xin commented on YARN-4120:
---

Link to YARN-4134, the two can be solved together.

> FSAppAttempt.getResourceUsage() should not take preemptedResource into account
> --
>
> Key: YARN-4120
> URL: https://issues.apache.org/jira/browse/YARN-4120
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Xianyin Xin
>
> When compute resource usage for Schedulables, the following code is envolved,
> {{FSAppAttempt.getResourceUsage}},
> {code}
> public Resource getResourceUsage() {
>   return Resources.subtract(getCurrentConsumption(), getPreemptedResources());
> }
> {code}
> and this value is aggregated to FSLeafQueues and FSParentQueues. In my 
> opinion, taking {{preemptedResource}} into account here is not reasonable, 
> there are two main reasons,
> # it is something in future, i.e., even though these resources are marked as 
> preempted, it is currently used by app, and these resources will be 
> subtracted from {{currentCosumption}} once the preemption is finished. it's 
> not reasonable to make arrange for it ahead of time. 
> # there's another problem here, consider following case,
> {code}
> root
>/\
>   queue1   queue2
>   /\
> queue1.3, queue1.4
> {code}
> suppose queue1.3 need resource and it can preempt resources from queue1.4, 
> the preemption happens in the interior of queue1. But when compute resource 
> usage of queue1, {{queue1.resourceUsage = it's_current_resource_usage - 
> preemption}} according to the current code, which is unfair to queue2 when 
> doing resource allocating.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4134) FairScheduler preemption stops at queue level that all child queues are not over their fairshare

2015-09-09 Thread Xianyin Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xianyin Xin updated YARN-4134:
--
Description: 
Now FairScheudler uses a choose-a-candidate method to select a container from 
leaf queues that to be preempted, in {{FSParentQueue.preemptContainer()}},
{code}
readLock.lock();
try {
  for (FSQueue queue : childQueues) {
if (candidateQueue == null ||
comparator.compare(queue, candidateQueue) > 0) {
  candidateQueue = queue;
}
  }
} finally {
  readLock.unlock();
}

// Let the selected queue choose which of its container to preempt
if (candidateQueue != null) {
  toBePreempted = candidateQueue.preemptContainer();
}
{code}
a candidate child queue is selected. However, if the queue's usage isn't over 
it's fairshare, preemption will not happen:
{code}
if (!preemptContainerPreCheck()) {
  return toBePreempted;
}
{code}
 A scenario:
{code}
root
   /\
  queue1   queue2
   /\
  queue2.3, (  queue2.4  )
{code}
suppose there're 8 containers, and queues at any level have the same weight. 
queue1 takes 4 and queue2.3 takes 4, so both queue1 and queue2 are at their 
fairshare. Now we submit an app in queue2.4 with 4 containers needs, it should 
preempt 2 from queue2.3, but the candidate-containers selection procedure will 
stop at queue1, so none of the containers will be preempted.

  was:
Now FairScheudler uses a choose-a-candidate method to select a container from 
leaf queues that to be preempted, in {{FSParentQueue.preemptContainer()}},
{code}
readLock.lock();
try {
  for (FSQueue queue : childQueues) {
if (candidateQueue == null ||
comparator.compare(queue, candidateQueue) > 0) {
  candidateQueue = queue;
}
  }
} finally {
  readLock.unlock();
}

// Let the selected queue choose which of its container to preempt
if (candidateQueue != null) {
  toBePreempted = candidateQueue.preemptContainer();
}
{code}
a candidate child queue is selected. However, if the queue's usage isn't over 
it's fairshare, preemption will not happen:
{code}
if (!preemptContainerPreCheck()) {
  return toBePreempted;
}
{code}
 A scenario:
{code}
root
   /\
  queue1   queue2
  /\
queue1.3, (  queue1.4  )
{code}
suppose there're 8 containers, and queues at any level have the same weight. 
queue1.3 takes 4 and queue2 takes 4, so both queue1 and queue2 are at their 
fairshare. Now we submit an app in queue1.4 with 4 containers needs, it should 
preempt 2 from queue1.3, but the candidate-containers selection procedure will 
stop at level that all of the child queues are not over their fairshare, and 
none of the containers will be preempted.


> FairScheduler preemption stops at queue level that all child queues are not 
> over their fairshare
> 
>
> Key: YARN-4134
> URL: https://issues.apache.org/jira/browse/YARN-4134
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Xianyin Xin
>
> Now FairScheudler uses a choose-a-candidate method to select a container from 
> leaf queues that to be preempted, in {{FSParentQueue.preemptContainer()}},
> {code}
> readLock.lock();
> try {
>   for (FSQueue queue : childQueues) {
> if (candidateQueue == null ||
> comparator.compare(queue, candidateQueue) > 0) {
>   candidateQueue = queue;
> }
>   }
> } finally {
>   readLock.unlock();
> }
> // Let the selected queue choose which of its container to preempt
> if (candidateQueue != null) {
>   toBePreempted = candidateQueue.preemptContainer();
> }
> {code}
> a candidate child queue is selected. However, if the queue's usage isn't over 
> it's fairshare, preemption will not happen:
> {code}
> if (!preemptContainerPreCheck()) {
>   return toBePreempted;
> }
> {code}
>  A scenario:
> {code}
> root
>/\
>   queue1   queue2
>/\
>   queue2.3, (  queue2.4  )
> {code}
> suppose there're 8 containers, and queues at any level have the same weight. 
> queue1 takes 4 and queue2.3 takes 4, so both queue1 and queue2 are at their 
> fairshare. Now we submit an app in queue2.4 with 4 containers needs, it 
> should preempt 2 from queue2.3, but the candidate-containers selection 
> procedure will stop at queue1, so none of the containers will be preempted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-4134) FairScheduler preemption stops at queue level that all child queues are not over their fairshare

2015-09-09 Thread Xianyin Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xianyin Xin reassigned YARN-4134:
-

Assignee: Xianyin Xin

> FairScheduler preemption stops at queue level that all child queues are not 
> over their fairshare
> 
>
> Key: YARN-4134
> URL: https://issues.apache.org/jira/browse/YARN-4134
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Xianyin Xin
>Assignee: Xianyin Xin
>
> Now FairScheudler uses a choose-a-candidate method to select a container from 
> leaf queues that to be preempted, in {{FSParentQueue.preemptContainer()}},
> {code}
> readLock.lock();
> try {
>   for (FSQueue queue : childQueues) {
> if (candidateQueue == null ||
> comparator.compare(queue, candidateQueue) > 0) {
>   candidateQueue = queue;
> }
>   }
> } finally {
>   readLock.unlock();
> }
> // Let the selected queue choose which of its container to preempt
> if (candidateQueue != null) {
>   toBePreempted = candidateQueue.preemptContainer();
> }
> {code}
> a candidate child queue is selected. However, if the queue's usage isn't over 
> it's fairshare, preemption will not happen:
> {code}
> if (!preemptContainerPreCheck()) {
>   return toBePreempted;
> }
> {code}
>  A scenario:
> {code}
> root
>/\
>   queue1   queue2
>/\
>   queue2.3, (  queue2.4  )
> {code}
> suppose there're 8 containers, and queues at any level have the same weight. 
> queue1 takes 4 and queue2.3 takes 4, so both queue1 and queue2 are at their 
> fairshare. Now we submit an app in queue2.4 with 4 containers needs, it 
> should preempt 2 from queue2.3, but the candidate-containers selection 
> procedure will stop at queue1, so none of the containers will be preempted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4120) FSAppAttempt.getResourceUsage() should not take preemptedResource into account

2015-09-08 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14735952#comment-14735952
 ] 

Xianyin Xin commented on YARN-4120:
---

Hi [~kasha], there's another issue in the current preemption logic, it's in 
{{FSParentQueue.java}} and {{FSLeafQueue.java}},
{code}
  public RMContainer preemptContainer() {
RMContainer toBePreempted = null;

// Find the childQueue which is most over fair share
FSQueue candidateQueue = null;
Comparator comparator = policy.getComparator();

readLock.lock();
try {
  for (FSQueue queue : childQueues) {
if (candidateQueue == null ||
comparator.compare(queue, candidateQueue) > 0) {
  candidateQueue = queue;
}
  }
} finally {
  readLock.unlock();
}

// Let the selected queue choose which of its container to preempt
if (candidateQueue != null) {
  toBePreempted = candidateQueue.preemptContainer();
}
return toBePreempted;
  }
{code}
{code}
  public RMContainer preemptContainer() {
RMContainer toBePreempted = null;

// If this queue is not over its fair share, reject
if (!preemptContainerPreCheck()) {
  return toBePreempted;
}
{code}
If the queue's hierarchy like that in the *Description*, suppose queue1 and 
queue2 have the same weight, and the cluster has 8 containers, 4 occupied by 
queue1.1 and 4 occupied by queue2. If new app was added in queue1.2, 2 
containers should be preempted from queue1.1. However, according the above 
code, queue1 and queue2 are both at their fairshare, so the preemption will not 
happen.

So if all of the childqueues at any level are at their fairshare, preemption 
will not happen even though there is/are resource deficit in some leafqueues.

I think we have to drop this logic in this case. As a candidate, we can 
calculates an ideal preemption distribution by traversing the queues. Any 
thoughts?

> FSAppAttempt.getResourceUsage() should not take preemptedResource into account
> --
>
> Key: YARN-4120
> URL: https://issues.apache.org/jira/browse/YARN-4120
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Xianyin Xin
>
> When compute resource usage for Schedulables, the following code is envolved,
> {{FSAppAttempt.getResourceUsage}},
> {code}
> public Resource getResourceUsage() {
>   return Resources.subtract(getCurrentConsumption(), getPreemptedResources());
> }
> {code}
> and this value is aggregated to FSLeafQueues and FSParentQueues. In my 
> opinion, taking {{preemptedResource}} into account here is not reasonable, 
> there are two main reasons,
> # it is something in future, i.e., even though these resources are marked as 
> preempted, it is currently used by app, and these resources will be 
> subtracted from {{currentCosumption}} once the preemption is finished. it's 
> not reasonable to make arrange for it ahead of time. 
> # there's another problem here, consider following case,
> {code}
> root
>/\
>   queue1   queue2
>   /\
> queue1.3, queue1.4
> {code}
> suppose queue1.3 need resource and it can preempt resources from queue1.4, 
> the preemption happens in the interior of queue1. But when compute resource 
> usage of queue1, {{queue1.resourceUsage = it's_current_resource_usage - 
> preemption}} according to the current code, which is unfair to queue2 when 
> doing resource allocating.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4133) Containers to be preempted leaks in FairScheduler preemption logic.

2015-09-08 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14735987#comment-14735987
 ] 

Xianyin Xin commented on YARN-4133:
---

Of course we can also address these problems one by one in different jiras. If 
you like this, just kindly ignore the above comment.

> Containers to be preempted leaks in FairScheduler preemption logic.
> ---
>
> Key: YARN-4133
> URL: https://issues.apache.org/jira/browse/YARN-4133
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.7.1
>Reporter: zhihai xu
>Assignee: zhihai xu
> Attachments: YARN-4133.000.patch
>
>
> Containers to be preempted leaks in FairScheduler preemption logic. It may 
> cause missing preemption due to containers in {{warnedContainers}} wrongly 
> removed. The problem is in {{preemptResources}}:
> There are two issues which can cause containers  wrongly removed from 
> {{warnedContainers}}:
> Firstly missing the container state {{RMContainerState.ACQUIRED}} in the 
> condition check:
> {code}
> (container.getState() == RMContainerState.RUNNING ||
>   container.getState() == RMContainerState.ALLOCATED)
> {code}
> Secondly if  {{isResourceGreaterThanNone(toPreempt)}} return false, we 
> shouldn't remove container from {{warnedContainers}}. We should only remove 
> container from {{warnedContainers}}, if container is not in state 
> {{RMContainerState.RUNNING}}, {{RMContainerState.ALLOCATED}} and 
> {{RMContainerState.ACQUIRED}}.
> {code}
>   if ((container.getState() == RMContainerState.RUNNING ||
>   container.getState() == RMContainerState.ALLOCATED) &&
>   isResourceGreaterThanNone(toPreempt)) {
> warnOrKillContainer(container);
> Resources.subtractFrom(toPreempt, 
> container.getContainer().getResource());
>   } else {
> warnedIter.remove();
>   }
> {code}
> Also once the containers in {{warnedContainers}} are wrongly removed, it will 
> never be preempted. Because these containers are already in 
> {{FSAppAttempt#preemptionMap}} and {{FSAppAttempt#preemptContainer}} won't 
> return the containers in {{FSAppAttempt#preemptionMap}}.
> {code}
>   public RMContainer preemptContainer() {
> if (LOG.isDebugEnabled()) {
>   LOG.debug("App " + getName() + " is going to preempt a running " +
>   "container");
> }
> RMContainer toBePreempted = null;
> for (RMContainer container : getLiveContainers()) {
>   if (!getPreemptionContainers().contains(container) &&
>   (toBePreempted == null ||
>   comparator.compare(toBePreempted, container) > 0)) {
> toBePreempted = container;
>   }
> }
> return toBePreempted;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4133) Containers to be preempted leaks in FairScheduler preemption logic.

2015-09-08 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14735959#comment-14735959
 ] 

Xianyin Xin commented on YARN-4133:
---

Hi [~zxu], it seems the current preemption logic has many problems. I just 
updated one in 
[https://issues.apache.org/jira/browse/YARN-4120?focusedCommentId=14735952=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14735952].
 I think a logic refactor is need, what do you think?

> Containers to be preempted leaks in FairScheduler preemption logic.
> ---
>
> Key: YARN-4133
> URL: https://issues.apache.org/jira/browse/YARN-4133
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.7.1
>Reporter: zhihai xu
>Assignee: zhihai xu
> Attachments: YARN-4133.000.patch
>
>
> Containers to be preempted leaks in FairScheduler preemption logic. It may 
> cause missing preemption due to containers in {{warnedContainers}} wrongly 
> removed. The problem is in {{preemptResources}}:
> There are two issues which can cause containers  wrongly removed from 
> {{warnedContainers}}:
> Firstly missing the container state {{RMContainerState.ACQUIRED}} in the 
> condition check:
> {code}
> (container.getState() == RMContainerState.RUNNING ||
>   container.getState() == RMContainerState.ALLOCATED)
> {code}
> Secondly if  {{isResourceGreaterThanNone(toPreempt)}} return false, we 
> shouldn't remove container from {{warnedContainers}}, We should only remove 
> container from {{warnedContainers}}, if container is not in state 
> {{RMContainerState.RUNNING}}, {{RMContainerState.ALLOCATED}} and 
> {{RMContainerState.ACQUIRED}}.
> {code}
>   if ((container.getState() == RMContainerState.RUNNING ||
>   container.getState() == RMContainerState.ALLOCATED) &&
>   isResourceGreaterThanNone(toPreempt)) {
> warnOrKillContainer(container);
> Resources.subtractFrom(toPreempt, 
> container.getContainer().getResource());
>   } else {
> warnedIter.remove();
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4134) FairScheduler preemption stops at queue level that all child queues are not over their fairshare

2015-09-08 Thread Xianyin Xin (JIRA)
Xianyin Xin created YARN-4134:
-

 Summary: FairScheduler preemption stops at queue level that all 
child queues are not over their fairshare
 Key: YARN-4134
 URL: https://issues.apache.org/jira/browse/YARN-4134
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Xianyin Xin


Now FairScheudler uses a choose-a-candidate method to select a container from 
leaf queues that to be preempted, in {{FSParentQueue.preemptContainer()}},
{code}
readLock.lock();
try {
  for (FSQueue queue : childQueues) {
if (candidateQueue == null ||
comparator.compare(queue, candidateQueue) > 0) {
  candidateQueue = queue;
}
  }
} finally {
  readLock.unlock();
}

// Let the selected queue choose which of its container to preempt
if (candidateQueue != null) {
  toBePreempted = candidateQueue.preemptContainer();
}
{code}
a candidate child queue is selected. However, if the queue's usage isn't over 
it's fairshare, preemption will not happen:
{code}
if (!preemptContainerPreCheck()) {
  return toBePreempted;
}
{code}
 A scenario:
{code}
root
   /\
  queue1   queue2
  /\
queue1.3, (  queue1.4  )
{code}
suppose there're 8 containers, and queues at any level have the same weight. 
queue1.3 takes 4 and queue2 takes 4, so both queue1 and queue2 are at their 
fairshare. Now we submit an app in queue1.4 with 4 containers needs, it should 
preempt 2 from queue1.3, but the candidate-containers selection procedure will 
stop at level that all of the child queues are not over their fairshare, and 
none of the containers will be preempted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4090) Make Collections.sort() more efficient in FSParentQueue.java

2015-09-08 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14736068#comment-14736068
 ] 

Xianyin Xin commented on YARN-4090:
---

Hi [~leftnoteasy], [~kasha], would you please take a look? since this change 
has relation with preemption, so link it with YARN-4120.

> Make Collections.sort() more efficient in FSParentQueue.java
> 
>
> Key: YARN-4090
> URL: https://issues.apache.org/jira/browse/YARN-4090
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Reporter: Xianyin Xin
>Assignee: Xianyin Xin
> Attachments: YARN-4090-TestResult.pdf, YARN-4090-preview.patch, 
> sampling1.jpg, sampling2.jpg
>
>
> Collections.sort() consumes too much time in a scheduling round.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4120) FSAppAttempt.getResourceUsage() should not take preemptedResource into account

2015-09-08 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14736130#comment-14736130
 ] 

Xianyin Xin commented on YARN-4120:
---

Create YARN-4134 to track it.

> FSAppAttempt.getResourceUsage() should not take preemptedResource into account
> --
>
> Key: YARN-4120
> URL: https://issues.apache.org/jira/browse/YARN-4120
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Xianyin Xin
>
> When compute resource usage for Schedulables, the following code is envolved,
> {{FSAppAttempt.getResourceUsage}},
> {code}
> public Resource getResourceUsage() {
>   return Resources.subtract(getCurrentConsumption(), getPreemptedResources());
> }
> {code}
> and this value is aggregated to FSLeafQueues and FSParentQueues. In my 
> opinion, taking {{preemptedResource}} into account here is not reasonable, 
> there are two main reasons,
> # it is something in future, i.e., even though these resources are marked as 
> preempted, it is currently used by app, and these resources will be 
> subtracted from {{currentCosumption}} once the preemption is finished. it's 
> not reasonable to make arrange for it ahead of time. 
> # there's another problem here, consider following case,
> {code}
> root
>/\
>   queue1   queue2
>   /\
> queue1.3, queue1.4
> {code}
> suppose queue1.3 need resource and it can preempt resources from queue1.4, 
> the preemption happens in the interior of queue1. But when compute resource 
> usage of queue1, {{queue1.resourceUsage = it's_current_resource_usage - 
> preemption}} according to the current code, which is unfair to queue2 when 
> doing resource allocating.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4120) FSAppAttempt.getResourceUsage() should not take preemptedResource into account

2015-09-06 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14733235#comment-14733235
 ] 

Xianyin Xin commented on YARN-4120:
---

Thanks [~kasha]. How about distinguishing getResourceUsage() (the current gross 
resource usage) and getNetResourceUsage() (the current gross resource usage 
minus preempted)? The latter are used for preemption related calculations and 
the former for others?

> FSAppAttempt.getResourceUsage() should not take preemptedResource into account
> --
>
> Key: YARN-4120
> URL: https://issues.apache.org/jira/browse/YARN-4120
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Xianyin Xin
>
> When compute resource usage for Schedulables, the following code is envolved,
> {{FSAppAttempt.getResourceUsage}},
> {code}
> public Resource getResourceUsage() {
>   return Resources.subtract(getCurrentConsumption(), getPreemptedResources());
> }
> {code}
> and this value is aggregated to FSLeafQueues and FSParentQueues. In my 
> opinion, taking {{preemptedResource}} into account here is not reasonable, 
> there are two main reasons,
> # it is something in future, i.e., even though these resources are marked as 
> preempted, it is currently used by app, and these resources will be 
> subtracted from {{currentCosumption}} once the preemption is finished. it's 
> not reasonable to make arrange for it ahead of time. 
> # there's another problem here, consider following case,
> {code}
> root
>/\
>   queue1   queue2
>   /\
> queue1.3, queue1.4
> {code}
> suppose queue1.3 need resource and it can preempt resources from queue1.4, 
> the preemption happens in the interior of queue1. But when compute resource 
> usage of queue1, {{queue1.resourceUsage = it's_current_resource_usage - 
> preemption}} according to the current code, which is unfair to queue2 when 
> doing resource allocating.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4120) FSAppAttempt.getResourceUsage() should not take preemptedResource into account

2015-09-06 Thread Xianyin Xin (JIRA)
Xianyin Xin created YARN-4120:
-

 Summary: FSAppAttempt.getResourceUsage() should not take 
preemptedResource into account
 Key: YARN-4120
 URL: https://issues.apache.org/jira/browse/YARN-4120
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Xianyin Xin


When compute resource usage for Schedulables, the following code is envolved,
{{FSAppAttempt.getResourceUsage}},
{code}
public Resource getResourceUsage() {
  return Resources.subtract(getCurrentConsumption(), getPreemptedResources());
}
{code}
and this value is aggregated to FSLeafQueues and FSParentQueues. In my opinion, 
taking {{preemptedResource}} into account here is not reasonable, there are two 
main reasons,
# it is something in future, i.e., even though these resources are marked as 
preempted, it is currently used by app, and these resources will be subtracted 
from {{currentCosumption}} once the preemption is finished. it's not reasonable 
to make arrange for it ahead of time. 
# there's another problem here, consider following case,
{code}
root
   /\
  queue1   queue2
  /\
queue1.3, queue1.4
{code}
suppose queue1.3 need resource and it can preempt resources from queue1.4, the 
preemption happens in the interior of queue1. But when compute resource usage 
of queue1, {{queue1.resourceUsage = it's_current_resource_usage - preemption}} 
according to the current code, which is unfair to queue2 when doing resource 
allocating.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4090) Make Collections.sort() more efficient in FSParentQueue.java

2015-09-02 Thread Xianyin Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xianyin Xin updated YARN-4090:
--
Attachment: YARN-4090-preview.patch

> Make Collections.sort() more efficient in FSParentQueue.java
> 
>
> Key: YARN-4090
> URL: https://issues.apache.org/jira/browse/YARN-4090
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Reporter: Xianyin Xin
>Assignee: Xianyin Xin
> Attachments: YARN-4090-preview.patch, sampling1.jpg, sampling2.jpg
>
>
> Collections.sort() consumes too much time in a scheduling round.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4090) Make Collections.sort() more efficient in FSParentQueue.java

2015-09-02 Thread Xianyin Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xianyin Xin updated YARN-4090:
--
Attachment: YARN-4090-TestResult.pdf

A simple fix and the correspond test result are submitted. The results show how 
expensive of the method the original comparator.compare() uses, that is, a 
recursive method to collect resource usages of two queues, and together with a 
time consuming FSAppAttempt.getResourceUsage().

> Make Collections.sort() more efficient in FSParentQueue.java
> 
>
> Key: YARN-4090
> URL: https://issues.apache.org/jira/browse/YARN-4090
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Reporter: Xianyin Xin
>Assignee: Xianyin Xin
> Attachments: YARN-4090-TestResult.pdf, YARN-4090-preview.patch, 
> sampling1.jpg, sampling2.jpg
>
>
> Collections.sort() consumes too much time in a scheduling round.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-4090) Make Collections.sort() more efficient in FSParentQueue.java

2015-08-28 Thread Xianyin Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xianyin Xin reassigned YARN-4090:
-

Assignee: Xianyin Xin

 Make Collections.sort() more efficient in FSParentQueue.java
 

 Key: YARN-4090
 URL: https://issues.apache.org/jira/browse/YARN-4090
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: fairscheduler
Reporter: Xianyin Xin
Assignee: Xianyin Xin
 Attachments: sampling1.jpg, sampling2.jpg


 Collections.sort() consumes too much time in a scheduling round.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4090) Make Collections.sort() more efficient in FSParentQueue.java

2015-08-27 Thread Xianyin Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xianyin Xin updated YARN-4090:
--
Summary: Make Collections.sort() more efficient in FSParentQueue.java  
(was: Make Collections.sort() more efficient in FSParent.java)

 Make Collections.sort() more efficient in FSParentQueue.java
 

 Key: YARN-4090
 URL: https://issues.apache.org/jira/browse/YARN-4090
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: fairscheduler
Reporter: Xianyin Xin

 Collections.sort() consumes too much time in a scheduling round.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4090) Make Collections.sort() more efficient in FSParentQueue.java

2015-08-27 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14716785#comment-14716785
 ] 

Xianyin Xin commented on YARN-4090:
---

We should also pay attention to the ReadLock.lock() and unlock() in the first 
img which cost much time.

 Make Collections.sort() more efficient in FSParentQueue.java
 

 Key: YARN-4090
 URL: https://issues.apache.org/jira/browse/YARN-4090
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: fairscheduler
Reporter: Xianyin Xin
 Attachments: sampling1.jpg, sampling2.jpg


 Collections.sort() consumes too much time in a scheduling round.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4090) Make Collections.sort() more efficient in FSParent.java

2015-08-27 Thread Xianyin Xin (JIRA)
Xianyin Xin created YARN-4090:
-

 Summary: Make Collections.sort() more efficient in FSParent.java
 Key: YARN-4090
 URL: https://issues.apache.org/jira/browse/YARN-4090
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: fairscheduler
Reporter: Xianyin Xin


Collections.sort() consumes too much time in a scheduling round.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4090) Make Collections.sort() more efficient in FSParentQueue.java

2015-08-27 Thread Xianyin Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xianyin Xin updated YARN-4090:
--
Attachment: sampling2.jpg
sampling1.jpg

I construct a queue hierarchy with 3 levels,
   root
   child1  child2child3
child1.child1~10, child2.child1~15, child3.child1~15
the number of leaf queues is 40. A total of 1000 apps running randomly on the 
leaf queues. The sampling results show that about 2/3 of the cpu times of 
FSParentQueue.assignContainers() was spent on Collections.sort(). In 
Collections.sort(), about 40% was spent on 
SchedulerAppplicationAttempt.getCurrentConsumption() and about 36% was spent on 
Resources.substract(). The former time consuming is because 
FSParentQueue.getResourceUsage() will make recursion on it's children, while 
for the latter time consuming, the clone() in substract() takes much cpu time.

 Make Collections.sort() more efficient in FSParentQueue.java
 

 Key: YARN-4090
 URL: https://issues.apache.org/jira/browse/YARN-4090
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: fairscheduler
Reporter: Xianyin Xin
 Attachments: sampling1.jpg, sampling2.jpg


 Collections.sort() consumes too much time in a scheduling round.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3652) A SchedulerMetrics may be need for evaluating the scheduler's performance

2015-08-18 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701291#comment-14701291
 ] 

Xianyin Xin commented on YARN-3652:
---

A simple introduction of the preview patch: SchedulerMetrics is focus on 
metrics that related to the scheduler's performace. The following metrics are 
considered:

num of waiting events in the scheduler dispatch queue;
num of all kinds events in the scheduler dispatch queue;

events handling rate;
node update handling rate;

events adding rate;
node update adding rate;

statistical info of num of waiting events;
statistical info of num of waiting node update events;

containers allocation rate;

scheduling method exec rate, i.e., num of scheduling tries per second;

app allocation call duration;
nodeUpdate call duration;
scheduling call duration;

These metrics give rich information of the scheduler performance, which can be 
used to diagnose the anomaly of the scheduler.

 A SchedulerMetrics may be need for evaluating the scheduler's performance
 -

 Key: YARN-3652
 URL: https://issues.apache.org/jira/browse/YARN-3652
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager, scheduler
Reporter: Xianyin Xin
 Attachments: YARN-3652-preview.patch


 As discussed in YARN-3630, a {{SchedulerMetrics}} may be need for evaluating 
 the scheduler's performance. The performance indexes includes #events waiting 
 for being handled by scheduler, the throughput, the scheduling delay and/or 
 other indicators.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3652) A SchedulerMetrics may be need for evaluating the scheduler's performance

2015-08-18 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701267#comment-14701267
 ] 

Xianyin Xin commented on YARN-3652:
---

In the patch i used functions from HADOOP-12338.

 A SchedulerMetrics may be need for evaluating the scheduler's performance
 -

 Key: YARN-3652
 URL: https://issues.apache.org/jira/browse/YARN-3652
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager, scheduler
Reporter: Xianyin Xin
 Attachments: YARN-3652-preview.patch


 As discussed in YARN-3630, a {{SchedulerMetrics}} may be need for evaluating 
 the scheduler's performance. The performance indexes includes #events waiting 
 for being handled by scheduler, the throughput, the scheduling delay and/or 
 other indicators.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3652) A SchedulerMetrics may be need for evaluating the scheduler's performance

2015-08-18 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701296#comment-14701296
 ] 

Xianyin Xin commented on YARN-3652:
---

Hi [~sunilg], [~vvasudev], would you please have a look? Any comments are 
welcome.

 A SchedulerMetrics may be need for evaluating the scheduler's performance
 -

 Key: YARN-3652
 URL: https://issues.apache.org/jira/browse/YARN-3652
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager, scheduler
Reporter: Xianyin Xin
 Attachments: YARN-3652-preview.patch


 As discussed in YARN-3630, a {{SchedulerMetrics}} may be need for evaluating 
 the scheduler's performance. The performance indexes includes #events waiting 
 for being handled by scheduler, the throughput, the scheduling delay and/or 
 other indicators.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3652) A SchedulerMetrics may be need for evaluating the scheduler's performance

2015-08-18 Thread Xianyin Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xianyin Xin updated YARN-3652:
--
Attachment: YARN-3652-preview.patch

A preview patch submitted.

 A SchedulerMetrics may be need for evaluating the scheduler's performance
 -

 Key: YARN-3652
 URL: https://issues.apache.org/jira/browse/YARN-3652
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager, scheduler
Reporter: Xianyin Xin
 Attachments: YARN-3652-preview.patch


 As discussed in YARN-3630, a {{SchedulerMetrics}} may be need for evaluating 
 the scheduler's performance. The performance indexes includes #events waiting 
 for being handled by scheduler, the throughput, the scheduling delay and/or 
 other indicators.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4002) make ResourceTrackerService.nodeHeartbeat more concurrent

2015-08-07 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14662744#comment-14662744
 ] 

Xianyin Xin commented on YARN-4002:
---

+1 for the proposal. [~leftnoteasy], I think YARN-3091 should also be carried 
out which is left there for a long time.

 make ResourceTrackerService.nodeHeartbeat more concurrent
 -

 Key: YARN-4002
 URL: https://issues.apache.org/jira/browse/YARN-4002
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Hong Zhiguo
Assignee: Hong Zhiguo
Priority: Critical

 We have multiple RPC threads to handle NodeHeartbeatRequest from NMs. By 
 design the method ResourceTrackerService.nodeHeartbeat should be concurrent 
 enough to scale for large clusters.
 But we have a BIG lock in NodesListManager.isValidNode which I think it's 
 unnecessary.
 First, the fields includes and excludes of HostsFileReader are only 
 updated on refresh nodes.  All RPC threads handling node heartbeats are 
 only readers.  So RWLock could be used to  alow concurrent access by RPC 
 threads.
 Second, since he fields includes and excludes of HostsFileReader are 
 always updated by reference assignment, which is atomic in Java, the reader 
 side lock could just be skipped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3931) default-node-label-expression doesn’t apply when an application is submitted by RM rest api

2015-07-16 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14630659#comment-14630659
 ] 

Xianyin Xin commented on YARN-3931:
---

This reminds me an earlier trouble i have met. Hi [~Naganarasimha], can we 
consider to remove the  node label expression in the code? It seems not make 
sense we set a node label as . For node label expression, it should be 
some_label or null. 

Just an unrigorous thoughts, what do you think?

 default-node-label-expression doesn’t apply when an application is submitted 
 by RM rest api
 ---

 Key: YARN-3931
 URL: https://issues.apache.org/jira/browse/YARN-3931
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
 Environment: hadoop-2.6.0
Reporter: kyungwan nam
Assignee: kyungwan nam
 Attachments: YARN-3931.001.patch


 * 
 yarn.scheduler.capacity.queue-path.default-node-label-expression=large_disk
 * submit an application using rest api without app-node-label-expression”, 
 am-container-node-label-expression”
 * RM doesn’t allocate containers to the hosts associated with large_disk node 
 label



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3922) Introduce adaptive heartbeat between RM and NM

2015-07-14 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14627374#comment-14627374
 ] 

Xianyin Xin commented on YARN-3922:
---

Maybe we should reconsider to rename the feature (adaptive heartbeat) of 
YARN-3630 or this. That's would lead to misunderstanding. Any thoughts?

  Introduce adaptive heartbeat between RM and NM
 ---

 Key: YARN-3922
 URL: https://issues.apache.org/jira/browse/YARN-3922
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager, resourcemanager
Reporter: Xiaodi Ke

 Currently, the communication between RM and NM are based on pull-based 
 heartbeat protocol. Along with the NM heartbeat, it updates the status of 
 containers (i.e. FINISHED container). This also updates the RM’s view of 
 available resource and triggers scheduling. How frequently the NM sends the 
 heartbeat will impact the task throughput and latency of YARN scheduler.  
 Although the heartbeat interval can be configured in yarn-stie.xml, it will 
 increase the load of RM and bring unnecessary overhead if the interval is 
 configured too short. 
 We propose the adaptive heartbeat between RM and NM to achieve a balance 
 between updating NM’s info promptly and minimizing the overhead of extra 
 heartbeats. With adaptive heartbeat, NM still honors the current heartbeat 
 interval and sends the heartbeat regularly. However, a heartbeat will be 
 triggered as soon as any container status is changed.  Also a minimum 
 interval can be configured to prevent NM from sending heartbeat too 
 frequently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3885) ProportionalCapacityPreemptionPolicy doesn't preempt if queue is more than 2 level

2015-07-14 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14627404#comment-14627404
 ] 

Xianyin Xin commented on YARN-3885:
---

Agree [~leftnoteasy]. suppose we have a case

   parent_of_A
/   \
  AA's brother B

and A has 9 extra and 10 preemptable, but B has 9 extra and 0 preemptable (we 
can set the queue unpreemptable), then parent_of_A has 18 extra, but how many 
preemptable it should offer? It should be 9. so we'd better limit preemptable 
to be less than extra for queues at any level.

Please correct me if i'm wrong.

 ProportionalCapacityPreemptionPolicy doesn't preempt if queue is more than 2 
 level
 --

 Key: YARN-3885
 URL: https://issues.apache.org/jira/browse/YARN-3885
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 2.8.0
Reporter: Ajith S
Assignee: Ajith S
Priority: Blocker
 Attachments: YARN-3885.02.patch, YARN-3885.03.patch, 
 YARN-3885.04.patch, YARN-3885.patch


 when preemption policy is {{ProportionalCapacityPreemptionPolicy.cloneQueues}}
 this piece of code, to calculate {{untoucable}} doesnt consider al the 
 children, it considers only immediate childern



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3885) ProportionalCapacityPreemptionPolicy doesn't preempt if queue is more than 2 level

2015-07-05 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14614194#comment-14614194
 ] 

Xianyin Xin commented on YARN-3885:
---

Hi Ajith, CHILDRENPREEMPTABLE  may be over counted. Suppose parent queue root 
has extra 50 containers, and child queue root.A has 20 preemptable, and root.B 
40 preemptable, than root has 60 preemptable by calculation. This should be 
limited to 50.

An alternative solution is we use PREEMTABLE of non-leaf queue record the sum 
of sub-queue's PREEMPTABLEs(i.e. PREEMPTABL=min(CHILDRENPREEMPTABLE, extra)) 
because it is non zero only for leaf queue, thus we can omit 
CHILDRENPREEMPTABLE in the TempQueuePerPartition. It's strange that all the 
non-leaf queues has 0 PREEMPTABLE.

Def PREEMPTABLE:
for leaf queues, the preemptable resources
for non-leaf queues, min(sum of children's PREEMOTABLEs, extra) -(this 
is what CHILDRENPREEMPTABLE does in the patch)

Just a suggestion.

 ProportionalCapacityPreemptionPolicy doesn't preempt if queue is more than 2 
 level
 --

 Key: YARN-3885
 URL: https://issues.apache.org/jira/browse/YARN-3885
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 2.8.0
Reporter: Ajith S
Priority: Critical
 Attachments: YARN-3885.patch


 when preemption policy is {{ProportionalCapacityPreemptionPolicy.cloneQueues}}
 this piece of code, to calculate {{untoucable}} is wrong as it doesnt 
 consider al the children, it considers only immediate childern



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3409) Add constraint node labels

2015-06-26 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603869#comment-14603869
 ] 

Xianyin Xin commented on YARN-3409:
---

Thanks for comments, [~grey]. IMO, topology may be hard to deal with node 
labels, as node labels describe the attributes of a node and topology is an 
attribute of the whole cluster. You remind me that YARN-1042 may not be as 
simple as i think.

And expecting your design doc, [~leftnoteasy].

 Add constraint node labels
 --

 Key: YARN-3409
 URL: https://issues.apache.org/jira/browse/YARN-3409
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, capacityscheduler, client
Reporter: Wangda Tan
Assignee: Wangda Tan

 Specify only one label for each node (IAW, partition a cluster) is a way to 
 determinate how resources of a special set of nodes could be shared by a 
 group of entities (like teams, departments, etc.). Partitions of a cluster 
 has following characteristics:
 - Cluster divided to several disjoint sub clusters.
 - ACL/priority can apply on partition (Only market team / marke team has 
 priority to use the partition).
 - Percentage of capacities can apply on partition (Market team has 40% 
 minimum capacity and Dev team has 60% of minimum capacity of the partition).
 Constraints are orthogonal to partition, they’re describing attributes of 
 node’s hardware/software just for affinity. Some example of constraints:
 - glibc version
 - JDK version
 - Type of CPU (x86_64/i686)
 - Type of OS (windows, linux, etc.)
 With this, application can be able to ask for resource has (glibc.version = 
 2.20  JDK.version = 8u20  x86_64).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >