date:20140817

[jira] [Resolved] (YARN-667) Data persisted in RM should be versioned

2014-08-17 Thread Junping Du (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du resolved YARN-667.
-

Resolution: Duplicate

 Data persisted in RM should be versioned
 

 Key: YARN-667
 URL: https://issues.apache.org/jira/browse/YARN-667
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.0.4-alpha
Reporter: Siddharth Seth
Assignee: Junping Du

 Includes data persisted for RM restart, NodeManager directory structure and 
 the Aggregated Log Format.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-667) Data persisted in RM should be versioned

2014-08-17 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099907#comment-14099907
 ] 

Junping Du commented on YARN-667:
-

Agree. Let's address them in separated JIRA when need it in future.
For version of RMState, looks like YARN-1239 already address it, so close this 
as duplicated.

 Data persisted in RM should be versioned
 

 Key: YARN-667
 URL: https://issues.apache.org/jira/browse/YARN-667
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.0.4-alpha
Reporter: Siddharth Seth
Assignee: Junping Du

 Includes data persisted for RM restart, NodeManager directory structure and 
 the Aggregated Log Format.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-1198) Capacity Scheduler headroom calculation does not work as expected

2014-08-17 Thread Craig Welch (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-1198:
--

Attachment: YARN-1198.6.patch

Fix for findbug finds

 Capacity Scheduler headroom calculation does not work as expected
 -

 Key: YARN-1198
 URL: https://issues.apache.org/jira/browse/YARN-1198
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Omkar Vinit Joshi
Assignee: Craig Welch
 Attachments: YARN-1198.1.patch, YARN-1198.2.patch, YARN-1198.3.patch, 
 YARN-1198.4.patch, YARN-1198.5.patch, YARN-1198.6.patch


 Today headroom calculation (for the app) takes place only when
 * New node is added/removed from the cluster
 * New container is getting assigned to the application.
 However there are potentially lot of situations which are not considered for 
 this calculation
 * If a container finishes then headroom for that application will change and 
 should be notified to the AM accordingly.
 * If a single user has submitted multiple applications (app1 and app2) to the 
 same queue then
 ** If app1's container finishes then not only app1's but also app2's AM 
 should be notified about the change in headroom.
 ** Similarly if a container is assigned to any applications app1/app2 then 
 both AM should be notified about their headroom.
 ** To simplify the whole communication process it is ideal to keep headroom 
 per User per LeafQueue so that everyone gets the same picture (apps belonging 
 to same user and submitted in same queue).
 * If a new user submits an application to the queue then all applications 
 submitted by all users in that queue should be notified of the headroom 
 change.
 * Also today headroom is an absolute number ( I think it should be normalized 
 but then this is going to be not backward compatible..)
 * Also  when admin user refreshes queue headroom has to be updated.
 These all are the potential bugs in headroom calculations



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1198) Capacity Scheduler headroom calculation does not work as expected

2014-08-17 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099927#comment-14099927
 ] 

Hadoop QA commented on YARN-1198:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12662360/YARN-1198.6.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4650//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4650//console

This message is automatically generated.

 Capacity Scheduler headroom calculation does not work as expected
 -

 Key: YARN-1198
 URL: https://issues.apache.org/jira/browse/YARN-1198
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Omkar Vinit Joshi
Assignee: Craig Welch
 Attachments: YARN-1198.1.patch, YARN-1198.2.patch, YARN-1198.3.patch, 
 YARN-1198.4.patch, YARN-1198.5.patch, YARN-1198.6.patch


 Today headroom calculation (for the app) takes place only when
 * New node is added/removed from the cluster
 * New container is getting assigned to the application.
 However there are potentially lot of situations which are not considered for 
 this calculation
 * If a container finishes then headroom for that application will change and 
 should be notified to the AM accordingly.
 * If a single user has submitted multiple applications (app1 and app2) to the 
 same queue then
 ** If app1's container finishes then not only app1's but also app2's AM 
 should be notified about the change in headroom.
 ** Similarly if a container is assigned to any applications app1/app2 then 
 both AM should be notified about their headroom.
 ** To simplify the whole communication process it is ideal to keep headroom 
 per User per LeafQueue so that everyone gets the same picture (apps belonging 
 to same user and submitted in same queue).
 * If a new user submits an application to the queue then all applications 
 submitted by all users in that queue should be notified of the headroom 
 change.
 * Also today headroom is an absolute number ( I think it should be normalized 
 but then this is going to be not backward compatible..)
 * Also  when admin user refreshes queue headroom has to be updated.
 These all are the potential bugs in headroom calculations



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2411) [Capacity Scheduler] support simple user and group mappings to queues

2014-08-17 Thread Ram Venkatesh (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ram Venkatesh updated YARN-2411:


Attachment: YARN-2411.4.patch

[~wangda.tan] thank you for your comments. I agree, it is better to check and 
reject the mapping upfront if it refers to a non-existent or non-leaf queue. 
Uploading patch with this change.

 [Capacity Scheduler] support simple user and group mappings to queues
 -

 Key: YARN-2411
 URL: https://issues.apache.org/jira/browse/YARN-2411
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler
Reporter: Ram Venkatesh
Assignee: Ram Venkatesh
 Attachments: YARN-2411-2.patch, YARN-2411.1.patch, YARN-2411.3.patch, 
 YARN-2411.4.patch


 YARN-2257 has a proposal to extend and share the queue placement rules for 
 the fair scheduler and the capacity scheduler. This is a good long term 
 solution to streamline queue placement of both schedulers but it has core 
 infra work that has to happen first and might require changes to current 
 features in all schedulers along with corresponding configuration changes, if 
 any. 
 I would like to propose a change with a smaller scope in the capacity 
 scheduler that addresses the core use cases for implicitly mapping jobs that 
 have the default queue or no queue specified to specific queues based on the 
 submitting user and user groups. It will be useful in a number of real-world 
 scenarios and can be migrated over to the unified scheme when YARN-2257 
 becomes available.
 The proposal is to add two new configuration options:
 yarn.scheduler.capacity.queue-mappings-override.enable 
 A boolean that controls if user-specified queues can be overridden by the 
 mapping, default is false.
 and,
 yarn.scheduler.capacity.queue-mappings
 A string that specifies a list of mappings in the following format (default 
 is  which is the same as no mapping)
 map_specifier:source_attribute:queue_name[,map_specifier:source_attribute:queue_name]*
 map_specifier := user (u) | group (g)
 source_attribute := user | group | %user
 queue_name := the name of the mapped queue | %user | %primary_group
 The mappings will be evaluated left to right, and the first valid mapping 
 will be used. If the mapped queue does not exist, or the current user does 
 not have permissions to submit jobs to the mapped queue, the submission will 
 fail.
 Example usages:
 1. user1 is mapped to queue1, group1 is mapped to queue2
 u:user1:queue1,g:group1:queue2
 2. To map users to queues with the same name as the user:
 u:%user:%user
 I am happy to volunteer to take this up.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2033) Investigate merging generic-history into the Timeline Store

2014-08-17 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099974#comment-14099974
 ] 

Junping Du commented on YARN-2033:
--

[~zjshen], thanks for comments above which sounds good to me. I just go through 
your latest patch, a couple of comments:
{code}
+  public static final String 
RM_METRICS_PUBLISHER_MULTI_THREADED_DISPATCHER_POOL_SIZE =
+  RM_PREFIX + metrics-publisher.multi-threaded-dispatcher.pool-size;
+  public static final int 
DEFAULT_RM_METRICS_PUBLISHER_MULTI_THREADED_DISPATCHER_POOL_SIZE =
+  10;
{code}
The name of config looks like too long. May be we can rename it to something 
shorter, i.e. RM_PREFIX + metrics-publisher.dispatcher.pool-size?

{code}
-  optional string diagnostics = 5 [default = N/A];
-  optional YarnApplicationAttemptStateProto yarn_application_attempt_state = 6;
-  optional ContainerIdProto am_container_id = 7;
+  optional string original_tracking_url = 5;
+  optional string diagnostics = 6 [default = N/A];
+  optional YarnApplicationAttemptStateProto yarn_application_attempt_state = 7;
+  optional ContainerIdProto am_container_id = 8;
{code}
We shouldn't insert a new field as it will change the order of existing fields. 
In PB, the encoded messages only include field type and number and will be map 
to field name when decoding. Thus, change the field number here will break 
compatibility which is unnecessary. Add original_tracking_url with field number 
of 8 should be fine.

{code}
-if (conf.getBoolean(YarnConfiguration.APPLICATION_HISTORY_ENABLED,
-  YarnConfiguration.DEFAULT_APPLICATION_HISTORY_ENABLED)) {
-  historyServiceEnabled = true;
+if (conf.get(YarnConfiguration.APPLICATION_HISTORY_STORE) == null
+ conf.getBoolean(YarnConfiguration.RM_METRICS_PUBLISHER_ENABLED,
+YarnConfiguration.DEFAULT_RM_METRICS_PUBLISHER_ENABLED) ||
+conf.get(YarnConfiguration.APPLICATION_HISTORY_STORE) != null
+ conf.getBoolean(YarnConfiguration.APPLICATION_HISTORY_ENABLED,
+YarnConfiguration.DEFAULT_APPLICATION_HISTORY_ENABLED)) {
+  yarnMetricsEnabled = true;
{code}
If user's config is slightly wrong (let's assume: 
YarnConfiguration.APPLICATION_HISTORY_STORE != null, 
YarnConfiguration.RM_METRICS_PUBLISHER_ENABLED=true), then here we disable 
yarnMetricsEnabled sliently which make trouble-shooting effort a little harder. 
Suggest to log warn messages when user's wrong configuration happens. Better to 
move logic operations inside of if() to a separated method and log the error 
for wrong configuration. 

{code}
+  property
+   descriptionThe setting that controls whether yarn metrics is 
published on
+the timeline server or not by RM./description
+   nameyarn.resourcemanager.metrics-publisher.enabled/name
+   valuefalse/value
+  /property
{code}
Indentation should be 2 white spaces instead of tab.

In ApplicationHistoryManagerOnTimelineStore.java,
{code}
} catch (YarnException e) {
+  throw new IOException(e);
+}
{code}
This kind of exception translate is unnecessary to me. We can remove it as 
YarnException get throw here. If we decide to throw IOException only (please 
see my comments later), we can extend the block to cover more code that could 
throw YarnException and translate to IOException.

The method of convertToApplicationReport seems a little too sophisticated in 
creating applicationReport. Another option is to wrapper it as Builder pattern 
(plz refer in MiniDFSCluster) should be better. The same comments on 
convertToApplicationAttemptReport and convertToContainerReport. This is only 
optional comments, see if you want to address here or some separate JIRA in 
future.

{code}
+  public ApplicationAttemptReport getApplicationAttempt(
+  ApplicationAttemptId appAttemptId) throws YarnException, IOException {
+getApplication(appAttemptId.getApplicationId(), 
ApplicationReportField.NONE);
+TimelineEntity entity = null;
...
{code}
Why do we need getApplication(appAttemptId.getApplicationId(), 
ApplicationReportField.NONE) here? IMO, the only work here is to check if 
applicationId is valid, but we have check on appAttemptId later. so we may 
consider to remove it if unnecessary. In addition, may be 
ApplicationReportField.NONE is not useful?

{code}
-new Path(conf.get(YarnConfiguration.FS_APPLICATION_HISTORY_STORE_URI));
+new Path(conf.get(YarnConfiguration.FS_APPLICATION_HISTORY_STORE_URI,
+conf.get(hadoop.tmp.dir) + /yarn/timeline/generic-history));
{code}
We should replace hadoop.tmp.dir and /yarn/timeline/generic-history with 
constant string in YarnConfiguration. BTW, hadoop.tmp.dir may not be 
necessary?

In ApplicationContext.java,
{code}
* @return {@link ApplicationReport} for the ApplicationId.
+   * @throws YarnException
* @throws IOException
*/
   @Public
   @Unstable
-  ApplicationReport

[jira] [Commented] (YARN-2411) [Capacity Scheduler] support simple user and group mappings to queues

2014-08-17 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099976#comment-14099976
 ] 

Hadoop QA commented on YARN-2411:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12662363/YARN-2411.4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.TestRMEmbeddedElector
  
org.apache.hadoop.yarn.server.resourcemanager.recovery.TestZKRMStateStoreZKClientConnections

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4651//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4651//console

This message is automatically generated.

 [Capacity Scheduler] support simple user and group mappings to queues
 -

 Key: YARN-2411
 URL: https://issues.apache.org/jira/browse/YARN-2411
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler
Reporter: Ram Venkatesh
Assignee: Ram Venkatesh
 Attachments: YARN-2411-2.patch, YARN-2411.1.patch, YARN-2411.3.patch, 
 YARN-2411.4.patch


 YARN-2257 has a proposal to extend and share the queue placement rules for 
 the fair scheduler and the capacity scheduler. This is a good long term 
 solution to streamline queue placement of both schedulers but it has core 
 infra work that has to happen first and might require changes to current 
 features in all schedulers along with corresponding configuration changes, if 
 any. 
 I would like to propose a change with a smaller scope in the capacity 
 scheduler that addresses the core use cases for implicitly mapping jobs that 
 have the default queue or no queue specified to specific queues based on the 
 submitting user and user groups. It will be useful in a number of real-world 
 scenarios and can be migrated over to the unified scheme when YARN-2257 
 becomes available.
 The proposal is to add two new configuration options:
 yarn.scheduler.capacity.queue-mappings-override.enable 
 A boolean that controls if user-specified queues can be overridden by the 
 mapping, default is false.
 and,
 yarn.scheduler.capacity.queue-mappings
 A string that specifies a list of mappings in the following format (default 
 is  which is the same as no mapping)
 map_specifier:source_attribute:queue_name[,map_specifier:source_attribute:queue_name]*
 map_specifier := user (u) | group (g)
 source_attribute := user | group | %user
 queue_name := the name of the mapped queue | %user | %primary_group
 The mappings will be evaluated left to right, and the first valid mapping 
 will be used. If the mapped queue does not exist, or the current user does 
 not have permissions to submit jobs to the mapped queue, the submission will 
 fail.
 Example usages:
 1. user1 is mapped to queue1, group1 is mapped to queue2
 u:user1:queue1,g:group1:queue2
 2. To map users to queues with the same name as the user:
 u:%user:%user
 I am happy to volunteer to take this up.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2411) [Capacity Scheduler] support simple user and group mappings to queues

2014-08-17 Thread Jian He (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100029#comment-14100029
 ] 

Jian He commented on YARN-2411:
---

test failures should be unrelated to the patch. resubmit same patch with one 
more assertion of RMApp.getQueue in the test.

 [Capacity Scheduler] support simple user and group mappings to queues
 -

 Key: YARN-2411
 URL: https://issues.apache.org/jira/browse/YARN-2411
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler
Reporter: Ram Venkatesh
Assignee: Ram Venkatesh
 Attachments: YARN-2411-2.patch, YARN-2411.1.patch, YARN-2411.3.patch, 
 YARN-2411.4.patch, YARN-2411.5.patch


 YARN-2257 has a proposal to extend and share the queue placement rules for 
 the fair scheduler and the capacity scheduler. This is a good long term 
 solution to streamline queue placement of both schedulers but it has core 
 infra work that has to happen first and might require changes to current 
 features in all schedulers along with corresponding configuration changes, if 
 any. 
 I would like to propose a change with a smaller scope in the capacity 
 scheduler that addresses the core use cases for implicitly mapping jobs that 
 have the default queue or no queue specified to specific queues based on the 
 submitting user and user groups. It will be useful in a number of real-world 
 scenarios and can be migrated over to the unified scheme when YARN-2257 
 becomes available.
 The proposal is to add two new configuration options:
 yarn.scheduler.capacity.queue-mappings-override.enable 
 A boolean that controls if user-specified queues can be overridden by the 
 mapping, default is false.
 and,
 yarn.scheduler.capacity.queue-mappings
 A string that specifies a list of mappings in the following format (default 
 is  which is the same as no mapping)
 map_specifier:source_attribute:queue_name[,map_specifier:source_attribute:queue_name]*
 map_specifier := user (u) | group (g)
 source_attribute := user | group | %user
 queue_name := the name of the mapped queue | %user | %primary_group
 The mappings will be evaluated left to right, and the first valid mapping 
 will be used. If the mapped queue does not exist, or the current user does 
 not have permissions to submit jobs to the mapped queue, the submission will 
 fail.
 Example usages:
 1. user1 is mapped to queue1, group1 is mapped to queue2
 u:user1:queue1,g:group1:queue2
 2. To map users to queues with the same name as the user:
 u:%user:%user
 I am happy to volunteer to take this up.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2411) [Capacity Scheduler] support simple user and group mappings to queues

2014-08-17 Thread Jian He (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-2411:
--

Attachment: YARN-2411.5.patch

 [Capacity Scheduler] support simple user and group mappings to queues
 -

 Key: YARN-2411
 URL: https://issues.apache.org/jira/browse/YARN-2411
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler
Reporter: Ram Venkatesh
Assignee: Ram Venkatesh
 Attachments: YARN-2411-2.patch, YARN-2411.1.patch, YARN-2411.3.patch, 
 YARN-2411.4.patch, YARN-2411.5.patch


 YARN-2257 has a proposal to extend and share the queue placement rules for 
 the fair scheduler and the capacity scheduler. This is a good long term 
 solution to streamline queue placement of both schedulers but it has core 
 infra work that has to happen first and might require changes to current 
 features in all schedulers along with corresponding configuration changes, if 
 any. 
 I would like to propose a change with a smaller scope in the capacity 
 scheduler that addresses the core use cases for implicitly mapping jobs that 
 have the default queue or no queue specified to specific queues based on the 
 submitting user and user groups. It will be useful in a number of real-world 
 scenarios and can be migrated over to the unified scheme when YARN-2257 
 becomes available.
 The proposal is to add two new configuration options:
 yarn.scheduler.capacity.queue-mappings-override.enable 
 A boolean that controls if user-specified queues can be overridden by the 
 mapping, default is false.
 and,
 yarn.scheduler.capacity.queue-mappings
 A string that specifies a list of mappings in the following format (default 
 is  which is the same as no mapping)
 map_specifier:source_attribute:queue_name[,map_specifier:source_attribute:queue_name]*
 map_specifier := user (u) | group (g)
 source_attribute := user | group | %user
 queue_name := the name of the mapped queue | %user | %primary_group
 The mappings will be evaluated left to right, and the first valid mapping 
 will be used. If the mapped queue does not exist, or the current user does 
 not have permissions to submit jobs to the mapped queue, the submission will 
 fail.
 Example usages:
 1. user1 is mapped to queue1, group1 is mapped to queue2
 u:user1:queue1,g:group1:queue2
 2. To map users to queues with the same name as the user:
 u:%user:%user
 I am happy to volunteer to take this up.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2424) LCE should support non-cgroups, non-secure mode

2014-08-17 Thread Alejandro Abdelnur (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100134#comment-14100134
 ] 

Alejandro Abdelnur commented on YARN-2424:
--

please refer to yarn-1253 comments, it was stated there that the old behavior 
had security issues.

 LCE should support non-cgroups, non-secure mode
 ---

 Key: YARN-2424
 URL: https://issues.apache.org/jira/browse/YARN-2424
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.3.0, 2.4.0, 2.5.0, 2.4.1
Reporter: Allen Wittenauer
Priority: Blocker
  Labels: regression
 Attachments: YARN-2424.patch


 After YARN-1253, LCE no longer works for non-secure, non-cgroup scenarios.  
 This is a fairly serious regression, as turning on LCE prior to turning on 
 full-blown security is a fairly standard procedure.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2424) LCE should support non-cgroups, non-secure mode

2014-08-17 Thread Allen Wittenauer (JIRA)

[
https://issues.apache.org/jira/browse/YARN-2424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100153#comment-14100153
]

Allen Wittenauer commented on YARN-2424:

This fix is all about EOU and operability. I can certainly understand the
desire to run cgroups without needing local users. But transitioning to
security is not a binary process for most users (or, at least, it doesn't have
to be...)

The problem with the current code base is that someone moving to a secure mode
now has to either enable cgroups (which, as pointed out in YARN-1253 is
irrelevant for security) or cut everything over at once. Enabling LCE prior to
enabling security allows for a two step transition and eases problem
determination when doing the security upgrade. Is that user missing from the
system or is Kerberos failing? Clearly the issues stemming from the former can
be sorted out without security. This makes the operations side of the house
much easier.

It's also worth pointing out that one of the key benefits of running tasks as
the user who submitted them is that it makes troubleshooting much easier. When
one hops on a node, it is evident as to which user's tasks one is looking at
it, even if those tasks aren't validated as that user. This is especially
important in heavy multi-tenant scenarios.

But, again, the fix in YARN-1253 caused a regression. LCE w/out security was
supported prior to Hadoop 2.3 and was definitely used by people.This change
still sets the default to be LCE w/either one user or security, but now for
folks who want the prior behavior, they can flip a flag and get it.

LCE should support non-cgroups, non-secure mode
---

Key: YARN-2424
URL: https://issues.apache.org/jira/browse/YARN-2424
Project: Hadoop YARN
Issue Type: Bug
Components: nodemanager
Affects Versions: 2.3.0, 2.4.0, 2.5.0, 2.4.1
Reporter: Allen Wittenauer
Priority: Blocker
Labels: regression
Attachments: YARN-2424.patch

After YARN-1253, LCE no longer works for non-secure, non-cgroup scenarios.
This is a fairly serious regression, as turning on LCE prior to turning on
full-blown security is a fairly standard procedure.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2424) LCE should support non-cgroups, non-secure mode

2014-08-17 Thread Alejandro Abdelnur (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100158#comment-14100158
 ] 

Alejandro Abdelnur commented on YARN-2424:
--

please go over todd's comment over the security issues on sudoing as user 
without secure auth. definitely you don't want to do that in a multi-tenant 
cluster. 

btw, fixing a security bug is not a regression.  

 LCE should support non-cgroups, non-secure mode
 ---

 Key: YARN-2424
 URL: https://issues.apache.org/jira/browse/YARN-2424
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.3.0, 2.4.0, 2.5.0, 2.4.1
Reporter: Allen Wittenauer
Priority: Blocker
  Labels: regression
 Attachments: YARN-2424.patch


 After YARN-1253, LCE no longer works for non-secure, non-cgroup scenarios.  
 This is a fairly serious regression, as turning on LCE prior to turning on 
 full-blown security is a fairly standard procedure.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2424) LCE should support non-cgroups, non-secure mode

2014-08-17 Thread Allen Wittenauer (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100163#comment-14100163
 ] 

Allen Wittenauer commented on YARN-2424:


I don't think you understood what I wrote.

 LCE should support non-cgroups, non-secure mode
 ---

 Key: YARN-2424
 URL: https://issues.apache.org/jira/browse/YARN-2424
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.3.0, 2.4.0, 2.5.0, 2.4.1
Reporter: Allen Wittenauer
Priority: Blocker
  Labels: regression
 Attachments: YARN-2424.patch


 After YARN-1253, LCE no longer works for non-secure, non-cgroup scenarios.  
 This is a fairly serious regression, as turning on LCE prior to turning on 
 full-blown security is a fairly standard procedure.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-1198) Capacity Scheduler headroom calculation does not work as expected

2014-08-17 Thread Craig Welch (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-1198:
--

Attachment: YARN-1198.7.patch

Can't repro test failure - uploading (better formatted) patch to trigger new 
build

 Capacity Scheduler headroom calculation does not work as expected
 -

 Key: YARN-1198
 URL: https://issues.apache.org/jira/browse/YARN-1198
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Omkar Vinit Joshi
Assignee: Craig Welch
 Attachments: YARN-1198.1.patch, YARN-1198.2.patch, YARN-1198.3.patch, 
 YARN-1198.4.patch, YARN-1198.5.patch, YARN-1198.6.patch, YARN-1198.7.patch


 Today headroom calculation (for the app) takes place only when
 * New node is added/removed from the cluster
 * New container is getting assigned to the application.
 However there are potentially lot of situations which are not considered for 
 this calculation
 * If a container finishes then headroom for that application will change and 
 should be notified to the AM accordingly.
 * If a single user has submitted multiple applications (app1 and app2) to the 
 same queue then
 ** If app1's container finishes then not only app1's but also app2's AM 
 should be notified about the change in headroom.
 ** Similarly if a container is assigned to any applications app1/app2 then 
 both AM should be notified about their headroom.
 ** To simplify the whole communication process it is ideal to keep headroom 
 per User per LeafQueue so that everyone gets the same picture (apps belonging 
 to same user and submitted in same queue).
 * If a new user submits an application to the queue then all applications 
 submitted by all users in that queue should be notified of the headroom 
 change.
 * Also today headroom is an absolute number ( I think it should be normalized 
 but then this is going to be not backward compatible..)
 * Also  when admin user refreshes queue headroom has to be updated.
 These all are the potential bugs in headroom calculations



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2424) LCE should support non-cgroups, non-secure mode

2014-08-17 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100171#comment-14100171
 ] 

Hadoop QA commented on YARN-2424:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12662313/YARN-2424.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4654//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4654//console

This message is automatically generated.

 LCE should support non-cgroups, non-secure mode
 ---

 Key: YARN-2424
 URL: https://issues.apache.org/jira/browse/YARN-2424
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.3.0, 2.4.0, 2.5.0, 2.4.1
Reporter: Allen Wittenauer
Priority: Blocker
  Labels: regression
 Attachments: YARN-2424.patch


 After YARN-1253, LCE no longer works for non-secure, non-cgroup scenarios.  
 This is a fairly serious regression, as turning on LCE prior to turning on 
 full-blown security is a fairly standard procedure.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1198) Capacity Scheduler headroom calculation does not work as expected

2014-08-17 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100190#comment-14100190
 ] 

Hadoop QA commented on YARN-1198:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12662403/YARN-1198.7.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4655//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4655//console

This message is automatically generated.

 Capacity Scheduler headroom calculation does not work as expected
 -

 Key: YARN-1198
 URL: https://issues.apache.org/jira/browse/YARN-1198
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Omkar Vinit Joshi
Assignee: Craig Welch
 Attachments: YARN-1198.1.patch, YARN-1198.2.patch, YARN-1198.3.patch, 
 YARN-1198.4.patch, YARN-1198.5.patch, YARN-1198.6.patch, YARN-1198.7.patch


 Today headroom calculation (for the app) takes place only when
 * New node is added/removed from the cluster
 * New container is getting assigned to the application.
 However there are potentially lot of situations which are not considered for 
 this calculation
 * If a container finishes then headroom for that application will change and 
 should be notified to the AM accordingly.
 * If a single user has submitted multiple applications (app1 and app2) to the 
 same queue then
 ** If app1's container finishes then not only app1's but also app2's AM 
 should be notified about the change in headroom.
 ** Similarly if a container is assigned to any applications app1/app2 then 
 both AM should be notified about their headroom.
 ** To simplify the whole communication process it is ideal to keep headroom 
 per User per LeafQueue so that everyone gets the same picture (apps belonging 
 to same user and submitted in same queue).
 * If a new user submits an application to the queue then all applications 
 submitted by all users in that queue should be notified of the headroom 
 change.
 * Also today headroom is an absolute number ( I think it should be normalized 
 but then this is going to be not backward compatible..)
 * Also  when admin user refreshes queue headroom has to be updated.
 These all are the potential bugs in headroom calculations



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1198) Capacity Scheduler headroom calculation does not work as expected

2014-08-17 Thread Chen He (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100204#comment-14100204
 ] 

Chen He commented on YARN-1198:
---

Thank you for the update, [~cwelch].

 Capacity Scheduler headroom calculation does not work as expected
 -

 Key: YARN-1198
 URL: https://issues.apache.org/jira/browse/YARN-1198
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Omkar Vinit Joshi
Assignee: Craig Welch
 Attachments: YARN-1198.1.patch, YARN-1198.2.patch, YARN-1198.3.patch, 
 YARN-1198.4.patch, YARN-1198.5.patch, YARN-1198.6.patch, YARN-1198.7.patch


 Today headroom calculation (for the app) takes place only when
 * New node is added/removed from the cluster
 * New container is getting assigned to the application.
 However there are potentially lot of situations which are not considered for 
 this calculation
 * If a container finishes then headroom for that application will change and 
 should be notified to the AM accordingly.
 * If a single user has submitted multiple applications (app1 and app2) to the 
 same queue then
 ** If app1's container finishes then not only app1's but also app2's AM 
 should be notified about the change in headroom.
 ** Similarly if a container is assigned to any applications app1/app2 then 
 both AM should be notified about their headroom.
 ** To simplify the whole communication process it is ideal to keep headroom 
 per User per LeafQueue so that everyone gets the same picture (apps belonging 
 to same user and submitted in same queue).
 * If a new user submits an application to the queue then all applications 
 submitted by all users in that queue should be notified of the headroom 
 change.
 * Also today headroom is an absolute number ( I think it should be normalized 
 but then this is going to be not backward compatible..)
 * Also  when admin user refreshes queue headroom has to be updated.
 These all are the potential bugs in headroom calculations



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-415) Capture memory utilization at the app-level for chargeback

2014-08-17 Thread Jian He (JIRA)

[
https://issues.apache.org/jira/browse/YARN-415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100253#comment-14100253
]

Jian He commented on YARN-415:
--

bq. an attempt can be in the complete state before all of its containers are
finished
CapacityScheduler#doneApplicationAttempt(FairScheduler#removeApplicationAttempt)
are synchronously finishing all the live containers. so I think all containers
should be guaranteed to finish before attempt finishes.
bq. charging the running containers to the current app until the containers
finish will be seemless to the end user.
Particularly in work-preserving AM restart, current AM is actually the one
who's managing previous running containers. Running containers in scheduler are
already transferred to the current AM. So running containers metrics are
transferred as well. I think it'll be confusing if finished containers are
still charged back against the previous dead attempt. Btw, YARN-1809 will add
the attempt web page where we could show attempt-specific metrics also.

Regarding the problem of metrics persistency. Agree that it doesn't solve the
problem for running apps in general. Maybe we can have the state store changes
in a separate jira and discuss more there, so that we can get this in first.

Capture memory utilization at the app-level for chargeback
--

Key: YARN-415
URL: https://issues.apache.org/jira/browse/YARN-415
Project: Hadoop YARN
Issue Type: New Feature
Components: resourcemanager
Affects Versions: 0.23.6
Reporter: Kendall Thrapp
Assignee: Andrey Klochkov
Attachments: YARN-415--n10.patch, YARN-415--n2.patch,
YARN-415--n3.patch, YARN-415--n4.patch, YARN-415--n5.patch,
YARN-415--n6.patch, YARN-415--n7.patch, YARN-415--n8.patch,
YARN-415--n9.patch, YARN-415.201405311749.txt, YARN-415.201406031616.txt,
YARN-415.201406262136.txt, YARN-415.201407042037.txt,
YARN-415.201407071542.txt, YARN-415.201407171553.txt,
YARN-415.201407172144.txt, YARN-415.201407232237.txt,
YARN-415.201407242148.txt, YARN-415.201407281816.txt,
YARN-415.201408062232.txt, YARN-415.201408080204.txt,
YARN-415.201408092006.txt, YARN-415.201408132109.txt,
YARN-415.201408150030.txt, YARN-415.patch

For the purpose of chargeback, I'd like to be able to compute the cost of an
application in terms of cluster resource usage. To start out, I'd like to
get the memory utilization of an application. The unit should be MB-seconds
or something similar and, from a chargeback perspective, the memory amount
should be the memory reserved for the application, as even if the app didn't
use all that memory, no one else was able to use it.
(reserved ram for container 1 * lifetime of container 1) + (reserved ram for
container 2 * lifetime of container 2) + ... + (reserved ram for container n
* lifetime of container n)
It'd be nice to have this at the app level instead of the job level because:
1. We'd still be able to get memory usage for jobs that crashed (and wouldn't
appear on the job history server).
2. We'd be able to get memory usage for future non-MR jobs (e.g. Storm).
This new metric should be available both through the RM UI and RM Web
Services REST API.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2411) [Capacity Scheduler] support simple user and group mappings to queues

2014-08-17 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100278#comment-14100278
 ] 

Hadoop QA commented on YARN-2411:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12662375/YARN-2411.5.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.TestRMEmbeddedElector
  
org.apache.hadoop.yarn.server.resourcemanager.recovery.TestZKRMStateStoreZKClientConnections

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4657//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4657//console

This message is automatically generated.

 [Capacity Scheduler] support simple user and group mappings to queues
 -

 Key: YARN-2411
 URL: https://issues.apache.org/jira/browse/YARN-2411
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler
Reporter: Ram Venkatesh
Assignee: Ram Venkatesh
 Attachments: YARN-2411-2.patch, YARN-2411.1.patch, YARN-2411.3.patch, 
 YARN-2411.4.patch, YARN-2411.5.patch


 YARN-2257 has a proposal to extend and share the queue placement rules for 
 the fair scheduler and the capacity scheduler. This is a good long term 
 solution to streamline queue placement of both schedulers but it has core 
 infra work that has to happen first and might require changes to current 
 features in all schedulers along with corresponding configuration changes, if 
 any. 
 I would like to propose a change with a smaller scope in the capacity 
 scheduler that addresses the core use cases for implicitly mapping jobs that 
 have the default queue or no queue specified to specific queues based on the 
 submitting user and user groups. It will be useful in a number of real-world 
 scenarios and can be migrated over to the unified scheme when YARN-2257 
 becomes available.
 The proposal is to add two new configuration options:
 yarn.scheduler.capacity.queue-mappings-override.enable 
 A boolean that controls if user-specified queues can be overridden by the 
 mapping, default is false.
 and,
 yarn.scheduler.capacity.queue-mappings
 A string that specifies a list of mappings in the following format (default 
 is  which is the same as no mapping)
 map_specifier:source_attribute:queue_name[,map_specifier:source_attribute:queue_name]*
 map_specifier := user (u) | group (g)
 source_attribute := user | group | %user
 queue_name := the name of the mapped queue | %user | %primary_group
 The mappings will be evaluated left to right, and the first valid mapping 
 will be used. If the mapped queue does not exist, or the current user does 
 not have permissions to submit jobs to the mapped queue, the submission will 
 fail.
 Example usages:
 1. user1 is mapped to queue1, group1 is mapped to queue2
 u:user1:queue1,g:group1:queue2
 2. To map users to queues with the same name as the user:
 u:%user:%user
 I am happy to volunteer to take this up.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2424) LCE should support non-cgroups, non-secure mode

2014-08-17 Thread Alejandro Abdelnur (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14100302#comment-14100302
 ] 

Alejandro Abdelnur commented on YARN-2424:
--

I think I did, if I'm reading correctly you are stating that is better for 
troubleshooting, specially in multi-tenant scenarios: 

bq. It's also worth pointing out that one of the key benefits of running tasks 
as the user who submitted them is that it makes troubleshooting much easier. 
When one hops on a node, it is evident as to which user's tasks one is looking 
at it, even if those tasks aren't validated as that user. This is especially 
important in heavy multi-tenant scenarios.


 LCE should support non-cgroups, non-secure mode
 ---

 Key: YARN-2424
 URL: https://issues.apache.org/jira/browse/YARN-2424
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.3.0, 2.4.0, 2.5.0, 2.4.1
Reporter: Allen Wittenauer
Priority: Blocker
  Labels: regression
 Attachments: YARN-2424.patch


 After YARN-1253, LCE no longer works for non-secure, non-cgroup scenarios.  
 This is a fairly serious regression, as turning on LCE prior to turning on 
 full-blown security is a fairly standard procedure.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2077) JobImpl#makeUberDecision doesn't log that Uber mode is disabled because of too much CPUs

2014-08-17 Thread Tsuyoshi OZAWA (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated YARN-2077:
-

Affects Version/s: 2.5.0

 JobImpl#makeUberDecision doesn't log that Uber mode is disabled because of 
 too much CPUs
 

 Key: YARN-2077
 URL: https://issues.apache.org/jira/browse/YARN-2077
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Affects Versions: 2.4.0, 2.5.0
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
Priority: Trivial
 Attachments: YARN-2077.1.patch


 JobImpl#makeUberDecision usually logs why the Job cannot be launched as Uber 
 mode(e.g. too much RAM; or something).  About CPUs, it's not logged 
 currently. We should log it when too much CPU.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-1919) Log yarn.resourcemanager.cluster-id is required for HA instead of throwing NPE

2014-08-17 Thread Tsuyoshi OZAWA (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated YARN-1919:
-

Affects Version/s: 2.5.0

 Log yarn.resourcemanager.cluster-id is required for HA instead of throwing NPE
 --

 Key: YARN-1919
 URL: https://issues.apache.org/jira/browse/YARN-1919
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.3.0, 2.4.0, 2.5.0
Reporter: Devaraj K
Assignee: Tsuyoshi OZAWA
Priority: Minor
 Attachments: YARN-1919.1.patch


 {code:xml}
 2014-04-09 16:14:16,392 WARN org.apache.hadoop.service.AbstractService: When 
 stopping the service 
 org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService : 
 java.lang.NullPointerException
 java.lang.NullPointerException
   at 
 org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.serviceStop(EmbeddedElectorService.java:108)
   at 
 org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
   at 
 org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52)
   at 
 org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:171)
   at 
 org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.AdminService.serviceInit(AdminService.java:122)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
   at 
 org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:232)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1038)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (YARN-667) Data persisted in RM should be versioned

[jira] [Commented] (YARN-667) Data persisted in RM should be versioned

[jira] [Updated] (YARN-1198) Capacity Scheduler headroom calculation does not work as expected

[jira] [Commented] (YARN-1198) Capacity Scheduler headroom calculation does not work as expected

[jira] [Updated] (YARN-2411) [Capacity Scheduler] support simple user and group mappings to queues

[jira] [Commented] (YARN-2033) Investigate merging generic-history into the Timeline Store

[jira] [Commented] (YARN-2411) [Capacity Scheduler] support simple user and group mappings to queues

[jira] [Commented] (YARN-2411) [Capacity Scheduler] support simple user and group mappings to queues

[jira] [Updated] (YARN-2411) [Capacity Scheduler] support simple user and group mappings to queues

[jira] [Commented] (YARN-2424) LCE should support non-cgroups, non-secure mode

[jira] [Commented] (YARN-2424) LCE should support non-cgroups, non-secure mode

[jira] [Commented] (YARN-2424) LCE should support non-cgroups, non-secure mode

[jira] [Commented] (YARN-2424) LCE should support non-cgroups, non-secure mode

[jira] [Updated] (YARN-1198) Capacity Scheduler headroom calculation does not work as expected

[jira] [Commented] (YARN-2424) LCE should support non-cgroups, non-secure mode

[jira] [Commented] (YARN-1198) Capacity Scheduler headroom calculation does not work as expected

[jira] [Commented] (YARN-1198) Capacity Scheduler headroom calculation does not work as expected

[jira] [Commented] (YARN-415) Capture memory utilization at the app-level for chargeback

[jira] [Commented] (YARN-2411) [Capacity Scheduler] support simple user and group mappings to queues

[jira] [Commented] (YARN-2424) LCE should support non-cgroups, non-secure mode

[jira] [Updated] (YARN-2077) JobImpl#makeUberDecision doesn't log that Uber mode is disabled because of too much CPUs

[jira] [Updated] (YARN-1919) Log yarn.resourcemanager.cluster-id is required for HA instead of throwing NPE

22 matches

Site Navigation

Mail list logo

Footer information