[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle

2016-11-10 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656417#comment-15656417
 ] 

Arun Suresh commented on YARN-4597:
---

Appreciate the review [~kkaranasos],

1.
bq. The Container has two new methods (sendLaunchEvent and sendKillEvent), 
which are public and are not following..
sendKillEvent is used by the Scheduler (which is in another package) to kill a 
container. Since this patch introduces an external entity that launches and 
kills a container, viz. the Scheduler, I feel it is apt to keep both as public 
methods. I prefer it to 'dispatcher.getEventHandler().handle..'. 

2.
The Container needs to be added to the {{nodeUpdateQueue}} if the container is 
to be move from ACQUIRED to RUNNING state (this is a state transition all 
containers should go thru). Regarding the {{launchedContainers}}, Lets have 
both Opportunistic and Guaranteed containers flow through a common code-path... 
and introduce specific behaviors if required in subsequent patches as and when 
required.

3.
bq. In the OpportunisticContainerAllocatorAMService we are now calling the 
SchedulerNode::allocate, and then we do not update the used resources but we do 
update some other counters, which leads to inconsistencies.
Hmmm... I do see that the numContainers are not decremented correctly when 
release. Thanks... but it looks like it would more likely just impact reporting 
 / UI, nothing functional (Will update the patch). Can you specify which other 
counters specifically ? Like I mentioned in the previous patch.. lets run all 
containers thru as much of the common code path before we add new counters etc.

4.
bq. Maybe as part of a different JIRA, we should at some point extend the 
container.metrics in the ContainerImpl to keep track of the scheduled/queued 
containers.
Yup.. +1 to that.

The rest of your comments make sense... will update patch.

bq. let's stress-test the code in a cluster before committing to make sure 
everything is good
It has been tested on a 3 node cluster and MR Pi jobs (with opportunistic 
containers) and I didn't hit any major issues. We can always open follow-up 
JIRAs for specific performance related issues as and when we find it. Besides, 
stess-testing is not really a precondition to committing a patch.


> Add SCHEDULE to NM container lifecycle
> --
>
> Key: YARN-4597
> URL: https://issues.apache.org/jira/browse/YARN-4597
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager
>Reporter: Chris Douglas
>Assignee: Arun Suresh
>  Labels: oct16-hard
> Attachments: YARN-4597.001.patch, YARN-4597.002.patch, 
> YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, 
> YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, 
> YARN-4597.009.patch
>
>
> Currently, the NM immediately launches containers after resource 
> localization. Several features could be more cleanly implemented if the NM 
> included a separate stage for reserving resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5792) adopt the id prefix for YARN, MR, and DS entities

2016-11-10 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-5792:
---
Attachment: YARN-5792-YARN-5355.03.patch

> adopt the id prefix for YARN, MR, and DS entities
> -
>
> Key: YARN-5792
> URL: https://issues.apache.org/jira/browse/YARN-5792
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-5355
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-5792-YARN-5355.01.patch, 
> YARN-5792-YARN-5355.02.patch, YARN-5792-YARN-5355.03.patch
>
>
> We introduced the entity id prefix to support flexible entity sorting 
> (YARN-5715). We should adopt the id prefix for YARN entities, MR entities, 
> and DS entities to take advantage of the id prefix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5765) LinuxContainerExecutor creates appcache and its subdirectories with wrong group owner.

2016-11-10 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656333#comment-15656333
 ] 

Naganarasimha G R commented on YARN-5765:
-

@Thanks [~haibochen] & [~miklos.szeg...@cloudera.com] for some insightful 
comments
There are 2 other places apart from launch_container_as_user where in mkdirs 
are getting used.
{code}
main
RUN_AS_USER_INITIALIZE_CONTAINER
mount_cgroup
mkdirs
create_validate_dir
MOUNT_CGROUPS
initialize_app
mkdirs
create_validate_dir
{code}

IIUC only setting umask before change_effective_user would not be ideal as it 
would be required in other places too.
What i want to understand is what impact would it have if we do it always ? As 
we never run the container-executor.c binary with root user refer (set_user -> 
check_user) and would it be sufficient to reset the umask after mkdir ?

bq. This means that by removing chmod this change does not apply to cases 
anymore, when the default ACL is too restrictive. Could this be an issue, or do 
we rely on the admin to set the default ACL correctly?
Good query ... something to be thought about ! not sure we will be able to 
handle it. One more question is if we reset the umask after mkdir then will the 
container logs created will be accessible to the NM because of restrictive 
rights ? would be ideal to set default ACL for the folders created  and reset 
the umask so that  files created by the user under these directories have the 
rightful permissions?


> LinuxContainerExecutor creates appcache and its subdirectories with wrong 
> group owner.
> --
>
> Key: YARN-5765
> URL: https://issues.apache.org/jira/browse/YARN-5765
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.8.0, 3.0.0-alpha1
>Reporter: Haibo Chen
>Assignee: Naganarasimha G R
>Priority: Blocker
> Attachments: YARN-5765.001.patch
>
>
> LinuxContainerExecutor creates usercache/\{userId\}/appcache/\{appId\} with 
> wrong group owner, causing Log aggregation and ShuffleHandler to fail because 
> node manager process does not have permission to read the files under the 
> directory.
> This can be easily reproduced by enabling LCE and submitting a MR example job 
> as a user that does not belong to the same group that NM process belongs to. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5600) Add a parameter to ContainerLaunchContext to emulate yarn.nodemanager.delete.debug-delay-sec on a per-application basis

2016-11-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656248#comment-15656248
 ] 

Hadoop QA commented on YARN-5600:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 6 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
52s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  4m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
54s{color} | {color:green} hadoop-yarn-project/hadoop-yarn: The patch generated 
0 new + 538 unchanged - 21 fixed = 538 total (was 559) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 34s{color} 
| {color:red} hadoop-yarn-api in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 15m 
44s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
39s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 57m 11s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.conf.TestYarnConfigurationFields |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:e809691 |
| JIRA Issue | YARN-5600 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12838500/YARN-5600.008.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 99fb6d6b2e4f 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 8848a8a |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/13868/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-api.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/13868/testReport/ |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 

[jira] [Updated] (YARN-5600) Add a parameter to ContainerLaunchContext to emulate yarn.nodemanager.delete.debug-delay-sec on a per-application basis

2016-11-10 Thread Miklos Szegedi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Szegedi updated YARN-5600:
-
Attachment: YARN-5600.008.patch

Updated patch based on the previous request.

> Add a parameter to ContainerLaunchContext to emulate 
> yarn.nodemanager.delete.debug-delay-sec on a per-application basis
> ---
>
> Key: YARN-5600
> URL: https://issues.apache.org/jira/browse/YARN-5600
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Daniel Templeton
>Assignee: Miklos Szegedi
>  Labels: oct16-medium
> Attachments: YARN-5600.000.patch, YARN-5600.001.patch, 
> YARN-5600.002.patch, YARN-5600.003.patch, YARN-5600.004.patch, 
> YARN-5600.005.patch, YARN-5600.006.patch, YARN-5600.007.patch, 
> YARN-5600.008.patch
>
>
> To make debugging application launch failures simpler, I'd like to add a 
> parameter to the CLC to allow an application owner to request delayed 
> deletion of the application's launch artifacts.
> This JIRA solves largely the same problem as YARN-5599, but for cases where 
> ATS is not in use, e.g. branch-2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5814) Add druid as storage backend in YARN Timeline Service

2016-11-10 Thread Bingxue Qiu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bingxue Qiu updated YARN-5814:
--
Attachment: Add-Druid-in-YARN-Timeline-Service.pdf

>  Add druid as storage backend in YARN Timeline Service
> --
>
> Key: YARN-5814
> URL: https://issues.apache.org/jira/browse/YARN-5814
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: ATSv2
>Affects Versions: 3.0.0-alpha2
>Reporter: Bingxue Qiu
> Attachments: Add-Druid-in-YARN-Timeline-Service.pdf
>
>
> h3. Introduction
> I propose to add druid as storage backend in YARN Timeline Service.
> We run more than 6000 applications and generate 450 million metrics daily in 
> Alibaba Clusters with thousands of nodes. We need to collect and store 
> meta/events/metrics data, online analyze the utilization reports of various 
> dimensions and display the trends of allocation/usage resources for cluster 
> by joining and aggregating data. It helps us to manage and optimize the 
> cluster by tracking resource utilization.
> To achieve our goal we have changed to use druid as the storage instead of 
> HBase and have achieved sub-second OLAP performance in our production 
> environment for few months. 
> h3. Analysis
> Currently YARN Timeline Service only supports aggregating metrics at a) flow 
> level by FlowRunCoprocessor and b) application level metrics aggregating by 
> AppLevelTimelineCollector, offline (time-based periodic) aggregation for 
> flows/users/queues for reporting and analysis is planned but not yet 
> implemented. YARN Timeline Service chooses Apache HBase as the primary 
> storage backend. As we all know that HBase doesn't fit for OLAP.
>  For arbitrary exploration of data,such as online analyze the utilization 
> reports of various dimensions(Queue,Flow,Users,Application,CPU,Memory) by 
> joining and aggregating data, Druid's custom column format enables ad-hoc 
> queries without pre-computation. The format also enables fast scans on 
> columns, which is important for good aggregation performance.
> To achieve our goal that support to online analyze the utilization reports of 
> various dimensions, display the variation trends of allocation/usage 
> resources for cluster, and arbitrary exploration of data, we propose to add 
> druid storage and implement DruidWriter /DruidReader in YARN Timeline Service.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5814) Add druid as storage backend in YARN Timeline Service

2016-11-10 Thread Bingxue Qiu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bingxue Qiu updated YARN-5814:
--
Attachment: (was: Add-Druid-in-YARN-Timeline-Service.pdf)

>  Add druid as storage backend in YARN Timeline Service
> --
>
> Key: YARN-5814
> URL: https://issues.apache.org/jira/browse/YARN-5814
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: ATSv2
>Affects Versions: 3.0.0-alpha2
>Reporter: Bingxue Qiu
>
> h3. Introduction
> I propose to add druid as storage backend in YARN Timeline Service.
> We run more than 6000 applications and generate 450 million metrics daily in 
> Alibaba Clusters with thousands of nodes. We need to collect and store 
> meta/events/metrics data, online analyze the utilization reports of various 
> dimensions and display the trends of allocation/usage resources for cluster 
> by joining and aggregating data. It helps us to manage and optimize the 
> cluster by tracking resource utilization.
> To achieve our goal we have changed to use druid as the storage instead of 
> HBase and have achieved sub-second OLAP performance in our production 
> environment for few months. 
> h3. Analysis
> Currently YARN Timeline Service only supports aggregating metrics at a) flow 
> level by FlowRunCoprocessor and b) application level metrics aggregating by 
> AppLevelTimelineCollector, offline (time-based periodic) aggregation for 
> flows/users/queues for reporting and analysis is planned but not yet 
> implemented. YARN Timeline Service chooses Apache HBase as the primary 
> storage backend. As we all know that HBase doesn't fit for OLAP.
>  For arbitrary exploration of data,such as online analyze the utilization 
> reports of various dimensions(Queue,Flow,Users,Application,CPU,Memory) by 
> joining and aggregating data, Druid's custom column format enables ad-hoc 
> queries without pre-computation. The format also enables fast scans on 
> columns, which is important for good aggregation performance.
> To achieve our goal that support to online analyze the utilization reports of 
> various dimensions, display the variation trends of allocation/usage 
> resources for cluster, and arbitrary exploration of data, we propose to add 
> druid storage and implement DruidWriter /DruidReader in YARN Timeline Service.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5814) Add druid as storage backend in YARN Timeline Service

2016-11-10 Thread Bingxue Qiu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bingxue Qiu updated YARN-5814:
--
Attachment: Add-Druid-in-YARN-Timeline-Service.pdf

>  Add druid as storage backend in YARN Timeline Service
> --
>
> Key: YARN-5814
> URL: https://issues.apache.org/jira/browse/YARN-5814
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: ATSv2
>Affects Versions: 3.0.0-alpha2
>Reporter: Bingxue Qiu
> Attachments: Add-Druid-in-YARN-Timeline-Service.pdf
>
>
> h3. Introduction
> I propose to add druid as storage backend in YARN Timeline Service.
> We run more than 6000 applications and generate 450 million metrics daily in 
> Alibaba Clusters with thousands of nodes. We need to collect and store 
> meta/events/metrics data, online analyze the utilization reports of various 
> dimensions and display the trends of allocation/usage resources for cluster 
> by joining and aggregating data. It helps us to manage and optimize the 
> cluster by tracking resource utilization.
> To achieve our goal we have changed to use druid as the storage instead of 
> HBase and have achieved sub-second OLAP performance in our production 
> environment for few months. 
> h3. Analysis
> Currently YARN Timeline Service only supports aggregating metrics at a) flow 
> level by FlowRunCoprocessor and b) application level metrics aggregating by 
> AppLevelTimelineCollector, offline (time-based periodic) aggregation for 
> flows/users/queues for reporting and analysis is planned but not yet 
> implemented. YARN Timeline Service chooses Apache HBase as the primary 
> storage backend. As we all know that HBase doesn't fit for OLAP.
>  For arbitrary exploration of data,such as online analyze the utilization 
> reports of various dimensions(Queue,Flow,Users,Application,CPU,Memory) by 
> joining and aggregating data, Druid's custom column format enables ad-hoc 
> queries without pre-computation. The format also enables fast scans on 
> columns, which is important for good aggregation performance.
> To achieve our goal that support to online analyze the utilization reports of 
> various dimensions, display the variation trends of allocation/usage 
> resources for cluster, and arbitrary exploration of data, we propose to add 
> druid storage and implement DruidWriter /DruidReader in YARN Timeline Service.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle

2016-11-10 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655975#comment-15655975
 ] 

Konstantinos Karanasos commented on YARN-4597:
--

Thanks for working on this, [~asuresh]! I am sending some first comments. I 
have not yet looked at the {{ContainerScheduler}} -- I will do that tomorrow.

- The {{Container}} has two new methods ({{sendLaunchEvent}} and 
{{sendKillEvent}}), which are public and are not following the design of the 
rest of the code that keeps such methods private and calls them through 
transitions in the {{ContainerImpl}}. Let's try to use the existing design if 
possible.

- In {{RMNodeImpl}}:
-- Instead of using the {{launchedContainers}} for both the launched and the 
queued, we might want to split it in two: one for the launched and one for the 
queued containers.
-- I think we should not add opportunistic containers to the 
{{launchContainers}}. If we do, they will be added to the 
{{newlyLaunchedContainers}}, then to the {{nodeUpdateQueue}}, and, if I am not 
wrong, they will be propagated to the schedulers for the guaranteed containers, 
which will create problems. I have to look at it a bit more, but my hunch is 
that we should avoid doing it. Even if it does not affect the resource 
accounting, I don't see any advantage to adding them.

- In the {{OpportunisticContainerAllocatorAMService}} we are now calling the 
{{SchedulerNode::allocate}}, and then we do not update the used resources, but 
we do update some other counters, which leads to inconsistencies. For example, 
when releasing a container, I think at the moment we are not calling the 
release of the {{SchedulerNode}}, which means that the container count will 
become inconsistent.
-- Instead, I suggest to add some counters for opportunistic containers at the 
{{SchedulerNode}}, both for the number of containers and for the resources 
used. In this case, we need to make sure that those resources are released too.

- Maybe as part of a different JIRA, we should at some point extend the 
{{container.metrics}} in the {{ContainerImpl}} to keep track of the 
scheduled/queued containers.

h6. Nits:
- There seem to be two redundant parameters at {{YarnConfiguration}} at the 
moment: {{NM_CONTAINER_QUEUING_MIN_QUEUE_LENGTH}} and 
{{NM_OPPORTUNISTIC_CONTAINERS_MAX_QUEUE_LENGTH}}. If I am not missing 
something, we should keep one of the two.
- {{yarn-default.xml}}: numbed -> number (in a comment)
- {{TestNodeManagerResync}}: I think it is better to use one of the existing 
methods for waiting to get to the RUNNING state.
- In {{Container}}/{{ContainerImpl}} and all the associated classes, I would 
suggest to rename {{isMarkedToKill}} to {{isMarkedForKilling}}. I know it is 
minor, but it is more self-explanatory.

I will send more comments once I check the {{ContainerScheduler}}. 
Also, let's stress-test the code in a cluster before committing to make sure 
everything is good. I can help with that.


> Add SCHEDULE to NM container lifecycle
> --
>
> Key: YARN-4597
> URL: https://issues.apache.org/jira/browse/YARN-4597
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager
>Reporter: Chris Douglas
>Assignee: Arun Suresh
>  Labels: oct16-hard
> Attachments: YARN-4597.001.patch, YARN-4597.002.patch, 
> YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, 
> YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, 
> YARN-4597.009.patch
>
>
> Currently, the NM immediately launches containers after resource 
> localization. Several features could be more cleanly implemented if the NM 
> included a separate stage for reserving resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5761) Separate QueueManager from Scheduler

2016-11-10 Thread Jonathan Hung (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655868#comment-15655868
 ] 

Jonathan Hung commented on YARN-5761:
-

This looks good, a few comments:
# We can probably put the {{static QueueHook}} class in 
CapacitySchedulerQueueManager 
# Can we set the YarnAuthorizationProvider in the CSQueueManager instead of in 
CS?  Since at least for now this is a queue-level component.
# Can we move the queue label stuff to CSQueueManager as well? i.e. 
{{labelManager.reinitializeQueueLabels(getQueueToLabels())}} in 
CS#initializeQueues and CS#reinitializeQueues, we can move to the respective 
methods in CSQueueManager. So we can pass the labelManager to CSQueueManager in 
the constructor, and have getQueueToLabels be a method in CSQueueManager.
# There are calls to {{CS#getQueue}} in some places inside CS, and some calls 
to queueManager.getQueue, perhaps we should make them all the same (we can use 
CS#getQueue since it is a wrapper around queueManager.getQueue)
# Two other methods, {{getDefaultPriorityForQueue}}, {{getAndCheckLeafQueue}} 
maybe consider moving to CSQueueManager. I'm not so sure about 
getDefaultPriorityForQueue, since CSQueueManager does queue configuration, does 
it make sense to have queue attribute accessors such as 
getDefaultPriorityForQueue in CSQueueManager too?



> Separate QueueManager from Scheduler
> 
>
> Key: YARN-5761
> URL: https://issues.apache.org/jira/browse/YARN-5761
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler
>Reporter: Xuan Gong
>Assignee: Xuan Gong
>  Labels: oct16-medium
> Attachments: YARN-5761.1.patch, YARN-5761.1.rebase.patch, 
> YARN-5761.2.patch, YARN-5761.3.patch
>
>
> Currently, in scheduler code, we are doing queue manager and scheduling work. 
> We'd better separate the queue manager out of scheduler logic. In that case, 
> it would be much easier and safer to extend.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4752) [Umbrella] FairScheduler: Improve preemption

2016-11-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655861#comment-15655861
 ] 

Hadoop QA commented on YARN-4752:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 7 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
51s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
44s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  5m 
19s{color} | {color:red} hadoop-yarn in trunk failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
7s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 0s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  3m 
53s{color} | {color:red} hadoop-yarn in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  3m 53s{color} 
| {color:red} hadoop-yarn in the patch failed. {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 49s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 11 new + 183 unchanged - 137 fixed = 194 total (was 320) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
36s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager
 generated 0 new + 916 unchanged - 9 fixed = 916 total (was 925) {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
27s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 37m 
18s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
37s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 80m 10s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:e809691 |
| JIRA Issue | YARN-4752 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12838473/yarn-4752.2.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 4d71dc1e74cc 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 93eeb13 |
| Default Java | 1.8.0_101 |
| compile | 
https://builds.apache.org/job/PreCommit-YARN-Build/13866/artifact/patchprocess/branch-compile-hadoop-yarn-project_hadoop-yarn.txt
 |
| findbugs | v3.0.0 |
| 

[jira] [Commented] (YARN-5600) Add a parameter to ContainerLaunchContext to emulate yarn.nodemanager.delete.debug-delay-sec on a per-application basis

2016-11-10 Thread Miklos Szegedi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655828#comment-15655828
 ] 

Miklos Szegedi commented on YARN-5600:
--

Thank you for reviewing the patch [~wangda], and also thank you for looking 
into the build issue!
I think it is a good idea to have the value in an environment variable. 
Speaking about a debug feature I would like the change to be as small and 
simple as possible.
When talking about environment variables, are you referring to this pattern?
{code}
String containerImageName = container.getLaunchContext().getEnvironment()
.get(YarnConfiguration.NM_DOCKER_CONTAINER_EXECUTOR_IMAGE_NAME);
{code}
I will write a new patch that includes your suggestions including 
yarn.nodemanager.delete.max-debug-delay-sec value to limit the maximum wait.

> Add a parameter to ContainerLaunchContext to emulate 
> yarn.nodemanager.delete.debug-delay-sec on a per-application basis
> ---
>
> Key: YARN-5600
> URL: https://issues.apache.org/jira/browse/YARN-5600
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Daniel Templeton
>Assignee: Miklos Szegedi
>  Labels: oct16-medium
> Attachments: YARN-5600.000.patch, YARN-5600.001.patch, 
> YARN-5600.002.patch, YARN-5600.003.patch, YARN-5600.004.patch, 
> YARN-5600.005.patch, YARN-5600.006.patch, YARN-5600.007.patch
>
>
> To make debugging application launch failures simpler, I'd like to add a 
> parameter to the CLC to allow an application owner to request delayed 
> deletion of the application's launch artifacts.
> This JIRA solves largely the same problem as YARN-5599, but for cases where 
> ATS is not in use, e.g. branch-2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5819) Verify fairshare and minshare preemption

2016-11-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655814#comment-15655814
 ] 

Hadoop QA commented on YARN-5819:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
24s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
40s{color} | {color:green} YARN-4752 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
37s{color} | {color:green} YARN-4752 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} YARN-4752 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
41s{color} | {color:green} YARN-4752 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
23s{color} | {color:green} YARN-4752 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
5s{color} | {color:green} YARN-4752 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} YARN-4752 passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 19s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 1 new + 30 unchanged - 0 fixed = 31 total (was 30) {color} 
|
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 40m 50s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 57m 52s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.metrics.TestSystemMetricsPublisher |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:e809691 |
| JIRA Issue | YARN-5819 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12838471/yarn-5819.YARN-4752.5.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 4d6e115b41bc 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | YARN-4752 / 2140674 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/13865/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/13865/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/13865/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 

[jira] [Commented] (YARN-5634) Simplify initialization/use of RouterPolicy via a RouterPolicyFacade

2016-11-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655793#comment-15655793
 ] 

Hadoop QA commented on YARN-5634:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  3m  
2s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
24s{color} | {color:green} YARN-2915 passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  5m 
20s{color} | {color:red} hadoop-yarn in YARN-2915 failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
50s{color} | {color:green} YARN-2915 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
10s{color} | {color:green} YARN-2915 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
44s{color} | {color:green} YARN-2915 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
17s{color} | {color:green} YARN-2915 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} YARN-2915 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  4m 
26s{color} | {color:red} hadoop-yarn in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  4m 26s{color} 
| {color:red} hadoop-yarn in the patch failed. {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 49s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 2 new + 227 unchanged - 0 fixed = 229 total (was 227) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
38s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  1m 10s{color} 
| {color:red} hadoop-yarn-server-common in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
46s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 45m 18s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.federation.policies.router.TestWeightedRandomRouterPolicy |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:e809691 |
| JIRA Issue | YARN-5634 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12838472/YARN-5634-YARN-2915.03.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 8586c13d77b0 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 
21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | YARN-2915 / c3a5672 |
| Default Java | 1.8.0_111 |
| compile | 
https://builds.apache.org/job/PreCommit-YARN-Build/13867/artifact/patchprocess/branch-compile-hadoop-yarn-project_hadoop-yarn.txt
 |
| findbugs | v3.0.0 |
| compile | 
https://builds.apache.org/job/PreCommit-YARN-Build/13867/artifact/patchprocess/patch-compile-hadoop-yarn-project_hadoop-yarn.txt
 |
| 

[jira] [Commented] (YARN-5600) Add a parameter to ContainerLaunchContext to emulate yarn.nodemanager.delete.debug-delay-sec on a per-application basis

2016-11-10 Thread Tan, Wangda (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655703#comment-15655703
 ] 

Tan, Wangda commented on YARN-5600:
---

This is a very useful feature without any doubt, thanks 
[~miklos.szeg...@cloudera.com] for working on this JIRA and thanks reviews from 
[~Naganarasimha] / [~templedf] for reviewing the patch. 

Apologize for my very late review, I only looked at API of the patch: Have you 
ever considered the other approach, which we can turn on debug-delay-sec by 
passing a pre-defined environment. The biggest benefit is, we don't need to 
update most applications to use this feature, for example, MR/Spark support 
specify environment. Making changes to all major applications to use this 
feature sounds like a big task.

As an example LinuxDockerContainerExecutor uses the approach which specify 
configurations by passing env var.

And in addition, it will be better to have a global max-debug-delay-sec in 
yarn-site (which could be MAX_INT by default), considering disk space and 
security, we may want application occupy disk beyond some specified time.

+ [~djp]

> Add a parameter to ContainerLaunchContext to emulate 
> yarn.nodemanager.delete.debug-delay-sec on a per-application basis
> ---
>
> Key: YARN-5600
> URL: https://issues.apache.org/jira/browse/YARN-5600
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Daniel Templeton
>Assignee: Miklos Szegedi
>  Labels: oct16-medium
> Attachments: YARN-5600.000.patch, YARN-5600.001.patch, 
> YARN-5600.002.patch, YARN-5600.003.patch, YARN-5600.004.patch, 
> YARN-5600.005.patch, YARN-5600.006.patch, YARN-5600.007.patch
>
>
> To make debugging application launch failures simpler, I'd like to add a 
> parameter to the CLC to allow an application owner to request delayed 
> deletion of the application's launch artifacts.
> This JIRA solves largely the same problem as YARN-5599, but for cases where 
> ATS is not in use, e.g. branch-2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5634) Simplify initialization/use of RouterPolicy via a RouterPolicyFacade

2016-11-10 Thread Carlo Curino (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carlo Curino updated YARN-5634:
---
Attachment: YARN-5634-YARN-2915.03.patch

> Simplify initialization/use of RouterPolicy via a RouterPolicyFacade 
> -
>
> Key: YARN-5634
> URL: https://issues.apache.org/jira/browse/YARN-5634
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Affects Versions: YARN-2915
>Reporter: Carlo Curino
>Assignee: Carlo Curino
>  Labels: oct16-medium
> Attachments: YARN-5634-YARN-2915.01.patch, 
> YARN-5634-YARN-2915.02.patch, YARN-5634-YARN-2915.03.patch
>
>
> The current set of policies require some machinery to (re)initialize based on 
> changes in the SubClusterPolicyConfiguration. This JIRA tracks the effort to 
> hide much of that behind a simple RouterPolicyFacade, making lifecycle and 
> usage of the policies easier to consumers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4752) [Umbrella] FairScheduler: Improve preemption

2016-11-10 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-4752:
---
Attachment: yarn-4752.2.patch

> [Umbrella] FairScheduler: Improve preemption
> 
>
> Key: YARN-4752
> URL: https://issues.apache.org/jira/browse/YARN-4752
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.8.0
>Reporter: Karthik Kambatla
> Attachments: YARN-4752.FairSchedulerPreemptionOverhaul.pdf, 
> yarn-4752-1.patch, yarn-4752.2.patch
>
>
> A number of issues have been reported with respect to preemption in 
> FairScheduler along the lines of:
> # FairScheduler preempts resources from nodes even if the resultant free 
> resources cannot fit the incoming request.
> # Preemption doesn't preempt from sibling queues
> # Preemption doesn't preempt from sibling apps under the same queue that is 
> over its fairshare
> # ...
> Filing this umbrella JIRA to group all the issues together and think of a 
> comprehensive solution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5819) Verify fairshare and minshare preemption

2016-11-10 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-5819:
---
Attachment: yarn-5819.YARN-4752.5.patch

Rebased YARN-4752 on trunk. Updating this patch to catch up with the rebase. 

> Verify fairshare and minshare preemption
> 
>
> Key: YARN-5819
> URL: https://issues.apache.org/jira/browse/YARN-5819
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Affects Versions: 2.9.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Attachments: yarn-5819.YARN-4752.1.patch, 
> yarn-5819.YARN-4752.2.patch, yarn-5819.YARN-4752.3.patch, 
> yarn-5819.YARN-4752.4.patch, yarn-5819.YARN-4752.5.patch
>
>
> JIRA to track the unit test(s) verifying both fairshare and minshare 
> preemption. The tests should verify:
> # preemption within a single leaf queue
> # preemption between sibling leaf queues
> # preemption between non-sibling leaf queues
> # {{allowPreemption = false}} should disallow preemption from a queue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-3053) [Security] Review and implement security in ATS v.2

2016-11-10 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655625#comment-15655625
 ] 

Sangjin Lee commented on YARN-3053:
---

Thanks [~varun_saxena] for putting together the proposal! It's a great start. 
Sorry it took me a while to get to this.

I have a couple of quick questions (maybe more to follow):
- How do other NMs (that are running the containers) authenticate? I don’t 
think they can do a real authentication. Then how would they get the delegation 
token for the app? To solve this, would we be able to allow YARN daemons to 
access and look up the DTs from RM?
- How would each option handle the case of AM failures (and subsequent 
relaunching of app attempts and/or the timeline collector on another node)? It 
wasn’t very clear to me…


> [Security] Review and implement security in ATS v.2
> ---
>
> Key: YARN-3053
> URL: https://issues.apache.org/jira/browse/YARN-3053
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>  Labels: YARN-5355
> Attachments: ATSv2Authentication(draft).pdf
>
>
> Per design in YARN-2928, we want to evaluate and review the system for 
> security, and ensure proper security in the system.
> This includes proper authentication, token management, access control, and 
> any other relevant security aspects.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5600) Add a parameter to ContainerLaunchContext to emulate yarn.nodemanager.delete.debug-delay-sec on a per-application basis

2016-11-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655581#comment-15655581
 ] 

Hadoop QA commented on YARN-5600:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 6 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
 9s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  5m 
22s{color} | {color:red} hadoop-yarn in trunk failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
 7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
31s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  4m  
8s{color} | {color:red} hadoop-yarn in the patch failed. {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red}  4m  8s{color} | 
{color:red} hadoop-yarn in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  4m  8s{color} 
| {color:red} hadoop-yarn in the patch failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
54s{color} | {color:green} hadoop-yarn-project/hadoop-yarn: The patch generated 
0 new + 577 unchanged - 7 fixed = 577 total (was 584) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
 4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
34s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
30s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 15m 
55s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
45s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 64m 33s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:e809691 |
| JIRA Issue | YARN-5600 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12838452/YARN-5600.007.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  cc  |
| uname | Linux 0805289b3944 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 93eeb13 |
| Default Java | 1.8.0_101 |
| compile | 
https://builds.apache.org/job/PreCommit-YARN-Build/13864/artifact/patchprocess/branch-compile-hadoop-yarn-project_hadoop-yarn.txt
 |
| findbugs | v3.0.0 |
| compile | 

[jira] [Commented] (YARN-5792) adopt the id prefix for YARN, MR, and DS entities

2016-11-10 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1561#comment-1561
 ] 

Sangjin Lee commented on YARN-5792:
---

At a high level, I'd like to discuss options.

I see that you're mostly using the (inverted) start time as the id prefix. 
Would it be better to use simply the id instead whenever possible? One big 
benefit of using the id is that it is very portable. When creating entities and 
updating them, the id is almost always available. All we require is a 
uniqueness within the app and the entity type, and it seems to me that the id 
is a superior alternative to the start time.

What do you think?

> adopt the id prefix for YARN, MR, and DS entities
> -
>
> Key: YARN-5792
> URL: https://issues.apache.org/jira/browse/YARN-5792
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-5355
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-5792-YARN-5355.01.patch, 
> YARN-5792-YARN-5355.02.patch
>
>
> We introduced the entity id prefix to support flexible entity sorting 
> (YARN-5715). We should adopt the id prefix for YARN entities, MR entities, 
> and DS entities to take advantage of the id prefix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5792) adopt the id prefix for YARN, MR, and DS entities

2016-11-10 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655540#comment-15655540
 ] 

Sangjin Lee commented on YARN-5792:
---

Patch v.2 fails compilation:
{noformat}
[INFO] -
[INFO] -
[ERROR] COMPILATION ERROR :
[INFO] -
[ERROR] 
/Users/sjlee/git/hadoop-ats/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/recovery/NMLeveldbStateStoreService.java:[246,12]
 startTime has private access in 
org.apache.hadoop.yarn.server.nodemanager.recovery.NMStateStoreService.RecoveredContainerState
[INFO] 1 error
{noformat}

Could you please take a look? Thanks!

> adopt the id prefix for YARN, MR, and DS entities
> -
>
> Key: YARN-5792
> URL: https://issues.apache.org/jira/browse/YARN-5792
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-5355
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-5792-YARN-5355.01.patch, 
> YARN-5792-YARN-5355.02.patch
>
>
> We introduced the entity id prefix to support flexible entity sorting 
> (YARN-5715). We should adopt the id prefix for YARN entities, MR entities, 
> and DS entities to take advantage of the id prefix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5044) Add peak memory usage counter for each task

2016-11-10 Thread Yufei Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu updated YARN-5044:
---
Assignee: (was: Yufei Gu)

> Add peak memory usage counter for each task
> ---
>
> Key: YARN-5044
> URL: https://issues.apache.org/jira/browse/YARN-5044
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Reporter: Yufei Gu
>
> Each task has counters PHYSICAL_MEMORY_BYTES and VIRTUAL_MEMORY_BYTES, which 
> are snapshots of memory usage of that task. They are not sufficient for users 
> to understand peak memory usage by that task, e.g. in order to diagnose task 
> failures, tune job parameters or change application design. This new feature 
> will add two more counters for each task: PHYSICAL_MEMORY_BYTES_MAX and 
> VIRTUAL_MEMORY_BYTES_MAX.
> This JIRA has the same feature from MAPREDUCE-4710.  I file this new YARN 
> JIRA since MAPREDUCE-4710 is pretty old one from MR 1.x era, it more or less 
> assumes a branch-1 architecture, should be close at this point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4715) Add support to read resource types from a config file

2016-11-10 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-4715:
--
Fix Version/s: YARN-3926

Setting the missed fix-version to the branch-name.

> Add support to read resource types from a config file
> -
>
> Key: YARN-4715
> URL: https://issues.apache.org/jira/browse/YARN-4715
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Fix For: YARN-3926
>
> Attachments: YARN-4715-YARN-3926.001.patch, 
> YARN-4715-YARN-3926.002.patch, YARN-4715-YARN-3926.003.patch, 
> YARN-4715-YARN-3926.004.patch, YARN-4715-YARN-3926.005.patch
>
>
> This ticket is to add support to allow the RM to read the resource types to 
> be used for scheduling from a config file. I'll file follow up tickets to add 
> similar support in the NM as well as to handle the RM-NM handshake protocol 
> issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5864) Capacity Scheduler preemption for fragmented cluster

2016-11-10 Thread Tan, Wangda (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655514#comment-15655514
 ] 

Tan, Wangda commented on YARN-5864:
---

Thanks [~curino] for sharing these insightful suggestions.

The problem you mentioned is totally true: we were putting lots of efforts to 
add features for various of resource constraints (such as limits, node 
partition, priority, etc.) but we paid less attention about how to make 
easier/consistent semantics.

I also agree that we do need to spend some time to think about what is the 
semantics that YARN scheduler should have. For example, the minimum guarantee 
of CS is queue should get at least their configured capacity, but a picky app 
could make an under-utilized queue waiting forever for the resource. And also 
as you mentioned above, non-preemptable queue can invalidate configured 
capacity as well.

However, I would argue that the scheduler is not able to run perfectly without 
invalidating all the constraints. It is not just a group of formulas we need to 
define and let the solver to optimize it, it involves lots of human's emotions 
and preferences. For example, user may not understand and glad to accept why a 
picky request cannot be allocated even if the queue/cluster have available 
capacity. And it may not be acceptable to a production cluster that a long 
running service for realtime queries cannot be launched because we don't want 
to kill some less-important batch jobs. My point is, if we can have these rules 
defined in the doc and user can know what happened from the UI/log, we can add 
them.

To improve these, I think your suggestion (1) will be more helpful and 
achievable in a short term, we can definitely remove some parameters, for 
example, existing user-limit definition is not good enough and 
user-limit-factor can always make a queue cannot fully utilize its capacity. 
And we can better define these semantics in doc and UI.

(2) Looks beautiful but it may not be able to solve the root problem directly: 
The first priority is to make our users feel happy to accept it instead of 
beautifully solving it in mathematics. For example, for the problem I put in 
description of the JIRA, I don't think (2) can get allocation without harming 
other applications. And in implementation's perspective, I'm not sure how to 
make a solver-based solution can handle both of fast allocation (we want to do 
allocation within milli-seconds for interactive queries) and good placement 
(such as gang scheduling with some other constraints like anti-affinity). It 
seems to me that we will sacrifice low latency to get better quality of 
placement for the option (2).

bq. This opens up many abuses, one that comes to mind ...
Actually this feature will be only used in a pretty controlled environment: 
Important long running services running in a separate queue, and admin/user 
agrees that it can preempt other batch jobs to get new containers. ACLs will be 
set to avoid normal user running inside these queues, all apps running in the 
queue should be trusted apps such as YARN native services (Slider), Spark, etc. 
And we can also make sure these apps will try best to respect other apps.
And please advice if you think we can improve the semantics of this feature.

Thanks,

> Capacity Scheduler preemption for fragmented cluster 
> -
>
> Key: YARN-5864
> URL: https://issues.apache.org/jira/browse/YARN-5864
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-5864.poc-0.patch
>
>
> YARN-4390 added preemption for reserved container. However, we found one case 
> that large container cannot be allocated even if all queues are under their 
> limit.
> For example, we have:
> {code}
> Two queues, a and b, capacity 50:50 
> Two nodes: n1 and n2, each of them have 50 resource 
> Now queue-a uses 10 on n1 and 10 on n2
> queue-b asks for one single container with resource=45. 
> {code} 
> The container could be reserved on any of the host, but no preemption will 
> happen because all queues are under their limits. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4829) Add support for binary units

2016-11-10 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-4829:
--
Fix Version/s: YARN-3926

Setting the missed fix-version to the branch-name.

> Add support for binary units
> 
>
> Key: YARN-4829
> URL: https://issues.apache.org/jira/browse/YARN-4829
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Fix For: YARN-3926
>
> Attachments: YARN-4829-YARN-3926.001.patch, 
> YARN-4829-YARN-3926.002.patch, YARN-4829-YARN-3926.003.patch, 
> YARN-4829-YARN-3926.004.patch
>
>
> The units conversion util should have support for binary units.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4830) Add support for resource types in the nodemanager

2016-11-10 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-4830:
--
Fix Version/s: YARN-3926

Setting the missed fix-version to the branch-name.

> Add support for resource types in the nodemanager
> -
>
> Key: YARN-4830
> URL: https://issues.apache.org/jira/browse/YARN-4830
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Fix For: YARN-3926
>
> Attachments: YARN-4830-YARN-3926.001.patch, 
> YARN-4830-YARN-3926.002.patch, YARN-4830-YARN-3926.003.patch, 
> YARN-4830-YARN-3926.004.patch
>
>
> The RM has support for multiple resource types. The same should be added for 
> the NMs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4218) Metric for resource*time that was preempted

2016-11-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655459#comment-15655459
 ] 

Hudson commented on YARN-4218:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10815 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10815/])
YARN-4218. Metric for resource*time that was preempted. Contributed by (epayne: 
rev 93eeb13164707d0e3556c2bf737bd2ee09a335c6)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/records/impl/pb/ApplicationResourceUsageReportPBImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmcontainer/RMContainerImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/metrics/TimelineServiceV1Publisher.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppMetrics.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMAppBlock.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestAppPage.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestAppManager.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ApplicationResourceUsageReport.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/ApplicationHistoryManagerOnTimelineStore.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestZKRMStateStore.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/metrics/TestSystemMetricsPublisherForV2.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/records/impl/pb/ApplicationAttemptStateDataPBImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMServerUtils.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/MockAsm.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/AppInfo.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/TestApplicationHistoryManagerOnTimelineStore.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/proto/yarn_server_resourcemanager_recovery.proto
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/metrics/TimelineServiceV2Publisher.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/ApplicationCLI.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/metrics/TestSystemMetricsPublisher.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesApps.java
* (edit) 

[jira] [Updated] (YARN-5600) Add a parameter to ContainerLaunchContext to emulate yarn.nodemanager.delete.debug-delay-sec on a per-application basis

2016-11-10 Thread Miklos Szegedi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Szegedi updated YARN-5600:
-
Attachment: YARN-5600.007.patch

Resubmitting the patch.

> Add a parameter to ContainerLaunchContext to emulate 
> yarn.nodemanager.delete.debug-delay-sec on a per-application basis
> ---
>
> Key: YARN-5600
> URL: https://issues.apache.org/jira/browse/YARN-5600
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Daniel Templeton
>Assignee: Miklos Szegedi
>  Labels: oct16-medium
> Attachments: YARN-5600.000.patch, YARN-5600.001.patch, 
> YARN-5600.002.patch, YARN-5600.003.patch, YARN-5600.004.patch, 
> YARN-5600.005.patch, YARN-5600.006.patch, YARN-5600.007.patch
>
>
> To make debugging application launch failures simpler, I'd like to add a 
> parameter to the CLC to allow an application owner to request delayed 
> deletion of the application's launch artifacts.
> This JIRA solves largely the same problem as YARN-5599, but for cases where 
> ATS is not in use, e.g. branch-2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5765) LinuxContainerExecutor creates appcache and its subdirectories with wrong group owner.

2016-11-10 Thread Miklos Szegedi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655375#comment-15655375
 ] 

Miklos Szegedi commented on YARN-5765:
--

Thank you, [~Naganarasimha] for the patch and [~haibochen] for the review. If I 
understand it correctly, this is the flow of calls.
{code}
launch_container_as_user
  fork
create_local_dirs
  create_log_dirs
mkdir
  change_effective_user
  create_container_directories
mkdirs
  create_validate_dir
{code}
We cannot change umask before change_effective_user() I think and changing it 
in mkdirs() or create_validate_dir() may add side effects to other callers of 
mkdirs() in the future as [~haibochen] mentioned. What I would do is to set the 
umask at the beginning of create_container_directories right at the comment 
below
{code}
// create dirs as 0750
umask(0027);
{code}
I would also reset it to the previous value, before it returns.
Just a side note: This is what the Linux man page says about mkdir(): "in the 
absence of a default ACL, the mode of the created directory is   (mode & 
~umask & 0777)"
This means that by removing chmod this change does not apply to cases anymore, 
when the default ACL is too restrictive. Could this be an issue, or do we rely 
on the admin to set the default ACL correctly?

> LinuxContainerExecutor creates appcache and its subdirectories with wrong 
> group owner.
> --
>
> Key: YARN-5765
> URL: https://issues.apache.org/jira/browse/YARN-5765
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.8.0, 3.0.0-alpha1
>Reporter: Haibo Chen
>Assignee: Naganarasimha G R
>Priority: Blocker
> Attachments: YARN-5765.001.patch
>
>
> LinuxContainerExecutor creates usercache/\{userId\}/appcache/\{appId\} with 
> wrong group owner, causing Log aggregation and ShuffleHandler to fail because 
> node manager process does not have permission to read the files under the 
> directory.
> This can be easily reproduced by enabling LCE and submitting a MR example job 
> as a user that does not belong to the same group that NM process belongs to. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5834) TestNodeStatusUpdater.testNMRMConnectionConf compares nodemanager wait time to the incorrect value

2016-11-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655347#comment-15655347
 ] 

Hudson commented on YARN-5834:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10814 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10814/])
YARN-5834. TestNodeStatusUpdater.testNMRMConnectionConf compares (kasha: rev 
3a98419532687e4362ffc26abbc1264232820db7)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeStatusUpdater.java


> TestNodeStatusUpdater.testNMRMConnectionConf compares nodemanager wait time 
> to the incorrect value
> --
>
> Key: YARN-5834
> URL: https://issues.apache.org/jira/browse/YARN-5834
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Miklos Szegedi
>Assignee: Chang Li
>Priority: Trivial
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: YARN-5834-branch-2.001.patch
>
>
> The function is TestNodeStatusUpdater#testNMRMConnectionConf()
> I believe the connectionWaitMs references below were meant to be 
> nmRmConnectionWaitMs.
> {code}
> conf.setLong(YarnConfiguration.NM_RESOURCEMANAGER_CONNECT_MAX_WAIT_MS,
> nmRmConnectionWaitMs);
> conf.setLong(YarnConfiguration.RESOURCEMANAGER_CONNECT_MAX_WAIT_MS,
> connectionWaitMs);
> ...
>   long t = System.currentTimeMillis();
>   long duration = t - waitStartTime;
>   boolean waitTimeValid = (duration >= nmRmConnectionWaitMs) &&
>   (duration < (*connectionWaitMs* + delta));
>   if(!waitTimeValid) {
> // throw exception if NM doesn't retry long enough
> throw new Exception("NM should have tried re-connecting to RM during 
> " +
>   "period of at least " + *connectionWaitMs* + " ms, but " +
>   "stopped retrying within " + (*connectionWaitMs* + delta) +
>   " ms: " + e, e);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5834) TestNodeStatusUpdater.testNMRMConnectionConf compares nodemanager wait time to the incorrect value

2016-11-10 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-5834:
---
Fix Version/s: 3.0.0-alpha2

> TestNodeStatusUpdater.testNMRMConnectionConf compares nodemanager wait time 
> to the incorrect value
> --
>
> Key: YARN-5834
> URL: https://issues.apache.org/jira/browse/YARN-5834
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Miklos Szegedi
>Assignee: Chang Li
>Priority: Trivial
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: YARN-5834-branch-2.001.patch
>
>
> The function is TestNodeStatusUpdater#testNMRMConnectionConf()
> I believe the connectionWaitMs references below were meant to be 
> nmRmConnectionWaitMs.
> {code}
> conf.setLong(YarnConfiguration.NM_RESOURCEMANAGER_CONNECT_MAX_WAIT_MS,
> nmRmConnectionWaitMs);
> conf.setLong(YarnConfiguration.RESOURCEMANAGER_CONNECT_MAX_WAIT_MS,
> connectionWaitMs);
> ...
>   long t = System.currentTimeMillis();
>   long duration = t - waitStartTime;
>   boolean waitTimeValid = (duration >= nmRmConnectionWaitMs) &&
>   (duration < (*connectionWaitMs* + delta));
>   if(!waitTimeValid) {
> // throw exception if NM doesn't retry long enough
> throw new Exception("NM should have tried re-connecting to RM during 
> " +
>   "period of at least " + *connectionWaitMs* + " ms, but " +
>   "stopped retrying within " + (*connectionWaitMs* + delta) +
>   " ms: " + e, e);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5825) ProportionalPreemptionalPolicy could use readLock over LeafQueue instead of synchronized block

2016-11-10 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655310#comment-15655310
 ] 

Jian He commented on YARN-5825:
---

looks good overall, 
looks like this method added is not used? can be removed
{code}

  public ReentrantReadWriteLock.WriteLock getWriteLock() {
return writeLock;
  }
{code}

> ProportionalPreemptionalPolicy could use readLock over LeafQueue instead of 
> synchronized block
> --
>
> Key: YARN-5825
> URL: https://issues.apache.org/jira/browse/YARN-5825
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: YARN-5825.0001.patch
>
>
> Currently in PCPP, {{synchronized (curQueue)}} is used in various places. 
> Such instances could be replaced with a read lock. Thank you [~jianhe] for 
> pointing out the same as comment 
> [here|https://issues.apache.org/jira/browse/YARN-2009?focusedCommentId=15626578=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15626578]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5834) TestNodeStatusUpdater.testNMRMConnectionConf compares nodemanager wait time to the incorrect value

2016-11-10 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655304#comment-15655304
 ] 

Karthik Kambatla commented on YARN-5834:


+1. Checking this in..

> TestNodeStatusUpdater.testNMRMConnectionConf compares nodemanager wait time 
> to the incorrect value
> --
>
> Key: YARN-5834
> URL: https://issues.apache.org/jira/browse/YARN-5834
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Miklos Szegedi
>Assignee: Chang Li
>Priority: Trivial
> Attachments: YARN-5834-branch-2.001.patch
>
>
> The function is TestNodeStatusUpdater#testNMRMConnectionConf()
> I believe the connectionWaitMs references below were meant to be 
> nmRmConnectionWaitMs.
> {code}
> conf.setLong(YarnConfiguration.NM_RESOURCEMANAGER_CONNECT_MAX_WAIT_MS,
> nmRmConnectionWaitMs);
> conf.setLong(YarnConfiguration.RESOURCEMANAGER_CONNECT_MAX_WAIT_MS,
> connectionWaitMs);
> ...
>   long t = System.currentTimeMillis();
>   long duration = t - waitStartTime;
>   boolean waitTimeValid = (duration >= nmRmConnectionWaitMs) &&
>   (duration < (*connectionWaitMs* + delta));
>   if(!waitTimeValid) {
> // throw exception if NM doesn't retry long enough
> throw new Exception("NM should have tried re-connecting to RM during 
> " +
>   "period of at least " + *connectionWaitMs* + " ms, but " +
>   "stopped retrying within " + (*connectionWaitMs* + delta) +
>   " ms: " + e, e);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5834) TestNodeStatusUpdater.testNMRMConnectionConf compares nodemanager wait time to the incorrect value

2016-11-10 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-5834:
---
Priority: Trivial  (was: Minor)

> TestNodeStatusUpdater.testNMRMConnectionConf compares nodemanager wait time 
> to the incorrect value
> --
>
> Key: YARN-5834
> URL: https://issues.apache.org/jira/browse/YARN-5834
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Miklos Szegedi
>Assignee: Chang Li
>Priority: Trivial
> Attachments: YARN-5834-branch-2.001.patch
>
>
> The function is TestNodeStatusUpdater#testNMRMConnectionConf()
> I believe the connectionWaitMs references below were meant to be 
> nmRmConnectionWaitMs.
> {code}
> conf.setLong(YarnConfiguration.NM_RESOURCEMANAGER_CONNECT_MAX_WAIT_MS,
> nmRmConnectionWaitMs);
> conf.setLong(YarnConfiguration.RESOURCEMANAGER_CONNECT_MAX_WAIT_MS,
> connectionWaitMs);
> ...
>   long t = System.currentTimeMillis();
>   long duration = t - waitStartTime;
>   boolean waitTimeValid = (duration >= nmRmConnectionWaitMs) &&
>   (duration < (*connectionWaitMs* + delta));
>   if(!waitTimeValid) {
> // throw exception if NM doesn't retry long enough
> throw new Exception("NM should have tried re-connecting to RM during 
> " +
>   "period of at least " + *connectionWaitMs* + " ms, but " +
>   "stopped retrying within " + (*connectionWaitMs* + delta) +
>   " ms: " + e, e);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4218) Metric for resource*time that was preempted

2016-11-10 Thread Eric Payne (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655289#comment-15655289
 ] 

Eric Payne commented on YARN-4218:
--

+1

Thanks [~lichangleo] for the patches and the work done on this JIRA. I will 
commit this.

> Metric for resource*time that was preempted
> ---
>
> Key: YARN-4218
> URL: https://issues.apache.org/jira/browse/YARN-4218
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: YARN-4218-branch-2.003.patch, YARN-4218.006.patch, 
> YARN-4218.2.patch, YARN-4218.2.patch, YARN-4218.2.patch, YARN-4218.2.patch, 
> YARN-4218.3.patch, YARN-4218.4.patch, YARN-4218.5.patch, 
> YARN-4218.branch-2.2.patch, YARN-4218.branch-2.patch, YARN-4218.patch, 
> YARN-4218.trunk.2.patch, YARN-4218.trunk.3.patch, YARN-4218.trunk.patch, 
> YARN-4218.wip.patch, screenshot-1.png, screenshot-2.png, screenshot-3.png
>
>
> After YARN-415 we have the ability to track the resource*time footprint of a 
> job and preemption metrics shows how many containers were preempted on a job. 
> However we don't have a metric showing the resource*time footprint cost of 
> preemption. In other words, we know how many containers were preempted but we 
> don't have a good measure of how much work was lost as a result of preemption.
> We should add this metric so we can analyze how much work preemption is 
> costing on a grid and better track which jobs were heavily impacted by it. A 
> job that has 100 containers preempted that only lasted a minute each and were 
> very small is going to be less impacted than a job that only lost a single 
> container but that container was huge and had been running for 3 days.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5819) Verify fairshare and minshare preemption

2016-11-10 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655248#comment-15655248
 ] 

Karthik Kambatla commented on YARN-5819:


Since updating {{Resource}} is not atomic, it seemed safer to do 
reads/writes/updates protected by a lock.

> Verify fairshare and minshare preemption
> 
>
> Key: YARN-5819
> URL: https://issues.apache.org/jira/browse/YARN-5819
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Affects Versions: 2.9.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Attachments: yarn-5819.YARN-4752.1.patch, 
> yarn-5819.YARN-4752.2.patch, yarn-5819.YARN-4752.3.patch, 
> yarn-5819.YARN-4752.4.patch
>
>
> JIRA to track the unit test(s) verifying both fairshare and minshare 
> preemption. The tests should verify:
> # preemption within a single leaf queue
> # preemption between sibling leaf queues
> # preemption between non-sibling leaf queues
> # {{allowPreemption = false}} should disallow preemption from a queue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5792) adopt the id prefix for YARN, MR, and DS entities

2016-11-10 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-5792:
---
Attachment: YARN-5792-YARN-5355.02.patch

Fixed checkstyle and javadoc issues

> adopt the id prefix for YARN, MR, and DS entities
> -
>
> Key: YARN-5792
> URL: https://issues.apache.org/jira/browse/YARN-5792
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-5355
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-5792-YARN-5355.01.patch, 
> YARN-5792-YARN-5355.02.patch
>
>
> We introduced the entity id prefix to support flexible entity sorting 
> (YARN-5715). We should adopt the id prefix for YARN entities, MR entities, 
> and DS entities to take advantage of the id prefix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5819) Verify fairshare and minshare preemption

2016-11-10 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655170#comment-15655170
 ] 

Daniel Templeton commented on YARN-5819:


Do you need to synchronize {{getPreemptedResources()}}?  Doesn't look like it 
helps to me.  Maybe synchronize and return a copy?

> Verify fairshare and minshare preemption
> 
>
> Key: YARN-5819
> URL: https://issues.apache.org/jira/browse/YARN-5819
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Affects Versions: 2.9.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Attachments: yarn-5819.YARN-4752.1.patch, 
> yarn-5819.YARN-4752.2.patch, yarn-5819.YARN-4752.3.patch, 
> yarn-5819.YARN-4752.4.patch
>
>
> JIRA to track the unit test(s) verifying both fairshare and minshare 
> preemption. The tests should verify:
> # preemption within a single leaf queue
> # preemption between sibling leaf queues
> # preemption between non-sibling leaf queues
> # {{allowPreemption = false}} should disallow preemption from a queue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4206) Add life time value in Application report and web UI

2016-11-10 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655155#comment-15655155
 ] 

Jian He commented on YARN-4206:
---

The remaining time can be inferred from the absolute timeout value ? If so,  I 
don't think we need an additional API in ApplicationClientProtocol to get that.

> Add life time value in Application report and web UI
> 
>
> Key: YARN-4206
> URL: https://issues.apache.org/jira/browse/YARN-4206
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler
>Reporter: nijel
>Assignee: nijel
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5792) adopt the id prefix for YARN, MR, and DS entities

2016-11-10 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-5792:
---
Attachment: (was: YARN-5792-YARN-5355.02.patch)

> adopt the id prefix for YARN, MR, and DS entities
> -
>
> Key: YARN-5792
> URL: https://issues.apache.org/jira/browse/YARN-5792
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-5355
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-5792-YARN-5355.01.patch
>
>
> We introduced the entity id prefix to support flexible entity sorting 
> (YARN-5715). We should adopt the id prefix for YARN entities, MR entities, 
> and DS entities to take advantage of the id prefix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5792) adopt the id prefix for YARN, MR, and DS entities

2016-11-10 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-5792:
---
Attachment: YARN-5792-YARN-5355.02.patch

> adopt the id prefix for YARN, MR, and DS entities
> -
>
> Key: YARN-5792
> URL: https://issues.apache.org/jira/browse/YARN-5792
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-5355
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-5792-YARN-5355.01.patch, 
> YARN-5792-YARN-5355.02.patch
>
>
> We introduced the entity id prefix to support flexible entity sorting 
> (YARN-5715). We should adopt the id prefix for YARN entities, MR entities, 
> and DS entities to take advantage of the id prefix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5792) adopt the id prefix for YARN, MR, and DS entities

2016-11-10 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655096#comment-15655096
 ] 

Varun Saxena commented on YARN-5792:


Tests pass in local. Lets see what comes up when build is invoked again.

> adopt the id prefix for YARN, MR, and DS entities
> -
>
> Key: YARN-5792
> URL: https://issues.apache.org/jira/browse/YARN-5792
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-5355
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-5792-YARN-5355.01.patch
>
>
> We introduced the entity id prefix to support flexible entity sorting 
> (YARN-5715). We should adopt the id prefix for YARN entities, MR entities, 
> and DS entities to take advantage of the id prefix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5792) adopt the id prefix for YARN, MR, and DS entities

2016-11-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655013#comment-15655013
 ] 

Hadoop QA commented on YARN-5792:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 10 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m 
45s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
36s{color} | {color:green} YARN-5355 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
14s{color} | {color:green} YARN-5355 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
44s{color} | {color:green} YARN-5355 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
58s{color} | {color:green} YARN-5355 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
35s{color} | {color:green} YARN-5355 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
5s{color} | {color:green} YARN-5355 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
55s{color} | {color:green} YARN-5355 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
18s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
16s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 55s{color} | {color:orange} root: The patch generated 35 new + 1286 
unchanged - 21 fixed = 1321 total (was 1307) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m  
7s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
26s{color} | {color:red} 
hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core 
generated 9 new + 2496 unchanged - 0 fixed = 2505 total (was 2496) {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 15m 
44s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 40m 
53s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 14m 34s{color} 
| {color:red} hadoop-yarn-applications-distributedshell in the patch failed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m 
34s{color} | {color:green} hadoop-mapreduce-client-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
56s{color} | {color:green} hadoop-mapreduce-client-app in the patch passed. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}115m 38s{color} 
| {color:red} hadoop-mapreduce-client-jobclient in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
38s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}282m 24s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.applications.distributedshell.TestDistributedShell |
|   | hadoop.mapred.TestMRTimelineEventHandling |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | YARN-5792 |
| JIRA Patch URL | 

[jira] [Commented] (YARN-5600) Add a parameter to ContainerLaunchContext to emulate yarn.nodemanager.delete.debug-delay-sec on a per-application basis

2016-11-10 Thread Miklos Szegedi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15654874#comment-15654874
 ] 

Miklos Szegedi commented on YARN-5600:
--

The build error seems to be unrelated to the change:
Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.3.1:exec (npm 
install) on project hadoop-yarn-ui

> Add a parameter to ContainerLaunchContext to emulate 
> yarn.nodemanager.delete.debug-delay-sec on a per-application basis
> ---
>
> Key: YARN-5600
> URL: https://issues.apache.org/jira/browse/YARN-5600
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Daniel Templeton
>Assignee: Miklos Szegedi
>  Labels: oct16-medium
> Attachments: YARN-5600.000.patch, YARN-5600.001.patch, 
> YARN-5600.002.patch, YARN-5600.003.patch, YARN-5600.004.patch, 
> YARN-5600.005.patch, YARN-5600.006.patch
>
>
> To make debugging application launch failures simpler, I'd like to add a 
> parameter to the CLC to allow an application owner to request delayed 
> deletion of the application's launch artifacts.
> This JIRA solves largely the same problem as YARN-5599, but for cases where 
> ATS is not in use, e.g. branch-2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-3955) Support for priority ACLs in CapacityScheduler

2016-11-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15654866#comment-15654866
 ] 

Hadoop QA commented on YARN-3955:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
38s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  6m  
0s{color} | {color:red} hadoop-yarn in trunk failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
 9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
18s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  4m 
44s{color} | {color:red} hadoop-yarn in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  4m 44s{color} 
| {color:red} hadoop-yarn in the patch failed. {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 52s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 26 new + 365 unchanged - 1 fixed = 391 total (was 366) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
35s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
56s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 39m 18s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
38s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 87m 17s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | 
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
|  |  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.PriorityACLConfiguration.createACLStringPerPriority(HashMap,
 Map) invokes inefficient new String() constructor  At 
PriorityACLConfiguration.java:String() constructor  At 
PriorityACLConfiguration.java:[line 137] |
|  |  Call to StringBuilder.equals(String) in 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.PriorityACLConfiguration.createACLStringForPriority(Map,
 Priority, String, PriorityACLConfiguration$PriorityACLConfig)  At 
PriorityACLConfiguration.java:Priority, String, 
PriorityACLConfiguration$PriorityACLConfig)  At 
PriorityACLConfiguration.java:[line 223] |
| Failed junit 

[jira] [Commented] (YARN-5600) Add a parameter to ContainerLaunchContext to emulate yarn.nodemanager.delete.debug-delay-sec on a per-application basis

2016-11-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15654842#comment-15654842
 ] 

Hadoop QA commented on YARN-5600:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 6 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
53s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
 4s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  5m 
17s{color} | {color:red} hadoop-yarn in trunk failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
 3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
29s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  3m 
42s{color} | {color:red} hadoop-yarn in the patch failed. {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red}  3m 42s{color} | 
{color:red} hadoop-yarn in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  3m 42s{color} 
| {color:red} hadoop-yarn in the patch failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
53s{color} | {color:green} hadoop-yarn-project/hadoop-yarn: The patch generated 
0 new + 577 unchanged - 7 fixed = 577 total (was 584) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
34s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
24s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 14m 
45s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
38s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 64m 21s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:e809691 |
| JIRA Issue | YARN-5600 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12838405/YARN-5600.006.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  cc  |
| uname | Linux 7c38b041e57b 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 89354f0 |
| Default Java | 1.8.0_111 |
| compile | 
https://builds.apache.org/job/PreCommit-YARN-Build/13862/artifact/patchprocess/branch-compile-hadoop-yarn-project_hadoop-yarn.txt
 |
| findbugs | v3.0.0 |
| compile | 

[jira] [Commented] (YARN-5634) Simplify initialization/use of RouterPolicy via a RouterPolicyFacade

2016-11-10 Thread Carlo Curino (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15654839#comment-15654839
 ] 

Carlo Curino commented on YARN-5634:


[~subru] thanks for the prompt feedback.  I addressed most of your points, and 
discuss the rest below.

{{YarnConfiguration}}
* I think having initialization params for the policy will be useful (while the 
current choice of default does not strictly needs params, I don't like 
hardcoding a null or empty buffer there, as a change of default should be 
limited to changes in YarnConfiguration).

{{RouterPolicyFacade}}
* I am explicitly avoiding the Charset.defaultCharset(), as this might depend 
on OS/VM conf, and since the serialize/deserialize will happen on separate 
machines, I want to avoid misaligned defaults which could means we need to 
redeploy code to fix bugs on a live cluster (for example if we go from a 
UniformRandomRouterPolicy to a WeightedRandomRouterPolicy and we realize the 
VMs have different defaultCharset).
* I don't think the *if* rewrite proposed matches the semantics we need. We 
need to initialize both in case a queue is not been cached before, or if the 
cached copy is different. Am I missing something?

{{TestFederationPolicyFacade}}
 * I am not sure I follow what you are proposing (as it is minor I will ask you 
offline).


> Simplify initialization/use of RouterPolicy via a RouterPolicyFacade 
> -
>
> Key: YARN-5634
> URL: https://issues.apache.org/jira/browse/YARN-5634
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Affects Versions: YARN-2915
>Reporter: Carlo Curino
>Assignee: Carlo Curino
>  Labels: oct16-medium
> Attachments: YARN-5634-YARN-2915.01.patch, 
> YARN-5634-YARN-2915.02.patch
>
>
> The current set of policies require some machinery to (re)initialize based on 
> changes in the SubClusterPolicyConfiguration. This JIRA tracks the effort to 
> hide much of that behind a simple RouterPolicyFacade, making lifecycle and 
> usage of the policies easier to consumers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-5694) ZKRMStateStore should always start its verification thread to prevent accidental state store corruption

2016-11-10 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15654697#comment-15654697
 ] 

Jian He edited comment on YARN-5694 at 11/10/16 6:09 PM:
-

bq. If we agree that it's bad to have two RMs accidentally sharing the same 
state store, 
If it's in non-HA mode, currently there's no protection in the ZKStore 
preventing two RMs from sharing the same store. All the ACLs setting related 
code is only used in HA mode. Essentially, with current patch, I doubt it will 
get NoAuthException in the verifyThread, without making user change the ACLs 
manually. So the handling code in this patch will not be triggered with default 
setting. Maybe I'm wrong, you may try on a real cluster.. also, I thinking 
setting ACLs for RM is not a required step for deploying non-HA cluster, 
forcing this to be set is behavior change..

bq. why would you not want to catch the issue as early as possible?
My point is that first,will this code work as mentioned above. second, if 
there's no difference in terms of functionality, why do I need to start a 
thread pinging the zk continuously every few seconds.  Of course, I might miss 
something, you may clarify more...

Also, is the use-case mainly about two clusters sharing the same zk-store with 
the same path ?  IMHO, this is not a primary use-case to solve, if user 
mis-configured, it's user's fault. There are many other places that can go 
wrong.  e.g. if two clusters configure the same path for anything on HDFS.

If the use-case is about two RMs sharing the same zk-path in the same cluster 
with non-HA mode. I think in non-HA mode, the invalid RM will not take workload 
in the first place, clients, NMs will not switch to that RM if HA is not 
configured properly. 


was (Author: jianhe):
bq. If we agree that it's bad to have two RMs accidentally sharing the same 
state store, 
If it's in non-HA mode, currently there's no protection in the ZKStore 
preventing two RMs from sharing the same store. All the ACLs setting related 
code is only used in HA mode. Essentially, with current patch, I doubt it will 
get NoAuthException in the verifyThread, without making user change the ACLs 
manually. So the handling code in this patch will not be triggered with default 
setting. Maybe I'm wrong, you may try on a real cluster..

bq. why would you not want to catch the issue as early as possible?
My point is that first,will this code work as mentioned above. second, if 
there's no difference in terms of functionality, why do I need to start a 
thread pinging the zk continuously every few seconds.  Of course, I might miss 
something, you may clarify more...

Also, is the use-case mainly about two clusters sharing the same zk-store with 
the same path ?  IMHO, this is not a primary use-case to solve, if user 
mis-configured, it's user's fault. There are many other places that can go 
wrong.  e.g. if two clusters configure the same path for anything on HDFS.

If the use-case is about two RMs sharing the same zk-path in the same cluster 
with non-HA mode. I think in non-HA mode, the invalid RM will not take workload 
in the first place, clients, NMs will not switch to that RM if HA is not 
configured properly. 

> ZKRMStateStore should always start its verification thread to prevent 
> accidental state store corruption
> ---
>
> Key: YARN-5694
> URL: https://issues.apache.org/jira/browse/YARN-5694
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
>Priority: Critical
>  Labels: oct16-medium
> Attachments: YARN-5694.001.patch, YARN-5694.002.patch, 
> YARN-5694.003.patch, YARN-5694.004.patch, YARN-5694.004.patch, 
> YARN-5694.005.patch, YARN-5694.006.patch, YARN-5694.007.patch, 
> YARN-5694.branch-2.7.001.patch, YARN-5694.branch-2.7.002.patch
>
>
> There are two cases.  In branch-2.7, the 
> {{ZKRMStateStore.VerifyActiveStatusThread}} is always started, even when 
> using embedded or Curator failover.  In branch-2.8, the 
> {{ZKRMStateStore.VerifyActiveStatusThread}} is only started when HA is 
> disabled, which makes no sense.  Based on the JIRA that introduced that 
> change (YARN-4559), I believe the intent was to start it only when embedded 
> failover is disabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5694) ZKRMStateStore should always start its verification thread to prevent accidental state store corruption

2016-11-10 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15654697#comment-15654697
 ] 

Jian He commented on YARN-5694:
---

bq. If we agree that it's bad to have two RMs accidentally sharing the same 
state store, 
If it's in non-HA mode, currently there's no protection in the ZKStore 
preventing two RMs from sharing the same store. All the ACLs setting related 
code is only used in HA mode. Essentially, with current patch, I doubt it will 
get NoAuthException in the verifyThread, without making user change the ACLs 
manually. So the handling code in this patch will not be triggered with default 
setting. Maybe I'm wrong, you may try on a real cluster..

bq. why would you not want to catch the issue as early as possible?
My point is that first,will this code work as mentioned above. second, if 
there's no difference in terms of functionality, why do I need to start a 
thread pinging the zk continuously every few seconds.  Of course, I might miss 
something, you may clarify more...

Also, is the use-case mainly about two clusters sharing the same zk-store with 
the same path ?  IMHO, this is not a primary use-case to solve, if user 
mis-configured, it's user's fault. There are many other places that can go 
wrong.  e.g. if two clusters configure the same path for anything on HDFS.

If the use-case is about two RMs sharing the same zk-path in the same cluster 
with non-HA mode. I think in non-HA mode, the invalid RM will not take workload 
in the first place, clients, NMs will not switch to that RM if HA is not 
configured properly. 

> ZKRMStateStore should always start its verification thread to prevent 
> accidental state store corruption
> ---
>
> Key: YARN-5694
> URL: https://issues.apache.org/jira/browse/YARN-5694
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
>Priority: Critical
>  Labels: oct16-medium
> Attachments: YARN-5694.001.patch, YARN-5694.002.patch, 
> YARN-5694.003.patch, YARN-5694.004.patch, YARN-5694.004.patch, 
> YARN-5694.005.patch, YARN-5694.006.patch, YARN-5694.007.patch, 
> YARN-5694.branch-2.7.001.patch, YARN-5694.branch-2.7.002.patch
>
>
> There are two cases.  In branch-2.7, the 
> {{ZKRMStateStore.VerifyActiveStatusThread}} is always started, even when 
> using embedded or Curator failover.  In branch-2.8, the 
> {{ZKRMStateStore.VerifyActiveStatusThread}} is only started when HA is 
> disabled, which makes no sense.  Based on the JIRA that introduced that 
> change (YARN-4559), I believe the intent was to start it only when embedded 
> failover is disabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5834) TestNodeStatusUpdater.testNMRMConnectionConf compares nodemanager wait time to the incorrect value

2016-11-10 Thread Miklos Szegedi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15654682#comment-15654682
 ] 

Miklos Szegedi commented on YARN-5834:
--

+1 non-binding. Thank you, [~lichangleo]! The change looks good to me.

> TestNodeStatusUpdater.testNMRMConnectionConf compares nodemanager wait time 
> to the incorrect value
> --
>
> Key: YARN-5834
> URL: https://issues.apache.org/jira/browse/YARN-5834
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Miklos Szegedi
>Assignee: Chang Li
>Priority: Minor
> Attachments: YARN-5834-branch-2.001.patch
>
>
> The function is TestNodeStatusUpdater#testNMRMConnectionConf()
> I believe the connectionWaitMs references below were meant to be 
> nmRmConnectionWaitMs.
> {code}
> conf.setLong(YarnConfiguration.NM_RESOURCEMANAGER_CONNECT_MAX_WAIT_MS,
> nmRmConnectionWaitMs);
> conf.setLong(YarnConfiguration.RESOURCEMANAGER_CONNECT_MAX_WAIT_MS,
> connectionWaitMs);
> ...
>   long t = System.currentTimeMillis();
>   long duration = t - waitStartTime;
>   boolean waitTimeValid = (duration >= nmRmConnectionWaitMs) &&
>   (duration < (*connectionWaitMs* + delta));
>   if(!waitTimeValid) {
> // throw exception if NM doesn't retry long enough
> throw new Exception("NM should have tried re-connecting to RM during 
> " +
>   "period of at least " + *connectionWaitMs* + " ms, but " +
>   "stopped retrying within " + (*connectionWaitMs* + delta) +
>   " ms: " + e, e);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5600) Add a parameter to ContainerLaunchContext to emulate yarn.nodemanager.delete.debug-delay-sec on a per-application basis

2016-11-10 Thread Miklos Szegedi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Szegedi updated YARN-5600:
-
Attachment: YARN-5600.006.patch

Fixing checkstyle issue

> Add a parameter to ContainerLaunchContext to emulate 
> yarn.nodemanager.delete.debug-delay-sec on a per-application basis
> ---
>
> Key: YARN-5600
> URL: https://issues.apache.org/jira/browse/YARN-5600
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Daniel Templeton
>Assignee: Miklos Szegedi
>  Labels: oct16-medium
> Attachments: YARN-5600.000.patch, YARN-5600.001.patch, 
> YARN-5600.002.patch, YARN-5600.003.patch, YARN-5600.004.patch, 
> YARN-5600.005.patch, YARN-5600.006.patch
>
>
> To make debugging application launch failures simpler, I'd like to add a 
> parameter to the CLC to allow an application owner to request delayed 
> deletion of the application's launch artifacts.
> This JIRA solves largely the same problem as YARN-5599, but for cases where 
> ATS is not in use, e.g. branch-2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5865) Retrospect updateApplicationPriority api to handle state store exception in align with YARN-5611

2016-11-10 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-5865:
--
Issue Type: Sub-task  (was: Bug)
Parent: YARN-1963

> Retrospect updateApplicationPriority api to handle state store exception in 
> align with YARN-5611
> 
>
> Key: YARN-5865
> URL: https://issues.apache.org/jira/browse/YARN-5865
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: YARN-5865.0001.patch
>
>
> Post YARN-5611, revisit dynamic update of application priority logic with 
> respect to state store error handling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-3955) Support for priority ACLs in CapacityScheduler

2016-11-10 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-3955:
--
Attachment: YARN-3955.0001.patch

Thanks [~jianhe] for the comments.

bq.I think readLock is not needed, the field itself is not changing
{{priorityACLs}} could be changed during reinitialize. Do we need to consider 
this point. Also we may add REST based support to add ACLs during runtime. If 
reinitialize is fine, i could remove lock and add it when REST is in progress. 
Thoughts?

> Support for priority ACLs in CapacityScheduler
> --
>
> Key: YARN-3955
> URL: https://issues.apache.org/jira/browse/YARN-3955
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler
>Affects Versions: 2.7.1
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: ApplicationPriority-ACL.pdf, 
> ApplicationPriority-ACLs-v2.pdf, YARN-3955.0001.patch, YARN-3955.v0.patch, 
> YARN-3955.v1.patch, YARN-3955.wip1.patch
>
>
> Support will be added for User-level access permission to use different 
> application-priorities. This is to avoid situations where all users try 
> running max priority in the cluster and thus degrading the value of 
> priorities.
> Access Control Lists can be set per priority level within each queue. Below 
> is an example configuration that can be added in capacity scheduler 
> configuration
> file for each Queue level.
> yarn.scheduler.capacity.root...acl=user1,user2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5825) ProportionalPreemptionalPolicy could use readLock over LeafQueue instead of synchronized block

2016-11-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15654471#comment-15654471
 ] 

Hadoop QA commented on YARN-5825:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 22s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 3 new + 114 unchanged - 0 fixed = 117 total (was 114) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 42m 20s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 60m 20s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.monitor.capacity.TestProportionalCapacityPreemptionPolicyIntraQueue
 |
|   | 
hadoop.yarn.server.resourcemanager.monitor.capacity.TestProportionalCapacityPreemptionPolicy
 |
|   | 
hadoop.yarn.server.resourcemanager.monitor.capacity.TestProportionalCapacityPreemptionPolicyForReservedContainers
 |
|   | 
hadoop.yarn.server.resourcemanager.monitor.capacity.TestProportionalCapacityPreemptionPolicyForNodePartitions
 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:e809691 |
| JIRA Issue | YARN-5825 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12838367/YARN-5825.0001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 4537a22cc49e 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / ca68f9c |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/13859/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| unit | 

[jira] [Commented] (YARN-5867) DirectoryCollection#checkDirs can cause incorrect permission of nmlocal dir

2016-11-10 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15654405#comment-15654405
 ] 

Jason Lowe commented on YARN-5867:
--

I'm curious how the top-level local directory was deleted in the first place.  
It sounds like an incorrect setup, like tmpwatch or something was coming along 
and blowing away NM directories.  Arbitrary removal of NM directories while it 
is running is going to cause container failures at a minimum.

I'm somewhat torn on this.  Part of me thinks it would be best to treat this 
case like a bad disk, since something _clearly_ is wrong when top-level 
directories go missing out of the blue.  Either admins setup something wrong on 
the cluster or the filesystem is having difficulty persisting data.  Both are 
bad.  Someone should really look into it, otherwise if we keep silently trying 
to fix it up after the fact then we just move the issue to debugging 
mysteriously failing containers.  However I can see the benefits of not forcing 
an admin to intervene, as it can hobble along automatically (with degraded 
performance due to reruns of mysteriously crashing containers).

If we do go with solution 1, we need to log an error when we detect it.

> DirectoryCollection#checkDirs can cause incorrect permission of nmlocal dir
> ---
>
> Key: YARN-5867
> URL: https://issues.apache.org/jira/browse/YARN-5867
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>
> Steps to reproduce
> ===
> # Set umask to 077 for user
> # Start nodemanager with nmlocal dir configured
> nmlocal dir permission is *755* 
> {{LocalDirsHandlerService#serviceInit}}
> {code} 
> FsPermission perm = new FsPermission((short)0755);
> boolean createSucceeded = localDirs.createNonExistentDirs(localFs, perm);
> createSucceeded &= logDirs.createNonExistentDirs(localFs, perm);
> {code}
> # After  startup delete the nmlocal dir and wait for {{MonitoringTimerTask}} 
> to run (simulation using delete)
> # Now check the permission of {{nmlocal dir}} will be *700*
> *Root Cause*
> {{DirectoryCollection#testDirs}} checks as following
> {code}
> // create a random dir to make sure fs isn't in read-only mode
> verifyDirUsingMkdir(testDir);
> {code}
> which cause a new Random directory to be create in {{localdir}} using
> {{DiskChecker.checkDir(dir)}} -> {{!mkdirsWithExistsCheck(dir)}} causing the 
> nmlocal dir to be created with wrong permission. *700*
> Few application fail to container launch due to permission denied.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5651) Changes to NMStateStore to persist reinitialization and rollback state

2016-11-10 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15654347#comment-15654347
 ] 

Arun Suresh commented on YARN-5651:
---

[~jianhe], Wondering what the right approach for this is.

Currently, in the normal container startup flow, if the NM recovery happens 
*after* the container start request comes in (the RecoveredContainerStatus == 
REQUESTED) but *before* the container is launched (at which point 
RecoveredContainerStatus == LAUNCHED), the container is just reported back as 
killed. If the Container has been launched and the container is active, then 
the ContainerImpl's internal state is regenerated using the 
StartContainerRequest.

I was thinking, similarly, if a re-initialization request (re-init / restart or 
rollback) arrives for a container, we just mark in the stateStore as 
RecoveredContainerStatus == RE_INITIALIZING.
If the NM restarts and recovers before the container has finished 
re-initializing, then we just report the container as killed.
If the Container has completed the relaunch, I proposed we:
# we can replace the ContainerImpl's internal state (launchContext, ResourceSet 
etc.). We already do this now.
# we also replace the stored StartContainerRequest object, stored in the db, 
with a new StartContainerRequest which we create from the ContainerImpl's 
internal state.

This way, there is no real need to actually store the 
ReInitializeContainerRequest object anywhere. Thoughts ?

> Changes to NMStateStore to persist reinitialization and rollback state
> --
>
> Key: YARN-5651
> URL: https://issues.apache.org/jira/browse/YARN-5651
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5792) adopt the id prefix for YARN, MR, and DS entities

2016-11-10 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15654321#comment-15654321
 ] 

Varun Saxena commented on YARN-5792:


The patch does the following.

# Uses inverse of container start time to publish id prefix for container 
entities. As container start time is not stored in NM state store, added 
support to add it as well.
# Uses inverse of task start time to publish id prefix for task entities. 
# Uses inverse of task attempt start time to publish id prefix for task attempt 
entities. 
# Uses inverse of DS container start time to publish id prefix for distributed 
shell container entities. 
# Uses inverse of attempt start time to publish id prefix for app attempt 
entities.  We can potentially use  inverse of attempt id bit of 
ApplicationAttemptId as well here. Also app registered time can be used.
# Uses inverse of DS attempt start time to publish id prefix for DS attempt 
entities.  We can potentially use  inverse of attempt id bit of 
ApplicationAttemptId as well here.
# Uses inverse of id bit of job id to publish id prefix for job entities.  We 
can potentially use job start time here.

Last 3 points we can reach a consensus on as to what to use.

> adopt the id prefix for YARN, MR, and DS entities
> -
>
> Key: YARN-5792
> URL: https://issues.apache.org/jira/browse/YARN-5792
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-5355
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-5792-YARN-5355.01.patch
>
>
> We introduced the entity id prefix to support flexible entity sorting 
> (YARN-5715). We should adopt the id prefix for YARN entities, MR entities, 
> and DS entities to take advantage of the id prefix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5867) DirectoryCollection#checkDirs can cause incorrect permission of nmlocal dir

2016-11-10 Thread Bibin A Chundatt (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15654312#comment-15654312
 ] 

Bibin A Chundatt commented on YARN-5867:


cc/ [~jlowe] and [~vvasudev] . Could you please share your thoughts too?

> DirectoryCollection#checkDirs can cause incorrect permission of nmlocal dir
> ---
>
> Key: YARN-5867
> URL: https://issues.apache.org/jira/browse/YARN-5867
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>
> Steps to reproduce
> ===
> # Set umask to 077 for user
> # Start nodemanager with nmlocal dir configured
> nmlocal dir permission is *755* 
> {{LocalDirsHandlerService#serviceInit}}
> {code} 
> FsPermission perm = new FsPermission((short)0755);
> boolean createSucceeded = localDirs.createNonExistentDirs(localFs, perm);
> createSucceeded &= logDirs.createNonExistentDirs(localFs, perm);
> {code}
> # After  startup delete the nmlocal dir and wait for {{MonitoringTimerTask}} 
> to run (simulation using delete)
> # Now check the permission of {{nmlocal dir}} will be *700*
> *Root Cause*
> {{DirectoryCollection#testDirs}} checks as following
> {code}
> // create a random dir to make sure fs isn't in read-only mode
> verifyDirUsingMkdir(testDir);
> {code}
> which cause a new Random directory to be create in {{localdir}} using
> {{DiskChecker.checkDir(dir)}} -> {{!mkdirsWithExistsCheck(dir)}} causing the 
> nmlocal dir to be created with wrong permission. *700*
> Few application fail to container launch due to permission denied.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5792) adopt the id prefix for YARN, MR, and DS entities

2016-11-10 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-5792:
---
Attachment: YARN-5792-YARN-5355.01.patch

> adopt the id prefix for YARN, MR, and DS entities
> -
>
> Key: YARN-5792
> URL: https://issues.apache.org/jira/browse/YARN-5792
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-5355
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-5792-YARN-5355.01.patch
>
>
> We introduced the entity id prefix to support flexible entity sorting 
> (YARN-5715). We should adopt the id prefix for YARN entities, MR entities, 
> and DS entities to take advantage of the id prefix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5694) ZKRMStateStore should always start its verification thread to prevent accidental state store corruption

2016-11-10 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15654259#comment-15654259
 ] 

Daniel Templeton commented on YARN-5694:


Yes, but if the RM isn't in HA mode, the fencing is quietly ignored, which is 
also something I should address in the next version of this patch.

The reason to have the thread always run is so that we react earlier.  If we 
agree that it's bad to have two RMs accidentally sharing the same state store, 
why would you not want to catch the issue as early as possible?

> ZKRMStateStore should always start its verification thread to prevent 
> accidental state store corruption
> ---
>
> Key: YARN-5694
> URL: https://issues.apache.org/jira/browse/YARN-5694
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
>Priority: Critical
>  Labels: oct16-medium
> Attachments: YARN-5694.001.patch, YARN-5694.002.patch, 
> YARN-5694.003.patch, YARN-5694.004.patch, YARN-5694.004.patch, 
> YARN-5694.005.patch, YARN-5694.006.patch, YARN-5694.007.patch, 
> YARN-5694.branch-2.7.001.patch, YARN-5694.branch-2.7.002.patch
>
>
> There are two cases.  In branch-2.7, the 
> {{ZKRMStateStore.VerifyActiveStatusThread}} is always started, even when 
> using embedded or Curator failover.  In branch-2.8, the 
> {{ZKRMStateStore.VerifyActiveStatusThread}} is only started when HA is 
> disabled, which makes no sense.  Based on the JIRA that introduced that 
> change (YARN-4559), I believe the intent was to start it only when embedded 
> failover is disabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-5867) DirectoryCollection#checkDirs can cause incorrect permission of nmlocal dir

2016-11-10 Thread Bibin A Chundatt (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15653928#comment-15653928
 ] 

Bibin A Chundatt edited comment on YARN-5867 at 11/10/16 2:28 PM:
--

[~naganarasimha...@apache.org]
Not related to appcache.. This jira is forconfigured nmlocaldir permission
{quote}
User with which NM is run ?
{quote}


NM started user umask is 077


was (Author: bibinchundatt):
[~naganarasimha...@apache.org]
Not related to appcache.. This i for root directory. configured nmlocaldir.
{quote}
User with which NM is run ?
{quote}
NM started user umask is 077

> DirectoryCollection#checkDirs can cause incorrect permission of nmlocal dir
> ---
>
> Key: YARN-5867
> URL: https://issues.apache.org/jira/browse/YARN-5867
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>
> Steps to reproduce
> ===
> # Set umask to 077 for user
> # Start nodemanager with nmlocal dir configured
> nmlocal dir permission is *755* 
> {{LocalDirsHandlerService#serviceInit}}
> {code} 
> FsPermission perm = new FsPermission((short)0755);
> boolean createSucceeded = localDirs.createNonExistentDirs(localFs, perm);
> createSucceeded &= logDirs.createNonExistentDirs(localFs, perm);
> {code}
> # After  startup delete the nmlocal dir and wait for {{MonitoringTimerTask}} 
> to run (simulation using delete)
> # Now check the permission of {{nmlocal dir}} will be *700*
> *Root Cause*
> {{DirectoryCollection#testDirs}} checks as following
> {code}
> // create a random dir to make sure fs isn't in read-only mode
> verifyDirUsingMkdir(testDir);
> {code}
> which cause a new Random directory to be create in {{localdir}} using
> {{DiskChecker.checkDir(dir)}} -> {{!mkdirsWithExistsCheck(dir)}} causing the 
> nmlocal dir to be created with wrong permission. *700*
> Few application fail to container launch due to permission denied.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-5825) ProportionalPreemptionalPolicy could use readLock over LeafQueue instead of synchronized block

2016-11-10 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15654156#comment-15654156
 ] 

Sunil G edited comment on YARN-5825 at 11/10/16 2:19 PM:
-

Attaching an initial version of patch.

In PCPP#cloneQueues, we use CSQueue abstract object. So I had to add 
{{getReadLock}} as an api in CSQueue interface to do lock. I agree this is not 
so clean way, however the alternative way is also not clean.

We might need to add below code in PCPP if we want to remove {{getReadQueue}} 
from CSQueue interface
{code}
  private ReentrantReadWriteLock.ReadLock getQueueReadLock(CSQueue curQueue) {
if (curQueue instanceof ParentQueue) {
  return ((ParentQueue) curQueue).getReadLock();
} else if (curQueue instanceof LeafQueue) {
  return ((LeafQueue) curQueue).getReadLock();
}
return null;
  }
{code}

[~leftnoteasy] [~jianhe] thoughts?


was (Author: sunilg):
Attaching an initial version of patch.

In PCPP#cloneQueues, we use CSQueue abstract object. So I had to add 
{{getReadLock}} as an api in CSQueue interface to do lock. I agree this is not 
so clean way, however the alternative way is also not clean.

We might need to add below code in PCPP if we want to remove {{getReadQueue}} 
from CSQueue interface
{code}
  private ReentrantReadWriteLock.ReadLock getQueueReadLock(CSQueue curQueue) {
if (curQueue instanceof ParentQueue) {
  return ((ParentQueue) curQueue).getReadLock();
} else if (curQueue instanceof LeafQueue) {
  return ((LeafQueue) curQueue).getReadLock();
}
return null;
  }
{code}

[~leftnoteasy] thoughts?

> ProportionalPreemptionalPolicy could use readLock over LeafQueue instead of 
> synchronized block
> --
>
> Key: YARN-5825
> URL: https://issues.apache.org/jira/browse/YARN-5825
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: YARN-5825.0001.patch
>
>
> Currently in PCPP, {{synchronized (curQueue)}} is used in various places. 
> Such instances could be replaced with a read lock. Thank you [~jianhe] for 
> pointing out the same as comment 
> [here|https://issues.apache.org/jira/browse/YARN-2009?focusedCommentId=15626578=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15626578]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5825) ProportionalPreemptionalPolicy could use readLock over LeafQueue instead of synchronized block

2016-11-10 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-5825:
--
Attachment: YARN-5825.0001.patch

Attaching an initial version of patch.

In PCPP#cloneQueues, we use CSQueue abstract object. So I had to add 
{{getReadLock}} as an api in CSQueue interface to do lock. I agree this is not 
so clean way, however the alternative way is also not clean.

We might need to add below code in PCPP if we want to remove {{getReadQueue}} 
from CSQueue interface
{code}
  private ReentrantReadWriteLock.ReadLock getQueueReadLock(CSQueue curQueue) {
if (curQueue instanceof ParentQueue) {
  return ((ParentQueue) curQueue).getReadLock();
} else if (curQueue instanceof LeafQueue) {
  return ((LeafQueue) curQueue).getReadLock();
}
return null;
  }
{code}

[~leftnoteasy] thoughts?

> ProportionalPreemptionalPolicy could use readLock over LeafQueue instead of 
> synchronized block
> --
>
> Key: YARN-5825
> URL: https://issues.apache.org/jira/browse/YARN-5825
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: YARN-5825.0001.patch
>
>
> Currently in PCPP, {{synchronized (curQueue)}} is used in various places. 
> Such instances could be replaced with a read lock. Thank you [~jianhe] for 
> pointing out the same as comment 
> [here|https://issues.apache.org/jira/browse/YARN-2009?focusedCommentId=15626578=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15626578]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5792) adopt the id prefix for YARN, MR, and DS entities

2016-11-10 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-5792:
---
Attachment: (was: YARN-5792-YARN-5355.01.patch)

> adopt the id prefix for YARN, MR, and DS entities
> -
>
> Key: YARN-5792
> URL: https://issues.apache.org/jira/browse/YARN-5792
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-5355
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>
> We introduced the entity id prefix to support flexible entity sorting 
> (YARN-5715). We should adopt the id prefix for YARN entities, MR entities, 
> and DS entities to take advantage of the id prefix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5792) adopt the id prefix for YARN, MR, and DS entities

2016-11-10 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-5792:
---
Attachment: YARN-5792-YARN-5355.01.patch

> adopt the id prefix for YARN, MR, and DS entities
> -
>
> Key: YARN-5792
> URL: https://issues.apache.org/jira/browse/YARN-5792
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-5355
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-5792-YARN-5355.01.patch
>
>
> We introduced the entity id prefix to support flexible entity sorting 
> (YARN-5715). We should adopt the id prefix for YARN entities, MR entities, 
> and DS entities to take advantage of the id prefix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5865) Retrospect updateApplicationPriority api to handle state store exception in align with YARN-5611

2016-11-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15654119#comment-15654119
 ] 

Hadoop QA commented on YARN-5865:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 41m  
5s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 56m 28s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:e809691 |
| JIRA Issue | YARN-5865 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12838358/YARN-5865.0001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 38af88531a3c 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / ca68f9c |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/13858/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/13858/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Retrospect updateApplicationPriority api to handle state store exception in 
> align with YARN-5611
> 
>
> Key: YARN-5865
> URL: https://issues.apache.org/jira/browse/YARN-5865
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: 

[jira] [Updated] (YARN-5865) Retrospect updateApplicationPriority api to handle state store exception in align with YARN-5611

2016-11-10 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-5865:
--
Attachment: YARN-5865.0001.patch

Updating an initial version of the patch. cc/[~rohithsharma] and [~jianhe]

> Retrospect updateApplicationPriority api to handle state store exception in 
> align with YARN-5611
> 
>
> Key: YARN-5865
> URL: https://issues.apache.org/jira/browse/YARN-5865
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: YARN-5865.0001.patch
>
>
> Post YARN-5611, revisit dynamic update of application priority logic with 
> respect to state store error handling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5867) DirectoryCollection#checkDirs can cause incorrect permission of nmlocal dir

2016-11-10 Thread Bibin A Chundatt (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15653928#comment-15653928
 ] 

Bibin A Chundatt commented on YARN-5867:


[~naganarasimha...@apache.org]
Not related to appcache.. This i for root directory. configured nmlocaldir.
{quote}
User with which NM is run ?
{quote}
NM started user umask is 077

> DirectoryCollection#checkDirs can cause incorrect permission of nmlocal dir
> ---
>
> Key: YARN-5867
> URL: https://issues.apache.org/jira/browse/YARN-5867
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>
> Steps to reproduce
> ===
> # Set umask to 077 for user
> # Start nodemanager with nmlocal dir configured
> nmlocal dir permission is *755* 
> {{LocalDirsHandlerService#serviceInit}}
> {code} 
> FsPermission perm = new FsPermission((short)0755);
> boolean createSucceeded = localDirs.createNonExistentDirs(localFs, perm);
> createSucceeded &= logDirs.createNonExistentDirs(localFs, perm);
> {code}
> # After  startup delete the nmlocal dir and wait for {{MonitoringTimerTask}} 
> to run (simulation using delete)
> # Now check the permission of {{nmlocal dir}} will be *700*
> *Root Cause*
> {{DirectoryCollection#testDirs}} checks as following
> {code}
> // create a random dir to make sure fs isn't in read-only mode
> verifyDirUsingMkdir(testDir);
> {code}
> which cause a new Random directory to be create in {{localdir}} using
> {{DiskChecker.checkDir(dir)}} -> {{!mkdirsWithExistsCheck(dir)}} causing the 
> nmlocal dir to be created with wrong permission. *700*
> Few application fail to container launch due to permission denied.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5867) DirectoryCollection#checkDirs can cause incorrect permission of nmlocal dir

2016-11-10 Thread Bibin A Chundatt (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt updated YARN-5867:
---
Description: 
Steps to reproduce
===
# Set umask to 077 for user
# Start nodemanager with nmlocal dir configured
nmlocal dir permission is *755* 

{{LocalDirsHandlerService#serviceInit}}

{code} 
FsPermission perm = new FsPermission((short)0755);
boolean createSucceeded = localDirs.createNonExistentDirs(localFs, perm);
createSucceeded &= logDirs.createNonExistentDirs(localFs, perm);
{code}
# After  startup delete the nmlocal dir and wait for {{MonitoringTimerTask}} to 
run (simulation using delete)
# Now check the permission of {{nmlocal dir}} will be *700*

*Root Cause*

{{DirectoryCollection#testDirs}} checks as following

{code}
// create a random dir to make sure fs isn't in read-only mode
verifyDirUsingMkdir(testDir);
{code}

which cause a new Random directory to be create in {{localdir}} using
{{DiskChecker.checkDir(dir)}} -> {{!mkdirsWithExistsCheck(dir)}} causing the 
nmlocal dir to be created with wrong permission. *700*

Few application fail to container launch due to permission denied.

  was:
Steps to reproduce
===
# Set umask to 077 for user
# Start nodemanager with nmlocal dir configured
nmlocal dir permission is *755* 

{{LocalDirsHandlerService#serviceInit}}

{code} 
FsPermission perm = new FsPermission((short)0755);
boolean createSucceeded = localDirs.createNonExistentDirs(localFs, perm);
createSucceeded &= logDirs.createNonExistentDirs(localFs, perm);
{code}
# After  startup delete the nmlocal dir and wait for {{MonitoringTimerTask}} to 
run (simulation using delete)
# Now check the permission of {{nmlocal dir}} will be *750*

*Root Cause*

{{DirectoryCollection#testDirs}} checks as following

{code}
// create a random dir to make sure fs isn't in read-only mode
verifyDirUsingMkdir(testDir);
{code}

which cause a new Random directory to be create in {{localdir}} using
{{DiskChecker.checkDir(dir)}} -> {{!mkdirsWithExistsCheck(dir)}} causing the 
nmlocal dir to be created with wrong permission. *750*

Few application fail to container launch due to permission denied.


> DirectoryCollection#checkDirs can cause incorrect permission of nmlocal dir
> ---
>
> Key: YARN-5867
> URL: https://issues.apache.org/jira/browse/YARN-5867
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>
> Steps to reproduce
> ===
> # Set umask to 077 for user
> # Start nodemanager with nmlocal dir configured
> nmlocal dir permission is *755* 
> {{LocalDirsHandlerService#serviceInit}}
> {code} 
> FsPermission perm = new FsPermission((short)0755);
> boolean createSucceeded = localDirs.createNonExistentDirs(localFs, perm);
> createSucceeded &= logDirs.createNonExistentDirs(localFs, perm);
> {code}
> # After  startup delete the nmlocal dir and wait for {{MonitoringTimerTask}} 
> to run (simulation using delete)
> # Now check the permission of {{nmlocal dir}} will be *700*
> *Root Cause*
> {{DirectoryCollection#testDirs}} checks as following
> {code}
> // create a random dir to make sure fs isn't in read-only mode
> verifyDirUsingMkdir(testDir);
> {code}
> which cause a new Random directory to be create in {{localdir}} using
> {{DiskChecker.checkDir(dir)}} -> {{!mkdirsWithExistsCheck(dir)}} causing the 
> nmlocal dir to be created with wrong permission. *700*
> Few application fail to container launch due to permission denied.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5545) App submit failure on queue with label when default queue partition capacity is zero

2016-11-10 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15653916#comment-15653916
 ] 

Sunil G commented on YARN-5545:
---

+1

> App submit failure on queue with label when default queue partition capacity 
> is zero
> 
>
> Key: YARN-5545
> URL: https://issues.apache.org/jira/browse/YARN-5545
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>  Labels: oct16-medium
> Attachments: YARN-5545.0001.patch, YARN-5545.0002.patch, 
> YARN-5545.0003.patch, YARN-5545.0005.patch, YARN-5545.0006.patch, 
> YARN-5545.0007.patch, YARN-5545.0008.patch, YARN-5545.004.patch, 
> capacity-scheduler.xml
>
>
> Configure capacity scheduler 
> yarn.scheduler.capacity.root.default.capacity=0
> yarn.scheduler.capacity.root.queue1.accessible-node-labels.labelx.capacity=50
> yarn.scheduler.capacity.root.default.accessible-node-labels.labelx.capacity=50
> Submit application as below
> ./yarn jar 
> ../share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.0.0-alpha2-SNAPSHOT-tests.jar
>  sleep -Dmapreduce.job.node-label-expression=labelx 
> -Dmapreduce.job.queuename=default -m 1 -r 1 -mt 1000 -rt 1
> {noformat}
> 2016-08-21 18:21:31,375 INFO mapreduce.JobSubmitter: Cleaning up the staging 
> area /tmp/hadoop-yarn/staging/root/.staging/job_1471670113386_0001
> java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Failed 
> to submit application_1471670113386_0001 to YARN : 
> org.apache.hadoop.security.AccessControlException: Queue root.default already 
> has 0 applications, cannot accept submission of application: 
> application_1471670113386_0001
>   at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:316)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:255)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1790)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
>   at org.apache.hadoop.mapreduce.SleepJob.run(SleepJob.java:273)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at org.apache.hadoop.mapreduce.SleepJob.main(SleepJob.java:194)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>   at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
>   at 
> org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:136)
>   at 
> org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:144)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> Caused by: org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit 
> application_1471670113386_0001 to YARN : 
> org.apache.hadoop.security.AccessControlException: Queue root.default already 
> has 0 applications, cannot accept submission of application: 
> application_1471670113386_0001
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:286)
>   at 
> org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:296)
>   at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:301)
>   ... 25 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5867) DirectoryCollection#checkDirs can cause incorrect permission of nmlocal dir

2016-11-10 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15653904#comment-15653904
 ] 

Naganarasimha G R commented on YARN-5867:
-

I think its kind of related to YARN-5765 and YARN-5287 with restricted rights 
on the user. Not sure which user you are trying to refer here user with which 
NM is run ? 

> DirectoryCollection#checkDirs can cause incorrect permission of nmlocal dir
> ---
>
> Key: YARN-5867
> URL: https://issues.apache.org/jira/browse/YARN-5867
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>
> Steps to reproduce
> ===
> # Set umask to 077 for user
> # Start nodemanager with nmlocal dir configured
> nmlocal dir permission is *755* 
> {{LocalDirsHandlerService#serviceInit}}
> {code} 
> FsPermission perm = new FsPermission((short)0755);
> boolean createSucceeded = localDirs.createNonExistentDirs(localFs, perm);
> createSucceeded &= logDirs.createNonExistentDirs(localFs, perm);
> {code}
> # After  startup delete the nmlocal dir and wait for {{MonitoringTimerTask}} 
> to run (simulation using delete)
> # Now check the permission of {{nmlocal dir}} will be *750*
> *Root Cause*
> {{DirectoryCollection#testDirs}} checks as following
> {code}
> // create a random dir to make sure fs isn't in read-only mode
> verifyDirUsingMkdir(testDir);
> {code}
> which cause a new Random directory to be create in {{localdir}} using
> {{DiskChecker.checkDir(dir)}} -> {{!mkdirsWithExistsCheck(dir)}} causing the 
> nmlocal dir to be created with wrong permission. *750*
> Few application fail to container launch due to permission denied.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5867) DirectoryCollection#checkDirs can cause incorrect permission of nmlocal dir

2016-11-10 Thread Bibin A Chundatt (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt updated YARN-5867:
---
Description: 
Steps to reproduce
===
# Set umask to 077 for user
# Start nodemanager with nmlocal dir configured
nmlocal dir permission is *755* 

{{LocalDirsHandlerService#serviceInit}}

{code} 
FsPermission perm = new FsPermission((short)0755);
boolean createSucceeded = localDirs.createNonExistentDirs(localFs, perm);
createSucceeded &= logDirs.createNonExistentDirs(localFs, perm);
{code}
# After  startup delete the nmlocal dir and wait for {{MonitoringTimerTask}} to 
run (simulation using delete)
# Now check the permission of {{nmlocal dir}} will be *750*

*Root Cause*

{{DirectoryCollection#testDirs}} checks as following

{code}
// create a random dir to make sure fs isn't in read-only mode
verifyDirUsingMkdir(testDir);
{code}

which cause a new Random directory to be create in {{localdir}} using
{{DiskChecker.checkDir(dir)}} -> {{!mkdirsWithExistsCheck(dir)}} causing the 
nmlocal dir to be created with wrong permission. *750*

Few application fail to container launch due to permission denied.

  was:
Steps to reproduce
===
# Set umask to 027 for user
# Start nodemanager with nmlocal dir configured
nmlocal dir permission is *755* 

{{LocalDirsHandlerService#serviceInit}}

{code} 
FsPermission perm = new FsPermission((short)0755);
boolean createSucceeded = localDirs.createNonExistentDirs(localFs, perm);
createSucceeded &= logDirs.createNonExistentDirs(localFs, perm);
{code}
# After  startup delete the nmlocal dir and wait for {{MonitoringTimerTask}} to 
run (simulation using delete)
# Now check the permission of {{nmlocal dir}} will be *750*

*Root Cause*

{{DirectoryCollection#testDirs}} checks as following

{code}
// create a random dir to make sure fs isn't in read-only mode
verifyDirUsingMkdir(testDir);
{code}

which cause a new Random directory to be create in {{localdir}} using
{{DiskChecker.checkDir(dir)}} -> {{!mkdirsWithExistsCheck(dir)}} causing the 
nmlocal dir to be created with wrong permission. *750*

Few application fail to container launch due to permission denied.


> DirectoryCollection#checkDirs can cause incorrect permission of nmlocal dir
> ---
>
> Key: YARN-5867
> URL: https://issues.apache.org/jira/browse/YARN-5867
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>
> Steps to reproduce
> ===
> # Set umask to 077 for user
> # Start nodemanager with nmlocal dir configured
> nmlocal dir permission is *755* 
> {{LocalDirsHandlerService#serviceInit}}
> {code} 
> FsPermission perm = new FsPermission((short)0755);
> boolean createSucceeded = localDirs.createNonExistentDirs(localFs, perm);
> createSucceeded &= logDirs.createNonExistentDirs(localFs, perm);
> {code}
> # After  startup delete the nmlocal dir and wait for {{MonitoringTimerTask}} 
> to run (simulation using delete)
> # Now check the permission of {{nmlocal dir}} will be *750*
> *Root Cause*
> {{DirectoryCollection#testDirs}} checks as following
> {code}
> // create a random dir to make sure fs isn't in read-only mode
> verifyDirUsingMkdir(testDir);
> {code}
> which cause a new Random directory to be create in {{localdir}} using
> {{DiskChecker.checkDir(dir)}} -> {{!mkdirsWithExistsCheck(dir)}} causing the 
> nmlocal dir to be created with wrong permission. *750*
> Few application fail to container launch due to permission denied.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5545) App submit failure on queue with label when default queue partition capacity is zero

2016-11-10 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15653872#comment-15653872
 ] 

Naganarasimha G R commented on YARN-5545:
-

Thanks [~bibinchundatt], +1, Latest patch looks good to me, if no further 
comments will commit it later today.

> App submit failure on queue with label when default queue partition capacity 
> is zero
> 
>
> Key: YARN-5545
> URL: https://issues.apache.org/jira/browse/YARN-5545
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>  Labels: oct16-medium
> Attachments: YARN-5545.0001.patch, YARN-5545.0002.patch, 
> YARN-5545.0003.patch, YARN-5545.0005.patch, YARN-5545.0006.patch, 
> YARN-5545.0007.patch, YARN-5545.0008.patch, YARN-5545.004.patch, 
> capacity-scheduler.xml
>
>
> Configure capacity scheduler 
> yarn.scheduler.capacity.root.default.capacity=0
> yarn.scheduler.capacity.root.queue1.accessible-node-labels.labelx.capacity=50
> yarn.scheduler.capacity.root.default.accessible-node-labels.labelx.capacity=50
> Submit application as below
> ./yarn jar 
> ../share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.0.0-alpha2-SNAPSHOT-tests.jar
>  sleep -Dmapreduce.job.node-label-expression=labelx 
> -Dmapreduce.job.queuename=default -m 1 -r 1 -mt 1000 -rt 1
> {noformat}
> 2016-08-21 18:21:31,375 INFO mapreduce.JobSubmitter: Cleaning up the staging 
> area /tmp/hadoop-yarn/staging/root/.staging/job_1471670113386_0001
> java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Failed 
> to submit application_1471670113386_0001 to YARN : 
> org.apache.hadoop.security.AccessControlException: Queue root.default already 
> has 0 applications, cannot accept submission of application: 
> application_1471670113386_0001
>   at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:316)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:255)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1790)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
>   at org.apache.hadoop.mapreduce.SleepJob.run(SleepJob.java:273)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at org.apache.hadoop.mapreduce.SleepJob.main(SleepJob.java:194)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>   at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
>   at 
> org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:136)
>   at 
> org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:144)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> Caused by: org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit 
> application_1471670113386_0001 to YARN : 
> org.apache.hadoop.security.AccessControlException: Queue root.default already 
> has 0 applications, cannot accept submission of application: 
> application_1471670113386_0001
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:286)
>   at 
> org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:296)
>   at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:301)
>   ... 25 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5867) DirectoryCollection#checkDirs can cause incorrect permission of nmlocal dir

2016-11-10 Thread Bibin A Chundatt (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15653868#comment-15653868
 ] 

Bibin A Chundatt commented on YARN-5867:


*Solution*
# We can check and try creation of localdir before testdir() all dir with 
*0755* permission
# Should create Random localdir only if the localdir exits , So that local dir 
will be considered as bad.

In my opinion should use *Solution 1* makes NM auto recoverable.Thoughts?


> DirectoryCollection#checkDirs can cause incorrect permission of nmlocal dir
> ---
>
> Key: YARN-5867
> URL: https://issues.apache.org/jira/browse/YARN-5867
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>
> Steps to reproduce
> ===
> # Set umask to 027 for user
> # Start nodemanager with nmlocal dir configured
> nmlocal dir permission is *755* 
> {{LocalDirsHandlerService#serviceInit}}
> {code} 
> FsPermission perm = new FsPermission((short)0755);
> boolean createSucceeded = localDirs.createNonExistentDirs(localFs, perm);
> createSucceeded &= logDirs.createNonExistentDirs(localFs, perm);
> {code}
> # After  startup delete the nmlocal dir and wait for {{MonitoringTimerTask}} 
> to run (simulation using delete)
> # Now check the permission of {{nmlocal dir}} will be *750*
> *Root Cause*
> {{DirectoryCollection#testDirs}} checks as following
> {code}
> // create a random dir to make sure fs isn't in read-only mode
> verifyDirUsingMkdir(testDir);
> {code}
> which cause a new Random directory to be create in {{localdir}} using
> {{DiskChecker.checkDir(dir)}} -> {{!mkdirsWithExistsCheck(dir)}} causing the 
> nmlocal dir to be created with wrong permission. *750*
> Few application fail to container launch due to permission denied.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5867) DirectoryCollection#checkDirs can cause incorrect permission of nmlocal dir

2016-11-10 Thread Bibin A Chundatt (JIRA)
Bibin A Chundatt created YARN-5867:
--

 Summary: DirectoryCollection#checkDirs can cause incorrect 
permission of nmlocal dir
 Key: YARN-5867
 URL: https://issues.apache.org/jira/browse/YARN-5867
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Bibin A Chundatt
Assignee: Bibin A Chundatt


Steps to reproduce
===
# Set umask to 027 for user
# Start nodemanager with nmlocal dir configured
nmlocal dir permission is *755* 

{{LocalDirsHandlerService#serviceInit}}

{code} 
FsPermission perm = new FsPermission((short)0755);
boolean createSucceeded = localDirs.createNonExistentDirs(localFs, perm);
createSucceeded &= logDirs.createNonExistentDirs(localFs, perm);
{code}
# After  startup delete the nmlocal dir and wait for {{MonitoringTimerTask}} to 
run (simulation using delete)
# Now check the permission of {{nmlocal dir}} will be *750*

*Root Cause*

{{DirectoryCollection#testDirs}} checks as following

{code}
// create a random dir to make sure fs isn't in read-only mode
verifyDirUsingMkdir(testDir);
{code}

which cause a new Random directory to be create in {{localdir}} using
{{DiskChecker.checkDir(dir)}} -> {{!mkdirsWithExistsCheck(dir)}} causing the 
nmlocal dir to be created with wrong permission. *750*

Few application fail to container launch due to permission denied.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4218) Metric for resource*time that was preempted

2016-11-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15653752#comment-15653752
 ] 

Hadoop QA commented on YARN-4218:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 10 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
35s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
34s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
50s{color} | {color:green} branch-2 passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
12s{color} | {color:green} branch-2 passed with JDK v1.7.0_111 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
51s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
53s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
29s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
42s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
56s{color} | {color:green} branch-2 passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
8s{color} | {color:green} branch-2 passed with JDK v1.7.0_111 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
1s{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  2m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
16s{color} | {color:green} the patch passed with JDK v1.7.0_111 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  2m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
16s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 53s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 10 new + 922 unchanged - 10 fixed = 932 total (was 932) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
53s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
20s{color} | {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_101
 with JDK v1.8.0_101 generated 1 new + 924 unchanged - 0 fixed = 925 total (was 
924) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
3s{color} | {color:green} the patch passed with JDK v1.7.0_111 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
27s{color} | {color:green} hadoop-yarn-api in the patch passed with JDK 
v1.7.0_111. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
27s{color} | {color:green} hadoop-yarn-common in the patch passed with JDK 
v1.7.0_111. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
29s{color} | {color:green} 

[jira] [Commented] (YARN-4218) Metric for resource*time that was preempted

2016-11-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15653631#comment-15653631
 ] 

Hadoop QA commented on YARN-4218:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 11 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
 2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  2m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
50s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  5m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m 
10s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m  1s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 10 new + 915 unchanged - 10 fixed = 925 total (was 925) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  2m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
10s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
31s{color} | {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager
 generated 1 new + 924 unchanged - 0 fixed = 925 total (was 924) {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
35s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
27s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
36s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
22s{color} | {color:green} hadoop-yarn-server-applicationhistoryservice in the 
patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 40m 
39s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 16m 
12s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
37s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}128m  8s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:e809691 |
| JIRA Issue | YARN-4218 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12838311/YARN-4218.006.patch |
| 

[jira] [Commented] (YARN-5834) TestNodeStatusUpdater.testNMRMConnectionConf compares nodemanager wait time to the incorrect value

2016-11-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15653530#comment-15653530
 ] 

Hadoop QA commented on YARN-5834:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
52s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green} branch-2 passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
27s{color} | {color:green} branch-2 passed with JDK v1.7.0_111 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
21s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
31s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
15s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
54s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} branch-2 passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} branch-2 passed with JDK v1.7.0_111 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed with JDK v1.7.0_111 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed with JDK v1.7.0_111 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 15m 
36s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed 
with JDK v1.7.0_111. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 47m  9s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:b59b8b7 |
| JIRA Issue | YARN-5834 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12838319/YARN-5834-branch-2.001.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 87018bab2daf 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | branch-2 / f7b2542 |
| Default Java | 1.7.0_111 |
| Multi-JDK versions |  /usr/lib/jvm/java-8-oracle:1.8.0_101 
/usr/lib/jvm/java-7-openjdk-amd64:1.7.0_111 |
| findbugs | v3.0.0 |
| JDK v1.7.0_111  Test 

[jira] [Commented] (YARN-5545) App submit failure on queue with label when default queue partition capacity is zero

2016-11-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15653440#comment-15653440
 ] 

Hadoop QA commented on YARN-5545:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 42m 
39s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 59m  2s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:e809691 |
| JIRA Issue | YARN-5545 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12838310/YARN-5545.0008.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 3e8ae235e1f0 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / c8bc7a8 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/13854/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/13854/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> App submit failure on queue with label when default queue partition capacity 
> is zero
> 
>
> Key: YARN-5545
> URL: https://issues.apache.org/jira/browse/YARN-5545
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>  Labels: oct16-medium
> 

[jira] [Commented] (YARN-5453) FairScheduler#update may skip update demand resource of child queue/app if current demand reached maxResource

2016-11-10 Thread sandflee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15653433#comment-15653433
 ] 

sandflee commented on YARN-5453:


thanks [~kasha] for review and  commit !

> FairScheduler#update may skip update demand resource of child queue/app if 
> current demand reached maxResource
> -
>
> Key: YARN-5453
> URL: https://issues.apache.org/jira/browse/YARN-5453
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: sandflee
>Assignee: sandflee
>  Labels: oct16-easy
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: YARN-5453.01.patch, YARN-5453.02.patch, 
> YARN-5453.03.patch, YARN-5453.04.patch, YARN-5453.05.patch
>
>
> {code}
>   demand = Resources.createResource(0);
>   for (FSQueue childQueue : childQueues) {
> childQueue.updateDemand();
> Resource toAdd = childQueue.getDemand();
> demand = Resources.add(demand, toAdd);
> demand = Resources.componentwiseMin(demand, maxRes);
> if (Resources.equals(demand, maxRes)) {
>   break;
> }
>   }
> {code}
> if one singe queue's demand resource exceed maxRes,  the other queue's demand 
> resource will not update.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5834) TestNodeStatusUpdater.testNMRMConnectionConf compares nodemanager wait time to the incorrect value

2016-11-10 Thread Chang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15653391#comment-15653391
 ] 

Chang Li commented on YARN-5834:


Thanks for reporting. Yes it's meant to be nmRmConnectionWaitMs. Provide 
branch-2 patch since this test does not exist in trunk

> TestNodeStatusUpdater.testNMRMConnectionConf compares nodemanager wait time 
> to the incorrect value
> --
>
> Key: YARN-5834
> URL: https://issues.apache.org/jira/browse/YARN-5834
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Miklos Szegedi
>Assignee: Chang Li
>Priority: Minor
> Attachments: YARN-5834-branch-2.001.patch
>
>
> The function is TestNodeStatusUpdater#testNMRMConnectionConf()
> I believe the connectionWaitMs references below were meant to be 
> nmRmConnectionWaitMs.
> {code}
> conf.setLong(YarnConfiguration.NM_RESOURCEMANAGER_CONNECT_MAX_WAIT_MS,
> nmRmConnectionWaitMs);
> conf.setLong(YarnConfiguration.RESOURCEMANAGER_CONNECT_MAX_WAIT_MS,
> connectionWaitMs);
> ...
>   long t = System.currentTimeMillis();
>   long duration = t - waitStartTime;
>   boolean waitTimeValid = (duration >= nmRmConnectionWaitMs) &&
>   (duration < (*connectionWaitMs* + delta));
>   if(!waitTimeValid) {
> // throw exception if NM doesn't retry long enough
> throw new Exception("NM should have tried re-connecting to RM during 
> " +
>   "period of at least " + *connectionWaitMs* + " ms, but " +
>   "stopped retrying within " + (*connectionWaitMs* + delta) +
>   " ms: " + e, e);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5834) TestNodeStatusUpdater.testNMRMConnectionConf compares nodemanager wait time to the incorrect value

2016-11-10 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated YARN-5834:
---
Attachment: YARN-5834-branch-2.001.patch

> TestNodeStatusUpdater.testNMRMConnectionConf compares nodemanager wait time 
> to the incorrect value
> --
>
> Key: YARN-5834
> URL: https://issues.apache.org/jira/browse/YARN-5834
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Miklos Szegedi
>Assignee: Chang Li
>Priority: Minor
> Attachments: YARN-5834-branch-2.001.patch
>
>
> The function is TestNodeStatusUpdater#testNMRMConnectionConf()
> I believe the connectionWaitMs references below were meant to be 
> nmRmConnectionWaitMs.
> {code}
> conf.setLong(YarnConfiguration.NM_RESOURCEMANAGER_CONNECT_MAX_WAIT_MS,
> nmRmConnectionWaitMs);
> conf.setLong(YarnConfiguration.RESOURCEMANAGER_CONNECT_MAX_WAIT_MS,
> connectionWaitMs);
> ...
>   long t = System.currentTimeMillis();
>   long duration = t - waitStartTime;
>   boolean waitTimeValid = (duration >= nmRmConnectionWaitMs) &&
>   (duration < (*connectionWaitMs* + delta));
>   if(!waitTimeValid) {
> // throw exception if NM doesn't retry long enough
> throw new Exception("NM should have tried re-connecting to RM during 
> " +
>   "period of at least " + *connectionWaitMs* + " ms, but " +
>   "stopped retrying within " + (*connectionWaitMs* + delta) +
>   " ms: " + e, e);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-5834) TestNodeStatusUpdater.testNMRMConnectionConf compares nodemanager wait time to the incorrect value

2016-11-10 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li reassigned YARN-5834:
--

Assignee: Chang Li

> TestNodeStatusUpdater.testNMRMConnectionConf compares nodemanager wait time 
> to the incorrect value
> --
>
> Key: YARN-5834
> URL: https://issues.apache.org/jira/browse/YARN-5834
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Miklos Szegedi
>Assignee: Chang Li
>Priority: Minor
> Attachments: YARN-5834-branch-2.001.patch
>
>
> The function is TestNodeStatusUpdater#testNMRMConnectionConf()
> I believe the connectionWaitMs references below were meant to be 
> nmRmConnectionWaitMs.
> {code}
> conf.setLong(YarnConfiguration.NM_RESOURCEMANAGER_CONNECT_MAX_WAIT_MS,
> nmRmConnectionWaitMs);
> conf.setLong(YarnConfiguration.RESOURCEMANAGER_CONNECT_MAX_WAIT_MS,
> connectionWaitMs);
> ...
>   long t = System.currentTimeMillis();
>   long duration = t - waitStartTime;
>   boolean waitTimeValid = (duration >= nmRmConnectionWaitMs) &&
>   (duration < (*connectionWaitMs* + delta));
>   if(!waitTimeValid) {
> // throw exception if NM doesn't retry long enough
> throw new Exception("NM should have tried re-connecting to RM during 
> " +
>   "period of at least " + *connectionWaitMs* + " ms, but " +
>   "stopped retrying within " + (*connectionWaitMs* + delta) +
>   " ms: " + e, e);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5453) FairScheduler#update may skip update demand resource of child queue/app if current demand reached maxResource

2016-11-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15653367#comment-15653367
 ] 

Hudson commented on YARN-5453:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10811 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10811/])
YARN-5453. FairScheduler#update may skip update demand resource of child 
(kasha: rev 86ac1ad9fd65c7dd12278372b369de38dc4616db)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSParentQueue.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java


> FairScheduler#update may skip update demand resource of child queue/app if 
> current demand reached maxResource
> -
>
> Key: YARN-5453
> URL: https://issues.apache.org/jira/browse/YARN-5453
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: sandflee
>Assignee: sandflee
>  Labels: oct16-easy
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: YARN-5453.01.patch, YARN-5453.02.patch, 
> YARN-5453.03.patch, YARN-5453.04.patch, YARN-5453.05.patch
>
>
> {code}
>   demand = Resources.createResource(0);
>   for (FSQueue childQueue : childQueues) {
> childQueue.updateDemand();
> Resource toAdd = childQueue.getDemand();
> demand = Resources.add(demand, toAdd);
> demand = Resources.componentwiseMin(demand, maxRes);
> if (Resources.equals(demand, maxRes)) {
>   break;
> }
>   }
> {code}
> if one singe queue's demand resource exceed maxRes,  the other queue's demand 
> resource will not update.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org