[jira] [Commented] (YARN-8132) Final Status of applications shown as UNDEFINED in ATS app queries

2019-02-15 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16770017#comment-16770017
 ] 

Prabhu Joseph commented on YARN-8132:
-

Hi [~haibochen], Can you review the patch for this jira when you get time. 
Below is the analysis

When the Job is Finished, and if Suceeded - the current attempt will exist and 
will provide the 
right FinalApplicationStatus.  
And if Failed / Killed, the current attempt does not exist and the 
FinalApplicationStatus gets 
from current state (not) (finished / failed / killed) will become UNDEFINED. 
The fix gets the 
FinalApplicationStatus from the final state it will transitionTo.

FinalState Finished will be either FinalApplicationStatus SUCCEEDED or FAILED.
FinalState Failed / Killed will be same for FinalApplicationStatus.

> Final Status of applications shown as UNDEFINED in ATS app queries
> --
>
> Key: YARN-8132
> URL: https://issues.apache.org/jira/browse/YARN-8132
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: ATSv2, timelineservice
>Reporter: Charan Hebri
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-8132-001.patch, YARN-8132-002.patch
>
>
> Final Status is shown as UNDEFINED for applications that are KILLED/FAILED. A 
> sample request/response with INFO field for an application,
> {noformat}
> 2018-04-09 13:10:02,126 INFO  reader.TimelineReaderWebServices 
> (TimelineReaderWebServices.java:getApp(1693)) - Received URL 
> /ws/v2/timeline/apps/application_1523259757659_0003?fields=INFO from user 
> hrt_qa
> 2018-04-09 13:10:02,156 INFO  reader.TimelineReaderWebServices 
> (TimelineReaderWebServices.java:getApp(1716)) - Processed URL 
> /ws/v2/timeline/apps/application_1523259757659_0003?fields=INFO (Took 30 
> ms.){noformat}
> {noformat}
> {
>   "metrics": [],
>   "events": [],
>   "createdtime": 1523263360719,
>   "idprefix": 0,
>   "id": "application_1523259757659_0003",
>   "type": "YARN_APPLICATION",
>   "info": {
> "YARN_APPLICATION_CALLER_CONTEXT": "CLI",
> "YARN_APPLICATION_DIAGNOSTICS_INFO": "Application 
> application_1523259757659_0003 was killed by user xxx_xx at XXX.XXX.XXX.XXX",
> "YARN_APPLICATION_FINAL_STATUS": "UNDEFINED",
> "YARN_APPLICATION_NAME": "Sleep job",
> "YARN_APPLICATION_USER": "hrt_qa",
> "YARN_APPLICATION_UNMANAGED_APPLICATION": false,
> "FROM_ID": 
> "yarn-cluster!hrt_qa!test_flow!1523263360719!application_1523259757659_0003",
> "UID": "yarn-cluster!application_1523259757659_0003",
> "YARN_APPLICATION_VIEW_ACLS": " ",
> "YARN_APPLICATION_SUBMITTED_TIME": 1523263360718,
> "YARN_AM_CONTAINER_LAUNCH_COMMAND": [
>   "$JAVA_HOME/bin/java -Djava.io.tmpdir=$PWD/tmp 
> -Dlog4j.configuration=container-log4j.properties 
> -Dyarn.app.container.log.dir= -Dyarn.app.container.log.filesize=0 
> -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog 
> -Dhdp.version=3.0.0.0-1163 -Xmx819m -Dhdp.version=3.0.0.0-1163 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1>/stdout 
> 2>/stderr "
> ],
> "YARN_APPLICATION_QUEUE": "default",
> "YARN_APPLICATION_TYPE": "MAPREDUCE",
> "YARN_APPLICATION_PRIORITY": 0,
> "YARN_APPLICATION_LATEST_APP_ATTEMPT": 
> "appattempt_1523259757659_0003_01",
> "YARN_APPLICATION_TAGS": [
>   "timeline_flow_name_tag:test_flow"
> ],
> "YARN_APPLICATION_STATE": "KILLED"
>   },
>   "configs": {},
>   "isrelatedto": {},
>   "relatesto": {}
> }{noformat}
> This is different to what the Resource Manager reports. For KILLED 
> applications the final status is KILLED and for FAILED applications it is 
> FAILED. This behavior is seen in ATSv2 as well as older versions of ATS. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8132) Final Status of applications shown as UNDEFINED in ATS app queries

2019-02-15 Thread Prabhu Joseph (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-8132:

Attachment: YARN-8132-002.patch

> Final Status of applications shown as UNDEFINED in ATS app queries
> --
>
> Key: YARN-8132
> URL: https://issues.apache.org/jira/browse/YARN-8132
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: ATSv2, timelineservice
>Reporter: Charan Hebri
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-8132-001.patch, YARN-8132-002.patch
>
>
> Final Status is shown as UNDEFINED for applications that are KILLED/FAILED. A 
> sample request/response with INFO field for an application,
> {noformat}
> 2018-04-09 13:10:02,126 INFO  reader.TimelineReaderWebServices 
> (TimelineReaderWebServices.java:getApp(1693)) - Received URL 
> /ws/v2/timeline/apps/application_1523259757659_0003?fields=INFO from user 
> hrt_qa
> 2018-04-09 13:10:02,156 INFO  reader.TimelineReaderWebServices 
> (TimelineReaderWebServices.java:getApp(1716)) - Processed URL 
> /ws/v2/timeline/apps/application_1523259757659_0003?fields=INFO (Took 30 
> ms.){noformat}
> {noformat}
> {
>   "metrics": [],
>   "events": [],
>   "createdtime": 1523263360719,
>   "idprefix": 0,
>   "id": "application_1523259757659_0003",
>   "type": "YARN_APPLICATION",
>   "info": {
> "YARN_APPLICATION_CALLER_CONTEXT": "CLI",
> "YARN_APPLICATION_DIAGNOSTICS_INFO": "Application 
> application_1523259757659_0003 was killed by user xxx_xx at XXX.XXX.XXX.XXX",
> "YARN_APPLICATION_FINAL_STATUS": "UNDEFINED",
> "YARN_APPLICATION_NAME": "Sleep job",
> "YARN_APPLICATION_USER": "hrt_qa",
> "YARN_APPLICATION_UNMANAGED_APPLICATION": false,
> "FROM_ID": 
> "yarn-cluster!hrt_qa!test_flow!1523263360719!application_1523259757659_0003",
> "UID": "yarn-cluster!application_1523259757659_0003",
> "YARN_APPLICATION_VIEW_ACLS": " ",
> "YARN_APPLICATION_SUBMITTED_TIME": 1523263360718,
> "YARN_AM_CONTAINER_LAUNCH_COMMAND": [
>   "$JAVA_HOME/bin/java -Djava.io.tmpdir=$PWD/tmp 
> -Dlog4j.configuration=container-log4j.properties 
> -Dyarn.app.container.log.dir= -Dyarn.app.container.log.filesize=0 
> -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog 
> -Dhdp.version=3.0.0.0-1163 -Xmx819m -Dhdp.version=3.0.0.0-1163 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1>/stdout 
> 2>/stderr "
> ],
> "YARN_APPLICATION_QUEUE": "default",
> "YARN_APPLICATION_TYPE": "MAPREDUCE",
> "YARN_APPLICATION_PRIORITY": 0,
> "YARN_APPLICATION_LATEST_APP_ATTEMPT": 
> "appattempt_1523259757659_0003_01",
> "YARN_APPLICATION_TAGS": [
>   "timeline_flow_name_tag:test_flow"
> ],
> "YARN_APPLICATION_STATE": "KILLED"
>   },
>   "configs": {},
>   "isrelatedto": {},
>   "relatesto": {}
> }{noformat}
> This is different to what the Resource Manager reports. For KILLED 
> applications the final status is KILLED and for FAILED applications it is 
> FAILED. This behavior is seen in ATSv2 as well as older versions of ATS. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-7021) TestResourceUtils to be moved to hadoop-yarn-api package

2019-02-15 Thread Zhaohui Xin (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhaohui Xin reassigned YARN-7021:
-

Assignee: Zhaohui Xin

> TestResourceUtils to be moved to hadoop-yarn-api package
> 
>
> Key: YARN-7021
> URL: https://issues.apache.org/jira/browse/YARN-7021
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Affects Versions: YARN-3926
>Reporter: Sunil Govindan
>Assignee: Zhaohui Xin
>Priority: Major
>
> ResourceUtils class is now in yarn-api. Its better its test class also to be 
> moved there, however these tests using lot of resources and using 
> ConfigurationProvider which is available only in yarn-common.  Hence 
> investigate and improve test for ResourceUtils class.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-6971) Clean up different ways to create resources

2019-02-15 Thread Zhaohui Xin (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhaohui Xin reassigned YARN-6971:
-

Assignee: (was: Zhaohui Xin)

> Clean up different ways to create resources
> ---
>
> Key: YARN-6971
> URL: https://issues.apache.org/jira/browse/YARN-6971
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager, scheduler
>Reporter: Yufei Gu
>Priority: Minor
>  Labels: newbie
>
> There are several ways to create a {{resource}} object, e.g., 
> BuilderUtils.newResource() and Resources.createResource(). These methods not 
> only cause confusing but also performance issues, for example 
> BuilderUtils.newResource() is significant slow than 
> Resources.createResource(). 
> We could merge them some how, and replace most BuilderUtils.newResource() 
> with Resources.createResource().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-6971) Clean up different ways to create resources

2019-02-15 Thread Zhaohui Xin (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhaohui Xin reassigned YARN-6971:
-

Assignee: Zhaohui Xin

> Clean up different ways to create resources
> ---
>
> Key: YARN-6971
> URL: https://issues.apache.org/jira/browse/YARN-6971
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager, scheduler
>Reporter: Yufei Gu
>Assignee: Zhaohui Xin
>Priority: Minor
>  Labels: newbie
>
> There are several ways to create a {{resource}} object, e.g., 
> BuilderUtils.newResource() and Resources.createResource(). These methods not 
> only cause confusing but also performance issues, for example 
> BuilderUtils.newResource() is significant slow than 
> Resources.createResource(). 
> We could merge them some how, and replace most BuilderUtils.newResource() 
> with Resources.createResource().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-7518) Node manager should allow resource units to be lower cased

2019-02-15 Thread Zhaohui Xin (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhaohui Xin reassigned YARN-7518:
-

Assignee: Zhaohui Xin

> Node manager should allow resource units to be lower cased
> --
>
> Key: YARN-7518
> URL: https://issues.apache.org/jira/browse/YARN-7518
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 3.0.0-beta1, 3.1.0
>Reporter: Daniel Templeton
>Assignee: Zhaohui Xin
>Priority: Major
>
> When we do units checks, we should ignore case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-6611) ResourceTypes should be renamed

2019-02-15 Thread Zhaohui Xin (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhaohui Xin reassigned YARN-6611:
-

Assignee: Zhaohui Xin

> ResourceTypes should be renamed
> ---
>
> Key: YARN-6611
> URL: https://issues.apache.org/jira/browse/YARN-6611
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: YARN-3926
>Reporter: Daniel Templeton
>Assignee: Zhaohui Xin
>Priority: Major
>
> {{ResourceTypes}} is too close to the unrelated {{ResourceType}} class.  
> Maybe {{ResourceClass}} would be better?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8927) Support trust top-level image like "centos" when "library" is configured in "docker.trusted.registries"

2019-02-15 Thread Zhankun Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769904#comment-16769904
 ] 

Zhankun Tang commented on YARN-8927:


[~eyang] , Thanks for the review!

[~ebadger] , I see. That makes sense to me. Thanks!

> Support trust top-level image like "centos" when "library" is configured in 
> "docker.trusted.registries"
> ---
>
> Key: YARN-8927
> URL: https://issues.apache.org/jira/browse/YARN-8927
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Zhankun Tang
>Assignee: Zhankun Tang
>Priority: Major
>  Labels: Docker
> Fix For: 3.3.0
>
> Attachments: YARN-8927-trunk.001.patch, YARN-8927-trunk.002.patch
>
>
> There are some missing cases that we need to catch when handling 
> "docker.trusted.registries".
> The container-executor.cfg configuration is as follows:
> {code:java}
> docker.trusted.registries=tangzhankun,ubuntu,centos{code}
> It works if run DistrubutedShell with "tangzhankun/tensorflow"
> {code:java}
> "yarn ... -shell_env YARN_CONTAINER_RUNTIME_TYPE=docker -shell_env 
> YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=tangzhankun/tensorflow
> {code}
> But run a DistrubutedShell job with "centos", "centos[:tagName]", "ubuntu" 
> and "ubuntu[:tagName]" fails:
> The error message is like:
> {code:java}
> "image: centos is not trusted"
> {code}
> We need better handling the above cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8132) Final Status of applications shown as UNDEFINED in ATS app queries

2019-02-15 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769684#comment-16769684
 ] 

Hadoop QA commented on YARN-8132:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 42s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 35s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 1 new + 148 unchanged - 0 fixed = 149 total (was 148) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 43s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 95m 42s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
35s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}148m 50s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-8132 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12958916/YARN-8132-001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 829db1819c70 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri Oct 
5 08:56:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / d10444e |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/23421/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/23421/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 

[jira] [Commented] (YARN-9296) [Timeline Server] FinalStatus is displayed wrong for killed and failed applications

2019-02-15 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769545#comment-16769545
 ] 

Prabhu Joseph commented on YARN-9296:
-

[~Nallasivan] Thanks for reporting. This looks duplicate of YARN-8132. 

> [Timeline Server] FinalStatus is displayed wrong for killed and failed 
> applications
> ---
>
> Key: YARN-9296
> URL: https://issues.apache.org/jira/browse/YARN-9296
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Reporter: Nallasivan
>Assignee: Prabhu Joseph
>Priority: Minor
>
> Timline Server(1.5), FinalStatus of the applications which are killed and 
> failed, is displayed as UNDEFINED in both GUI, REST API



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8132) Final Status of applications shown as UNDEFINED in ATS app queries

2019-02-15 Thread Prabhu Joseph (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-8132:

Attachment: YARN-8132-001.patch

> Final Status of applications shown as UNDEFINED in ATS app queries
> --
>
> Key: YARN-8132
> URL: https://issues.apache.org/jira/browse/YARN-8132
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: ATSv2, timelineservice
>Reporter: Charan Hebri
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-8132-001.patch
>
>
> Final Status is shown as UNDEFINED for applications that are KILLED/FAILED. A 
> sample request/response with INFO field for an application,
> {noformat}
> 2018-04-09 13:10:02,126 INFO  reader.TimelineReaderWebServices 
> (TimelineReaderWebServices.java:getApp(1693)) - Received URL 
> /ws/v2/timeline/apps/application_1523259757659_0003?fields=INFO from user 
> hrt_qa
> 2018-04-09 13:10:02,156 INFO  reader.TimelineReaderWebServices 
> (TimelineReaderWebServices.java:getApp(1716)) - Processed URL 
> /ws/v2/timeline/apps/application_1523259757659_0003?fields=INFO (Took 30 
> ms.){noformat}
> {noformat}
> {
>   "metrics": [],
>   "events": [],
>   "createdtime": 1523263360719,
>   "idprefix": 0,
>   "id": "application_1523259757659_0003",
>   "type": "YARN_APPLICATION",
>   "info": {
> "YARN_APPLICATION_CALLER_CONTEXT": "CLI",
> "YARN_APPLICATION_DIAGNOSTICS_INFO": "Application 
> application_1523259757659_0003 was killed by user xxx_xx at XXX.XXX.XXX.XXX",
> "YARN_APPLICATION_FINAL_STATUS": "UNDEFINED",
> "YARN_APPLICATION_NAME": "Sleep job",
> "YARN_APPLICATION_USER": "hrt_qa",
> "YARN_APPLICATION_UNMANAGED_APPLICATION": false,
> "FROM_ID": 
> "yarn-cluster!hrt_qa!test_flow!1523263360719!application_1523259757659_0003",
> "UID": "yarn-cluster!application_1523259757659_0003",
> "YARN_APPLICATION_VIEW_ACLS": " ",
> "YARN_APPLICATION_SUBMITTED_TIME": 1523263360718,
> "YARN_AM_CONTAINER_LAUNCH_COMMAND": [
>   "$JAVA_HOME/bin/java -Djava.io.tmpdir=$PWD/tmp 
> -Dlog4j.configuration=container-log4j.properties 
> -Dyarn.app.container.log.dir= -Dyarn.app.container.log.filesize=0 
> -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog 
> -Dhdp.version=3.0.0.0-1163 -Xmx819m -Dhdp.version=3.0.0.0-1163 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1>/stdout 
> 2>/stderr "
> ],
> "YARN_APPLICATION_QUEUE": "default",
> "YARN_APPLICATION_TYPE": "MAPREDUCE",
> "YARN_APPLICATION_PRIORITY": 0,
> "YARN_APPLICATION_LATEST_APP_ATTEMPT": 
> "appattempt_1523259757659_0003_01",
> "YARN_APPLICATION_TAGS": [
>   "timeline_flow_name_tag:test_flow"
> ],
> "YARN_APPLICATION_STATE": "KILLED"
>   },
>   "configs": {},
>   "isrelatedto": {},
>   "relatesto": {}
> }{noformat}
> This is different to what the Resource Manager reports. For KILLED 
> applications the final status is KILLED and for FAILED applications it is 
> FAILED. This behavior is seen in ATSv2 as well as older versions of ATS. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8927) Support trust top-level image like "centos" when "library" is configured in "docker.trusted.registries"

2019-02-15 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769483#comment-16769483
 ] 

Hudson commented on YARN-8927:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15977 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/15977/])
YARN-8927. Added support for top level Dockerhub images to trusted (eyang: rev 
7c1b561e334f32cc0b5011fc52c47e0758fd47a9)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/test/utils/test_docker_util.cc
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/utils/docker-util.c


> Support trust top-level image like "centos" when "library" is configured in 
> "docker.trusted.registries"
> ---
>
> Key: YARN-8927
> URL: https://issues.apache.org/jira/browse/YARN-8927
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Zhankun Tang
>Assignee: Zhankun Tang
>Priority: Major
>  Labels: Docker
> Fix For: 3.3.0
>
> Attachments: YARN-8927-trunk.001.patch, YARN-8927-trunk.002.patch
>
>
> There are some missing cases that we need to catch when handling 
> "docker.trusted.registries".
> The container-executor.cfg configuration is as follows:
> {code:java}
> docker.trusted.registries=tangzhankun,ubuntu,centos{code}
> It works if run DistrubutedShell with "tangzhankun/tensorflow"
> {code:java}
> "yarn ... -shell_env YARN_CONTAINER_RUNTIME_TYPE=docker -shell_env 
> YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=tangzhankun/tensorflow
> {code}
> But run a DistrubutedShell job with "centos", "centos[:tagName]", "ubuntu" 
> and "ubuntu[:tagName]" fails:
> The error message is like:
> {code:java}
> "image: centos is not trusted"
> {code}
> We need better handling the above cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-8927) Support trust top-level image like "centos" when "library" is configured in "docker.trusted.registries"

2019-02-15 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769473#comment-16769473
 ] 

Eric Yang edited comment on YARN-8927 at 2/15/19 4:20 PM:
--

Thank you [~tangzhankun] for the patch.
Thank you [~ebadger] for the review.

I committed this to trunk.


was (Author: eyang):
Thank you [~tangzhankun] for the patch.
Thank you [~ebadger] for the review.

> Support trust top-level image like "centos" when "library" is configured in 
> "docker.trusted.registries"
> ---
>
> Key: YARN-8927
> URL: https://issues.apache.org/jira/browse/YARN-8927
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Zhankun Tang
>Assignee: Zhankun Tang
>Priority: Major
>  Labels: Docker
> Fix For: 3.3.0
>
> Attachments: YARN-8927-trunk.001.patch, YARN-8927-trunk.002.patch
>
>
> There are some missing cases that we need to catch when handling 
> "docker.trusted.registries".
> The container-executor.cfg configuration is as follows:
> {code:java}
> docker.trusted.registries=tangzhankun,ubuntu,centos{code}
> It works if run DistrubutedShell with "tangzhankun/tensorflow"
> {code:java}
> "yarn ... -shell_env YARN_CONTAINER_RUNTIME_TYPE=docker -shell_env 
> YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=tangzhankun/tensorflow
> {code}
> But run a DistrubutedShell job with "centos", "centos[:tagName]", "ubuntu" 
> and "ubuntu[:tagName]" fails:
> The error message is like:
> {code:java}
> "image: centos is not trusted"
> {code}
> We need better handling the above cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9266) Various fixes are needed in IntelFpgaOpenclPlugin

2019-02-15 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769455#comment-16769455
 ] 

Hadoop QA commented on YARN-9266:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 13s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
30s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
55s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 55s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 23s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 1 new + 54 unchanged - 94 fixed = 55 total (was 148) 
{color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
33s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} shadedclient {color} | {color:red}  3m 
25s{color} | {color:red} patch has errors when building and testing our client 
artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
24s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 58s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 39m 23s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-9266 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12958904/YARN-9266-006.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 8697e7047f9b 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / e0fe3d1 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
| mvninstall | 
https://builds.apache.org/job/PreCommit-YARN-Build/23420/artifact/out/patch-mvninstall-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
| compile | 

[jira] [Commented] (YARN-8927) Support trust top-level image like "centos" when "library" is configured in "docker.trusted.registries"

2019-02-15 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769435#comment-16769435
 ] 

Eric Yang commented on YARN-8927:
-

[~tangzhankun] Tag can not contain '/'.  I was referring to "docker tag" 
command to include '/' in repository name.  Valid usage are:
{code}docker tag centos:latest private/centos:latest{code}

or

{code}docker tag tensorflow/tensorflow:latest tensorflow:latest{code}

If a admin run the second command.  Tensorflow image becomes trusted if library 
keyword is given.

> Support trust top-level image like "centos" when "library" is configured in 
> "docker.trusted.registries"
> ---
>
> Key: YARN-8927
> URL: https://issues.apache.org/jira/browse/YARN-8927
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Zhankun Tang
>Assignee: Zhankun Tang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8927-trunk.001.patch, YARN-8927-trunk.002.patch
>
>
> There are some missing cases that we need to catch when handling 
> "docker.trusted.registries".
> The container-executor.cfg configuration is as follows:
> {code:java}
> docker.trusted.registries=tangzhankun,ubuntu,centos{code}
> It works if run DistrubutedShell with "tangzhankun/tensorflow"
> {code:java}
> "yarn ... -shell_env YARN_CONTAINER_RUNTIME_TYPE=docker -shell_env 
> YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=tangzhankun/tensorflow
> {code}
> But run a DistrubutedShell job with "centos", "centos[:tagName]", "ubuntu" 
> and "ubuntu[:tagName]" fails:
> The error message is like:
> {code:java}
> "image: centos is not trusted"
> {code}
> We need better handling the above cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9266) Various fixes are needed in IntelFpgaOpenclPlugin

2019-02-15 Thread Peter Bacsko (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated YARN-9266:
---
Attachment: YARN-9266-006.patch

> Various fixes are needed in IntelFpgaOpenclPlugin
> -
>
> Key: YARN-9266
> URL: https://issues.apache.org/jira/browse/YARN-9266
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-9266-001.patch, YARN-9266-002.patch, 
> YARN-9266-003.patch, YARN-9266-004.patch, YARN-9266-005.patch, 
> YARN-9266-006.patch
>
>
> Problems identified in this class:
>  * {{InnerShellExecutor}} ignores the timeout parameter
>  * {{configureIP()}} uses printStackTrace() instead of logging
>  * {{configureIP()}} does not log the output of aocl if the exit code != 0
>  * {{parseDiagnoseInfo()}} is too heavyweight – it should be in its own class 
> for better testability
>  * {{downloadIP()}} uses {{contains()}} for file name check – this can really 
> surprise users in some cases (eg. you want to use hello.aocx but hello2.aocx 
> also matches)
>  * method name {{downloadIP()}} is misleading – it actually tries to finds 
> the file. Everything is downloaded (localized) at this point.
>  * {{@VisibleForTesting}} methods should be package private
>  * {{aliasMap}} is not needed - store the acl number in the {{FpgaDevice}} 
> class
>  * checkstyle fixes



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9266) Various fixes are needed in IntelFpgaOpenclPlugin

2019-02-15 Thread Peter Bacsko (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769392#comment-16769392
 ] 

Peter Bacsko commented on YARN-9266:


Test failure is unrelated, see https://issues.apache.org/jira/browse/YARN-7145

Will address the remaining checkstyle problem.

> Various fixes are needed in IntelFpgaOpenclPlugin
> -
>
> Key: YARN-9266
> URL: https://issues.apache.org/jira/browse/YARN-9266
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-9266-001.patch, YARN-9266-002.patch, 
> YARN-9266-003.patch, YARN-9266-004.patch, YARN-9266-005.patch
>
>
> Problems identified in this class:
>  * {{InnerShellExecutor}} ignores the timeout parameter
>  * {{configureIP()}} uses printStackTrace() instead of logging
>  * {{configureIP()}} does not log the output of aocl if the exit code != 0
>  * {{parseDiagnoseInfo()}} is too heavyweight – it should be in its own class 
> for better testability
>  * {{downloadIP()}} uses {{contains()}} for file name check – this can really 
> surprise users in some cases (eg. you want to use hello.aocx but hello2.aocx 
> also matches)
>  * method name {{downloadIP()}} is misleading – it actually tries to finds 
> the file. Everything is downloaded (localized) at this point.
>  * {{@VisibleForTesting}} methods should be package private
>  * {{aliasMap}} is not needed - store the acl number in the {{FpgaDevice}} 
> class
>  * checkstyle fixes



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9266) Various fixes are needed in IntelFpgaOpenclPlugin

2019-02-15 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769383#comment-16769383
 ] 

Hadoop QA commented on YARN-9266:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 15s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 20s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 1 new + 54 unchanged - 94 fixed = 55 total (was 148) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 47s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 20m 35s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 69m 27s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.nodemanager.containermanager.TestContainerManager |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-9266 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12958890/YARN-9266-005.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux deaa2b634a56 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 9385ec4 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/23419/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
| unit | 

[jira] [Assigned] (YARN-9296) [Timeline Server] FinalStatus is displayed wrong for killed and failed applications

2019-02-15 Thread Prabhu Joseph (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph reassigned YARN-9296:
---

Assignee: Prabhu Joseph

> [Timeline Server] FinalStatus is displayed wrong for killed and failed 
> applications
> ---
>
> Key: YARN-9296
> URL: https://issues.apache.org/jira/browse/YARN-9296
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Reporter: Nallasivan
>Assignee: Prabhu Joseph
>Priority: Minor
>
> Timline Server(1.5), FinalStatus of the applications which are killed and 
> failed, is displayed as UNDEFINED in both GUI, REST API



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9235) If linux container executor is not set for a GPU cluster GpuResourceHandlerImpl is not initialized and NPE is thrown

2019-02-15 Thread JIRA


[ 
https://issues.apache.org/jira/browse/YARN-9235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769357#comment-16769357
 ] 

Antal Bálint Steinbach commented on YARN-9235:
--

Hi [~pbacsko], [~sunilg] ,

Yeah, that would be great but unfortunately, it is not so easy. This class is 
not prepared to be testable. 

_GpuDiscoverer.getInstance().getGpuDeviceInformation()_ will throw exception 
before we reach the code we would like to test. Me and [~snemeth] has some 
patches available to address this issue. I would not do the same change in the 
3rd patch for this. (add GpuDiscoverer as a dependency for the class)

YARN-9217 for example has a test which is testing this method. Unfortunately, 
this issue is blocked by [~snemeth]'s other pending commits (YARN-9118, 
YARN-9213), because they are conflicting badly.

I would recommend submitting those first then I merge my issues or submit this 
without test and I resolve the problem with the other mentioned issues.

> If linux container executor is not set for a GPU cluster 
> GpuResourceHandlerImpl is not initialized and NPE is thrown
> 
>
> Key: YARN-9235
> URL: https://issues.apache.org/jira/browse/YARN-9235
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0, 3.1.0
>Reporter: Antal Bálint Steinbach
>Assignee: Antal Bálint Steinbach
>Priority: Major
> Attachments: YARN-9235.001.patch
>
>
> If GPU plugin is enabled for the NodeManager, it is possible to run jobs with 
> GPU.
> However, if LinuxContainerExecutor is not configured, an NPE is thrown when 
> calling 
> {code:java}
> GpuResourcePlugin.getNMResourceInfo{code}
> Also, there are no warns in the log if GPU is misconfigured like this. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-8132) Final Status of applications shown as UNDEFINED in ATS app queries

2019-02-15 Thread Prabhu Joseph (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph reassigned YARN-8132:
---

Assignee: Prabhu Joseph

> Final Status of applications shown as UNDEFINED in ATS app queries
> --
>
> Key: YARN-8132
> URL: https://issues.apache.org/jira/browse/YARN-8132
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: ATSv2, timelineservice
>Reporter: Charan Hebri
>Assignee: Prabhu Joseph
>Priority: Major
>
> Final Status is shown as UNDEFINED for applications that are KILLED/FAILED. A 
> sample request/response with INFO field for an application,
> {noformat}
> 2018-04-09 13:10:02,126 INFO  reader.TimelineReaderWebServices 
> (TimelineReaderWebServices.java:getApp(1693)) - Received URL 
> /ws/v2/timeline/apps/application_1523259757659_0003?fields=INFO from user 
> hrt_qa
> 2018-04-09 13:10:02,156 INFO  reader.TimelineReaderWebServices 
> (TimelineReaderWebServices.java:getApp(1716)) - Processed URL 
> /ws/v2/timeline/apps/application_1523259757659_0003?fields=INFO (Took 30 
> ms.){noformat}
> {noformat}
> {
>   "metrics": [],
>   "events": [],
>   "createdtime": 1523263360719,
>   "idprefix": 0,
>   "id": "application_1523259757659_0003",
>   "type": "YARN_APPLICATION",
>   "info": {
> "YARN_APPLICATION_CALLER_CONTEXT": "CLI",
> "YARN_APPLICATION_DIAGNOSTICS_INFO": "Application 
> application_1523259757659_0003 was killed by user xxx_xx at XXX.XXX.XXX.XXX",
> "YARN_APPLICATION_FINAL_STATUS": "UNDEFINED",
> "YARN_APPLICATION_NAME": "Sleep job",
> "YARN_APPLICATION_USER": "hrt_qa",
> "YARN_APPLICATION_UNMANAGED_APPLICATION": false,
> "FROM_ID": 
> "yarn-cluster!hrt_qa!test_flow!1523263360719!application_1523259757659_0003",
> "UID": "yarn-cluster!application_1523259757659_0003",
> "YARN_APPLICATION_VIEW_ACLS": " ",
> "YARN_APPLICATION_SUBMITTED_TIME": 1523263360718,
> "YARN_AM_CONTAINER_LAUNCH_COMMAND": [
>   "$JAVA_HOME/bin/java -Djava.io.tmpdir=$PWD/tmp 
> -Dlog4j.configuration=container-log4j.properties 
> -Dyarn.app.container.log.dir= -Dyarn.app.container.log.filesize=0 
> -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog 
> -Dhdp.version=3.0.0.0-1163 -Xmx819m -Dhdp.version=3.0.0.0-1163 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1>/stdout 
> 2>/stderr "
> ],
> "YARN_APPLICATION_QUEUE": "default",
> "YARN_APPLICATION_TYPE": "MAPREDUCE",
> "YARN_APPLICATION_PRIORITY": 0,
> "YARN_APPLICATION_LATEST_APP_ATTEMPT": 
> "appattempt_1523259757659_0003_01",
> "YARN_APPLICATION_TAGS": [
>   "timeline_flow_name_tag:test_flow"
> ],
> "YARN_APPLICATION_STATE": "KILLED"
>   },
>   "configs": {},
>   "isrelatedto": {},
>   "relatesto": {}
> }{noformat}
> This is different to what the Resource Manager reports. For KILLED 
> applications the final status is KILLED and for FAILED applications it is 
> FAILED. This behavior is seen in ATSv2 as well as older versions of ATS. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9233) RM may report allocated container which is killed (but not acquired by AM ) to AM which can cause spark AM confused

2019-02-15 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769346#comment-16769346
 ] 

Hadoop QA commented on YARN-9233:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 17s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 46s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 1 new + 305 unchanged - 1 fixed = 306 total (was 306) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 18s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}103m 26s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
40s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}161m 30s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer
 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-9233 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12958866/YARN-9233-003.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux c1d3dac7ee2c 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri Oct 
5 08:56:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 9385ec4 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/23417/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| whitespace | 

[jira] [Updated] (YARN-9266) Various fixes are needed in IntelFpgaOpenclPlugin

2019-02-15 Thread Peter Bacsko (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated YARN-9266:
---
Attachment: YARN-9266-005.patch

> Various fixes are needed in IntelFpgaOpenclPlugin
> -
>
> Key: YARN-9266
> URL: https://issues.apache.org/jira/browse/YARN-9266
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-9266-001.patch, YARN-9266-002.patch, 
> YARN-9266-003.patch, YARN-9266-004.patch, YARN-9266-005.patch
>
>
> Problems identified in this class:
>  * {{InnerShellExecutor}} ignores the timeout parameter
>  * {{configureIP()}} uses printStackTrace() instead of logging
>  * {{configureIP()}} does not log the output of aocl if the exit code != 0
>  * {{parseDiagnoseInfo()}} is too heavyweight – it should be in its own class 
> for better testability
>  * {{downloadIP()}} uses {{contains()}} for file name check – this can really 
> surprise users in some cases (eg. you want to use hello.aocx but hello2.aocx 
> also matches)
>  * method name {{downloadIP()}} is misleading – it actually tries to finds 
> the file. Everything is downloaded (localized) at this point.
>  * {{@VisibleForTesting}} methods should be package private
>  * {{aliasMap}} is not needed - store the acl number in the {{FpgaDevice}} 
> class
>  * checkstyle fixes



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9286) [Timeline Server] Sorting based on FinalStatus throws pop-up message

2019-02-15 Thread Bilwa S T (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769281#comment-16769281
 ] 

Bilwa S T commented on YARN-9286:
-

!image-2019-02-15-18-16-21-804.png!

> [Timeline Server] Sorting based on FinalStatus throws pop-up message
> 
>
> Key: YARN-9286
> URL: https://issues.apache.org/jira/browse/YARN-9286
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Reporter: Nallasivan
>Assignee: Bilwa S T
>Priority: Minor
> Attachments: YARN-9286-001.patch, image-2019-02-15-18-16-21-804.png
>
>
> In Timeline Server GUI, if we try to sort the details based on FinalStatus, a 
> popup window is getting displayed. And further any operations which involves 
> the refreshing of the page, results in the display of same popup window.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9286) [Timeline Server] Sorting based on FinalStatus throws pop-up message

2019-02-15 Thread Bilwa S T (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-9286:

Attachment: image-2019-02-15-18-16-21-804.png

> [Timeline Server] Sorting based on FinalStatus throws pop-up message
> 
>
> Key: YARN-9286
> URL: https://issues.apache.org/jira/browse/YARN-9286
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Reporter: Nallasivan
>Assignee: Bilwa S T
>Priority: Minor
> Attachments: YARN-9286-001.patch, image-2019-02-15-18-16-21-804.png
>
>
> In Timeline Server GUI, if we try to sort the details based on FinalStatus, a 
> popup window is getting displayed. And further any operations which involves 
> the refreshing of the page, results in the display of same popup window.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9286) [Timeline Server] Sorting based on FinalStatus throws pop-up message

2019-02-15 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769286#comment-16769286
 ] 

Hadoop QA commented on YARN-9286:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  8s{color} 
| {color:red} YARN-9286 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-9286 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/23418/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> [Timeline Server] Sorting based on FinalStatus throws pop-up message
> 
>
> Key: YARN-9286
> URL: https://issues.apache.org/jira/browse/YARN-9286
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Reporter: Nallasivan
>Assignee: Bilwa S T
>Priority: Minor
> Attachments: YARN-9286-001.patch, image-2019-02-15-18-16-21-804.png
>
>
> In Timeline Server GUI, if we try to sort the details based on FinalStatus, a 
> popup window is getting displayed. And further any operations which involves 
> the refreshing of the page, results in the display of same popup window.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-9286) [Timeline Server] Sorting based on FinalStatus throws pop-up message

2019-02-15 Thread Bilwa S T (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769281#comment-16769281
 ] 

Bilwa S T edited comment on YARN-9286 at 2/15/19 12:48 PM:
---

!image-2019-02-15-18-16-21-804.png!

Proof attached. No pop up when page is refreshed or when clicked on FinalStatus


was (Author: bilwast):
!image-2019-02-15-18-16-21-804.png!

> [Timeline Server] Sorting based on FinalStatus throws pop-up message
> 
>
> Key: YARN-9286
> URL: https://issues.apache.org/jira/browse/YARN-9286
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Reporter: Nallasivan
>Assignee: Bilwa S T
>Priority: Minor
> Attachments: YARN-9286-001.patch, image-2019-02-15-18-16-21-804.png
>
>
> In Timeline Server GUI, if we try to sort the details based on FinalStatus, a 
> popup window is getting displayed. And further any operations which involves 
> the refreshing of the page, results in the display of same popup window.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9309) Improvise graphs in SLS as values displayed in graph are overlapping

2019-02-15 Thread Bilwa S T (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769275#comment-16769275
 ] 

Bilwa S T commented on YARN-9309:
-

cc [~bibinchundatt]

> Improvise graphs in SLS as values displayed in graph are overlapping
> 
>
> Key: YARN-9309
> URL: https://issues.apache.org/jira/browse/YARN-9309
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-9309-001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9309) Improvise graphs in SLS as values displayed in graph are overlapping

2019-02-15 Thread Bilwa S T (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-9309:

Attachment: YARN-9309-001.patch

> Improvise graphs in SLS as values displayed in graph are overlapping
> 
>
> Key: YARN-9309
> URL: https://issues.apache.org/jira/browse/YARN-9309
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-9309-001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9233) RM may report allocated container which is killed (but not acquired by AM ) to AM which can cause spark AM confused

2019-02-15 Thread Bilwa S T (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769209#comment-16769209
 ] 

Bilwa S T commented on YARN-9233:
-

Thanks [~rohithsharma] for reviewing!
{quote}   This is SparkAM issue. I am skipping event if either container is 
acquired by AM or if it is Master container. I have attached patch for the 
same. Please review.
{quote}

> RM may report allocated container which is killed (but not acquired by AM ) 
> to AM which can cause spark AM confused
> ---
>
> Key: YARN-9233
> URL: https://issues.apache.org/jira/browse/YARN-9233
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-9233-001.patch, YARN-9233-002.patch
>
>
> After the RM kills an allocated (Allocated state) container for various 
> reasons, it will go through the state transition process to the FINISHED 
> state just like other state containers. Currently RM doesn't consider if 
> container is acquired by the AM. Hence All the containers transitioned to 
> FINISH state are added to justFinishedContainers list. Therefore the 
> container that is not obtained by the AM and is killed by the rm will also 
> return through the AM heartbeat. So AM re-applies for more resources than 
> needed which would eventually cause number of containers to exceed the 
> maximum limit



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9233) RM may report allocated container which is killed (but not acquired by AM ) to AM which can cause spark AM confused

2019-02-15 Thread Bilwa S T (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-9233:

Attachment: YARN-9233-003.patch

> RM may report allocated container which is killed (but not acquired by AM ) 
> to AM which can cause spark AM confused
> ---
>
> Key: YARN-9233
> URL: https://issues.apache.org/jira/browse/YARN-9233
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-9233-001.patch, YARN-9233-002.patch, 
> YARN-9233-003.patch
>
>
> After the RM kills an allocated (Allocated state) container for various 
> reasons, it will go through the state transition process to the FINISHED 
> state just like other state containers. Currently RM doesn't consider if 
> container is acquired by the AM. Hence All the containers transitioned to 
> FINISH state are added to justFinishedContainers list. Therefore the 
> container that is not obtained by the AM and is killed by the rm will also 
> return through the AM heartbeat. So AM re-applies for more resources than 
> needed which would eventually cause number of containers to exceed the 
> maximum limit



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9283) Javadoc of LinuxContainerExecutor#addSchedPriorityCommand has a wrong property name as reference

2019-02-15 Thread Adam Antal (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769198#comment-16769198
 ] 

Adam Antal commented on YARN-9283:
--

Thanks [~ajisakaa] for the commit, and [~snemeth] for the review.

> Javadoc of LinuxContainerExecutor#addSchedPriorityCommand has a wrong 
> property name as reference
> 
>
> Key: YARN-9283
> URL: https://issues.apache.org/jira/browse/YARN-9283
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.2.0
>Reporter: Szilard Nemeth
>Assignee: Adam Antal
>Priority: Minor
>  Labels: newbie
> Fix For: 2.10.0, 3.0.4, 3.3.0, 3.2.1, 2.9.3, 3.1.3
>
> Attachments: YARN-9283.000.patch, YARN-9283.001.patch
>
>
> The javadoc of LinuxContainerExecutor#addSchedPriorityCommand tries to refer 
> to the property 
> org.apache.hadoop.yarn.conf.YarnConfiguration#NM_CONTAINER_EXECUTOR_SCHED_PRIORITY
> which has the value: 
> "yarn.nodemanager.container-executor.os.sched.priority.adjustment" but the 
> javadoc contains the value: 
> "yarn.nodemanager.container-executer.os.sched.prioity" which is incorrect.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-9233) RM may report allocated container which is killed (but not acquired by AM ) to AM which can cause spark AM confused

2019-02-15 Thread Bilwa S T (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769167#comment-16769167
 ] 

Bilwa S T edited comment on YARN-9233 at 2/15/19 10:22 AM:
---

Thanks [~rohithsharma] for reviewing
{quote} This is Spark AM issue. I think skipping event is a better 
solution. I have attached a patch for it. Please review
{quote}
 

 


was (Author: bilwast):
Thanks [~rohithsharma] for reviewing
{quote} \{quota}This is Spark AM issue. I think skipping event is a better 
solution. I have attached a patch for it. Please review
{quote}
 

 

> RM may report allocated container which is killed (but not acquired by AM ) 
> to AM which can cause spark AM confused
> ---
>
> Key: YARN-9233
> URL: https://issues.apache.org/jira/browse/YARN-9233
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-9233-001.patch, YARN-9233-002.patch
>
>
> After the RM kills an allocated (Allocated state) container for various 
> reasons, it will go through the state transition process to the FINISHED 
> state just like other state containers. Currently RM doesn't consider if 
> container is acquired by the AM. Hence All the containers transitioned to 
> FINISH state are added to justFinishedContainers list. Therefore the 
> container that is not obtained by the AM and is killed by the rm will also 
> return through the AM heartbeat. So AM re-applies for more resources than 
> needed which would eventually cause number of containers to exceed the 
> maximum limit



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9233) RM may report allocated container which is killed (but not acquired by AM ) to AM which can cause spark AM confused

2019-02-15 Thread Bilwa S T (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-9233:

Attachment: (was: YARN-9233-003.patch)

> RM may report allocated container which is killed (but not acquired by AM ) 
> to AM which can cause spark AM confused
> ---
>
> Key: YARN-9233
> URL: https://issues.apache.org/jira/browse/YARN-9233
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-9233-001.patch, YARN-9233-002.patch
>
>
> After the RM kills an allocated (Allocated state) container for various 
> reasons, it will go through the state transition process to the FINISHED 
> state just like other state containers. Currently RM doesn't consider if 
> container is acquired by the AM. Hence All the containers transitioned to 
> FINISH state are added to justFinishedContainers list. Therefore the 
> container that is not obtained by the AM and is killed by the rm will also 
> return through the AM heartbeat. So AM re-applies for more resources than 
> needed which would eventually cause number of containers to exceed the 
> maximum limit



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9233) RM may report allocated container which is killed (but not acquired by AM ) to AM which can cause spark AM confused

2019-02-15 Thread Bilwa S T (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769167#comment-16769167
 ] 

Bilwa S T commented on YARN-9233:
-

Thanks [~rohithsharma] for reviewing

    This is Spark AM issue. I think skipping event is a better 
solution. I have attached a patch for it. Please review

 

> RM may report allocated container which is killed (but not acquired by AM ) 
> to AM which can cause spark AM confused
> ---
>
> Key: YARN-9233
> URL: https://issues.apache.org/jira/browse/YARN-9233
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-9233-001.patch, YARN-9233-002.patch
>
>
> After the RM kills an allocated (Allocated state) container for various 
> reasons, it will go through the state transition process to the FINISHED 
> state just like other state containers. Currently RM doesn't consider if 
> container is acquired by the AM. Hence All the containers transitioned to 
> FINISH state are added to justFinishedContainers list. Therefore the 
> container that is not obtained by the AM and is killed by the rm will also 
> return through the AM heartbeat. So AM re-applies for more resources than 
> needed which would eventually cause number of containers to exceed the 
> maximum limit



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (YARN-9233) RM may report allocated container which is killed (but not acquired by AM ) to AM which can cause spark AM confused

2019-02-15 Thread Bilwa S T (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-9233:

Comment: was deleted

(was: Thanks [~rohithsharma] for reviewing
{quote}This is Spark AM issue. I think skipping event is a better solution. I 
have attached a patch for it. Please review
{quote}
 

 )

> RM may report allocated container which is killed (but not acquired by AM ) 
> to AM which can cause spark AM confused
> ---
>
> Key: YARN-9233
> URL: https://issues.apache.org/jira/browse/YARN-9233
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-9233-001.patch, YARN-9233-002.patch
>
>
> After the RM kills an allocated (Allocated state) container for various 
> reasons, it will go through the state transition process to the FINISHED 
> state just like other state containers. Currently RM doesn't consider if 
> container is acquired by the AM. Hence All the containers transitioned to 
> FINISH state are added to justFinishedContainers list. Therefore the 
> container that is not obtained by the AM and is killed by the rm will also 
> return through the AM heartbeat. So AM re-applies for more resources than 
> needed which would eventually cause number of containers to exceed the 
> maximum limit



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9233) RM may report allocated container which is killed (but not acquired by AM ) to AM which can cause spark AM confused

2019-02-15 Thread Bilwa S T (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-9233:

Attachment: YARN-9233-003.patch

> RM may report allocated container which is killed (but not acquired by AM ) 
> to AM which can cause spark AM confused
> ---
>
> Key: YARN-9233
> URL: https://issues.apache.org/jira/browse/YARN-9233
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-9233-001.patch, YARN-9233-002.patch, 
> YARN-9233-003.patch
>
>
> After the RM kills an allocated (Allocated state) container for various 
> reasons, it will go through the state transition process to the FINISHED 
> state just like other state containers. Currently RM doesn't consider if 
> container is acquired by the AM. Hence All the containers transitioned to 
> FINISH state are added to justFinishedContainers list. Therefore the 
> container that is not obtained by the AM and is killed by the rm will also 
> return through the AM heartbeat. So AM re-applies for more resources than 
> needed which would eventually cause number of containers to exceed the 
> maximum limit



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-9233) RM may report allocated container which is killed (but not acquired by AM ) to AM which can cause spark AM confused

2019-02-15 Thread Bilwa S T (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769167#comment-16769167
 ] 

Bilwa S T edited comment on YARN-9233 at 2/15/19 10:23 AM:
---

Thanks [~rohithsharma] for reviewing
{quote}This is Spark AM issue. I think skipping event is a better solution. I 
have attached a patch for it. Please review
{quote}
 

 


was (Author: bilwast):
Thanks [~rohithsharma] for reviewing
{quote} This is Spark AM issue. I think skipping event is a better 
solution. I have attached a patch for it. Please review
{quote}
 

 

> RM may report allocated container which is killed (but not acquired by AM ) 
> to AM which can cause spark AM confused
> ---
>
> Key: YARN-9233
> URL: https://issues.apache.org/jira/browse/YARN-9233
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-9233-001.patch, YARN-9233-002.patch
>
>
> After the RM kills an allocated (Allocated state) container for various 
> reasons, it will go through the state transition process to the FINISHED 
> state just like other state containers. Currently RM doesn't consider if 
> container is acquired by the AM. Hence All the containers transitioned to 
> FINISH state are added to justFinishedContainers list. Therefore the 
> container that is not obtained by the AM and is killed by the rm will also 
> return through the AM heartbeat. So AM re-applies for more resources than 
> needed which would eventually cause number of containers to exceed the 
> maximum limit



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-9233) RM may report allocated container which is killed (but not acquired by AM ) to AM which can cause spark AM confused

2019-02-15 Thread Bilwa S T (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769167#comment-16769167
 ] 

Bilwa S T edited comment on YARN-9233 at 2/15/19 10:22 AM:
---

Thanks [~rohithsharma] for reviewing
{quote} \{quota}This is Spark AM issue. I think skipping event is a better 
solution. I have attached a patch for it. Please review
{quote}
 

 


was (Author: bilwast):
Thanks [~rohithsharma] for reviewing

    This is Spark AM issue. I think skipping event is a better 
solution. I have attached a patch for it. Please review

 

> RM may report allocated container which is killed (but not acquired by AM ) 
> to AM which can cause spark AM confused
> ---
>
> Key: YARN-9233
> URL: https://issues.apache.org/jira/browse/YARN-9233
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-9233-001.patch, YARN-9233-002.patch
>
>
> After the RM kills an allocated (Allocated state) container for various 
> reasons, it will go through the state transition process to the FINISHED 
> state just like other state containers. Currently RM doesn't consider if 
> container is acquired by the AM. Hence All the containers transitioned to 
> FINISH state are added to justFinishedContainers list. Therefore the 
> container that is not obtained by the AM and is killed by the rm will also 
> return through the AM heartbeat. So AM re-applies for more resources than 
> needed which would eventually cause number of containers to exceed the 
> maximum limit



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-4327) RM can not renew TIMELINE_DELEGATION_TOKEN in secure clusters

2019-02-15 Thread Yeliang Cang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769138#comment-16769138
 ] 

Yeliang Cang edited comment on YARN-4327 at 2/15/19 9:58 AM:
-

[~basha.sh...@gmail.com],[~linou518], [~zsl2007], Hi, guys ! Have you solve 
this problem? I have met the same!
In the timelineServer log, there are messages below:
{code}
2019-02-15 17:22:00,276 WARN 
org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler: 
'Authorization' does not start with 'Negotiate' :  
administrator:SdzV4wj1ufh3+X1PgIQXj7ld9gc=
2019-02-15 17:22:00,303 WARN 
org.apache.hadoop.security.authentication.server.AuthenticationFilter: 
Authentication exception: GSSException: No valid credentials provided 
(Mechanism level: Failed to find any Kerberos credentails)
{code}
[~zjshen], could you help me with this question?



was (Author: cyl):
[~basha.sh...@gmail.com],[~linou518], [~zsl2007], Hi, guys ! Have you solve 
this problem? I have met the same!
[~zjshen], could you help me with this question?


> RM can not renew  TIMELINE_DELEGATION_TOKEN in secure clusters
> --
>
> Key: YARN-4327
> URL: https://issues.apache.org/jira/browse/YARN-4327
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, security, timelineserver
>Affects Versions: 2.7.1
> Environment: hadoop 2.7.1hdfs,yarn, mrhistoryserver, ATS all use 
> kerberos security.
> conf like this:
> 
>   hadoop.security.authorization
>   true
>   Is service-level authorization enabled?
> 
> 
>   hadoop.security.authentication
>   kerberos
>   Possible values are simple (no authentication), and kerberos
>   
> 
>Reporter: zhangshilong
>Priority: Major
>
> bin hadoop 2.7.1
> ATS conf like this: 
> 
> yarn.timeline-service.http-authentication.type
> simple
> 
> 
> yarn.timeline-service.http-authentication.kerberos.principal
> HTTP/_h...@xxx.com
> 
> 
> yarn.timeline-service.http-authentication.kerberos.keytab
> /etc/hadoop/keytabs/xxx.keytab
> 
> 
> yarn.timeline-service.principal
> xxx/_h...@xxx.com
> 
> 
> yarn.timeline-service.keytab
> /etc/hadoop/keytabs/xxx.keytab
> 
> 
> yarn.timeline-service.best-effort
> true
> 
> 
> yarn.timeline-service.enabled
> true
>   
>  
> I'd like to allow everyone to access ATS from HTTP as RM,HDFS.
> client can submit job to RM and  add TIMELINE_DELEGATION_TOKEN  to AM 
> Context, but RM can not renew  TIMELINE_DELEGATION_TOKEN and make application 
> to failure.
> RM logs:
> 2015-11-03 11:58:38,191 WARN 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer:
>  Unable to add the application to the delegation token renewer.
> java.io.IOException: Failed to renew token: Kind: TIMELINE_DELEGATION_TOKEN, 
> Service: 10.12.38.4:8188, Ident: (owner=yarn-test, renewer=yarn-test, 
> realUser=, issueDate=1446523118046, maxDate=1447127918046, sequenceNumber=9, 
> masterKeyId=2)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:439)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:78)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:847)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:828)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: HTTP status [500], message [Null user]
> at 
> org.apache.hadoop.util.HttpExceptionUtils.validateResponse(HttpExceptionUtils.java:169)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.doDelegationTokenOperation(DelegationTokenAuthenticator.java:287)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.renewDelegationToken(DelegationTokenAuthenticator.java:212)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.renewDelegationToken(DelegationTokenAuthenticatedURL.java:414)
> at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$3.run(TimelineClientImpl.java:396)
> at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$3.run(TimelineClientImpl.java:378)
> at java.security.AccessController.doPrivileged(Native Method)
> at 

[jira] [Commented] (YARN-9266) Various fixes are needed in IntelFpgaOpenclPlugin

2019-02-15 Thread Adam Antal (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769159#comment-16769159
 ] 

Adam Antal commented on YARN-9266:
--

Thanks for the reaction, [~pbacsko]. I agree with your points. 

> Various fixes are needed in IntelFpgaOpenclPlugin
> -
>
> Key: YARN-9266
> URL: https://issues.apache.org/jira/browse/YARN-9266
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-9266-001.patch, YARN-9266-002.patch, 
> YARN-9266-003.patch, YARN-9266-004.patch
>
>
> Problems identified in this class:
>  * {{InnerShellExecutor}} ignores the timeout parameter
>  * {{configureIP()}} uses printStackTrace() instead of logging
>  * {{configureIP()}} does not log the output of aocl if the exit code != 0
>  * {{parseDiagnoseInfo()}} is too heavyweight – it should be in its own class 
> for better testability
>  * {{downloadIP()}} uses {{contains()}} for file name check – this can really 
> surprise users in some cases (eg. you want to use hello.aocx but hello2.aocx 
> also matches)
>  * method name {{downloadIP()}} is misleading – it actually tries to finds 
> the file. Everything is downloaded (localized) at this point.
>  * {{@VisibleForTesting}} methods should be package private
>  * {{aliasMap}} is not needed - store the acl number in the {{FpgaDevice}} 
> class
>  * checkstyle fixes



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9213) RM Web UI v1 does not show custom resource allocations for containers page

2019-02-15 Thread Szilard Nemeth (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769158#comment-16769158
 ] 

Szilard Nemeth commented on YARN-9213:
--

Hi [~sunilg]!
Now the jenkins result looks good as well :) 

> RM Web UI v1 does not show custom resource allocations for containers page
> --
>
> Key: YARN-9213
> URL: https://issues.apache.org/jira/browse/YARN-9213
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Major
> Attachments: Screen Shot 2019-02-08 at 21.16.37-before.png, Screen 
> Shot 2019-02-09 at 9.55.16-after.png, YARN-9213.001.patch, 
> YARN-9213.002.patch, YARN-9213.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7824) [UI2] Yarn Component Instance page should include link to container logs

2019-02-15 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769144#comment-16769144
 ] 

Hadoop QA commented on YARN-7824:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} branch-3.2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
10s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
32m 52s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 35s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
27s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 46m 58s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:63396be |
| JIRA Issue | YARN-7824 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12958837/YARN-7824-branch-3.2.001.patch
 |
| Optional Tests |  dupname  asflicense  shadedclient  |
| uname | Linux a22957942d36 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-3.2 / b4dc62a |
| maven | version: Apache Maven 3.3.9 |
| Max. process+thread count | 416 (vs. ulimit of 1) |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/23416/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> [UI2] Yarn Component Instance page should include link to container logs
> 
>
> Key: YARN-7824
> URL: https://issues.apache.org/jira/browse/YARN-7824
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn-ui-v2
>Affects Versions: 3.0.0
>Reporter: Yesha Vora
>Assignee: Akhil PB
>Priority: Major
> Attachments: YARN-7824-branch-3.2.001.patch, YARN-7824.001.patch
>
>
> Steps:
> 1) Launch Httpd example
> 2) Visit component Instance page for httpd-proxy-0
> This page has information regarding httpd-proxy-0 component.
> This page should also include a link to container logs for this component
> h2.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9283) Javadoc of LinuxContainerExecutor#addSchedPriorityCommand has a wrong property name as reference

2019-02-15 Thread Akira Ajisaka (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated YARN-9283:

Labels: newbie  (was: newbie newbie++)

> Javadoc of LinuxContainerExecutor#addSchedPriorityCommand has a wrong 
> property name as reference
> 
>
> Key: YARN-9283
> URL: https://issues.apache.org/jira/browse/YARN-9283
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.2.0
>Reporter: Szilard Nemeth
>Assignee: Adam Antal
>Priority: Minor
>  Labels: newbie
> Fix For: 2.10.0, 3.0.4, 3.3.0, 3.2.1, 2.9.3, 3.1.3
>
> Attachments: YARN-9283.000.patch, YARN-9283.001.patch
>
>
> The javadoc of LinuxContainerExecutor#addSchedPriorityCommand tries to refer 
> to the property 
> org.apache.hadoop.yarn.conf.YarnConfiguration#NM_CONTAINER_EXECUTOR_SCHED_PRIORITY
> which has the value: 
> "yarn.nodemanager.container-executor.os.sched.priority.adjustment" but the 
> javadoc contains the value: 
> "yarn.nodemanager.container-executer.os.sched.prioity" which is incorrect.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9283) Javadoc of LinuxContainerExecutor#addSchedPriorityCommand has a wrong property name as reference

2019-02-15 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769154#comment-16769154
 ] 

Hudson commented on YARN-9283:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15973 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/15973/])
YARN-9283. Javadoc of LinuxContainerExecutor#addSchedPriorityCommand has 
(aajisaka: rev 9385ec45d75109a2e6565faa10527cc56637bf5f)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LinuxContainerExecutor.java


> Javadoc of LinuxContainerExecutor#addSchedPriorityCommand has a wrong 
> property name as reference
> 
>
> Key: YARN-9283
> URL: https://issues.apache.org/jira/browse/YARN-9283
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.2.0
>Reporter: Szilard Nemeth
>Assignee: Adam Antal
>Priority: Minor
>  Labels: newbie, newbie++
> Attachments: YARN-9283.000.patch, YARN-9283.001.patch
>
>
> The javadoc of LinuxContainerExecutor#addSchedPriorityCommand tries to refer 
> to the property 
> org.apache.hadoop.yarn.conf.YarnConfiguration#NM_CONTAINER_EXECUTOR_SCHED_PRIORITY
> which has the value: 
> "yarn.nodemanager.container-executor.os.sched.priority.adjustment" but the 
> javadoc contains the value: 
> "yarn.nodemanager.container-executer.os.sched.prioity" which is incorrect.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9213) RM Web UI v1 does not show custom resource allocations for containers page

2019-02-15 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769142#comment-16769142
 ] 

Hadoop QA commented on YARN-9213:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
29s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  3s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 58s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
35s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 53m 37s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-9213 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12958836/YARN-9213.003.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 55a535776cc7 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri Oct 
5 08:56:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 75e15cc |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/23415/testReport/ |
| Max. process+thread count | 340 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/23415/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> RM Web UI v1 does not show custom resource allocations for containers 

[jira] [Comment Edited] (YARN-4327) RM can not renew TIMELINE_DELEGATION_TOKEN in secure clusters

2019-02-15 Thread Yeliang Cang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769138#comment-16769138
 ] 

Yeliang Cang edited comment on YARN-4327 at 2/15/19 9:44 AM:
-

[~basha.sh...@gmail.com],[~linou518], [~zsl2007], Hi, guys ! Have you solve 
this problem? I have met the same!
[~zjshen], could you help me with this question?



was (Author: cyl):
[~basha.sh...@gmail.com],[~linou518], [~zsl2007], Hi, guys ! Have you solve 
this problem? I have met the same!
[~zjshen], could you see this problem?

> RM can not renew  TIMELINE_DELEGATION_TOKEN in secure clusters
> --
>
> Key: YARN-4327
> URL: https://issues.apache.org/jira/browse/YARN-4327
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, security, timelineserver
>Affects Versions: 2.7.1
> Environment: hadoop 2.7.1hdfs,yarn, mrhistoryserver, ATS all use 
> kerberos security.
> conf like this:
> 
>   hadoop.security.authorization
>   true
>   Is service-level authorization enabled?
> 
> 
>   hadoop.security.authentication
>   kerberos
>   Possible values are simple (no authentication), and kerberos
>   
> 
>Reporter: zhangshilong
>Priority: Major
>
> bin hadoop 2.7.1
> ATS conf like this: 
> 
> yarn.timeline-service.http-authentication.type
> simple
> 
> 
> yarn.timeline-service.http-authentication.kerberos.principal
> HTTP/_h...@xxx.com
> 
> 
> yarn.timeline-service.http-authentication.kerberos.keytab
> /etc/hadoop/keytabs/xxx.keytab
> 
> 
> yarn.timeline-service.principal
> xxx/_h...@xxx.com
> 
> 
> yarn.timeline-service.keytab
> /etc/hadoop/keytabs/xxx.keytab
> 
> 
> yarn.timeline-service.best-effort
> true
> 
> 
> yarn.timeline-service.enabled
> true
>   
>  
> I'd like to allow everyone to access ATS from HTTP as RM,HDFS.
> client can submit job to RM and  add TIMELINE_DELEGATION_TOKEN  to AM 
> Context, but RM can not renew  TIMELINE_DELEGATION_TOKEN and make application 
> to failure.
> RM logs:
> 2015-11-03 11:58:38,191 WARN 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer:
>  Unable to add the application to the delegation token renewer.
> java.io.IOException: Failed to renew token: Kind: TIMELINE_DELEGATION_TOKEN, 
> Service: 10.12.38.4:8188, Ident: (owner=yarn-test, renewer=yarn-test, 
> realUser=, issueDate=1446523118046, maxDate=1447127918046, sequenceNumber=9, 
> masterKeyId=2)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:439)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:78)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:847)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:828)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: HTTP status [500], message [Null user]
> at 
> org.apache.hadoop.util.HttpExceptionUtils.validateResponse(HttpExceptionUtils.java:169)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.doDelegationTokenOperation(DelegationTokenAuthenticator.java:287)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.renewDelegationToken(DelegationTokenAuthenticator.java:212)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.renewDelegationToken(DelegationTokenAuthenticatedURL.java:414)
> at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$3.run(TimelineClientImpl.java:396)
> at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$3.run(TimelineClientImpl.java:378)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$5.run(TimelineClientImpl.java:451)
> at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineClientConnectionRetry.retryOn(TimelineClientImpl.java:183)
> at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.operateDelegationToken(TimelineClientImpl.java:466)

[jira] [Comment Edited] (YARN-4327) RM can not renew TIMELINE_DELEGATION_TOKEN in secure clusters

2019-02-15 Thread Yeliang Cang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769138#comment-16769138
 ] 

Yeliang Cang edited comment on YARN-4327 at 2/15/19 9:42 AM:
-

[~basha.sh...@gmail.com],[~linou518], [~zsl2007], Hi, guys ! Have you solve 
this problem? I have met the same!
[~zjshen], could you see this problem?


was (Author: cyl):
[~basha.sh...@gmail.com],[~linou518], [~zsl2007], Hi, guys ! Have you solve 
this problem? I have met the same!

> RM can not renew  TIMELINE_DELEGATION_TOKEN in secure clusters
> --
>
> Key: YARN-4327
> URL: https://issues.apache.org/jira/browse/YARN-4327
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, security, timelineserver
>Affects Versions: 2.7.1
> Environment: hadoop 2.7.1hdfs,yarn, mrhistoryserver, ATS all use 
> kerberos security.
> conf like this:
> 
>   hadoop.security.authorization
>   true
>   Is service-level authorization enabled?
> 
> 
>   hadoop.security.authentication
>   kerberos
>   Possible values are simple (no authentication), and kerberos
>   
> 
>Reporter: zhangshilong
>Priority: Major
>
> bin hadoop 2.7.1
> ATS conf like this: 
> 
> yarn.timeline-service.http-authentication.type
> simple
> 
> 
> yarn.timeline-service.http-authentication.kerberos.principal
> HTTP/_h...@xxx.com
> 
> 
> yarn.timeline-service.http-authentication.kerberos.keytab
> /etc/hadoop/keytabs/xxx.keytab
> 
> 
> yarn.timeline-service.principal
> xxx/_h...@xxx.com
> 
> 
> yarn.timeline-service.keytab
> /etc/hadoop/keytabs/xxx.keytab
> 
> 
> yarn.timeline-service.best-effort
> true
> 
> 
> yarn.timeline-service.enabled
> true
>   
>  
> I'd like to allow everyone to access ATS from HTTP as RM,HDFS.
> client can submit job to RM and  add TIMELINE_DELEGATION_TOKEN  to AM 
> Context, but RM can not renew  TIMELINE_DELEGATION_TOKEN and make application 
> to failure.
> RM logs:
> 2015-11-03 11:58:38,191 WARN 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer:
>  Unable to add the application to the delegation token renewer.
> java.io.IOException: Failed to renew token: Kind: TIMELINE_DELEGATION_TOKEN, 
> Service: 10.12.38.4:8188, Ident: (owner=yarn-test, renewer=yarn-test, 
> realUser=, issueDate=1446523118046, maxDate=1447127918046, sequenceNumber=9, 
> masterKeyId=2)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:439)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:78)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:847)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:828)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: HTTP status [500], message [Null user]
> at 
> org.apache.hadoop.util.HttpExceptionUtils.validateResponse(HttpExceptionUtils.java:169)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.doDelegationTokenOperation(DelegationTokenAuthenticator.java:287)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.renewDelegationToken(DelegationTokenAuthenticator.java:212)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.renewDelegationToken(DelegationTokenAuthenticatedURL.java:414)
> at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$3.run(TimelineClientImpl.java:396)
> at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$3.run(TimelineClientImpl.java:378)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$5.run(TimelineClientImpl.java:451)
> at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineClientConnectionRetry.retryOn(TimelineClientImpl.java:183)
> at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.operateDelegationToken(TimelineClientImpl.java:466)
> at 
> 

[jira] [Commented] (YARN-4327) RM can not renew TIMELINE_DELEGATION_TOKEN in secure clusters

2019-02-15 Thread Yeliang Cang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769138#comment-16769138
 ] 

Yeliang Cang commented on YARN-4327:


[~basha.sh...@gmail.com],[~linou518], [~zsl2007], Hi, guys ! Have you solve 
this problem? I have met the same!

> RM can not renew  TIMELINE_DELEGATION_TOKEN in secure clusters
> --
>
> Key: YARN-4327
> URL: https://issues.apache.org/jira/browse/YARN-4327
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, security, timelineserver
>Affects Versions: 2.7.1
> Environment: hadoop 2.7.1hdfs,yarn, mrhistoryserver, ATS all use 
> kerberos security.
> conf like this:
> 
>   hadoop.security.authorization
>   true
>   Is service-level authorization enabled?
> 
> 
>   hadoop.security.authentication
>   kerberos
>   Possible values are simple (no authentication), and kerberos
>   
> 
>Reporter: zhangshilong
>Priority: Major
>
> bin hadoop 2.7.1
> ATS conf like this: 
> 
> yarn.timeline-service.http-authentication.type
> simple
> 
> 
> yarn.timeline-service.http-authentication.kerberos.principal
> HTTP/_h...@xxx.com
> 
> 
> yarn.timeline-service.http-authentication.kerberos.keytab
> /etc/hadoop/keytabs/xxx.keytab
> 
> 
> yarn.timeline-service.principal
> xxx/_h...@xxx.com
> 
> 
> yarn.timeline-service.keytab
> /etc/hadoop/keytabs/xxx.keytab
> 
> 
> yarn.timeline-service.best-effort
> true
> 
> 
> yarn.timeline-service.enabled
> true
>   
>  
> I'd like to allow everyone to access ATS from HTTP as RM,HDFS.
> client can submit job to RM and  add TIMELINE_DELEGATION_TOKEN  to AM 
> Context, but RM can not renew  TIMELINE_DELEGATION_TOKEN and make application 
> to failure.
> RM logs:
> 2015-11-03 11:58:38,191 WARN 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer:
>  Unable to add the application to the delegation token renewer.
> java.io.IOException: Failed to renew token: Kind: TIMELINE_DELEGATION_TOKEN, 
> Service: 10.12.38.4:8188, Ident: (owner=yarn-test, renewer=yarn-test, 
> realUser=, issueDate=1446523118046, maxDate=1447127918046, sequenceNumber=9, 
> masterKeyId=2)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:439)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:78)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:847)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:828)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: HTTP status [500], message [Null user]
> at 
> org.apache.hadoop.util.HttpExceptionUtils.validateResponse(HttpExceptionUtils.java:169)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.doDelegationTokenOperation(DelegationTokenAuthenticator.java:287)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.renewDelegationToken(DelegationTokenAuthenticator.java:212)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.renewDelegationToken(DelegationTokenAuthenticatedURL.java:414)
> at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$3.run(TimelineClientImpl.java:396)
> at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$3.run(TimelineClientImpl.java:378)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$5.run(TimelineClientImpl.java:451)
> at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineClientConnectionRetry.retryOn(TimelineClientImpl.java:183)
> at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.operateDelegationToken(TimelineClientImpl.java:466)
> at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.renewDelegationToken(TimelineClientImpl.java:400)
> at 
> org.apache.hadoop.yarn.security.client.TimelineDelegationTokenIdentifier$Renewer.renew(TimelineDelegationTokenIdentifier.java:81)

[jira] [Updated] (YARN-9283) Javadoc of LinuxContainerExecutor#addSchedPriorityCommand has a wrong property name as reference

2019-02-15 Thread Akira Ajisaka (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated YARN-9283:

Component/s: (was: yarn)
 documentation
 Issue Type: Bug  (was: Improvement)

> Javadoc of LinuxContainerExecutor#addSchedPriorityCommand has a wrong 
> property name as reference
> 
>
> Key: YARN-9283
> URL: https://issues.apache.org/jira/browse/YARN-9283
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.2.0
>Reporter: Szilard Nemeth
>Assignee: Adam Antal
>Priority: Minor
>  Labels: newbie, newbie++
> Attachments: YARN-9283.000.patch, YARN-9283.001.patch
>
>
> The javadoc of LinuxContainerExecutor#addSchedPriorityCommand tries to refer 
> to the property 
> org.apache.hadoop.yarn.conf.YarnConfiguration#NM_CONTAINER_EXECUTOR_SCHED_PRIORITY
> which has the value: 
> "yarn.nodemanager.container-executor.os.sched.priority.adjustment" but the 
> javadoc contains the value: 
> "yarn.nodemanager.container-executer.os.sched.prioity" which is incorrect.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9283) Javadoc of LinuxContainerExecutor#addSchedPriorityCommand has a wrong property name as reference

2019-02-15 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769130#comment-16769130
 ] 

Hadoop QA commented on YARN-9283:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  1s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 51s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 20m 
34s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 68m 45s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-9283 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12958831/YARN-9283.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux f6bd5c26085d 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 75e15cc |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/23414/testReport/ |
| Max. process+thread count | 423 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/23414/console |
| Powered by | Apache Yetus 0.8.0   

[jira] [Commented] (YARN-9294) Potential race condition in setting GPU cgroups & execute command in the selected cgroup

2019-02-15 Thread Zhankun Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769118#comment-16769118
 ] 

Zhankun Tang commented on YARN-9294:


[~oliverhuh...@gmail.com] , Do you mean that the current way of updating the 
cgroups in contain-executor ("fprintf" actually) is not working in RHEL 7?

> Potential race condition in setting GPU cgroups & execute command in the 
> selected cgroup
> 
>
> Key: YARN-9294
> URL: https://issues.apache.org/jira/browse/YARN-9294
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.10.0
>Reporter: Keqiu Hu
>Assignee: Keqiu Hu
>Priority: Critical
>
> Environment is latest branch-2 head
> OS: RHEL 7.4
> *Observation*
> Out of ~10 container allocations with GPU requirement, at least 1 of the 
> allocated containers would lose GPU isolation. Even if I asked for 1 GPU, I 
> could still have visibility to all GPUs on the same machine when running 
> nvidia-smi.
> The funny thing is even though I have visibility to all GPUs at the moment of 
> executing container-executor (say ordinal 0,1,2,3), but cgroups jailed the 
> process's access to only that single GPU after sometime. 
> The underlying process trying to access GPU would take the initial 
> information as source of truth and try to access physical 0 GPU which is not 
> really available to the process. This results in a 
> [CUDA_ERROR_INVALID_DEVICE: invalid device ordinal] error.
> Validated the container-executor commands are correct:
> {code:java}
> PrivilegedOperationExecutor command: 
> [/export/apps/hadoop/nodemanager/latest/bin/container-executor, --module-gpu, 
> --container_id, container_e22_1549663278916_0249_01_01, --excluded_gpus, 
> 0,1,2,3]
> PrivilegedOperationExecutor command: 
> [/export/apps/hadoop/nodemanager/latest/bin/container-executor, khu, khu, 0, 
> application_1549663278916_0249, 
> /grid/a/tmp/yarn/nmPrivate/container_e22_1549663278916_0249_01_01.tokens, 
> /grid/a/tmp/yarn, /grid/a/tmp/userlogs, 
> /export/apps/jdk/JDK-1_8_0_172/jre/bin/java, -classpath, ..., -Xmx256m, 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer,
>  khu, application_1549663278916_0249, 
> container_e22_1549663278916_0249_01_01, ltx1-hcl7552.grid.linkedin.com, 
> 8040, /grid/a/tmp/yarn]
> {code}
> So most likely a race condition between these two operations? 
> cc [~jhung]
> Another potential theory is the cgroups creation for the container actually 
> failed but the error was swallowed silently.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9283) Javadoc of LinuxContainerExecutor#addSchedPriorityCommand has a wrong property name as reference

2019-02-15 Thread Akira Ajisaka (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769100#comment-16769100
 ] 

Akira Ajisaka commented on YARN-9283:
-

+1, thanks [~adam.antal].

> Javadoc of LinuxContainerExecutor#addSchedPriorityCommand has a wrong 
> property name as reference
> 
>
> Key: YARN-9283
> URL: https://issues.apache.org/jira/browse/YARN-9283
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 3.2.0
>Reporter: Szilard Nemeth
>Assignee: Adam Antal
>Priority: Minor
>  Labels: newbie, newbie++
> Attachments: YARN-9283.000.patch, YARN-9283.001.patch
>
>
> The javadoc of LinuxContainerExecutor#addSchedPriorityCommand tries to refer 
> to the property 
> org.apache.hadoop.yarn.conf.YarnConfiguration#NM_CONTAINER_EXECUTOR_SCHED_PRIORITY
> which has the value: 
> "yarn.nodemanager.container-executor.os.sched.priority.adjustment" but the 
> javadoc contains the value: 
> "yarn.nodemanager.container-executer.os.sched.prioity" which is incorrect.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7824) [UI2] Yarn Component Instance page should include link to container logs

2019-02-15 Thread Akhil PB (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769086#comment-16769086
 ] 

Akhil PB commented on YARN-7824:


Attached branch-3.2 patch (works for branch-3.1 too).
cc [~sunilg]

> [UI2] Yarn Component Instance page should include link to container logs
> 
>
> Key: YARN-7824
> URL: https://issues.apache.org/jira/browse/YARN-7824
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn-ui-v2
>Affects Versions: 3.0.0
>Reporter: Yesha Vora
>Assignee: Akhil PB
>Priority: Major
> Attachments: YARN-7824-branch-3.2.001.patch, YARN-7824.001.patch
>
>
> Steps:
> 1) Launch Httpd example
> 2) Visit component Instance page for httpd-proxy-0
> This page has information regarding httpd-proxy-0 component.
> This page should also include a link to container logs for this component
> h2.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7824) [UI2] Yarn Component Instance page should include link to container logs

2019-02-15 Thread Akhil PB (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akhil PB updated YARN-7824:
---
Attachment: YARN-7824-branch-3.2.001.patch

> [UI2] Yarn Component Instance page should include link to container logs
> 
>
> Key: YARN-7824
> URL: https://issues.apache.org/jira/browse/YARN-7824
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn-ui-v2
>Affects Versions: 3.0.0
>Reporter: Yesha Vora
>Assignee: Akhil PB
>Priority: Major
> Attachments: YARN-7824-branch-3.2.001.patch, YARN-7824.001.patch
>
>
> Steps:
> 1) Launch Httpd example
> 2) Visit component Instance page for httpd-proxy-0
> This page has information regarding httpd-proxy-0 component.
> This page should also include a link to container logs for this component
> h2.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9213) RM Web UI v1 does not show custom resource allocations for containers page

2019-02-15 Thread Szilard Nemeth (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-9213:
-
Attachment: YARN-9213.003.patch

> RM Web UI v1 does not show custom resource allocations for containers page
> --
>
> Key: YARN-9213
> URL: https://issues.apache.org/jira/browse/YARN-9213
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Major
> Attachments: Screen Shot 2019-02-08 at 21.16.37-before.png, Screen 
> Shot 2019-02-09 at 9.55.16-after.png, YARN-9213.001.patch, 
> YARN-9213.002.patch, YARN-9213.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9213) RM Web UI v1 does not show custom resource allocations for containers page

2019-02-15 Thread Szilard Nemeth (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769085#comment-16769085
 ] 

Szilard Nemeth commented on YARN-9213:
--

Sure [~sunilg]!
I added a new patch that fixes checkstyle issues.

> RM Web UI v1 does not show custom resource allocations for containers page
> --
>
> Key: YARN-9213
> URL: https://issues.apache.org/jira/browse/YARN-9213
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Major
> Attachments: Screen Shot 2019-02-08 at 21.16.37-before.png, Screen 
> Shot 2019-02-09 at 9.55.16-after.png, YARN-9213.001.patch, 
> YARN-9213.002.patch, YARN-9213.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9213) RM Web UI v1 does not show custom resource allocations for containers page

2019-02-15 Thread Sunil Govindan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769073#comment-16769073
 ] 

Sunil Govindan commented on YARN-9213:
--

Thanks [~snemeth]

I think these checkstyle issues to be fixed. for switch case, allignement from 
IDE doesnt match with jenkins expectation. Could you please correct same.

> RM Web UI v1 does not show custom resource allocations for containers page
> --
>
> Key: YARN-9213
> URL: https://issues.apache.org/jira/browse/YARN-9213
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Major
> Attachments: Screen Shot 2019-02-08 at 21.16.37-before.png, Screen 
> Shot 2019-02-09 at 9.55.16-after.png, YARN-9213.001.patch, YARN-9213.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9213) RM Web UI v1 does not show custom resource allocations for containers page

2019-02-15 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769066#comment-16769066
 ] 

Hadoop QA commented on YARN-9213:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 13s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 16s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common: 
The patch generated 11 new + 32 unchanged - 0 fixed = 43 total (was 32) {color} 
|
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 51s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
53s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 54m 17s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-9213 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12958824/YARN-9213.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux ddd344a9e231 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri Oct 
5 08:56:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 5cb67cf |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/23413/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-common.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/23413/testReport/ |
| Max. process+thread count | 340 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common U: 

[jira] [Commented] (YARN-9283) Javadoc of LinuxContainerExecutor#addSchedPriorityCommand has a wrong property name as reference

2019-02-15 Thread Adam Antal (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769064#comment-16769064
 ] 

Adam Antal commented on YARN-9283:
--

Good idea, included in [^YARN-9283.001.patch].

Javadoc compiles, and checked the html of 
{{LinuxContainerExecutor#addSchedPriorityCommand -}} displays 
YarnConfiguration#NM_CONTAINER_EXECUTOR_SCHED_PRIORITY as expected.

> Javadoc of LinuxContainerExecutor#addSchedPriorityCommand has a wrong 
> property name as reference
> 
>
> Key: YARN-9283
> URL: https://issues.apache.org/jira/browse/YARN-9283
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 3.2.0
>Reporter: Szilard Nemeth
>Assignee: Adam Antal
>Priority: Minor
>  Labels: newbie, newbie++
> Attachments: YARN-9283.000.patch, YARN-9283.001.patch
>
>
> The javadoc of LinuxContainerExecutor#addSchedPriorityCommand tries to refer 
> to the property 
> org.apache.hadoop.yarn.conf.YarnConfiguration#NM_CONTAINER_EXECUTOR_SCHED_PRIORITY
> which has the value: 
> "yarn.nodemanager.container-executor.os.sched.priority.adjustment" but the 
> javadoc contains the value: 
> "yarn.nodemanager.container-executer.os.sched.prioity" which is incorrect.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9283) Javadoc of LinuxContainerExecutor#addSchedPriorityCommand has a wrong property name as reference

2019-02-15 Thread Adam Antal (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Antal updated YARN-9283:
-
Attachment: YARN-9283.001.patch

> Javadoc of LinuxContainerExecutor#addSchedPriorityCommand has a wrong 
> property name as reference
> 
>
> Key: YARN-9283
> URL: https://issues.apache.org/jira/browse/YARN-9283
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 3.2.0
>Reporter: Szilard Nemeth
>Assignee: Adam Antal
>Priority: Minor
>  Labels: newbie, newbie++
> Attachments: YARN-9283.000.patch, YARN-9283.001.patch
>
>
> The javadoc of LinuxContainerExecutor#addSchedPriorityCommand tries to refer 
> to the property 
> org.apache.hadoop.yarn.conf.YarnConfiguration#NM_CONTAINER_EXECUTOR_SCHED_PRIORITY
> which has the value: 
> "yarn.nodemanager.container-executor.os.sched.priority.adjustment" but the 
> javadoc contains the value: 
> "yarn.nodemanager.container-executer.os.sched.prioity" which is incorrect.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9294) Potential race condition in setting GPU cgroups & execute command in the selected cgroup

2019-02-15 Thread Keqiu Hu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769049#comment-16769049
 ] 

Keqiu Hu commented on YARN-9294:


After more debugging, we found the race condition is not because of flakiness 
with cgroup creation & launching job in the cgroup slice, but caused by an 
incompatibility with RHEL 7. Would love to hear if anyone is the community 
experienced the same issue with RHEL 7. Basically the existing logic to `mkdir 
container_123` && `echo taskId > container_123/tasks` doesn't work anymore. 
There is some sanity work in the OS that if the process is not registered in 
`/sys/fs/cgroup/systemd/`, the taskId will be removed from 
`container_123/tasks`.

There are a couple ways to fix the issue, one is to use RHEL 7 specific cgroups 
CLI like * systemd-run --unit=hu --slice=hadoop nohup /root/echo.sh* to start 
the container executor, but this won't be compatible with other operating 
systems. Still trying to figure out if there is a way to make it work for most 
OS.

> Potential race condition in setting GPU cgroups & execute command in the 
> selected cgroup
> 
>
> Key: YARN-9294
> URL: https://issues.apache.org/jira/browse/YARN-9294
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.10.0
>Reporter: Keqiu Hu
>Assignee: Keqiu Hu
>Priority: Critical
>
> Environment is latest branch-2 head
> OS: RHEL 7.4
> *Observation*
> Out of ~10 container allocations with GPU requirement, at least 1 of the 
> allocated containers would lose GPU isolation. Even if I asked for 1 GPU, I 
> could still have visibility to all GPUs on the same machine when running 
> nvidia-smi.
> The funny thing is even though I have visibility to all GPUs at the moment of 
> executing container-executor (say ordinal 0,1,2,3), but cgroups jailed the 
> process's access to only that single GPU after sometime. 
> The underlying process trying to access GPU would take the initial 
> information as source of truth and try to access physical 0 GPU which is not 
> really available to the process. This results in a 
> [CUDA_ERROR_INVALID_DEVICE: invalid device ordinal] error.
> Validated the container-executor commands are correct:
> {code:java}
> PrivilegedOperationExecutor command: 
> [/export/apps/hadoop/nodemanager/latest/bin/container-executor, --module-gpu, 
> --container_id, container_e22_1549663278916_0249_01_01, --excluded_gpus, 
> 0,1,2,3]
> PrivilegedOperationExecutor command: 
> [/export/apps/hadoop/nodemanager/latest/bin/container-executor, khu, khu, 0, 
> application_1549663278916_0249, 
> /grid/a/tmp/yarn/nmPrivate/container_e22_1549663278916_0249_01_01.tokens, 
> /grid/a/tmp/yarn, /grid/a/tmp/userlogs, 
> /export/apps/jdk/JDK-1_8_0_172/jre/bin/java, -classpath, ..., -Xmx256m, 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer,
>  khu, application_1549663278916_0249, 
> container_e22_1549663278916_0249_01_01, ltx1-hcl7552.grid.linkedin.com, 
> 8040, /grid/a/tmp/yarn]
> {code}
> So most likely a race condition between these two operations? 
> cc [~jhung]
> Another potential theory is the cgroups creation for the container actually 
> failed but the error was swallowed silently.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org