[jira] [Updated] (YARN-9365) fix wrong command in TimelineServiceV2.md

2019-03-07 Thread runlin (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

runlin updated YARN-9365:
-
Description: 
In TimelineServiceV2.md  255,The step to  create  the timeline service schema 
does  not work

 
{noformat}
Finally, run the schema creator tool to create the necessary tables:

bin/hadoop 
org.apache.hadoop.yarn.server.timelineservice.storage.TimelineSchemaCreator 
-create{noformat}
 

should be

 
{noformat}
The schema creation can be run on the hbase cluster which is going to store the 
timeline
service tables. The schema creator tool requires both the timelineservice-hbase 
as well
as the hbase-server jars. Hence, during schema creation, you need to ensure 
that the
hbase classpath contains the yarn-timelineservice-hbase jar.

On the hbase cluster, you can get it from hdfs since we placed it there for the
coprocessor in the step above.

```
   hadoop fs -get 
/hbase/coprocessor/hadoop-yarn-server-timelineservice-hbase-client-${project.version}.jar
   hadoop fs -get 
/hbase/coprocessor/hadoop-yarn-server-timelineservice-${project.version}.jar
   hadoop fs -get 
/hbase/coprocessor/hadoop-yarn-server-timelineservice-hbase-common-${project.version}.jar
 /.
```

Next, add it to the hbase classpath as follows:

```
   export 
HBASE_CLASSPATH=$HBASE_CLASSPATH:/home/yarn/hadoop-current/share/hadoop/yarn/timelineservice/hadoop-yarn-server-timelineservice-hbase-client-${project.version}.jar
   export 
HBASE_CLASSPATH=$HBASE_CLASSPATH:/home/yarn/hadoop-current/share/hadoop/yarn/timelineservice/hadoop-yarn-server-timelineservice-${project.version}.jar
   export 
HBASE_CLASSPATH=$HBASE_CLASSPATH:/home/yarn/hadoop-current/share/hadoop/yarn/timelineservice/hadoop-yarn-server-timelineservice-hbase-common-${project.version}.jar
```

Finally, run the schema creator tool to create the necessary tables:

```
bin/hbase 
org.apache.hadoop.yarn.server.timelineservice.storage.TimelineSchemaCreator 
-create
```{noformat}
 

  was:
In TimelineServiceV2.md  255,the step to  Create  the timeline service schema 
does  not work

 
{noformat}
Finally, run the schema creator tool to create the necessary tables:

bin/hadoop 
org.apache.hadoop.yarn.server.timelineservice.storage.TimelineSchemaCreator 
-create{noformat}
 

should be

 
{noformat}
The schema creation can be run on the hbase cluster which is going to store the 
timeline
service tables. The schema creator tool requires both the timelineservice-hbase 
as well
as the hbase-server jars. Hence, during schema creation, you need to ensure 
that the
hbase classpath contains the yarn-timelineservice-hbase jar.

On the hbase cluster, you can get it from hdfs since we placed it there for the
coprocessor in the step above.

```
   hadoop fs -get 
/hbase/coprocessor/hadoop-yarn-server-timelineservice-hbase-client-${project.version}.jar
   hadoop fs -get 
/hbase/coprocessor/hadoop-yarn-server-timelineservice-${project.version}.jar
   hadoop fs -get 
/hbase/coprocessor/hadoop-yarn-server-timelineservice-hbase-common-${project.version}.jar
 /.
```

Next, add it to the hbase classpath as follows:

```
   export 
HBASE_CLASSPATH=$HBASE_CLASSPATH:/home/yarn/hadoop-current/share/hadoop/yarn/timelineservice/hadoop-yarn-server-timelineservice-hbase-client-${project.version}.jar
   export 
HBASE_CLASSPATH=$HBASE_CLASSPATH:/home/yarn/hadoop-current/share/hadoop/yarn/timelineservice/hadoop-yarn-server-timelineservice-${project.version}.jar
   export 
HBASE_CLASSPATH=$HBASE_CLASSPATH:/home/yarn/hadoop-current/share/hadoop/yarn/timelineservice/hadoop-yarn-server-timelineservice-hbase-common-${project.version}.jar
```

Finally, run the schema creator tool to create the necessary tables:

```
bin/hbase 
org.apache.hadoop.yarn.server.timelineservice.storage.TimelineSchemaCreator 
-create
```{noformat}
 


> fix wrong command in TimelineServiceV2.md 
> --
>
> Key: YARN-9365
> URL: https://issues.apache.org/jira/browse/YARN-9365
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 3.2.0, 3.1.1
>Reporter: runlin
>Priority: Major
> Attachments: YARN-9365.patch
>
>
> In TimelineServiceV2.md  255,The step to  create  the timeline service schema 
> does  not work
>  
> {noformat}
> Finally, run the schema creator tool to create the necessary tables:
> bin/hadoop 
> org.apache.hadoop.yarn.server.timelineservice.storage.TimelineSchemaCreator 
> -create{noformat}
>  
> should be
>  
> {noformat}
> The schema creation can be run on the hbase cluster which is going to store 
> the timeline
> service tables. The schema creator tool requires both the 
> timelineservice-hbase as well
> as the hbase-server jars. Hence, during schema creation, you need to ensure 
> that the
> hbase classpath contains the 

[jira] [Updated] (YARN-9365) fix wrong command in TimelineServiceV2.md

2019-03-07 Thread runlin (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

runlin updated YARN-9365:
-
Attachment: YARN-9365.patch

> fix wrong command in TimelineServiceV2.md 
> --
>
> Key: YARN-9365
> URL: https://issues.apache.org/jira/browse/YARN-9365
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 3.2.0, 3.1.1
>Reporter: runlin
>Priority: Major
> Attachments: YARN-9365.patch
>
>
> In TimelineServiceV2.md  255,the step to  Create  the timeline service schema 
> does  not work
>  
> {noformat}
> Finally, run the schema creator tool to create the necessary tables:
> bin/hadoop 
> org.apache.hadoop.yarn.server.timelineservice.storage.TimelineSchemaCreator 
> -create{noformat}
>  
> should be
>  
> {noformat}
> The schema creation can be run on the hbase cluster which is going to store 
> the timeline
> service tables. The schema creator tool requires both the 
> timelineservice-hbase as well
> as the hbase-server jars. Hence, during schema creation, you need to ensure 
> that the
> hbase classpath contains the yarn-timelineservice-hbase jar.
> On the hbase cluster, you can get it from hdfs since we placed it there for 
> the
> coprocessor in the step above.
> ```
>hadoop fs -get 
> /hbase/coprocessor/hadoop-yarn-server-timelineservice-hbase-client-${project.version}.jar
>hadoop fs -get 
> /hbase/coprocessor/hadoop-yarn-server-timelineservice-${project.version}.jar
>hadoop fs -get 
> /hbase/coprocessor/hadoop-yarn-server-timelineservice-hbase-common-${project.version}.jar
>  /.
> ```
> Next, add it to the hbase classpath as follows:
> ```
>export 
> HBASE_CLASSPATH=$HBASE_CLASSPATH:/home/yarn/hadoop-current/share/hadoop/yarn/timelineservice/hadoop-yarn-server-timelineservice-hbase-client-${project.version}.jar
>export 
> HBASE_CLASSPATH=$HBASE_CLASSPATH:/home/yarn/hadoop-current/share/hadoop/yarn/timelineservice/hadoop-yarn-server-timelineservice-${project.version}.jar
>export 
> HBASE_CLASSPATH=$HBASE_CLASSPATH:/home/yarn/hadoop-current/share/hadoop/yarn/timelineservice/hadoop-yarn-server-timelineservice-hbase-common-${project.version}.jar
> ```
> Finally, run the schema creator tool to create the necessary tables:
> ```
> bin/hbase 
> org.apache.hadoop.yarn.server.timelineservice.storage.TimelineSchemaCreator 
> -create
> ```{noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9365) fix wrong command in TimelineServiceV2.md

2019-03-07 Thread runlin (JIRA)
runlin created YARN-9365:


 Summary: fix wrong command in TimelineServiceV2.md 
 Key: YARN-9365
 URL: https://issues.apache.org/jira/browse/YARN-9365
 Project: Hadoop YARN
  Issue Type: Bug
  Components: timelineserver
Affects Versions: 3.1.1, 3.2.0
Reporter: runlin


In TimelineServiceV2.md  255,the step to  Create  the timeline service schema 
does  not work

 
{noformat}
Finally, run the schema creator tool to create the necessary tables:

bin/hadoop 
org.apache.hadoop.yarn.server.timelineservice.storage.TimelineSchemaCreator 
-create{noformat}
 

should be

 
{noformat}
The schema creation can be run on the hbase cluster which is going to store the 
timeline
service tables. The schema creator tool requires both the timelineservice-hbase 
as well
as the hbase-server jars. Hence, during schema creation, you need to ensure 
that the
hbase classpath contains the yarn-timelineservice-hbase jar.

On the hbase cluster, you can get it from hdfs since we placed it there for the
coprocessor in the step above.

```
   hadoop fs -get 
/hbase/coprocessor/hadoop-yarn-server-timelineservice-hbase-client-${project.version}.jar
   hadoop fs -get 
/hbase/coprocessor/hadoop-yarn-server-timelineservice-${project.version}.jar
   hadoop fs -get 
/hbase/coprocessor/hadoop-yarn-server-timelineservice-hbase-common-${project.version}.jar
 /.
```

Next, add it to the hbase classpath as follows:

```
   export 
HBASE_CLASSPATH=$HBASE_CLASSPATH:/home/yarn/hadoop-current/share/hadoop/yarn/timelineservice/hadoop-yarn-server-timelineservice-hbase-client-${project.version}.jar
   export 
HBASE_CLASSPATH=$HBASE_CLASSPATH:/home/yarn/hadoop-current/share/hadoop/yarn/timelineservice/hadoop-yarn-server-timelineservice-${project.version}.jar
   export 
HBASE_CLASSPATH=$HBASE_CLASSPATH:/home/yarn/hadoop-current/share/hadoop/yarn/timelineservice/hadoop-yarn-server-timelineservice-hbase-common-${project.version}.jar
```

Finally, run the schema creator tool to create the necessary tables:

```
bin/hbase 
org.apache.hadoop.yarn.server.timelineservice.storage.TimelineSchemaCreator 
-create
```{noformat}
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9364) Remove commons-logging dependency from remaining hadoop-yarn

2019-03-07 Thread Prabhu Joseph (JIRA)
Prabhu Joseph created YARN-9364:
---

 Summary: Remove commons-logging dependency from remaining 
hadoop-yarn
 Key: YARN-9364
 URL: https://issues.apache.org/jira/browse/YARN-9364
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 3.2.0
Reporter: Prabhu Joseph
Assignee: Prabhu Joseph


YARN-6712 removes the usage of commons-logging dependency. The dependency can 
be removed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9335) [atsv2] Restrict the number of elements held in NM timeline collector when backend is unreachable for asycn calls

2019-03-07 Thread Vrushali C (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787605#comment-16787605
 ] 

Vrushali C commented on YARN-9335:
--

As discussed in the community call today, we can tackle the solution for async 
writes in this jira. We can have another jira to handle the situation with sync 
writes. 

> [atsv2] Restrict the number of elements held in NM timeline collector when 
> backend is unreachable for asycn calls
> -
>
> Key: YARN-9335
> URL: https://issues.apache.org/jira/browse/YARN-9335
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Vrushali C
>Assignee: Abhishek Modi
>Priority: Major
>
> For ATSv2 , if the backend is unreachable, the number/size of data held in 
> timeline collector's memory increases significantly. This is not good for the 
> NM memory. 
> Filing jira to set a limit on how many/much should be retained by the 
> timeline collector in memory in case the backend is not reachable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8676) Incorrect progress index in old yarn UI

2019-03-07 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787602#comment-16787602
 ] 

Hadoop QA commented on YARN-8676:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  8s{color} 
| {color:red} YARN-8676 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-8676 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12935970/YARN-8676.001.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/23663/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Incorrect progress index in old yarn UI
> ---
>
> Key: YARN-8676
> URL: https://issues.apache.org/jira/browse/YARN-8676
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Yeliang Cang
>Assignee: Yeliang Cang
>Priority: Critical
> Attachments: YARN-8676.001.patch
>
>
> The index of parseHadoopProgress index is wrong in 
> WebPageUtils#getAppsTableColumnDefs
> {code:java}
> if (isFairSchedulerPage) {
>  sb.append("[15]");
> } else if (isResourceManager) {
>  sb.append("[17]");
> } else {
>  sb.append("[9]");
> }
> {code}
> should be
> {code:java}
> if (isFairSchedulerPage) {
>  sb.append("[16]");
> } else if (isResourceManager) {
>  sb.append("[18]");
> } else {
>  sb.append("[11]");
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9343) Replace isDebugEnabled with SLF4J parameterized log messages

2019-03-07 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787610#comment-16787610
 ] 

Prabhu Joseph commented on YARN-9343:
-

Thanks [~wilfreds], have reported YARN-9363 to address remaining code. Can we 
commit this one.

> Replace isDebugEnabled with SLF4J parameterized log messages
> 
>
> Key: YARN-9343
> URL: https://issues.apache.org/jira/browse/YARN-9343
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-9343-001.patch, YARN-9343-002.patch, 
> YARN-9343-003.patch
>
>
> Replace isDebugEnabled with SLF4J parameterized log messages. 
> https://www.slf4j.org/faq.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9363) Replace isDebugEnabled with SLF4J parameterized log messages for remaining code

2019-03-07 Thread Prabhu Joseph (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9363:

Issue Type: Sub-task  (was: Bug)
Parent: YARN-6712

> Replace isDebugEnabled with SLF4J parameterized log messages for remaining 
> code
> ---
>
> Key: YARN-9363
> URL: https://issues.apache.org/jira/browse/YARN-9363
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 3.2.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Minor
>
> Follow up of YARN-9343 to address below review comments
> There are still 200+ LOG.isDebugEnabled() calls in the code. two things:
> There are a lot of simple one parameter calls which could easily be converted 
> to unguarded calls, examples:
> NvidiaDockerV1CommandPlugin.java
> FSParentQueue.java
> Application.java
> Some of the calls to LOG.debug that are guarded inside those guards have not 
> been changed to parameterized calls yet. 
> cc [~wilfreds]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9363) Replace isDebugEnabled with SLF4J parameterized log messages for remaining code

2019-03-07 Thread Prabhu Joseph (JIRA)
Prabhu Joseph created YARN-9363:
---

 Summary: Replace isDebugEnabled with SLF4J parameterized log 
messages for remaining code
 Key: YARN-9363
 URL: https://issues.apache.org/jira/browse/YARN-9363
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 3.2.0
Reporter: Prabhu Joseph
Assignee: Prabhu Joseph


Follow up of YARN-9343 to address below review comments

There are still 200+ LOG.isDebugEnabled() calls in the code. two things:
There are a lot of simple one parameter calls which could easily be converted 
to unguarded calls, examples:
NvidiaDockerV1CommandPlugin.java
FSParentQueue.java
Application.java
Some of the calls to LOG.debug that are guarded inside those guards have not 
been changed to parameterized calls yet. 

cc [~wilfreds]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9349) When doTransition() method occurs exception, the log level practices are inconsistent

2019-03-07 Thread Anuhan Torgonshar (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787600#comment-16787600
 ] 

Anuhan Torgonshar commented on YARN-9349:
-

Hi, [~Prabhu Joseph], I uploaded the patch file, could you give this issue a 
review? Thanks!

> When doTransition() method occurs exception, the log level practices are 
> inconsistent
> -
>
> Key: YARN-9349
> URL: https://issues.apache.org/jira/browse/YARN-9349
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 3.1.0, 2.8.5
>Reporter: Anuhan Torgonshar
>Priority: Major
>  Labels: easyfix
> Attachments: YARN-9349.trunk.patch
>
>
> There are *inconsistent* log level practices when code catches 
> *_InvalidStateTransitionException_* for _*doTransition()*_ method.
> {code:java}
> **WARN level**
> /**
>   file path: 
> hadoop-2.8.5-src\hadoop-yarn-project\hadoop-yarn\hadoop-yarn-server\hadoop-yarn-server-nodemanager\src\main\java\org\apache\hadoop\yarn\server\nodemanager\containermanager\application\ApplicationImpl.java
>   log statement line number: 482
>   log level:warn
> **/
> try {
>// queue event requesting init of the same app
>newState = stateMachine.doTransition(event.getType(), event);
> } catch (InvalidStateTransitionException e) {
>LOG.warn("Can't handle this event at current state", e);
> }
> /**
>   file path: 
> hadoop-2.8.5-src\hadoop-yarn-project\hadoop-yarn\hadoop-yarn-server\hadoop-yarn-server-nodemanager\src\main\java\org\apache\hadoop\yarn\server\nodemanager\containermanager\localizer\LocalizedResource.java
>   log statement line number: 200
>   log level:warn
> **/
> try {
>newState = this.stateMachine.doTransition(event.getType(), event);
> } catch (InvalidStateTransitionException e) {
>LOG.warn("Can't handle this event at current state", e);
> }
> /**
>   file path: 
> hadoop-2.8.5-src\hadoop-yarn-project\hadoop-yarn\hadoop-yarn-server\hadoop-yarn-server-nodemanager\src\main\java\org\apache\hadoop\yarn\server\nodemanager\containermanager\container\ContainerImpl.java
>   log statement line number: 1156
>   log level:warn
> **/
> try {
> newState =
> stateMachine.doTransition(event.getType(), event);
> } catch (InvalidStateTransitionException e) {
> LOG.warn("Can't handle this event at current state: Current: ["
> + oldState + "], eventType: [" + event.getType() + "]", e);
> }
> **ERROR level*
> /**
> file path: 
> hadoop-2.8.5-src\hadoop-yarn-project\hadoop-yarn\hadoop-yarn-server\hadoop-yarn-server-resourcemanager\src\main\java\org\apache\hadoop\yarn\server\resourcemanager\rmapp\attempt\RMAppAttemptImpl.java
> log statement line number:878
> log level: error
> **/
> try {
>/* keep the master in sync with the state machine */
>this.stateMachine.doTransition(event.getType(), event);
> } catch (InvalidStateTransitionException e) {
>LOG.error("App attempt: " + appAttemptID
>+ " can't handle this event at current state", e);
>onInvalidTranstion(event.getType(), oldState);
> }
> /**
> file path: 
> hadoop-2.8.5-src\hadoop-yarn-project\hadoop-yarn\hadoop-yarn-server\hadoop-yarn-server-resourcemanager\src\main\java\org\apache\hadoop\yarn\server\resourcemanager\rmnode\RMNodeImpl.java
> log statement line number:623
> log level: error
> **/
> try {
>stateMachine.doTransition(event.getType(), event);
> } catch (InvalidStateTransitionException e) {
>LOG.error("Can't handle this event at current state", e);
>LOG.error("Invalid event " + event.getType() + 
>" on Node " + this.nodeId);
> }
>  
> //There are 8 similar code snippets with ERROR log level.
> {code}
> After had a look on whole project, I found that there are 8 similar code 
> snippets assgin the ERROR level, when doTransition() ocurrs 
> *InvalidStateTransitionException*. And there are just 3 places choose  the 
> WARN level when in same situations. Therefor, I think these 3 log statements 
> should be assigned ERROR level to keep consistent with other code snippets.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8676) Incorrect progress index in old yarn UI

2019-03-07 Thread Rakesh Shah (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787598#comment-16787598
 ] 

Rakesh Shah commented on YARN-8676:
---

Hi [~Cyl]

Can you elaborate and tell how to reproduce

> Incorrect progress index in old yarn UI
> ---
>
> Key: YARN-8676
> URL: https://issues.apache.org/jira/browse/YARN-8676
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Yeliang Cang
>Assignee: Yeliang Cang
>Priority: Critical
> Attachments: YARN-8676.001.patch
>
>
> The index of parseHadoopProgress index is wrong in 
> WebPageUtils#getAppsTableColumnDefs
> {code:java}
> if (isFairSchedulerPage) {
>  sb.append("[15]");
> } else if (isResourceManager) {
>  sb.append("[17]");
> } else {
>  sb.append("[9]");
> }
> {code}
> should be
> {code:java}
> if (isFairSchedulerPage) {
>  sb.append("[16]");
> } else if (isResourceManager) {
>  sb.append("[18]");
> } else {
>  sb.append("[11]");
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8499) ATS v2 should handle connection issues in general for all storages

2019-03-07 Thread Vrushali C (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vrushali C updated YARN-8499:
-
Labels: atsv2  (was: )

> ATS v2 should handle connection issues in general for all storages
> --
>
> Key: YARN-8499
> URL: https://issues.apache.org/jira/browse/YARN-8499
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Sunil Govindan
>Assignee: Prabhu Joseph
>Priority: Major
>  Labels: atsv2
> Attachments: YARN-8499-001.patch, YARN-8499-002.patch, 
> YARN-8499-003.patch, YARN-8499-004.patch
>
>
> Post YARN-8302, Hbase connection issues are handled in ATSv2. However this 
> could be made general by introducing an api in storage interface and 
> implementing in each of the storage as per the store semantics.
>  
> cc [~rohithsharma] [~vinodkv] [~vrushalic]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9335) [atsv2] Restrict the number of elements held in NM timeline collector when backend is unreachable for asycn calls

2019-03-07 Thread Abhishek Modi (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Modi updated YARN-9335:

Summary: [atsv2] Restrict the number of elements held in NM timeline 
collector when backend is unreachable for asycn calls  (was: [atsv2] Restrict 
the number of elements held in NM timeline collector when backend is 
unreachable)

> [atsv2] Restrict the number of elements held in NM timeline collector when 
> backend is unreachable for asycn calls
> -
>
> Key: YARN-9335
> URL: https://issues.apache.org/jira/browse/YARN-9335
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Vrushali C
>Assignee: Abhishek Modi
>Priority: Major
>
> For ATSv2 , if the backend is unreachable, the number/size of data held in 
> timeline collector's memory increases significantly. This is not good for the 
> NM memory. 
> Filing jira to set a limit on how many/much should be retained by the 
> timeline collector in memory in case the backend is not reachable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8549) Adding a NoOp timeline writer and reader plugin classes for ATSv2

2019-03-07 Thread Vrushali C (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787595#comment-16787595
 ] 

Vrushali C commented on YARN-8549:
--

Thanks for the branch-2 patch Prabha! I will commit it now. 

> Adding a NoOp timeline writer and reader plugin classes for ATSv2
> -
>
> Key: YARN-8549
> URL: https://issues.apache.org/jira/browse/YARN-8549
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: ATSv2, timelineclient, timelineserver
>Reporter: Prabha Manepalli
>Assignee: Prabha Manepalli
>Priority: Minor
> Attachments: YARN-8549-branch-2.001.patch, YARN-8549.v1.patch, 
> YARN-8549.v2.patch, YARN-8549.v4.patch, YARN-8549.v5.patch
>
>
> Stub implementation for TimeLineReader and TimeLineWriter classes. 
> These are useful for functional testing of writer and reader path for ATSv2



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8302) ATS v2 should handle HBase connection issue properly

2019-03-07 Thread Vrushali C (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vrushali C updated YARN-8302:
-
Labels: atsv2  (was: )

> ATS v2 should handle HBase connection issue properly
> 
>
> Key: YARN-8302
> URL: https://issues.apache.org/jira/browse/YARN-8302
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: ATSv2
>Affects Versions: 3.1.0
>Reporter: Yesha Vora
>Assignee: Billie Rinaldi
>Priority: Major
>  Labels: atsv2
> Fix For: 3.2.0, 3.1.1
>
> Attachments: YARN-8302.1.patch, YARN-8302.2.patch, YARN-8302.3.patch
>
>
> ATS v2 call times out with below error when it can't connect to HBase 
> instance.
> {code}
> bash-4.2$ curl -i -k -s -1  -H 'Content-Type: application/json'  -H 'Accept: 
> application/json' --max-time 5   --negotiate -u : 
> 'https://xxx:8199/ws/v2/timeline/apps/application_1526357251888_0022/entities/YARN_CONTAINER?fields=ALL&_=1526425686092'
> curl: (28) Operation timed out after 5002 milliseconds with 0 bytes received
> {code}
> {code:title=ATS log}
> 2018-05-15 23:10:03,623 INFO  client.RpcRetryingCallerImpl 
> (RpcRetryingCallerImpl.java:callWithRetries(134)) - Call exception, tries=7, 
> retries=7, started=8165 ms ago, cancelled=false, msg=Call to xxx/xxx:17020 
> failed on connection exception: 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException:
>  Connection refused: xxx/xxx:17020, details=row 
> 'prod.timelineservice.app_flow,
> ,99' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, 
> hostname=xxx,17020,1526348294182, seqNum=-1
> 2018-05-15 23:10:13,651 INFO  client.RpcRetryingCallerImpl 
> (RpcRetryingCallerImpl.java:callWithRetries(134)) - Call exception, tries=8, 
> retries=8, started=18192 ms ago, cancelled=false, msg=Call to xxx/xxx:17020 
> failed on connection exception: 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException:
>  Connection refused: xxx/xxx:17020, details=row 
> 'prod.timelineservice.app_flow,
> ,99' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, 
> hostname=xxx,17020,1526348294182, seqNum=-1
> 2018-05-15 23:10:23,730 INFO  client.RpcRetryingCallerImpl 
> (RpcRetryingCallerImpl.java:callWithRetries(134)) - Call exception, tries=9, 
> retries=9, started=28272 ms ago, cancelled=false, msg=Call to xxx/xxx:17020 
> failed on connection exception: 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException:
>  Connection refused: xxx/xxx:17020, details=row 
> 'prod.timelineservice.app_flow,
> ,99' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, 
> hostname=xxx,17020,1526348294182, seqNum=-1
> 2018-05-15 23:10:33,788 INFO  client.RpcRetryingCallerImpl 
> (RpcRetryingCallerImpl.java:callWithRetries(134)) - Call exception, tries=10, 
> retries=10, started=38330 ms ago, cancelled=false, msg=Call to xxx/xxx:17020 
> failed on connection exception: 
> org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException:
>  Connection refused: xxx/xxx:17020, details=row 
> 'prod.timelineservice.app_flow,
> ,99' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, 
> hostname=xxx,17020,1526348294182, seqNum=-1{code}
> There are two issues here.
> 1) Check why ATS can't connect to HBase
> 2) In case of connection error,  ATS call should not get timeout. It should 
> fail with proper error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8589) ATS TimelineACLsManager checkAccess is slow

2019-03-07 Thread Rakesh Shah (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787592#comment-16787592
 ] 

Rakesh Shah commented on YARN-8589:
---

No I have not tried with hive query [~Prabhu Joseph]

> ATS TimelineACLsManager checkAccess is slow
> ---
>
> Key: YARN-8589
> URL: https://issues.apache.org/jira/browse/YARN-8589
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 2.7.3
>Reporter: Prabhu Joseph
>Priority: Major
>
> ATS rest api is very slow when there are more than 1lakh entries if 
> yarn.acl.enable is set to true as TimelineACLsManager has to check access for 
> every entries. We can;t disable yarn.acl.enable as all the YARN ACLs uses the 
> same config. We can have a separate config to provide read access to the ATS 
> Entries.
> {code}
> curl  http://:8188/ws/v1/timeline/HIVE_QUERY_ID
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9344) FS should not reserve when container capability is bigger than node total resource

2019-03-07 Thread Zhaohui Xin (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhaohui Xin updated YARN-9344:
--
Attachment: YARN-9344.003.patch

> FS should not reserve when container capability is bigger than node total 
> resource
> --
>
> Key: YARN-9344
> URL: https://issues.apache.org/jira/browse/YARN-9344
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Zhaohui Xin
>Assignee: Zhaohui Xin
>Priority: Major
> Attachments: YARN-9344.001.patch, YARN-9344.002.patch, 
> YARN-9344.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4741) RM is flooded with RMNodeFinishedContainersPulledByAMEvents in the async dispatcher event queue

2019-03-07 Thread Sathish (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787570#comment-16787570
 ] 

Sathish commented on YARN-4741:
---

I see the issue could be because of the Log aggregation failed for those jobs 
upon the incorrect permission of remote log directory

 

2016-02-18 01:39:34,164 WARN 
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService:
 Remote Root Log Dir [/var/log/hadoop-yarn] already exist, but with incorrect 
permissions. Expected: [rwxrwxrwt], Found: [rwxrwxrwx]. The cluster may have 
problems with multiple users.

 

Since those were kept in your NM's local leveldb, NM was trying to replay those 
we are getting into this situation.

Have you tried fixing the permissions of those directories and made sure log 
aggregation was working fine from those nodemanagers ?

 

> RM is flooded with RMNodeFinishedContainersPulledByAMEvents in the async 
> dispatcher event queue
> ---
>
> Key: YARN-4741
> URL: https://issues.apache.org/jira/browse/YARN-4741
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Sangjin Lee
>Priority: Critical
> Attachments: nm.log
>
>
> We had a pretty major incident with the RM where it was continually flooded 
> with RMNodeFinishedContainersPulledByAMEvents in the async dispatcher event 
> queue.
> In our setup, we had the RM HA or stateful restart *disabled*, but NM 
> work-preserving restart *enabled*. Due to other issues, we did a cluster-wide 
> NM restart.
> Some time during the restart (which took multiple hours), we started seeing 
> the async dispatcher event queue building. Normally it would log 1,000. In 
> this case, it climbed all the way up to tens of millions of events.
> When we looked at the RM log, it was full of the following messages:
> {noformat}
> 2016-02-18 01:47:29,530 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Invalid 
> event FINISHED_CONTAINERS_PULLED_BY_AM on Node  worker-node-foo.bar.net:8041
> 2016-02-18 01:47:29,535 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Can't handle 
> this event at current state
> 2016-02-18 01:47:29,535 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Invalid 
> event FINISHED_CONTAINERS_PULLED_BY_AM on Node  worker-node-foo.bar.net:8041
> 2016-02-18 01:47:29,538 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Can't handle 
> this event at current state
> 2016-02-18 01:47:29,538 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Invalid 
> event FINISHED_CONTAINERS_PULLED_BY_AM on Node  worker-node-foo.bar.net:8041
> {noformat}
> And that node in question was restarted a few minutes earlier.
> When we inspected the RM heap, it was full of 
> RMNodeFinishedContainersPulledByAMEvents.
> Suspecting the NM work-preserving restart, we disabled it and did another 
> cluster-wide rolling restart. Initially that seemed to have helped reduce the 
> queue size, but the queue built back up to several millions and continued for 
> an extended period. We had to restart the RM to resolve the problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8589) ATS TimelineACLsManager checkAccess is slow

2019-03-07 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787548#comment-16787548
 ] 

Prabhu Joseph commented on YARN-8589:
-

[~Rakesh_Shah] Did you try with below query as well

{code}
curl  http://:8188/ws/v1/timeline/HIVE_QUERY_ID
{code}

> ATS TimelineACLsManager checkAccess is slow
> ---
>
> Key: YARN-8589
> URL: https://issues.apache.org/jira/browse/YARN-8589
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 2.7.3
>Reporter: Prabhu Joseph
>Priority: Major
>
> ATS rest api is very slow when there are more than 1lakh entries if 
> yarn.acl.enable is set to true as TimelineACLsManager has to check access for 
> every entries. We can;t disable yarn.acl.enable as all the YARN ACLs uses the 
> same config. We can have a separate config to provide read access to the ATS 
> Entries.
> {code}
> curl  http://:8188/ws/v1/timeline/HIVE_QUERY_ID
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8589) ATS TimelineACLsManager checkAccess is slow

2019-03-07 Thread Rakesh Shah (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787543#comment-16787543
 ] 

Rakesh Shah commented on YARN-8589:
---

Hi [~Prabhu Joseph]

When I am trying to get the entities I am only getting details of 99 entities 
using the api

[http://vm1:8188/ws/v1/timeline/DS_APP_ATTEMPT]

But we have more then two thousand job submitted, moreover I increased the 
cache sizes also.

Is there any other way to reproduce?

Or any preconditions?

> ATS TimelineACLsManager checkAccess is slow
> ---
>
> Key: YARN-8589
> URL: https://issues.apache.org/jira/browse/YARN-8589
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 2.7.3
>Reporter: Prabhu Joseph
>Priority: Major
>
> ATS rest api is very slow when there are more than 1lakh entries if 
> yarn.acl.enable is set to true as TimelineACLsManager has to check access for 
> every entries. We can;t disable yarn.acl.enable as all the YARN ACLs uses the 
> same config. We can have a separate config to provide read access to the ATS 
> Entries.
> {code}
> curl  http://:8188/ws/v1/timeline/HIVE_QUERY_ID
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9335) [atsv2] Restrict the number of elements held in NM timeline collector when backend is unreachable

2019-03-07 Thread Abhishek Modi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787541#comment-16787541
 ] 

Abhishek Modi commented on YARN-9335:
-

There are two major issues right now. Hbase client has a huge retry time out 
which causes threads to get blocked at write entities for async writes. For 
sync writes, threads get blocked at synchronized blocks and that will bloat up 
the event queue causing huge memory pressure on NM as well as delay in 
processing of other events.

> [atsv2] Restrict the number of elements held in NM timeline collector when 
> backend is unreachable
> -
>
> Key: YARN-9335
> URL: https://issues.apache.org/jira/browse/YARN-9335
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Vrushali C
>Assignee: Abhishek Modi
>Priority: Major
>
> For ATSv2 , if the backend is unreachable, the number/size of data held in 
> timeline collector's memory increases significantly. This is not good for the 
> NM memory. 
> Filing jira to set a limit on how many/much should be retained by the 
> timeline collector in memory in case the backend is not reachable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9349) When doTransition() method occurs exception, the log level practices are inconsistent

2019-03-07 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787494#comment-16787494
 ] 

Hadoop QA commented on YARN-9349:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
26s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 29s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 23s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 20m 
49s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 73m 15s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-9349 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12961650/YARN-9349.trunk.patch 
|
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 7fb817f02a47 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri Oct 
5 08:56:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 373705f |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/23661/testReport/ |
| Max. process+thread count | 339 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/23661/console |
| Powered by | Apache Yetus 0.8.0   

[jira] [Updated] (YARN-9349) When doTransition() method occurs exception, the log level practices are inconsistent

2019-03-07 Thread Anuhan Torgonshar (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuhan Torgonshar updated YARN-9349:

Attachment: (was: ApplicationImpl.java)

> When doTransition() method occurs exception, the log level practices are 
> inconsistent
> -
>
> Key: YARN-9349
> URL: https://issues.apache.org/jira/browse/YARN-9349
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 3.1.0, 2.8.5
>Reporter: Anuhan Torgonshar
>Priority: Major
>  Labels: easyfix
> Attachments: YARN-9349.trunk.patch
>
>
> There are *inconsistent* log level practices when code catches 
> *_InvalidStateTransitionException_* for _*doTransition()*_ method.
> {code:java}
> **WARN level**
> /**
>   file path: 
> hadoop-2.8.5-src\hadoop-yarn-project\hadoop-yarn\hadoop-yarn-server\hadoop-yarn-server-nodemanager\src\main\java\org\apache\hadoop\yarn\server\nodemanager\containermanager\application\ApplicationImpl.java
>   log statement line number: 482
>   log level:warn
> **/
> try {
>// queue event requesting init of the same app
>newState = stateMachine.doTransition(event.getType(), event);
> } catch (InvalidStateTransitionException e) {
>LOG.warn("Can't handle this event at current state", e);
> }
> /**
>   file path: 
> hadoop-2.8.5-src\hadoop-yarn-project\hadoop-yarn\hadoop-yarn-server\hadoop-yarn-server-nodemanager\src\main\java\org\apache\hadoop\yarn\server\nodemanager\containermanager\localizer\LocalizedResource.java
>   log statement line number: 200
>   log level:warn
> **/
> try {
>newState = this.stateMachine.doTransition(event.getType(), event);
> } catch (InvalidStateTransitionException e) {
>LOG.warn("Can't handle this event at current state", e);
> }
> /**
>   file path: 
> hadoop-2.8.5-src\hadoop-yarn-project\hadoop-yarn\hadoop-yarn-server\hadoop-yarn-server-nodemanager\src\main\java\org\apache\hadoop\yarn\server\nodemanager\containermanager\container\ContainerImpl.java
>   log statement line number: 1156
>   log level:warn
> **/
> try {
> newState =
> stateMachine.doTransition(event.getType(), event);
> } catch (InvalidStateTransitionException e) {
> LOG.warn("Can't handle this event at current state: Current: ["
> + oldState + "], eventType: [" + event.getType() + "]", e);
> }
> **ERROR level*
> /**
> file path: 
> hadoop-2.8.5-src\hadoop-yarn-project\hadoop-yarn\hadoop-yarn-server\hadoop-yarn-server-resourcemanager\src\main\java\org\apache\hadoop\yarn\server\resourcemanager\rmapp\attempt\RMAppAttemptImpl.java
> log statement line number:878
> log level: error
> **/
> try {
>/* keep the master in sync with the state machine */
>this.stateMachine.doTransition(event.getType(), event);
> } catch (InvalidStateTransitionException e) {
>LOG.error("App attempt: " + appAttemptID
>+ " can't handle this event at current state", e);
>onInvalidTranstion(event.getType(), oldState);
> }
> /**
> file path: 
> hadoop-2.8.5-src\hadoop-yarn-project\hadoop-yarn\hadoop-yarn-server\hadoop-yarn-server-resourcemanager\src\main\java\org\apache\hadoop\yarn\server\resourcemanager\rmnode\RMNodeImpl.java
> log statement line number:623
> log level: error
> **/
> try {
>stateMachine.doTransition(event.getType(), event);
> } catch (InvalidStateTransitionException e) {
>LOG.error("Can't handle this event at current state", e);
>LOG.error("Invalid event " + event.getType() + 
>" on Node " + this.nodeId);
> }
>  
> //There are 8 similar code snippets with ERROR log level.
> {code}
> After had a look on whole project, I found that there are 8 similar code 
> snippets assgin the ERROR level, when doTransition() ocurrs 
> *InvalidStateTransitionException*. And there are just 3 places choose  the 
> WARN level when in same situations. Therefor, I think these 3 log statements 
> should be assigned ERROR level to keep consistent with other code snippets.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9349) When doTransition() method occurs exception, the log level practices are inconsistent

2019-03-07 Thread Anuhan Torgonshar (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuhan Torgonshar updated YARN-9349:

Attachment: (was: ContainerImpl.java)

> When doTransition() method occurs exception, the log level practices are 
> inconsistent
> -
>
> Key: YARN-9349
> URL: https://issues.apache.org/jira/browse/YARN-9349
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 3.1.0, 2.8.5
>Reporter: Anuhan Torgonshar
>Priority: Major
>  Labels: easyfix
> Attachments: ApplicationImpl.java, YARN-9349.trunk.patch
>
>
> There are *inconsistent* log level practices when code catches 
> *_InvalidStateTransitionException_* for _*doTransition()*_ method.
> {code:java}
> **WARN level**
> /**
>   file path: 
> hadoop-2.8.5-src\hadoop-yarn-project\hadoop-yarn\hadoop-yarn-server\hadoop-yarn-server-nodemanager\src\main\java\org\apache\hadoop\yarn\server\nodemanager\containermanager\application\ApplicationImpl.java
>   log statement line number: 482
>   log level:warn
> **/
> try {
>// queue event requesting init of the same app
>newState = stateMachine.doTransition(event.getType(), event);
> } catch (InvalidStateTransitionException e) {
>LOG.warn("Can't handle this event at current state", e);
> }
> /**
>   file path: 
> hadoop-2.8.5-src\hadoop-yarn-project\hadoop-yarn\hadoop-yarn-server\hadoop-yarn-server-nodemanager\src\main\java\org\apache\hadoop\yarn\server\nodemanager\containermanager\localizer\LocalizedResource.java
>   log statement line number: 200
>   log level:warn
> **/
> try {
>newState = this.stateMachine.doTransition(event.getType(), event);
> } catch (InvalidStateTransitionException e) {
>LOG.warn("Can't handle this event at current state", e);
> }
> /**
>   file path: 
> hadoop-2.8.5-src\hadoop-yarn-project\hadoop-yarn\hadoop-yarn-server\hadoop-yarn-server-nodemanager\src\main\java\org\apache\hadoop\yarn\server\nodemanager\containermanager\container\ContainerImpl.java
>   log statement line number: 1156
>   log level:warn
> **/
> try {
> newState =
> stateMachine.doTransition(event.getType(), event);
> } catch (InvalidStateTransitionException e) {
> LOG.warn("Can't handle this event at current state: Current: ["
> + oldState + "], eventType: [" + event.getType() + "]", e);
> }
> **ERROR level*
> /**
> file path: 
> hadoop-2.8.5-src\hadoop-yarn-project\hadoop-yarn\hadoop-yarn-server\hadoop-yarn-server-resourcemanager\src\main\java\org\apache\hadoop\yarn\server\resourcemanager\rmapp\attempt\RMAppAttemptImpl.java
> log statement line number:878
> log level: error
> **/
> try {
>/* keep the master in sync with the state machine */
>this.stateMachine.doTransition(event.getType(), event);
> } catch (InvalidStateTransitionException e) {
>LOG.error("App attempt: " + appAttemptID
>+ " can't handle this event at current state", e);
>onInvalidTranstion(event.getType(), oldState);
> }
> /**
> file path: 
> hadoop-2.8.5-src\hadoop-yarn-project\hadoop-yarn\hadoop-yarn-server\hadoop-yarn-server-resourcemanager\src\main\java\org\apache\hadoop\yarn\server\resourcemanager\rmnode\RMNodeImpl.java
> log statement line number:623
> log level: error
> **/
> try {
>stateMachine.doTransition(event.getType(), event);
> } catch (InvalidStateTransitionException e) {
>LOG.error("Can't handle this event at current state", e);
>LOG.error("Invalid event " + event.getType() + 
>" on Node " + this.nodeId);
> }
>  
> //There are 8 similar code snippets with ERROR log level.
> {code}
> After had a look on whole project, I found that there are 8 similar code 
> snippets assgin the ERROR level, when doTransition() ocurrs 
> *InvalidStateTransitionException*. And there are just 3 places choose  the 
> WARN level when in same situations. Therefor, I think these 3 log statements 
> should be assigned ERROR level to keep consistent with other code snippets.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9349) When doTransition() method occurs exception, the log level practices are inconsistent

2019-03-07 Thread Anuhan Torgonshar (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuhan Torgonshar updated YARN-9349:

Attachment: (was: LocalizedResource.java)

> When doTransition() method occurs exception, the log level practices are 
> inconsistent
> -
>
> Key: YARN-9349
> URL: https://issues.apache.org/jira/browse/YARN-9349
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 3.1.0, 2.8.5
>Reporter: Anuhan Torgonshar
>Priority: Major
>  Labels: easyfix
> Attachments: ApplicationImpl.java, YARN-9349.trunk.patch
>
>
> There are *inconsistent* log level practices when code catches 
> *_InvalidStateTransitionException_* for _*doTransition()*_ method.
> {code:java}
> **WARN level**
> /**
>   file path: 
> hadoop-2.8.5-src\hadoop-yarn-project\hadoop-yarn\hadoop-yarn-server\hadoop-yarn-server-nodemanager\src\main\java\org\apache\hadoop\yarn\server\nodemanager\containermanager\application\ApplicationImpl.java
>   log statement line number: 482
>   log level:warn
> **/
> try {
>// queue event requesting init of the same app
>newState = stateMachine.doTransition(event.getType(), event);
> } catch (InvalidStateTransitionException e) {
>LOG.warn("Can't handle this event at current state", e);
> }
> /**
>   file path: 
> hadoop-2.8.5-src\hadoop-yarn-project\hadoop-yarn\hadoop-yarn-server\hadoop-yarn-server-nodemanager\src\main\java\org\apache\hadoop\yarn\server\nodemanager\containermanager\localizer\LocalizedResource.java
>   log statement line number: 200
>   log level:warn
> **/
> try {
>newState = this.stateMachine.doTransition(event.getType(), event);
> } catch (InvalidStateTransitionException e) {
>LOG.warn("Can't handle this event at current state", e);
> }
> /**
>   file path: 
> hadoop-2.8.5-src\hadoop-yarn-project\hadoop-yarn\hadoop-yarn-server\hadoop-yarn-server-nodemanager\src\main\java\org\apache\hadoop\yarn\server\nodemanager\containermanager\container\ContainerImpl.java
>   log statement line number: 1156
>   log level:warn
> **/
> try {
> newState =
> stateMachine.doTransition(event.getType(), event);
> } catch (InvalidStateTransitionException e) {
> LOG.warn("Can't handle this event at current state: Current: ["
> + oldState + "], eventType: [" + event.getType() + "]", e);
> }
> **ERROR level*
> /**
> file path: 
> hadoop-2.8.5-src\hadoop-yarn-project\hadoop-yarn\hadoop-yarn-server\hadoop-yarn-server-resourcemanager\src\main\java\org\apache\hadoop\yarn\server\resourcemanager\rmapp\attempt\RMAppAttemptImpl.java
> log statement line number:878
> log level: error
> **/
> try {
>/* keep the master in sync with the state machine */
>this.stateMachine.doTransition(event.getType(), event);
> } catch (InvalidStateTransitionException e) {
>LOG.error("App attempt: " + appAttemptID
>+ " can't handle this event at current state", e);
>onInvalidTranstion(event.getType(), oldState);
> }
> /**
> file path: 
> hadoop-2.8.5-src\hadoop-yarn-project\hadoop-yarn\hadoop-yarn-server\hadoop-yarn-server-resourcemanager\src\main\java\org\apache\hadoop\yarn\server\resourcemanager\rmnode\RMNodeImpl.java
> log statement line number:623
> log level: error
> **/
> try {
>stateMachine.doTransition(event.getType(), event);
> } catch (InvalidStateTransitionException e) {
>LOG.error("Can't handle this event at current state", e);
>LOG.error("Invalid event " + event.getType() + 
>" on Node " + this.nodeId);
> }
>  
> //There are 8 similar code snippets with ERROR log level.
> {code}
> After had a look on whole project, I found that there are 8 similar code 
> snippets assgin the ERROR level, when doTransition() ocurrs 
> *InvalidStateTransitionException*. And there are just 3 places choose  the 
> WARN level when in same situations. Therefor, I think these 3 log statements 
> should be assigned ERROR level to keep consistent with other code snippets.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9239) Document docker registry deployment with Ozone CSI driver

2019-03-07 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787445#comment-16787445
 ] 

Hudson commented on YARN-9239:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #16159 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16159/])
YARN-9239. Document docker registry deployment with Ozone CSI driver. (wwei: 
rev 373705fceae498560a8c9e7f315fd99707bb577c)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/DockerContainers.md


> Document docker registry deployment with Ozone CSI driver
> -
>
> Key: YARN-9239
> URL: https://issues.apache.org/jira/browse/YARN-9239
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-9239.001.patch
>
>
> Another approach to mount Docker registry storage can be done at file system 
> layer.  The deployment can use YARN CSI driver to mount Ozone volume into 
> docker registry container.  This provides alternate approach to YARN-9229 for 
> getting Docker registry to use distributed filesystem as backend storage.  
> This task is to study the performance of deploying docker registry with Ozone 
> CSI driver.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9239) Document docker registry deployment with Ozone CSI driver

2019-03-07 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787438#comment-16787438
 ] 

Hadoop QA commented on YARN-9239:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
33s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
40m 46s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 31s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
 9s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 60m 22s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-9239 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12961641/YARN-9239.001.patch |
| Optional Tests |  dupname  asflicense  mvnsite  |
| uname | Linux b8ce49a196d3 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 064f38b |
| maven | version: Apache Maven 3.3.9 |
| Max. process+thread count | 303 (vs. ulimit of 1) |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/23660/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Document docker registry deployment with Ozone CSI driver
> -
>
> Key: YARN-9239
> URL: https://issues.apache.org/jira/browse/YARN-9239
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-9239.001.patch
>
>
> Another approach to mount Docker registry storage can be done at file system 
> layer.  The deployment can use YARN CSI driver to mount Ozone volume into 
> docker registry container.  This provides alternate approach to YARN-9229 for 
> getting Docker registry to use distributed filesystem as backend storage.  
> This task is to study the performance of deploying docker registry with Ozone 
> CSI driver.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9341) Reentrant lock() before try

2019-03-07 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787432#comment-16787432
 ] 

Prabhu Joseph commented on YARN-9341:
-

Thanks [~eyang] and [~wilfreds]!

> Reentrant lock() before try
> ---
>
> Key: YARN-9341
> URL: https://issues.apache.org/jira/browse/YARN-9341
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.1.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: YARN-9341-001.patch
>
>
> As a best practice - Reentrant lock has to be acquired before try clause. 
> https://stackoverflow.com/questions/31058681/java-locking-structure-best-pattern
> There are many places where lock is obtained inside try.
> {code}
> try {
>this.writeLock.lock();
>   
> } finally {
>   this.writeLock.unlock();
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9255) Improve recommend applications order

2019-03-07 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787431#comment-16787431
 ] 

Hadoop QA commented on YARN-9255:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 47s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
33s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-catalog/hadoop-yarn-applications-catalog-webapp
 in trunk has 13 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 11s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-catalog/hadoop-yarn-applications-catalog-webapp:
 The patch generated 4 new + 0 unchanged - 0 fixed = 4 total (was 0) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  4s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
44s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-catalog/hadoop-yarn-applications-catalog-webapp
 generated 1 new + 0 unchanged - 13 fixed = 1 total (was 13) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 42s{color} 
| {color:red} hadoop-yarn-applications-catalog-webapp in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 46m 18s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | 
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-catalog/hadoop-yarn-applications-catalog-webapp
 |
|  |  org.apache.hadoop.yarn.appcatalog.model.Application defines equals but 
not hashCode  At Application.java:hashCode  At Application.java:[lines 69-78] |
| Timed out junit tests | 
org.apache.hadoop.yarn.appcatalog.application.TestAppCatalogSolrClient |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-9255 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12961639/YARN-9255.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| 

[jira] [Commented] (YARN-9265) FPGA plugin fails to recognize Intel Processing Accelerator Card

2019-03-07 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787429#comment-16787429
 ] 

Hadoop QA commented on YARN-9265:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
18s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
 5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m 40s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
28s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
33s{color} | {color:green} hadoop-yarn-project/hadoop-yarn: The patch generated 
0 new + 262 unchanged - 12 fixed = 262 total (was 274) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 31s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
51s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
51s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
45s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 21m  
6s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
40s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}116m 31s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-9265 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12961633/YARN-9265-009.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux 55243639d776 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri Oct 
5 08:56:16 UTC 2018 x86_64 

[jira] [Assigned] (YARN-9352) Multiple versions of createSchedulingRequest in FairSchedulerTestBase could be cleaned up

2019-03-07 Thread Siddharth Ahuja (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Ahuja reassigned YARN-9352:
-

Assignee: Siddharth Ahuja

> Multiple versions of createSchedulingRequest in FairSchedulerTestBase could 
> be cleaned up
> -
>
> Key: YARN-9352
> URL: https://issues.apache.org/jira/browse/YARN-9352
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Siddharth Ahuja
>Priority: Minor
>  Labels: newbie, newbie++
>
> createSchedulingRequest in FairSchedulerTestBase is overloaded many times.
> This could be more cleaner is we introduced a builder instead of calling 
> various forms of this method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9239) Document docker registry deployment with Ozone CSI driver

2019-03-07 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787390#comment-16787390
 ] 

Weiwei Yang commented on YARN-9239:
---

Thanks [~eyang], this looks good to me. +1 on my side, will commit today.

> Document docker registry deployment with Ozone CSI driver
> -
>
> Key: YARN-9239
> URL: https://issues.apache.org/jira/browse/YARN-9239
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-9239.001.patch
>
>
> Another approach to mount Docker registry storage can be done at file system 
> layer.  The deployment can use YARN CSI driver to mount Ozone volume into 
> docker registry container.  This provides alternate approach to YARN-9229 for 
> getting Docker registry to use distributed filesystem as backend storage.  
> This task is to study the performance of deploying docker registry with Ozone 
> CSI driver.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-9354) TestUtils#createResource calls should be replaced with ResourceTypesTestHelper#newResource

2019-03-07 Thread Siddharth Ahuja (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Ahuja reassigned YARN-9354:
-

Assignee: Siddharth Ahuja

> TestUtils#createResource calls should be replaced with 
> ResourceTypesTestHelper#newResource
> --
>
> Key: YARN-9354
> URL: https://issues.apache.org/jira/browse/YARN-9354
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Siddharth Ahuja
>Priority: Trivial
>  Labels: newbie, newbie++
>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestUtils#createResource
>  has not identical, but very similar implementation to 
> org.apache.hadoop.yarn.resourcetypes.ResourceTypesTestHelper#newResource. 
> Since these 2 methods are doing the same essentially and 
> ResourceTypesTestHelper is newer and used more, TestUtils#createResource 
> should be replaced with ResourceTypesTestHelper#newResource with all 
> occurrence.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9239) Document docker registry deployment with Ozone CSI driver

2019-03-07 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-9239:

Attachment: YARN-9239.001.patch

> Document docker registry deployment with Ozone CSI driver
> -
>
> Key: YARN-9239
> URL: https://issues.apache.org/jira/browse/YARN-9239
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: YARN-9239.001.patch
>
>
> Another approach to mount Docker registry storage can be done at file system 
> layer.  The deployment can use YARN CSI driver to mount Ozone volume into 
> docker registry container.  This provides alternate approach to YARN-9229 for 
> getting Docker registry to use distributed filesystem as backend storage.  
> This task is to study the performance of deploying docker registry with Ozone 
> CSI driver.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-9239) Document docker registry deployment with Ozone CSI driver

2019-03-07 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang reassigned YARN-9239:
---

Assignee: Eric Yang  (was: Weiwei Yang)

> Document docker registry deployment with Ozone CSI driver
> -
>
> Key: YARN-9239
> URL: https://issues.apache.org/jira/browse/YARN-9239
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-9239.001.patch
>
>
> Another approach to mount Docker registry storage can be done at file system 
> layer.  The deployment can use YARN CSI driver to mount Ozone volume into 
> docker registry container.  This provides alternate approach to YARN-9229 for 
> getting Docker registry to use distributed filesystem as backend storage.  
> This task is to study the performance of deploying docker registry with Ozone 
> CSI driver.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9255) Improve recommend applications order

2019-03-07 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787369#comment-16787369
 ] 

Eric Yang commented on YARN-9255:
-

Patch 2 added some bug fix and test case.

> Improve recommend applications order
> 
>
> Key: YARN-9255
> URL: https://issues.apache.org/jira/browse/YARN-9255
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.3.0
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-9255.001.patch, YARN-9255.002.patch
>
>
> When there is no search term in application catalog, the recommended 
> application list is random.  The relevance can be fine tuned to be sorted by 
> number of downloads or alphabetic order.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9255) Improve recommend applications order

2019-03-07 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-9255:

Attachment: YARN-9255.002.patch

> Improve recommend applications order
> 
>
> Key: YARN-9255
> URL: https://issues.apache.org/jira/browse/YARN-9255
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.3.0
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-9255.001.patch, YARN-9255.002.patch
>
>
> When there is no search term in application catalog, the recommended 
> application list is random.  The relevance can be fine tuned to be sorted by 
> number of downloads or alphabetic order.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9239) Document docker registry deployment with Ozone CSI driver

2019-03-07 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787328#comment-16787328
 ] 

Eric Yang commented on YARN-9239:
-

Quote from [~anu]'s mail:

{quote}
The CSI interface does not have the write back capability yet. You might want 
to use S3 protocol -- that way you can read and write to the bucket if needed.

--Anu

We need to switch to another CSI driver that supports writes, but that is 
something we need to get to.
{quote}

There are more work on Ozone side to enable proper writeback capabilities.  
This is not the right time to document how Ozone CSI driver to work with Docker 
registry.  Therefore, I will remove Docker Registry + CSI driver documentation 
for now.

> Document docker registry deployment with Ozone CSI driver
> -
>
> Key: YARN-9239
> URL: https://issues.apache.org/jira/browse/YARN-9239
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Weiwei Yang
>Priority: Major
>
> Another approach to mount Docker registry storage can be done at file system 
> layer.  The deployment can use YARN CSI driver to mount Ozone volume into 
> docker registry container.  This provides alternate approach to YARN-9229 for 
> getting Docker registry to use distributed filesystem as backend storage.  
> This task is to study the performance of deploying docker registry with Ozone 
> CSI driver.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-9355) RMContainerRequestor#makeRemoteRequest has confusing log message

2019-03-07 Thread Siddharth Ahuja (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Ahuja reassigned YARN-9355:
-

Assignee: Siddharth Ahuja

> RMContainerRequestor#makeRemoteRequest has confusing log message
> 
>
> Key: YARN-9355
> URL: https://issues.apache.org/jira/browse/YARN-9355
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Siddharth Ahuja
>Priority: Trivial
>  Labels: newbie, newbie++
>
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor#makeRemoteRequest 
> has this log: 
> {code:java}
> if (ask.size() > 0 || release.size() > 0) {
>   LOG.info("getResources() for " + applicationId + ":" + " ask="
>   + ask.size() + " release= " + release.size() + " newContainers="
>   + allocateResponse.getAllocatedContainers().size()
>   + " finishedContainers=" + numCompletedContainers
>   + " resourcelimit=" + availableResources + " knownNMs="
>   + clusterNmCount);
> }
> {code}
> The reason why "getResources()" is printed because 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator#getResources 
> invokes makeRemoteRequest. This is not too informative and error-prone as 
> name of getResources could change over time and the log will be outdated. 
> Moreover, it's not a good idea to print a method name from a method below the 
> current one in the stack.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9265) FPGA plugin fails to recognize Intel Processing Accelerator Card

2019-03-07 Thread Peter Bacsko (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated YARN-9265:
---
Attachment: YARN-9265-009.patch

> FPGA plugin fails to recognize Intel Processing Accelerator Card
> 
>
> Key: YARN-9265
> URL: https://issues.apache.org/jira/browse/YARN-9265
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.1.0
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Critical
> Attachments: YARN-9265-001.patch, YARN-9265-002.patch, 
> YARN-9265-003.patch, YARN-9265-004.patch, YARN-9265-005.patch, 
> YARN-9265-006.patch, YARN-9265-007.patch, YARN-9265-008.patch, 
> YARN-9265-009.patch
>
>
> The plugin cannot autodetect Intel FPGA PAC (Processing Accelerator Card).
> There are two major issues.
> Problem #1
> The output of aocl diagnose:
> {noformat}
> 
> Device Name:
> acl0
>  
> Package Pat:
> /home/pbacsko/inteldevstack/intelFPGA_pro/hld/board/opencl_bsp
>  
> Vendor: Intel Corp
>  
> Physical Dev Name   StatusInformation
>  
> pac_a10_f20 PassedPAC Arria 10 Platform (pac_a10_f20)
>   PCIe 08:00.0
>   FPGA temperature = 79 degrees C.
>  
> DIAGNOSTIC_PASSED
> 
>  
> Call "aocl diagnose " to run diagnose for specified devices
> Call "aocl diagnose all" to run diagnose for all devices
> {noformat}
> The plugin fails to recognize this and fails with the following message:
> {noformat}
> 2019-01-25 06:46:02,834 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.FpgaResourcePlugin:
>  Using FPGA vendor plugin: 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.IntelFpgaOpenclPlugin
> 2019-01-25 06:46:02,943 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.FpgaDiscoverer:
>  Trying to diagnose FPGA information ...
> 2019-01-25 06:46:03,085 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.ResourceHandlerModule:
>  Using traffic control bandwidth handler
> 2019-01-25 06:46:03,108 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.CGroupsHandlerImpl:
>  Initializing mounted controller cpu at /sys/fs/cgroup/cpu,cpuacct/yarn
> 2019-01-25 06:46:03,139 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.fpga.FpgaResourceHandlerImpl:
>  FPGA Plugin bootstrap success.
> 2019-01-25 06:46:03,247 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.IntelFpgaOpenclPlugin:
>  Couldn't find (?i)bus:slot.func\s=\s.*, pattern
> 2019-01-25 06:46:03,248 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.IntelFpgaOpenclPlugin:
>  Couldn't find (?i)Total\sCard\sPower\sUsage\s=\s.* pattern
> 2019-01-25 06:46:03,251 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.IntelFpgaOpenclPlugin:
>  Failed to get major-minor number from reading /dev/pac_a10_f30
> 2019-01-25 06:46:03,252 ERROR 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Failed to 
> bootstrap configured resource subsystems!
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.ResourceHandlerException:
>  No FPGA devices detected!
> {noformat}
> Problem #2
> The plugin assumes that the file name under {{/dev}} can be derived from the 
> "Physical Dev Name", but this is wrong. For example, it thinks that the 
> device file is {{/dev/pac_a10_f30}} which is not the case, the actual 
> file is {{/dev/intel-fpga-port.0}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9265) FPGA plugin fails to recognize Intel Processing Accelerator Card

2019-03-07 Thread Peter Bacsko (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787300#comment-16787300
 ] 

Peter Bacsko commented on YARN-9265:


Patch v9: addressed failing unit test, ASF license problem and checkstyle stuff.

> FPGA plugin fails to recognize Intel Processing Accelerator Card
> 
>
> Key: YARN-9265
> URL: https://issues.apache.org/jira/browse/YARN-9265
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.1.0
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Critical
> Attachments: YARN-9265-001.patch, YARN-9265-002.patch, 
> YARN-9265-003.patch, YARN-9265-004.patch, YARN-9265-005.patch, 
> YARN-9265-006.patch, YARN-9265-007.patch, YARN-9265-008.patch, 
> YARN-9265-009.patch
>
>
> The plugin cannot autodetect Intel FPGA PAC (Processing Accelerator Card).
> There are two major issues.
> Problem #1
> The output of aocl diagnose:
> {noformat}
> 
> Device Name:
> acl0
>  
> Package Pat:
> /home/pbacsko/inteldevstack/intelFPGA_pro/hld/board/opencl_bsp
>  
> Vendor: Intel Corp
>  
> Physical Dev Name   StatusInformation
>  
> pac_a10_f20 PassedPAC Arria 10 Platform (pac_a10_f20)
>   PCIe 08:00.0
>   FPGA temperature = 79 degrees C.
>  
> DIAGNOSTIC_PASSED
> 
>  
> Call "aocl diagnose " to run diagnose for specified devices
> Call "aocl diagnose all" to run diagnose for all devices
> {noformat}
> The plugin fails to recognize this and fails with the following message:
> {noformat}
> 2019-01-25 06:46:02,834 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.FpgaResourcePlugin:
>  Using FPGA vendor plugin: 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.IntelFpgaOpenclPlugin
> 2019-01-25 06:46:02,943 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.FpgaDiscoverer:
>  Trying to diagnose FPGA information ...
> 2019-01-25 06:46:03,085 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.ResourceHandlerModule:
>  Using traffic control bandwidth handler
> 2019-01-25 06:46:03,108 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.CGroupsHandlerImpl:
>  Initializing mounted controller cpu at /sys/fs/cgroup/cpu,cpuacct/yarn
> 2019-01-25 06:46:03,139 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.fpga.FpgaResourceHandlerImpl:
>  FPGA Plugin bootstrap success.
> 2019-01-25 06:46:03,247 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.IntelFpgaOpenclPlugin:
>  Couldn't find (?i)bus:slot.func\s=\s.*, pattern
> 2019-01-25 06:46:03,248 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.IntelFpgaOpenclPlugin:
>  Couldn't find (?i)Total\sCard\sPower\sUsage\s=\s.* pattern
> 2019-01-25 06:46:03,251 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.IntelFpgaOpenclPlugin:
>  Failed to get major-minor number from reading /dev/pac_a10_f30
> 2019-01-25 06:46:03,252 ERROR 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Failed to 
> bootstrap configured resource subsystems!
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.ResourceHandlerException:
>  No FPGA devices detected!
> {noformat}
> Problem #2
> The plugin assumes that the file name under {{/dev}} can be derived from the 
> "Physical Dev Name", but this is wrong. For example, it thinks that the 
> device file is {{/dev/pac_a10_f30}} which is not the case, the actual 
> file is {{/dev/intel-fpga-port.0}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8805) Automatically convert the launch command to the exec form when using entrypoint support

2019-03-07 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787254#comment-16787254
 ] 

Hadoop QA commented on YARN-8805:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 16s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core:
 The patch generated 0 new + 7 unchanged - 1 fixed = 7 total (was 8) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 41s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 17m 
48s{color} | {color:green} hadoop-yarn-services-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 60m 39s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-8805 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12961620/YARN-8805.009.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 73550b920a10 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 1bc282e |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/23657/testReport/ |
| Max. process+thread count | 768 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core
 |
| Console output | 

[jira] [Commented] (YARN-9341) Reentrant lock() before try

2019-03-07 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787232#comment-16787232
 ] 

Eric Yang commented on YARN-9341:
-

[~wilfreds] You are right.  They have been handled correctly.  Thank you for 
the review.

+1

> Reentrant lock() before try
> ---
>
> Key: YARN-9341
> URL: https://issues.apache.org/jira/browse/YARN-9341
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.1.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Minor
> Attachments: YARN-9341-001.patch
>
>
> As a best practice - Reentrant lock has to be acquired before try clause. 
> https://stackoverflow.com/questions/31058681/java-locking-structure-best-pattern
> There are many places where lock is obtained inside try.
> {code}
> try {
>this.writeLock.lock();
>   
> } finally {
>   this.writeLock.unlock();
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9341) Reentrant lock() before try

2019-03-07 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787239#comment-16787239
 ] 

Hudson commented on YARN-9341:
--

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #16157 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16157/])
YARN-9341.  Fixed enentrant lock usage in YARN project. (eyang: rev 
39b4a37e02e929a698fcf9e32f1f71bb6b977635)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/IntraQueueCandidatesSelector.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/constraint/MemoryPlacementConstraintManager.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ManagedParentQueue.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AppSchedulingInfo.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/AbstractCSQueue.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/placement/PlacementManager.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AbstractYarnScheduler.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/nodelabels/NodeAttributesManagerImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/LocalizedResource.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/volume/csi/lifecycle/VolumeImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmcontainer/RMContainerImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AbstractResourceUsage.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/application/ApplicationImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/impl/FileSystemTimelineWriter.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/ServiceManager.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/queuemanagement/GuaranteedOrZeroCapacityOverTimePolicy.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/LeveldbTimelineStore.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ReservationQueue.java
* (edit) 

[jira] [Updated] (YARN-8805) Automatically convert the launch command to the exec form when using entrypoint support

2019-03-07 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-8805:

Attachment: YARN-8805.009.patch

> Automatically convert the launch command to the exec form when using 
> entrypoint support
> ---
>
> Key: YARN-8805
> URL: https://issues.apache.org/jira/browse/YARN-8805
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Shane Kumpf
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8805.001.patch, YARN-8805.004.patch, 
> YARN-8805.005.patch, YARN-8805.006.patch, YARN-8805.007.patch, 
> YARN-8805.008.patch, YARN-8805.009.patch
>
>
> When {{YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE}} is true, and a 
> launch command is provided, it is expected that the launch command is 
> provided by the user in exec form.
> For example:
> {code:java}
> "/usr/bin/sleep 6000"{code}
> must be changed to:
> {code}"/usr/bin/sleep,6000"{code}
> If this is not done, the container will never start and will be in a Created 
> state. We should automatically do this conversion vs making the user 
> understand this nuance of using the entrypoint support. Docs should be 
> updated to reflect this change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8805) Automatically convert the launch command to the exec form when using entrypoint support

2019-03-07 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787194#comment-16787194
 ] 

Eric Yang commented on YARN-8805:
-

Patch 009 fixes checkstyle issues.

> Automatically convert the launch command to the exec form when using 
> entrypoint support
> ---
>
> Key: YARN-8805
> URL: https://issues.apache.org/jira/browse/YARN-8805
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Shane Kumpf
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8805.001.patch, YARN-8805.004.patch, 
> YARN-8805.005.patch, YARN-8805.006.patch, YARN-8805.007.patch, 
> YARN-8805.008.patch, YARN-8805.009.patch
>
>
> When {{YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE}} is true, and a 
> launch command is provided, it is expected that the launch command is 
> provided by the user in exec form.
> For example:
> {code:java}
> "/usr/bin/sleep 6000"{code}
> must be changed to:
> {code}"/usr/bin/sleep,6000"{code}
> If this is not done, the container will never start and will be in a Created 
> state. We should automatically do this conversion vs making the user 
> understand this nuance of using the entrypoint support. Docs should be 
> updated to reflect this change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-6886) AllocationFileLoaderService.loadQueue() should validate that setting do not conflict with parent

2019-03-07 Thread Szilard Nemeth (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth reassigned YARN-6886:


Assignee: (was: Szilard Nemeth)

> AllocationFileLoaderService.loadQueue() should validate that setting do not 
> conflict with parent
> 
>
> Key: YARN-6886
> URL: https://issues.apache.org/jira/browse/YARN-6886
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 3.0.0-alpha4
>Reporter: Daniel Templeton
>Priority: Minor
>  Labels: newbie
>
> Some settings, like policy, are limited by the queue's parent queue's 
> configuration.  We should check those settings when we load the file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-6886) AllocationFileLoaderService.loadQueue() should validate that setting do not conflict with parent

2019-03-07 Thread Szilard Nemeth (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth reassigned YARN-6886:


Assignee: Szilard Nemeth

> AllocationFileLoaderService.loadQueue() should validate that setting do not 
> conflict with parent
> 
>
> Key: YARN-6886
> URL: https://issues.apache.org/jira/browse/YARN-6886
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 3.0.0-alpha4
>Reporter: Daniel Templeton
>Assignee: Szilard Nemeth
>Priority: Minor
>  Labels: newbie
>
> Some settings, like policy, are limited by the queue's parent queue's 
> configuration.  We should check those settings when we load the file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9339) Apps pending metric incorrect after moving app to a new queue

2019-03-07 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787106#comment-16787106
 ] 

Hadoop QA commented on YARN-9339:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
41s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 34s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 42s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 1 new + 455 unchanged - 0 fixed = 456 total (was 455) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 26s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 86m 37s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}151m 49s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.metrics.TestSystemMetricsPublisherForV2 |
|   | 
hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher |
|   | hadoop.yarn.server.resourcemanager.scheduler.TestAbstractYarnScheduler |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-9339 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12961588/YARN-9339.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 215416b8049e 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri Oct 
5 08:56:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 475011b |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 

[jira] [Commented] (YARN-9360) Do not expose innards of QueueMetrics object into FSLeafQueue#computeMaxAMResource

2019-03-07 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787095#comment-16787095
 ] 

Hadoop QA commented on YARN-9360:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
 7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
2m 21s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 29s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 1 new + 50 unchanged - 0 fixed = 51 total (was 50) {color} 
|
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 48s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
24s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
24s{color} | {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager
 generated 1 new + 4 unchanged - 0 fixed = 5 total (was 4) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 73m 53s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}112m  7s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher |
|   | 
hadoop.yarn.server.resourcemanager.metrics.TestSystemMetricsPublisherForV2 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-9360 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12961590/YARN-9360.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 8db45c07e5a1 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 1bc282e |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | 

[jira] [Assigned] (YARN-7713) Add parallel copying of directories into FSDownload

2019-03-07 Thread Szilard Nemeth (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth reassigned YARN-7713:


Assignee: (was: Szilard Nemeth)

> Add parallel copying of directories into FSDownload
> ---
>
> Key: YARN-7713
> URL: https://issues.apache.org/jira/browse/YARN-7713
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Miklos Szegedi
>Priority: Major
>  Labels: newbie
>
> YARN currently copies directories sequentially when localizing. This could be 
> improved to do in parallel, since the source blocks are normally on different 
> nodes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-7548) TestCapacityOverTimePolicy.testAllocation is flaky

2019-03-07 Thread Szilard Nemeth (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth reassigned YARN-7548:


Assignee: (was: Szilard Nemeth)

> TestCapacityOverTimePolicy.testAllocation is flaky
> --
>
> Key: YARN-7548
> URL: https://issues.apache.org/jira/browse/YARN-7548
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: reservation system
>Affects Versions: 3.0.0-beta1
>Reporter: Haibo Chen
>Priority: Major
>
> It failed in both YARN-7337 and YARN-6921 jenkins jobs.
> org.apache.hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy.testAllocation[Duration
>  90,000,000, height 0.25, numSubmission 1, periodic 8640)]
> *Stacktrace*
> {code:java}
> junit.framework.AssertionFailedError: null
>  at junit.framework.Assert.fail(Assert.java:55)
>  at junit.framework.Assert.fail(Assert.java:64)
>  at junit.framework.TestCase.fail(TestCase.java:235)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.reservation.BaseSharingPolicyTest.runTest(BaseSharingPolicyTest.java:146)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy.testAllocation(TestCapacityOverTimePolicy.java:136){code}
> *Standard Output*
> {code:java}
> 2017-11-20 23:57:03,759 INFO [main] recovery.RMStateStore 
> (RMStateStore.java:transition(538)) - Storing reservation 
> allocation.reservation_-9026698577416205920_6337917439559340517
>  2017-11-20 23:57:03,759 INFO [main] recovery.RMStateStore 
> (MemoryRMStateStore.java:storeReservationState(247)) - Storing 
> reservationallocation for 
> reservation_-9026698577416205920_6337917439559340517 for plan dedicated
>  2017-11-20 23:57:03,760 INFO [main] reservation.InMemoryPlan 
> (InMemoryPlan.java:addReservation(373)) - Successfully added reservation: 
> reservation_-9026698577416205920_6337917439559340517 to plan.
>  In-memory Plan: Parent Queue: dedicatedTotal Capacity:  vCores:1000>Step: 1000reservation_-9026698577416205920_6337917439559340517 
> user:u1 startTime: 0 endTime: 8640 Periodiciy: 8640 alloc:
>  [Period: 8640
>  0: 
>  3423748: 
>  86223748: 
>  8640: 
>  9223372036854775807: null
>  ]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-7514) TestAggregatedLogDeletionService.testRefreshLogRetentionSettings is flaky

2019-03-07 Thread Szilard Nemeth (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth reassigned YARN-7514:


Assignee: (was: Szilard Nemeth)

> TestAggregatedLogDeletionService.testRefreshLogRetentionSettings is flaky
> -
>
> Key: YARN-7514
> URL: https://issues.apache.org/jira/browse/YARN-7514
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: log-aggregation
>Affects Versions: 3.0.0-beta1
>Reporter: Haibo Chen
>Priority: Major
>
> TestAggregatedLogDeletionService.testRefreshLogRetentionSettings fails 
> occasionally with 
> *Error Message*
> Argument(s) are different! Wanted:
> fileSystem.delete(
> mockfs://foo/tmp/logs/me/logs/application_1510201418065_0002,
> true
> );
> -> at 
> org.apache.hadoop.yarn.logaggregation.TestAggregatedLogDeletionService.testRefreshLogRetentionSettings(TestAggregatedLogDeletionService.java:300)
> Actual invocation has different arguments:
> fileSystem.delete(
> mockfs://foo/tmp/logs/me/logs/application_1510201418024_0001,
> true
> );
> -> at org.apache.hadoop.fs.FilterFileSystem.delete(FilterFileSystem.java:252)
> *Stacktrace*
> org.mockito.exceptions.verification.junit.ArgumentsAreDifferent: 
> Argument(s) are different! Wanted:
> fileSystem.delete(
> mockfs://foo/tmp/logs/me/logs/application_1510201418065_0002,
> true
> );
> -> at 
> org.apache.hadoop.yarn.logaggregation.TestAggregatedLogDeletionService.testRefreshLogRetentionSettings(TestAggregatedLogDeletionService.java:300)
> Actual invocation has different arguments:
> fileSystem.delete(
> mockfs://foo/tmp/logs/me/logs/application_1510201418024_0001,
> true
> );
> -> at org.apache.hadoop.fs.FilterFileSystem.delete(FilterFileSystem.java:252)
>   at 
> org.apache.hadoop.yarn.logaggregation.TestAggregatedLogDeletionService.testRefreshLogRetentionSettings(TestAggregatedLogDeletionService.java:300)
> *Standard Output*
> 2017-11-08 20:23:38,138 INFO  [Timer-0] 
> logaggregation.AggregatedLogDeletionService 
> (AggregatedLogDeletionService.java:run(79)) - aggregated log deletion started.
> 2017-11-08 20:23:38,146 INFO  [Timer-0] 
> logaggregation.AggregatedLogDeletionService 
> (AggregatedLogDeletionService.java:deleteOldLogDirsFrom(106)) - Deleting 
> aggregated logs in 
> mockfs://foo/tmp/logs/me/logs/application_1510201418024_0001
> 2017-11-08 20:23:38,146 INFO  [Timer-0] 
> logaggregation.AggregatedLogDeletionService 
> (AggregatedLogDeletionService.java:run(92)) - aggregated log deletion 
> finished.
> 2017-11-08 20:23:38,167 INFO  [Timer-1] 
> logaggregation.AggregatedLogDeletionService 
> (AggregatedLogDeletionService.java:run(79)) - aggregated log deletion started.
> 2017-11-08 20:23:38,172 INFO  [Timer-1] 
> logaggregation.AggregatedLogDeletionService 
> (AggregatedLogDeletionService.java:deleteOldLogDirsFrom(106)) - Deleting 
> aggregated logs in 
> mockfs://foo/tmp/logs/me/logs/application_1510201418024_0001
> 2017-11-08 20:23:38,173 INFO  [Timer-1] 
> logaggregation.AggregatedLogDeletionService 
> (AggregatedLogDeletionService.java:deleteOldLogDirsFrom(106)) - Deleting 
> aggregated logs in 
> mockfs://foo/tmp/logs/me/logs/application_1510201418065_0002
> 2017-11-08 20:23:38,181 INFO  [Timer-1] 
> logaggregation.AggregatedLogDeletionService 
> (AggregatedLogDeletionService.java:run(92)) - aggregated log deletion 
> finished.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-6272) TestAMRMClient#testAMRMClientWithContainerResourceChange fails intermittently

2019-03-07 Thread Szilard Nemeth (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth reassigned YARN-6272:


Assignee: (was: Szilard Nemeth)

> TestAMRMClient#testAMRMClientWithContainerResourceChange fails intermittently
> -
>
> Key: YARN-6272
> URL: https://issues.apache.org/jira/browse/YARN-6272
> Project: Hadoop YARN
>  Issue Type: Test
>  Components: yarn
>Affects Versions: 3.0.0-alpha4
>Reporter: Ray Chiang
>Priority: Major
>
> I'm seeing this unit test fail fairly often in trunk:
> testAMRMClientWithContainerResourceChange(org.apache.hadoop.yarn.client.api.impl.TestAMRMClient)
>   Time elapsed: 5.113 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<1> but was:<0>
> at org.junit.Assert.fail(Assert.java:88)
> at org.junit.Assert.failNotEquals(Assert.java:743)
> at org.junit.Assert.assertEquals(Assert.java:118)
> at org.junit.Assert.assertEquals(Assert.java:555)
> at org.junit.Assert.assertEquals(Assert.java:542)
> at 
> org.apache.hadoop.yarn.client.api.impl.TestAMRMClient.doContainerResourceChange(TestAMRMClient.java:1087)
> at 
> org.apache.hadoop.yarn.client.api.impl.TestAMRMClient.testAMRMClientWithContainerResourceChange(TestAMRMClient.java:963)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-3884) App History status not updated when RMContainer transitions from RESERVED to KILLED

2019-03-07 Thread Bibin A Chundatt (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt resolved YARN-3884.

Resolution: Duplicate

> App History status not updated when RMContainer transitions from RESERVED to 
> KILLED
> ---
>
> Key: YARN-3884
> URL: https://issues.apache.org/jira/browse/YARN-3884
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
> Environment: Suse11 Sp3
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Major
>  Labels: oct16-easy
> Attachments: 0001-YARN-3884.patch, Apphistory Container Status.jpg, 
> Elapsed Time.jpg, Test Result-Container status.jpg, YARN-3884.0002.patch, 
> YARN-3884.0003.patch, YARN-3884.0004.patch, YARN-3884.0005.patch, 
> YARN-3884.0006.patch, YARN-3884.0007.patch, YARN-3884.0008.patch
>
>
> Setup
> ===
> 1 NM 3072 16 cores each
> Steps to reproduce
> ===
> 1.Submit apps  to Queue 1 with 512 mb 1 core
> 2.Submit apps  to Queue 2 with 512 mb and 5 core
> lots of containers get reserved and unreserved in this case 
> {code}
> 2015-07-02 20:45:31,169 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
> container_e24_1435849994778_0002_01_13 Container Transitioned from NEW to 
> RESERVED
> 2015-07-02 20:45:31,170 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: 
> Reserved container  application=application_1435849994778_0002 
> resource= queue=QueueA: capacity=0.4, 
> absoluteCapacity=0.4, usedResources=, 
> usedCapacity=1.6410257, absoluteUsedCapacity=0.65625, numApps=1, 
> numContainers=5 usedCapacity=1.6410257 absoluteUsedCapacity=0.65625 
> used= cluster=
> 2015-07-02 20:45:31,170 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
> Re-sorting assigned queue: root.QueueA stats: QueueA: capacity=0.4, 
> absoluteCapacity=0.4, usedResources=, 
> usedCapacity=2.0317461, absoluteUsedCapacity=0.8125, numApps=1, 
> numContainers=6
> 2015-07-02 20:45:31,170 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
> assignedContainer queue=root usedCapacity=0.96875 
> absoluteUsedCapacity=0.96875 used= 
> cluster=
> 2015-07-02 20:45:31,191 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
> container_e24_1435849994778_0001_01_14 Container Transitioned from NEW to 
> ALLOCATED
> 2015-07-02 20:45:31,191 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=dsperf   
> OPERATION=AM Allocated ContainerTARGET=SchedulerApp 
> RESULT=SUCCESS  APPID=application_1435849994778_0001
> CONTAINERID=container_e24_1435849994778_0001_01_14
> 2015-07-02 20:45:31,191 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: 
> Assigned container container_e24_1435849994778_0001_01_14 of capacity 
>  on host host-10-19-92-117:64318, which has 6 
> containers,  used and  available 
> after allocation
> 2015-07-02 20:45:31,191 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: 
> assignedContainer application attempt=appattempt_1435849994778_0001_01 
> container=Container: [ContainerId: 
> container_e24_1435849994778_0001_01_14, NodeId: host-10-19-92-117:64318, 
> NodeHttpAddress: host-10-19-92-117:65321, Resource: , 
> Priority: 20, Token: null, ] queue=default: capacity=0.2, 
> absoluteCapacity=0.2, usedResources=, 
> usedCapacity=2.0846906, absoluteUsedCapacity=0.4166, numApps=1, 
> numContainers=5 clusterResource=
> 2015-07-02 20:45:31,191 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
> Re-sorting assigned queue: root.default stats: default: capacity=0.2, 
> absoluteCapacity=0.2, usedResources=, 
> usedCapacity=2.5016286, absoluteUsedCapacity=0.5, numApps=1, numContainers=6
> 2015-07-02 20:45:31,191 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
> assignedContainer queue=root usedCapacity=1.0 absoluteUsedCapacity=1.0 
> used= cluster=
> 2015-07-02 20:45:32,143 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
> container_e24_1435849994778_0001_01_14 Container Transitioned from 
> ALLOCATED to ACQUIRED
> 2015-07-02 20:45:32,174 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
>  Trying to fulfill reservation for application application_1435849994778_0002 
> on node: host-10-19-92-143:64318
> 2015-07-02 20:45:32,174 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: 
> Reserved container  application=application_1435849994778_0002 
> resource= queue=QueueA: 

[jira] [Assigned] (YARN-6813) TestAMRMProxy#testE2ETokenRenewal fails sporadically due to race conditions

2019-03-07 Thread Szilard Nemeth (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth reassigned YARN-6813:


Assignee: (was: Szilard Nemeth)

> TestAMRMProxy#testE2ETokenRenewal fails sporadically due to race conditions
> ---
>
> Key: YARN-6813
> URL: https://issues.apache.org/jira/browse/YARN-6813
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.8.1
>Reporter: Jason Lowe
>Priority: Major
>
> The testE2ETokenRenewal test lowers the AM and nodemanager heartbeat 
> intervals to only 1.5 seconds.  This leaves very little headroom over the 
> default heartbeat intervals of 1 second. If the AM hits a hiccup and runs a 
> bit slower than expected the unit test can fail because the RM expires the AM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-2614) Cleanup synchronized method in SchedulerApplicationAttempt

2019-03-07 Thread Szilard Nemeth (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth reassigned YARN-2614:


Assignee: (was: Szilard Nemeth)

> Cleanup synchronized method in SchedulerApplicationAttempt
> --
>
> Key: YARN-2614
> URL: https://issues.apache.org/jira/browse/YARN-2614
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Wangda Tan
>Priority: Major
>
> According to discussions in YARN-2594, there're some methods in 
> SchedulerApplicationAttempt will be accessed by other modules, that will lead 
> to potential dead lock in RM, we should cleanup them as much as we can.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-2199) FairScheduler: Allow max-AM-share to be specified in the root queue

2019-03-07 Thread Szilard Nemeth (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth reassigned YARN-2199:


Assignee: (was: Szilard Nemeth)

> FairScheduler: Allow max-AM-share to be specified in the root queue
> ---
>
> Key: YARN-2199
> URL: https://issues.apache.org/jira/browse/YARN-2199
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 2.5.0
>Reporter: Robert Kanter
>Priority: Major
> Attachments: YARN-2199.patch, YARN-2199.patch
>
>
> If users want to specify the max-AM-share, they have to do it for each leaf 
> queue individually.  It would be convenient if they could also specify it in 
> the root queue so they'd only have to specify it once to apply to all queues. 
>  It could still be overridden in a specific leaf queue though.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-5652) testRefreshNodesResourceWithResourceReturnInRegistration fails intermittently

2019-03-07 Thread Szilard Nemeth (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth reassigned YARN-5652:


Assignee: (was: Szilard Nemeth)

> testRefreshNodesResourceWithResourceReturnInRegistration fails intermittently
> -
>
> Key: YARN-5652
> URL: https://issues.apache.org/jira/browse/YARN-5652
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Reporter: Jason Lowe
>Priority: Major
>
> Saw the following in a recent precommit:
> {noformat}
> Running org.apache.hadoop.yarn.server.resourcemanager.TestRMAdminService
> Tests run: 25, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 18.639 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.resourcemanager.TestRMAdminService
> testRefreshNodesResourceWithResourceReturnInRegistration(org.apache.hadoop.yarn.server.resourcemanager.TestRMAdminService)
>   Time elapsed: 0.763 sec  <<< FAILURE!
> org.junit.ComparisonFailure: expected:<> but 
> was:<>
>   at org.junit.Assert.assertEquals(Assert.java:115)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestRMAdminService.testRefreshNodesResourceWithResourceReturnInRegistration(TestRMAdminService.java:286)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9348) Build issues on hadoop-yarn-application-catalog-webapp

2019-03-07 Thread Billie Rinaldi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787010#comment-16787010
 ] 

Billie Rinaldi commented on YARN-9348:
--

Thanks [~eyang]!

> Build issues on hadoop-yarn-application-catalog-webapp
> --
>
> Key: YARN-9348
> URL: https://issues.apache.org/jira/browse/YARN-9348
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.3.0
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-9348.001.patch, YARN-9348.002.patch, 
> YARN-9348.003.patch, YARN-9348.004.patch, YARN-9348.005.patch
>
>
> A couple reports jenkins precommit builds are failing due to integration 
> problem between nodejs libraries and Yetus.  Problems are:
> # Nodejs third party libraries are checked by whitespace check, which 
> generates many errors.  One possible solution is to move nodejs libraries 
> placement from project top level directory to target directory to prevent 
> stumble on whitespace checks.
> # maven clean fails because clean plugin tries to remove target directory and 
> files inside target/generated-sources directories to cause race conditions.
> # Building on mac will trigger access to osx keychain to attempt to login to 
> Dockerhub.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8218) Add application launch time to ATSV1

2019-03-07 Thread Vrushali C (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787003#comment-16787003
 ] 

Vrushali C commented on YARN-8218:
--

https://builds.apache.org/job/Hadoop-trunk-Commit/16152/ was successful 

> Add application launch time to ATSV1
> 
>
> Key: YARN-8218
> URL: https://issues.apache.org/jira/browse/YARN-8218
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Kanwaljeet Sachdev
>Assignee: Abhishek Modi
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-8218.001.patch
>
>
> YARN-7088 publishes application launch time to RMStore and also adds it to 
> the YARN UI. It would be a nice enhancement to have the launchTime event 
> published into the Application history server as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8218) Add application launch time to ATSV1

2019-03-07 Thread Abhishek Modi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787006#comment-16787006
 ] 

Abhishek Modi commented on YARN-8218:
-

Thanks [~vrushalic] for review and committing it.

> Add application launch time to ATSV1
> 
>
> Key: YARN-8218
> URL: https://issues.apache.org/jira/browse/YARN-8218
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Kanwaljeet Sachdev
>Assignee: Abhishek Modi
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-8218.001.patch
>
>
> YARN-7088 publishes application launch time to RMStore and also adds it to 
> the YARN UI. It would be a nice enhancement to have the launchTime event 
> published into the Application history server as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9265) FPGA plugin fails to recognize Intel Processing Accelerator Card

2019-03-07 Thread Sunil Govindan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786993#comment-16786993
 ] 

Sunil Govindan commented on YARN-9265:
--

[~pbacsko] cud u pls check the ASF license pblm.

> FPGA plugin fails to recognize Intel Processing Accelerator Card
> 
>
> Key: YARN-9265
> URL: https://issues.apache.org/jira/browse/YARN-9265
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.1.0
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Critical
> Attachments: YARN-9265-001.patch, YARN-9265-002.patch, 
> YARN-9265-003.patch, YARN-9265-004.patch, YARN-9265-005.patch, 
> YARN-9265-006.patch, YARN-9265-007.patch, YARN-9265-008.patch
>
>
> The plugin cannot autodetect Intel FPGA PAC (Processing Accelerator Card).
> There are two major issues.
> Problem #1
> The output of aocl diagnose:
> {noformat}
> 
> Device Name:
> acl0
>  
> Package Pat:
> /home/pbacsko/inteldevstack/intelFPGA_pro/hld/board/opencl_bsp
>  
> Vendor: Intel Corp
>  
> Physical Dev Name   StatusInformation
>  
> pac_a10_f20 PassedPAC Arria 10 Platform (pac_a10_f20)
>   PCIe 08:00.0
>   FPGA temperature = 79 degrees C.
>  
> DIAGNOSTIC_PASSED
> 
>  
> Call "aocl diagnose " to run diagnose for specified devices
> Call "aocl diagnose all" to run diagnose for all devices
> {noformat}
> The plugin fails to recognize this and fails with the following message:
> {noformat}
> 2019-01-25 06:46:02,834 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.FpgaResourcePlugin:
>  Using FPGA vendor plugin: 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.IntelFpgaOpenclPlugin
> 2019-01-25 06:46:02,943 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.FpgaDiscoverer:
>  Trying to diagnose FPGA information ...
> 2019-01-25 06:46:03,085 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.ResourceHandlerModule:
>  Using traffic control bandwidth handler
> 2019-01-25 06:46:03,108 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.CGroupsHandlerImpl:
>  Initializing mounted controller cpu at /sys/fs/cgroup/cpu,cpuacct/yarn
> 2019-01-25 06:46:03,139 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.fpga.FpgaResourceHandlerImpl:
>  FPGA Plugin bootstrap success.
> 2019-01-25 06:46:03,247 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.IntelFpgaOpenclPlugin:
>  Couldn't find (?i)bus:slot.func\s=\s.*, pattern
> 2019-01-25 06:46:03,248 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.IntelFpgaOpenclPlugin:
>  Couldn't find (?i)Total\sCard\sPower\sUsage\s=\s.* pattern
> 2019-01-25 06:46:03,251 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.IntelFpgaOpenclPlugin:
>  Failed to get major-minor number from reading /dev/pac_a10_f30
> 2019-01-25 06:46:03,252 ERROR 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Failed to 
> bootstrap configured resource subsystems!
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.ResourceHandlerException:
>  No FPGA devices detected!
> {noformat}
> Problem #2
> The plugin assumes that the file name under {{/dev}} can be derived from the 
> "Physical Dev Name", but this is wrong. For example, it thinks that the 
> device file is {{/dev/pac_a10_f30}} which is not the case, the actual 
> file is {{/dev/intel-fpga-port.0}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9362) Code cleanup in TestNMLeveldbStateStoreService

2019-03-07 Thread Szilard Nemeth (JIRA)
Szilard Nemeth created YARN-9362:


 Summary: Code cleanup in TestNMLeveldbStateStoreService
 Key: YARN-9362
 URL: https://issues.apache.org/jira/browse/YARN-9362
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Szilard Nemeth


There are many ways to improve TestNMLeveldbStateStoreService: 
1. RecoveredContainerState fields are asserted many times repeatedly. Some 
simple method extractions would definitely make this more readable.
2. The tests are very long and hard to read in general: Again, finding how 
methods could be extracted to avoid code repetition could help. 
3. You name it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9265) FPGA plugin fails to recognize Intel Processing Accelerator Card

2019-03-07 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786990#comment-16786990
 ] 

Hadoop QA commented on YARN-9265:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
43s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 39s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
49s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
45s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 18s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 5 new + 262 unchanged - 12 fixed = 267 total (was 274) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 56s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
42s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
53s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
57s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 21m 17s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
47s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}108m 44s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.TestFpgaDiscoverer
 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-9265 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12961580/YARN-9265-008.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle 

[jira] [Updated] (YARN-9360) Do not expose innards of QueueMetrics object into FSLeafQueue#computeMaxAMResource

2019-03-07 Thread Szilard Nemeth (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-9360:
-
Attachment: YARN-9360.001.patch

> Do not expose innards of QueueMetrics object into 
> FSLeafQueue#computeMaxAMResource
> --
>
> Key: YARN-9360
> URL: https://issues.apache.org/jira/browse/YARN-9360
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Major
> Attachments: YARN-9360.001.patch
>
>
> This is a follow-up for YARN-9323, covering required changes as discussed 
> with [~templedf] earlier.
> After YARN-9323, 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue#computeMaxAMResource
>  gets the QueueMetricsForCustomResources object from 
> scheduler.getRootQueueMetrics().
> Instead, we should use a "fill-in" method in QueueMetrics that receives a 
> Resource and fills in custom resource values if they are non-zero.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-9361) Write testcase for FSLeafQueue that explicitly checks if non-zero AM-share values are not overwritten for custom resources

2019-03-07 Thread Szilard Nemeth (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth reassigned YARN-9361:


Assignee: (was: Szilard Nemeth)

> Write testcase for FSLeafQueue that explicitly checks if non-zero AM-share 
> values are not overwritten for custom resources
> --
>
> Key: YARN-9361
> URL: https://issues.apache.org/jira/browse/YARN-9361
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Priority: Major
>
> This is a follow-up for YARN-9323, covering changes regarding explicit zero 
> value check that has been discussed with [~templedf] earlier.
> YARN-9323 fixed a bug in FSLeafQueue#computeMaxAMResource, so that custom 
> resource values are also set to the AM share.
> We need a new test in TestFSLeafQueue that explicitly checks if the custom 
> resource value is only being set if the fairshare for that resource is zero.
> This way, we can make sure we don't overwrite any meaningful resource value.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9361) Write testcase for FSLeafQueue that explicitly checks if non-zero values are not overwritten for custom resources

2019-03-07 Thread Szilard Nemeth (JIRA)
Szilard Nemeth created YARN-9361:


 Summary: Write testcase for FSLeafQueue that explicitly checks if 
non-zero values are not overwritten for custom resources
 Key: YARN-9361
 URL: https://issues.apache.org/jira/browse/YARN-9361
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Szilard Nemeth
Assignee: Szilard Nemeth


This is a follow-up for YARN-9323, covering required changes as discussed with 
[~templedf] earlier.
After YARN-9323, 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue#computeMaxAMResource
 gets the QueueMetricsForCustomResources object from 
scheduler.getRootQueueMetrics().
Instead, we should use a "fill-in" method in QueueMetrics that receives a 
Resource and fills in custom resource values if they are non-zero.





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9361) Write testcase for FSLeafQueue that explicitly checks if non-zero AM-share values are not overwritten for custom resources

2019-03-07 Thread Szilard Nemeth (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-9361:
-
Summary: Write testcase for FSLeafQueue that explicitly checks if non-zero 
AM-share values are not overwritten for custom resources  (was: Write testcase 
for FSLeafQueue that explicitly checks if non-zero values are not overwritten 
for custom resources)

> Write testcase for FSLeafQueue that explicitly checks if non-zero AM-share 
> values are not overwritten for custom resources
> --
>
> Key: YARN-9361
> URL: https://issues.apache.org/jira/browse/YARN-9361
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Major
>
> This is a follow-up for YARN-9323, covering changes regarding explicit zero 
> value check that has been discussed with [~templedf] earlier.
> YARN-9323 fixed a bug in FSLeafQueue#computeMaxAMResource, so that custom 
> resource values are also set to the AM share.
> We need a new test in TestFSLeafQueue that explicitly checks if the custom 
> resource value is only being set if the fairshare for that resource is zero.
> This way, we can make sure we don't overwrite any meaningful resource value.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9361) Write testcase for FSLeafQueue that explicitly checks if non-zero values are not overwritten for custom resources

2019-03-07 Thread Szilard Nemeth (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-9361:
-
Description: 
This is a follow-up for YARN-9323, covering changes regarding explicit zero 
value check that has been discussed with [~templedf] earlier.
YARN-9323 fixed a bug in FSLeafQueue#computeMaxAMResource, so that custom 
resource values are also set to the AM share.
We need a new test in TestFSLeafQueue that explicitly checks if the custom 
resource value is only being set if the fairshare for that resource is zero.
This way, we can make sure we don't overwrite any meaningful resource value.

  was:
This is a follow-up for YARN-9323, covering required changes as discussed with 
[~templedf] earlier.
After YARN-9323, 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue#computeMaxAMResource
 gets the QueueMetricsForCustomResources object from 
scheduler.getRootQueueMetrics().
Instead, we should use a "fill-in" method in QueueMetrics that receives a 
Resource and fills in custom resource values if they are non-zero.




> Write testcase for FSLeafQueue that explicitly checks if non-zero values are 
> not overwritten for custom resources
> -
>
> Key: YARN-9361
> URL: https://issues.apache.org/jira/browse/YARN-9361
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Major
>
> This is a follow-up for YARN-9323, covering changes regarding explicit zero 
> value check that has been discussed with [~templedf] earlier.
> YARN-9323 fixed a bug in FSLeafQueue#computeMaxAMResource, so that custom 
> resource values are also set to the AM share.
> We need a new test in TestFSLeafQueue that explicitly checks if the custom 
> resource value is only being set if the fairshare for that resource is zero.
> This way, we can make sure we don't overwrite any meaningful resource value.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9360) Do not expose innards of QueueMetrics object into FSLeafQueue#computeMaxAMResource

2019-03-07 Thread Szilard Nemeth (JIRA)
Szilard Nemeth created YARN-9360:


 Summary: Do not expose innards of QueueMetrics object into 
FSLeafQueue#computeMaxAMResource
 Key: YARN-9360
 URL: https://issues.apache.org/jira/browse/YARN-9360
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Szilard Nemeth


This is a follow-up for YARN-9323, covering required changes as discussed with 
[~templedf] earlier.
After YARN-9323, 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue#computeMaxAMResource
 gets the QueueMetricsForCustomResources object from 
scheduler.getRootQueueMetrics().
Instead, we should use a "fill-in" method in QueueMetrics that receives a 
Resource and fills in custom resource values if they are non-zero.





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-9360) Do not expose innards of QueueMetrics object into FSLeafQueue#computeMaxAMResource

2019-03-07 Thread Szilard Nemeth (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth reassigned YARN-9360:


Assignee: Szilard Nemeth

> Do not expose innards of QueueMetrics object into 
> FSLeafQueue#computeMaxAMResource
> --
>
> Key: YARN-9360
> URL: https://issues.apache.org/jira/browse/YARN-9360
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Major
>
> This is a follow-up for YARN-9323, covering required changes as discussed 
> with [~templedf] earlier.
> After YARN-9323, 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue#computeMaxAMResource
>  gets the QueueMetricsForCustomResources object from 
> scheduler.getRootQueueMetrics().
> Instead, we should use a "fill-in" method in QueueMetrics that receives a 
> Resource and fills in custom resource values if they are non-zero.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9359) Avoid code duplication in Resources for calculation methods

2019-03-07 Thread Szilard Nemeth (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-9359:
-
Labels: newbie newbie++  (was: )

> Avoid code duplication in Resources for calculation methods
> ---
>
> Key: YARN-9359
> URL: https://issues.apache.org/jira/browse/YARN-9359
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Priority: Minor
>  Labels: newbie, newbie++
>
> This is a follow-up for YARN-9318, dealing with code duplication issueas, as 
> discussed with [~templedf] earlier.
> Resources has many very similar calculation methods like addTo, subtractFrom, 
> multiply, etc.
> These are having extractable code as common, the only difference could be the 
> calculation they perform on the passed Resource object(s).
> These methods either receive one or two Resource objects and make some 
> calculations on these.
> One caveat that needs some attention is that some of them do clone the 
> Resource and do the calculation on the cloned resource and return the result 
> (leaving the passed Resource alone) and some of them perform the calculation 
> on the passed Resource object itself.
> The common code could be extracted like this: 
> {code:java}
> private static Resource applyFunctionOnValues(Resource lhs,
>   Function valueFunction) {
> int numResources = ResourceUtils.getNumberOfCountableResourceTypes();
> for (int i = 0; i < numResources; i++) {
>   try {
> ResourceInformation lhsValue = lhs.getResourceInformation(i);
> Long modifiedValue = valueFunction.apply(lhsValue.getValue());
> lhs.setResourceValue(i, modifiedValue);
>   } catch (ResourceNotFoundException ye) {
> LOG.warn("Resource is missing:" + ye.getMessage());
>   }
> }
> return lhs;
>   }
> {code}
> And an example code could be like this: 
> {code:java}
> public static Resource multiplyAndRoundUp(Resource lhs, double by) {
> return applyFunctionOnValues(clone(lhs),
> (value) -> (long) Math.ceil(value * by));
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9359) Avoid code duplication in Resources for calculation methods

2019-03-07 Thread Szilard Nemeth (JIRA)
Szilard Nemeth created YARN-9359:


 Summary: Avoid code duplication in Resources for calculation 
methods
 Key: YARN-9359
 URL: https://issues.apache.org/jira/browse/YARN-9359
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Szilard Nemeth


This is a follow-up for YARN-9318, dealing with code duplication issueas, as 
discussed with [~templedf] earlier.

Resources has many very similar calculation methods like addTo, subtractFrom, 
multiply, etc.
These are having extractable code as common, the only difference could be the 
calculation they perform on the passed Resource object(s).

These methods either receive one or two Resource objects and make some 
calculations on these.
One caveat that needs some attention is that some of them do clone the Resource 
and do the calculation on the cloned resource and return the result (leaving 
the passed Resource alone) and some of them perform the calculation on the 
passed Resource object itself.

The common code could be extracted like this: 


{code:java}
private static Resource applyFunctionOnValues(Resource lhs,
  Function valueFunction) {
int numResources = ResourceUtils.getNumberOfCountableResourceTypes();
for (int i = 0; i < numResources; i++) {
  try {
ResourceInformation lhsValue = lhs.getResourceInformation(i);
Long modifiedValue = valueFunction.apply(lhsValue.getValue());
lhs.setResourceValue(i, modifiedValue);
  } catch (ResourceNotFoundException ye) {
LOG.warn("Resource is missing:" + ye.getMessage());
  }
}
return lhs;
  }
{code}

And an example code could be like this: 

{code:java}
public static Resource multiplyAndRoundUp(Resource lhs, double by) {
return applyFunctionOnValues(clone(lhs),
(value) -> (long) Math.ceil(value * by));
  }
{code}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9265) FPGA plugin fails to recognize Intel Processing Accelerator Card

2019-03-07 Thread Peter Bacsko (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated YARN-9265:
---
Attachment: YARN-9265-008.patch

> FPGA plugin fails to recognize Intel Processing Accelerator Card
> 
>
> Key: YARN-9265
> URL: https://issues.apache.org/jira/browse/YARN-9265
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.1.0
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Critical
> Attachments: YARN-9265-001.patch, YARN-9265-002.patch, 
> YARN-9265-003.patch, YARN-9265-004.patch, YARN-9265-005.patch, 
> YARN-9265-006.patch, YARN-9265-007.patch, YARN-9265-008.patch
>
>
> The plugin cannot autodetect Intel FPGA PAC (Processing Accelerator Card).
> There are two major issues.
> Problem #1
> The output of aocl diagnose:
> {noformat}
> 
> Device Name:
> acl0
>  
> Package Pat:
> /home/pbacsko/inteldevstack/intelFPGA_pro/hld/board/opencl_bsp
>  
> Vendor: Intel Corp
>  
> Physical Dev Name   StatusInformation
>  
> pac_a10_f20 PassedPAC Arria 10 Platform (pac_a10_f20)
>   PCIe 08:00.0
>   FPGA temperature = 79 degrees C.
>  
> DIAGNOSTIC_PASSED
> 
>  
> Call "aocl diagnose " to run diagnose for specified devices
> Call "aocl diagnose all" to run diagnose for all devices
> {noformat}
> The plugin fails to recognize this and fails with the following message:
> {noformat}
> 2019-01-25 06:46:02,834 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.FpgaResourcePlugin:
>  Using FPGA vendor plugin: 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.IntelFpgaOpenclPlugin
> 2019-01-25 06:46:02,943 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.FpgaDiscoverer:
>  Trying to diagnose FPGA information ...
> 2019-01-25 06:46:03,085 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.ResourceHandlerModule:
>  Using traffic control bandwidth handler
> 2019-01-25 06:46:03,108 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.CGroupsHandlerImpl:
>  Initializing mounted controller cpu at /sys/fs/cgroup/cpu,cpuacct/yarn
> 2019-01-25 06:46:03,139 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.fpga.FpgaResourceHandlerImpl:
>  FPGA Plugin bootstrap success.
> 2019-01-25 06:46:03,247 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.IntelFpgaOpenclPlugin:
>  Couldn't find (?i)bus:slot.func\s=\s.*, pattern
> 2019-01-25 06:46:03,248 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.IntelFpgaOpenclPlugin:
>  Couldn't find (?i)Total\sCard\sPower\sUsage\s=\s.* pattern
> 2019-01-25 06:46:03,251 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.IntelFpgaOpenclPlugin:
>  Failed to get major-minor number from reading /dev/pac_a10_f30
> 2019-01-25 06:46:03,252 ERROR 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Failed to 
> bootstrap configured resource subsystems!
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.ResourceHandlerException:
>  No FPGA devices detected!
> {noformat}
> Problem #2
> The plugin assumes that the file name under {{/dev}} can be derived from the 
> "Physical Dev Name", but this is wrong. For example, it thinks that the 
> device file is {{/dev/pac_a10_f30}} which is not the case, the actual 
> file is {{/dev/intel-fpga-port.0}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-3881) Writing RM cluster-level metrics

2019-03-07 Thread Prabha Manepalli (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786861#comment-16786861
 ] 

Prabha Manepalli commented on YARN-3881:


[~BINGXUE QIU] If you are not currently working on this, Can I take this up 
now. Thanks.

> Writing RM cluster-level metrics
> 
>
> Key: YARN-3881
> URL: https://issues.apache.org/jira/browse/YARN-3881
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Zhijie Shen
>Assignee: Zhijie Shen
>Priority: Major
>  Labels: YARN-5355
> Attachments: metrics.json
>
>
> RM has a bunch of metrics that we may want to write into the timeline backend 
> to. I attached the metrics.json that I've crawled via 
> {{http://localhost:8088/jmx?qry=Hadoop:*}}. IMHO, we need to pay attention to 
> three groups of metrics:
> 1. QueueMetrics
> 2. JvmMetrics
> 3. ClusterMetrics
> The problem is that unlike other metrics belongs to a single application, 
> these ones belongs to RM or cluster-wide. Therefore, current write path is 
> not going to work for these metrics because they don't have the associated 
> user/flow/app context info. We need to rethink of modeling cross-app metrics 
> and the api to handle them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5714) ContainerExecutor does not order environment map

2019-03-07 Thread Eric Payne (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786869#comment-16786869
 ] 

Eric Payne commented on YARN-5714:
--

Okay, I will backport this through to branch-2.8

> ContainerExecutor does not order environment map
> 
>
> Key: YARN-5714
> URL: https://issues.apache.org/jira/browse/YARN-5714
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.4.1, 2.5.2, 2.7.3, 2.6.4, 3.0.0-alpha1
> Environment: all (linux and windows alike)
>Reporter: Remi Catherinot
>Assignee: Remi Catherinot
>Priority: Trivial
>  Labels: oct16-medium
> Fix For: 3.1.0
>
> Attachments: YARN-5714-branch-2.001.patch, 
> YARN-5714-branch-2.8.001.patch, YARN-5714.001.patch, YARN-5714.002.patch, 
> YARN-5714.003.patch, YARN-5714.004.patch, YARN-5714.005.patch, 
> YARN-5714.006.patch, YARN-5714.007.patch, YARN-5714.008.patch, 
> YARN-5714.009.patch
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> when dumping the launch container script, environment variables are dumped 
> based on the order internally used by the map implementation (hash based). It 
> does not take into consideration that some env varibales may refer each 
> other, and so that some env variables must be declared before those 
> referencing them.
> In my case, i ended up having LD_LIBRARY_PATH which was depending on 
> HADOOP_COMMON_HOME being dumped before HADOOP_COMMON_HOME. Thus it had a 
> wrong value and so native libraries weren't loaded. jobs were running but not 
> at their best efficiency. This is just a use case falling into that bug, but 
> i'm sure others may happen as well.
> I already have a patch running in my production environment, i just estimate 
> to 5 days for packaging the patch in the right fashion for JIRA + try my best 
> to add tests.
> Note : the patch is not OS aware with a default empty implementation. I will 
> only implement the unix version on a 1st release. I'm not used to windows env 
> variables syntax so it will take me more time/research for it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9358) Add javadoc to new methods introduced in FSQueueMetrics with YARN-9322

2019-03-07 Thread Szilard Nemeth (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-9358:
-
Description: 
This is a follow-up for YARN-9322, covering javadoc changes as discussed with 
[~templedf] earlier.
As discussed with Daniel, we need to add javadoc for the new methods introduced 
with YARN-9322 and also for the modified methods. 
The javadoc should refer to the fact that Resource Types are also included in 
the Resource object in case of get/set as well.
The methods are: 
1. getFairShare / setFairShare
2. getSteadyFairShare / setSteadyFairShare
3. getMinShare / setMinShare
4. getMaxShare / setMaxShare
5. getMaxAMShare / setMaxAMShare
6. getAMResourceUsage / setAMResourceUsage

Moreover, a javadoc could be added to the constructor of FSQueueMetrics as well.

  was:
This is a follow-up for YARN-9322.
As discussed with Daniel, we need to add javadoc for the new methods introduced 
with YARN-9322 and also for the modified methods. 
The javadoc should refer to the fact that Resource Types are also included in 
the Resource object in case of get/set as well.
The methods are: 
1. getFairShare / setFairShare
2. getSteadyFairShare / setSteadyFairShare
3. getMinShare / setMinShare
4. getMaxShare / setMaxShare
5. getMaxAMShare / setMaxAMShare
6. getAMResourceUsage / setAMResourceUsage

Moreover, a javadoc could be added to the constructor of FSQueueMetrics as well.


> Add javadoc to new methods introduced in FSQueueMetrics with YARN-9322
> --
>
> Key: YARN-9358
> URL: https://issues.apache.org/jira/browse/YARN-9358
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Priority: Major
>
> This is a follow-up for YARN-9322, covering javadoc changes as discussed with 
> [~templedf] earlier.
> As discussed with Daniel, we need to add javadoc for the new methods 
> introduced with YARN-9322 and also for the modified methods. 
> The javadoc should refer to the fact that Resource Types are also included in 
> the Resource object in case of get/set as well.
> The methods are: 
> 1. getFairShare / setFairShare
> 2. getSteadyFairShare / setSteadyFairShare
> 3. getMinShare / setMinShare
> 4. getMaxShare / setMaxShare
> 5. getMaxAMShare / setMaxAMShare
> 6. getAMResourceUsage / setAMResourceUsage
> Moreover, a javadoc could be added to the constructor of FSQueueMetrics as 
> well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9322) Store metrics for custom resource types into FSQueueMetrics and query them in FairSchedulerQueueInfo

2019-03-07 Thread Szilard Nemeth (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786866#comment-16786866
 ] 

Szilard Nemeth commented on YARN-9322:
--

Hi [~adam.antal]!
Yep, thanks for pointing this out, your statement is most likely correct.

[~templedf]: Filed YARN-9358 as a follow-up, as we agree earlier.

> Store metrics for custom resource types into FSQueueMetrics and query them in 
> FairSchedulerQueueInfo
> 
>
> Key: YARN-9322
> URL: https://issues.apache.org/jira/browse/YARN-9322
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: Screen Shot 2019-02-21 at 12.06.46.png, 
> YARN-9322.001.patch, YARN-9322.002.patch, YARN-9322.003.patch, 
> YARN-9322.004.patch, YARN-9322.005.patch, YARN-9322.006.patch
>
>
> YARN-8842 implemented storing and exposing of metrics of custom resources.
> FSQueueMetrics should have a similar implementation.
> All metrics stored in this class should have their custom resource 
> counterpart.
> In a consequence of metrics were not stored for custom resource type, 
> FairSchedulerQueueInfo haven't contained those values therefore the UI v1 
> could not show them, obviously. 
> See that gpu is missing from the value of  "AM Max Resources" on the attached 
> screenshot.
> Additionally, the callees of the following methods (in class 
> FairSchedulerQueueInfo) should consider to query values for custom resource 
> types too: 
> getMaxAMShareMB
> getMaxAMShareVCores
> getAMResourceUsageMB
> getAMResourceUsageVCores



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-3880) Writing more RM side app-level metrics

2019-03-07 Thread Prabha Manepalli (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786858#comment-16786858
 ] 

Prabha Manepalli commented on YARN-3880:


[~Naganarasimha] If you are not working on this now, Can I take this up ? 
Thanks.

> Writing more RM side app-level metrics
> --
>
> Key: YARN-3880
> URL: https://issues.apache.org/jira/browse/YARN-3880
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Zhijie Shen
>Assignee: Naganarasimha G R
>Priority: Major
>  Labels: YARN-5355
>
> In YARN-3044, we implemented an analog of metrics publisher for ATS v1. While 
> it helps to write app/attempt/container life cycle events, it really doesn't 
> write  as many app-level system metrics that RM are now having.  Just list 
> the metrics that I found missing:
> * runningContainers
> * memorySeconds
> * vcoreSeconds
> * preemptedResourceMB
> * preemptedResourceVCores
> * numNonAMContainerPreempted
> * numAMContainerPreempted
> Please feel fee to add more into the list if you find it's not covered.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9358) Add javadoc to new methods introduced in FSQueueMetrics with YARN-9322

2019-03-07 Thread Szilard Nemeth (JIRA)
Szilard Nemeth created YARN-9358:


 Summary: Add javadoc to new methods introduced in FSQueueMetrics 
with YARN-9322
 Key: YARN-9358
 URL: https://issues.apache.org/jira/browse/YARN-9358
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Szilard Nemeth


This is a follow-up for YARN-9322.
As discussed with Daniel, we need to add javadoc for the new methods introduced 
with YARN-9322 and also for the modified methods. 
The javadoc should refer to the fact that Resource Types are also included in 
the Resource object in case of get/set as well.
The methods are: 
1. getFairShare / setFairShare
2. getSteadyFairShare / setSteadyFairShare
3. getMinShare / setMinShare
4. getMaxShare / setMaxShare
5. getMaxAMShare / setMaxAMShare
6. getAMResourceUsage / setAMResourceUsage

Moreover, a javadoc could be added to the constructor of FSQueueMetrics as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9357) HBaseTimelineReaderImpl storage monitor log level change

2019-03-07 Thread Prabhu Joseph (JIRA)
Prabhu Joseph created YARN-9357:
---

 Summary: HBaseTimelineReaderImpl storage monitor log level change
 Key: YARN-9357
 URL: https://issues.apache.org/jira/browse/YARN-9357
 Project: Hadoop YARN
  Issue Type: Bug
  Components: ATSv2
Affects Versions: 3.2.0
Reporter: Prabhu Joseph
Assignee: Prabhu Joseph


HBaseTimelineReaderImpl storage monitor logs below every minute. Has to be 
changed to DEBUG level.

{code}
2019-03-07 13:48:28,764 INFO 
org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineReaderImpl: 
Running HBase liveness monitor
2019-03-07 13:49:28,763 INFO 
org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineReaderImpl: 
Running HBase liveness monitor
2019-03-07 13:50:28,764 INFO 
org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineReaderImpl: 
Running HBase liveness monitor
2019-03-07 13:51:28,764 INFO 
org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineReaderImpl: 
Running HBase liveness monitor
2019-03-07 13:52:28,763 INFO 
org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineReaderImpl: 
Running HBase liveness monitor
2019-03-07 13:53:28,763 INFO 
org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineReaderImpl: 
Running HBase liveness monitor
2019-03-07 13:54:28,763 INFO 
org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineReaderImpl: 
Running HBase liveness monitor
2019-03-07 13:55:28,764 INFO 
org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineReaderImpl: 
Running HBase liveness monitor
2019-03-07 13:56:28,764 INFO 
org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineReaderImpl: 
Running HBase liveness monitor
2019-03-07 13:57:28,764 INFO 
org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineReaderImpl: 
Running HBase liveness monitor
2019-03-07 13:58:28,764 INFO 
org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineReaderImpl: 
Running HBase liveness monitor
2019-03-07 13:59:28,764 INFO 
org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineReaderImpl: 
Running HBase liveness monitor
2019-03-07 14:00:28,764 INFO 
org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineReaderImpl: 
Running HBase liveness monitor
2019-03-07 14:01:28,763 INFO 
org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineReaderImpl: 
Running HBase liveness monitor
2019-03-07 14:02:28,764 INFO 
org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineReaderImpl: 
Running HBase liveness monitor
2019-03-07 14:03:28,764 INFO 
org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineReaderImpl: 
Running HBase liveness monitor
2019-03-07 14:04:28,764 INFO 
org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineReaderImpl: 
Running HBase liveness monitor
2019-03-07 14:05:28,764 INFO 
org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineReaderImpl: 
Running HBase liveness monitor
2019-03-07 14:06:28,764 INFO 
org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineReaderImpl: 
Running HBase liveness monitor
2019-03-07 14:07:28,764 INFO 
org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineReaderImpl: 
Running HBase liveness monitor
2019-03-07 14:08:28,764 INFO 
org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineReaderImpl: 
Running HBase liveness monitor
2019-03-07 14:09:28,764 INFO 
org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineReaderImpl: 
Running HBase liveness monitor
2019-03-07 14:10:28,764 INFO 
org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineReaderImpl: 
Running HBase liveness monitor
2019-03-07 14:11:28,764 INFO 
org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineReaderImpl: 
Running HBase liveness monitor
2019-03-07 14:12:28,764 INFO 
org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineReaderImpl: 
Running HBase liveness monitor
2019-03-07 14:13:28,763 INFO 
org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineReaderImpl: 
Running HBase liveness monitor
{code}






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9356) Add more tests to ratio method in TestResourceCalculator

2019-03-07 Thread Szilard Nemeth (JIRA)
Szilard Nemeth created YARN-9356:


 Summary: Add more tests to ratio method in TestResourceCalculator 
 Key: YARN-9356
 URL: https://issues.apache.org/jira/browse/YARN-9356
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Szilard Nemeth


TestResourceCalculator has some edge-case testcases to verify how division by 
zero is handled with ResourceCalculator.
We need other basic tests like we have for other ResourceCalculator methods.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9314) Fair Scheduler: Queue Info mistake when configured same queue name at same level

2019-03-07 Thread Wilfred Spiegelenburg (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786734#comment-16786734
 ] 

Wilfred Spiegelenburg commented on YARN-9314:
-

Hi [~fengyongshe], thank you for filing this and providing a patch.

I have a couple comments:
* The text in the exception needs clarification:
{{queuename (" + queueName + ") repeated defining in Allocation File}}
something like this is clearer:
{{queue name (" + queueName + ") is defined multiple times, queues can only be 
defined once.}}
* The {{exists}} method can be simplified:
{code}
public boolean exists(String queueName) {
  for (FSQueueType queueType : FSQueueType.values()) {
if (configuredQueues.get(queueType).contains(queueName)) {
  return true;
}
  }
  return false;
}
{code}
* instead of checking the text of the message in the exception it is better to 
use the {{(expected = AllocationConfigurationException.class)}} on the test. If 
we change the text the test would still pass making maintenance easier. We 
already do that in a number of tests like {{testQueueAlongsideRoot}} as an 
example.
* the patch introduces a number of new checkstyle issues which should be fixed.

> Fair Scheduler: Queue Info mistake when configured same queue name at same 
> level
> 
>
> Key: YARN-9314
> URL: https://issues.apache.org/jira/browse/YARN-9314
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: fengyongshe
>Priority: Major
> Attachments: Fair Scheduler Mistake when configured same queue at 
> same level.png, YARN-9341.patch
>
>
> The Queue Info is configured in fair-scheduler.xml like below
> 
>       {color:#ff}{color}
>           3072mb,3vcores
>          4096mb,4vcores
>           
>                1024mb,1vcores
>               2048mb,2vcores
>                Charlie
>            
>        
>       {color:#ff}{color}
>            1024mb,1vcores
>            2048mb,2vcores
>        
>  
> {color:#33}The Queue root.deva configured last will override existing 
> root.deva{color}{color:#33} in root.deva.sample, like the 
> {color}attachment 
>  
>   root.deva
> ||Used Resources:||
> ||Min Resources:|.  => should be <3072mb,3vcores>|
> ||Max Resources:|.  => should be <4096mb,4vcores>|
> ||Reserved Resources:||
> ||Steady Fair Share:||
> ||Instantaneous Fair Share:||
>  
> root.deva.sample
> ||Min Resources:||
> ||Max Resources:||
> ||Reserved Resources:||
> ||Steady Fair Share:||
>      
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8549) Adding a NoOp timeline writer and reader plugin classes for ATSv2

2019-03-07 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786697#comment-16786697
 ] 

Hadoop QA commented on YARN-8549:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 17m 
10s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
52s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green} branch-2 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
21s{color} | {color:green} branch-2 passed with JDK v1.8.0_191 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
26s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
32s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} branch-2 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} branch-2 passed with JDK v1.8.0_191 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed with JDK v1.8.0_191 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed with JDK v1.8.0_191 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
56s{color} | {color:green} hadoop-yarn-server-timelineservice in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 37m 59s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:da67579 |
| JIRA Issue | YARN-8549 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12961534/YARN-8549-branch-2.001.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 85612c273098 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri Oct 
5 08:56:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-2 / d71cfe1 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| Multi-JDK 

[jira] [Commented] (YARN-9351) user can't use total resources of one partition even yarn.scheduler.capacity..minimum-user-limit-percent is set to 100

2019-03-07 Thread tianjuan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786686#comment-16786686
 ] 

tianjuan commented on YARN-9351:


@Sunil Govindan could you take a look at this?

> user can't use total resources of one partition even 
> yarn.scheduler.capacity..minimum-user-limit-percent is set to 100 
> ---
>
> Key: YARN-9351
> URL: https://issues.apache.org/jira/browse/YARN-9351
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.1.2
>Reporter: tianjuan
>Priority: Major
>
> if we configure queue capacity in absolute term, users can't use total 
> resource of one partition even 
> yarn.scheduler.capacity..minimum-user-limit-percent is set to 100 
>  for example there are two partition A,B, partition A has (120G memory,30 
> vcores), and partition B has (180G memory,60 vcores), and Queue Prod is 
> configured with (75G memory, 25 vcores) partition A resource, like 
> yarn.scheduler.capacity.root.Prod.accessible-node-labels.A.capacity=[memory=75Gi,vcores=25],
> and 
> yarn.scheduler.capacity.root.Prod.accessible-node-labels.A.maximum-capacity=[memory=120Gi,vcores=30]
> yarn.scheduler.capacity.root.Prod.minimum-user-limit-percent=100, and at one 
> point the used resource of queue Prod is (90G memory,10 vcores), at this time 
> even though yarn.scheduler.capacity..minimum-user-limit-percent 
> is set to 100 , users in queue A can't get more resource.
>  
> the reason for this is that  when {color:#d04437}*computeUserLimit*{color}, 
> partitionResource is used for comparing consumed, queueCapacity, so in the 
> example (75G memory, 25 vcores) is the user limit. 
> Resource currentCapacity = Resources.lessThan(resourceCalculator,
>  partitionResource, consumed, queueCapacity)
>  ? queueCapacity
>  : Resources.add(consumed, required);
> Resource userLimitResource = Resources.max(resourceCalculator, 
> partitionResource,Resources.divideAndCeil(resourceCalculator, resourceUsed,
> usersSummedByWeight),Resources.divideAndCeil(resourceCalculator,Resources.multiplyAndRoundDown(currentCapacity,
>  getUserLimit()),100));
>  
> but when *{color:#d04437}canAssignToUser{color}* = 
> Resources.greaterThan(resourceCalculator, clusterResource,
>  user.getUsed(nodePartition), limit)
> *{color:#d04437}clusterResource{color}* {color:#33}is used for for 
> comparing  *used and limit, the result is false.*{color}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9351) user can't use total resources of one partition even yarn.scheduler.capacity..minimum-user-limit-percent is set to 100

2019-03-07 Thread tianjuan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

tianjuan updated YARN-9351:
---
Description: 
if we configure queue capacity in absolute term, users can't use total resource 
of one partition even 
yarn.scheduler.capacity..minimum-user-limit-percent is set to 100 
 for example there are two partition A,B, partition A has (120G memory,30 
vcores), and partition B has (180G memory,60 vcores), and Queue Prod is 
configured with (75G memory, 25 vcores) partition A resource, like 
yarn.scheduler.capacity.root.Prod.accessible-node-labels.A.capacity=[memory=75Gi,vcores=25],

and 
yarn.scheduler.capacity.root.Prod.accessible-node-labels.A.maximum-capacity=[memory=120Gi,vcores=30]

yarn.scheduler.capacity.root.Prod.minimum-user-limit-percent=100, and at one 
point the used resource of queue Prod is (90G memory,10 vcores), at this time 
even though yarn.scheduler.capacity..minimum-user-limit-percent is 
set to 100 , users in queue A can't get more resource.

 

the reason for this is that  when {color:#d04437}*computeUserLimit*{color}, 
partitionResource is used for comparing consumed, queueCapacity, so in the 
example (75G memory, 25 vcores) is the user limit. 

Resource currentCapacity = Resources.lessThan(resourceCalculator,
 partitionResource, consumed, queueCapacity)
 ? queueCapacity
 : Resources.add(consumed, required);

Resource userLimitResource = Resources.max(resourceCalculator, 
partitionResource,Resources.divideAndCeil(resourceCalculator, resourceUsed,

usersSummedByWeight),Resources.divideAndCeil(resourceCalculator,Resources.multiplyAndRoundDown(currentCapacity,
 getUserLimit()),100));

 

but when *{color:#d04437}canAssignToUser{color}* = 
Resources.greaterThan(resourceCalculator, clusterResource,
 user.getUsed(nodePartition), limit)

*{color:#d04437}clusterResource{color}* {color:#33}is used for for 
comparing * *used and limit, the result is false.*{color}

  was:
if we configure queue capacity in absolute term, users can't use total resource 
of one partition even 
yarn.scheduler.capacity..minimum-user-limit-percent is set to 100 
 for example there are two partition A,B, partition A has (120G memory,30 
vcores), and partition B has (180G memory,60 vcores), and Queue Prod is 
configured with (75G memory, 25 vcores) partition A resource, like 
yarn.scheduler.capacity.root.Prod.accessible-node-labels.A.capacity=[memory=75Gi,vcores=25],

and 
yarn.scheduler.capacity.root.Prod.accessible-node-labels.A.maximum-capacity=[memory=120Gi,vcores=30]

yarn.scheduler.capacity.root.Prod.minimum-user-limit-percent=100, and at one 
point the used resource of queue Prod is (90G memory,10 vcores), at this time 
even though yarn.scheduler.capacity..minimum-user-limit-percent is 
set to 100 , users in queue A can't get more resource.

 

the reason for this is that  when {color:#d04437}*computeUserLimit*{color}, 
partitionResource is used for comparing consumed, queueCapacity, so in the 
example (75G memory, 25 vcores) is the user limit. 

Resource currentCapacity = Resources.lessThan(resourceCalculator,
 partitionResource, consumed, queueCapacity)
 ? queueCapacity
 : Resources.add(consumed, required);

Resource userLimitResource = Resources.max(resourceCalculator, 
partitionResource,Resources.divideAndCeil(resourceCalculator, resourceUsed,

usersSummedByWeight),Resources.divideAndCeil(resourceCalculator,Resources.multiplyAndRoundDown(currentCapacity,
 getUserLimit()),100));

 

but when *{color:#d04437}canAssignToUser{color}* = 
Resources.greaterThan(resourceCalculator, clusterResource,
 user.getUsed(nodePartition), limit)

*{color:#d04437}clusterResource{color}* {color:#33}is used for for 
comparing * *used and limit, the result *is false.*{color}


> user can't use total resources of one partition even 
> yarn.scheduler.capacity..minimum-user-limit-percent is set to 100 
> ---
>
> Key: YARN-9351
> URL: https://issues.apache.org/jira/browse/YARN-9351
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.1.2
>Reporter: tianjuan
>Priority: Major
>
> if we configure queue capacity in absolute term, users can't use total 
> resource of one partition even 
> yarn.scheduler.capacity..minimum-user-limit-percent is set to 100 
>  for example there are two partition A,B, partition A has (120G memory,30 
> vcores), and partition B has (180G memory,60 vcores), and Queue Prod is 
> configured with (75G memory, 25 vcores) partition A resource, like 
> yarn.scheduler.capacity.root.Prod.accessible-node-labels.A.capacity=[memory=75Gi,vcores=25],
> and 
> yarn.scheduler.capacity.root.Prod.accessible-node-labels.A.maximum-capacity=[memory=120Gi,vcores=30]
> 

[jira] [Updated] (YARN-9351) user can't use total resources of one partition even yarn.scheduler.capacity..minimum-user-limit-percent is set to 100

2019-03-07 Thread tianjuan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

tianjuan updated YARN-9351:
---
Description: 
if we configure queue capacity in absolute term, users can't use total resource 
of one partition even 
yarn.scheduler.capacity..minimum-user-limit-percent is set to 100 
 for example there are two partition A,B, partition A has (120G memory,30 
vcores), and partition B has (180G memory,60 vcores), and Queue Prod is 
configured with (75G memory, 25 vcores) partition A resource, like 
yarn.scheduler.capacity.root.Prod.accessible-node-labels.A.capacity=[memory=75Gi,vcores=25],

and 
yarn.scheduler.capacity.root.Prod.accessible-node-labels.A.maximum-capacity=[memory=120Gi,vcores=30]

yarn.scheduler.capacity.root.Prod.minimum-user-limit-percent=100, and at one 
point the used resource of queue Prod is (90G memory,10 vcores), at this time 
even though yarn.scheduler.capacity..minimum-user-limit-percent is 
set to 100 , users in queue A can't get more resource.

 

the reason for this is that  when {color:#d04437}*computeUserLimit*{color}, 
partitionResource is used for comparing consumed, queueCapacity, so in the 
example (75G memory, 25 vcores) is the user limit. 

Resource currentCapacity = Resources.lessThan(resourceCalculator,
 partitionResource, consumed, queueCapacity)
 ? queueCapacity
 : Resources.add(consumed, required);

Resource userLimitResource = Resources.max(resourceCalculator, 
partitionResource,Resources.divideAndCeil(resourceCalculator, resourceUsed,

usersSummedByWeight),Resources.divideAndCeil(resourceCalculator,Resources.multiplyAndRoundDown(currentCapacity,
 getUserLimit()),100));

 

but when *{color:#d04437}canAssignToUser{color}* = 
Resources.greaterThan(resourceCalculator, clusterResource,
 user.getUsed(nodePartition), limit)

*{color:#d04437}clusterResource{color}* {color:#33}is used for for 
comparing  *used and limit, the result is false.*{color}

  was:
if we configure queue capacity in absolute term, users can't use total resource 
of one partition even 
yarn.scheduler.capacity..minimum-user-limit-percent is set to 100 
 for example there are two partition A,B, partition A has (120G memory,30 
vcores), and partition B has (180G memory,60 vcores), and Queue Prod is 
configured with (75G memory, 25 vcores) partition A resource, like 
yarn.scheduler.capacity.root.Prod.accessible-node-labels.A.capacity=[memory=75Gi,vcores=25],

and 
yarn.scheduler.capacity.root.Prod.accessible-node-labels.A.maximum-capacity=[memory=120Gi,vcores=30]

yarn.scheduler.capacity.root.Prod.minimum-user-limit-percent=100, and at one 
point the used resource of queue Prod is (90G memory,10 vcores), at this time 
even though yarn.scheduler.capacity..minimum-user-limit-percent is 
set to 100 , users in queue A can't get more resource.

 

the reason for this is that  when {color:#d04437}*computeUserLimit*{color}, 
partitionResource is used for comparing consumed, queueCapacity, so in the 
example (75G memory, 25 vcores) is the user limit. 

Resource currentCapacity = Resources.lessThan(resourceCalculator,
 partitionResource, consumed, queueCapacity)
 ? queueCapacity
 : Resources.add(consumed, required);

Resource userLimitResource = Resources.max(resourceCalculator, 
partitionResource,Resources.divideAndCeil(resourceCalculator, resourceUsed,

usersSummedByWeight),Resources.divideAndCeil(resourceCalculator,Resources.multiplyAndRoundDown(currentCapacity,
 getUserLimit()),100));

 

but when *{color:#d04437}canAssignToUser{color}* = 
Resources.greaterThan(resourceCalculator, clusterResource,
 user.getUsed(nodePartition), limit)

*{color:#d04437}clusterResource{color}* {color:#33}is used for for 
comparing * *used and limit, the result is false.*{color}


> user can't use total resources of one partition even 
> yarn.scheduler.capacity..minimum-user-limit-percent is set to 100 
> ---
>
> Key: YARN-9351
> URL: https://issues.apache.org/jira/browse/YARN-9351
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.1.2
>Reporter: tianjuan
>Priority: Major
>
> if we configure queue capacity in absolute term, users can't use total 
> resource of one partition even 
> yarn.scheduler.capacity..minimum-user-limit-percent is set to 100 
>  for example there are two partition A,B, partition A has (120G memory,30 
> vcores), and partition B has (180G memory,60 vcores), and Queue Prod is 
> configured with (75G memory, 25 vcores) partition A resource, like 
> yarn.scheduler.capacity.root.Prod.accessible-node-labels.A.capacity=[memory=75Gi,vcores=25],
> and 
> yarn.scheduler.capacity.root.Prod.accessible-node-labels.A.maximum-capacity=[memory=120Gi,vcores=30]
> 

[jira] [Updated] (YARN-9351) user can't use total resources of one partition even yarn.scheduler.capacity..minimum-user-limit-percent is set to 100

2019-03-07 Thread tianjuan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

tianjuan updated YARN-9351:
---
Affects Version/s: (was: 2.9.2)
   3.1.2

> user can't use total resources of one partition even 
> yarn.scheduler.capacity..minimum-user-limit-percent is set to 100 
> ---
>
> Key: YARN-9351
> URL: https://issues.apache.org/jira/browse/YARN-9351
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.1.2
>Reporter: tianjuan
>Priority: Major
>
> if we configure queue capacity in absolute term, users can't use total 
> resource of one partition even 
> yarn.scheduler.capacity..minimum-user-limit-percent is set to 100 
>  for example there are two partition A,B, partition A has (120G memory,30 
> vcores), and partition B has (180G memory,60 vcores), and Queue Prod is 
> configured with (75G memory, 25 vcores) partition A resource, like 
> yarn.scheduler.capacity.root.Prod.accessible-node-labels.A.capacity=[memory=75Gi,vcores=25],
> and 
> yarn.scheduler.capacity.root.Prod.accessible-node-labels.A.maximum-capacity=[memory=120Gi,vcores=30]
> yarn.scheduler.capacity.root.Prod.minimum-user-limit-percent=100, and at one 
> point the used resource of queue Prod is (90G memory,10 vcores), at this time 
> even though yarn.scheduler.capacity..minimum-user-limit-percent 
> is set to 100 , users in queue A can't get more resource.
>  
> the reason for this is that  when {color:#d04437}*computeUserLimit*{color}, 
> partitionResource is used for comparing consumed, queueCapacity, so in the 
> example (75G memory, 25 vcores) is the user limit. 
> Resource currentCapacity = Resources.lessThan(resourceCalculator,
>  partitionResource, consumed, queueCapacity)
>  ? queueCapacity
>  : Resources.add(consumed, required);
> Resource userLimitResource = Resources.max(resourceCalculator, 
> partitionResource,Resources.divideAndCeil(resourceCalculator, resourceUsed,
> usersSummedByWeight),Resources.divideAndCeil(resourceCalculator,Resources.multiplyAndRoundDown(currentCapacity,
>  getUserLimit()),100));
>  
> but when *{color:#d04437}canAssignToUser{color}* = 
> Resources.greaterThan(resourceCalculator, clusterResource,
>  user.getUsed(nodePartition), limit)
> *{color:#d04437}clusterResource{color}* {color:#33}is used for for 
> comparing * *used and limit, the result *is false.*{color}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9355) RMContainerRequestor#makeRemoteRequest has confusing log message

2019-03-07 Thread Szilard Nemeth (JIRA)
Szilard Nemeth created YARN-9355:


 Summary: RMContainerRequestor#makeRemoteRequest has confusing log 
message
 Key: YARN-9355
 URL: https://issues.apache.org/jira/browse/YARN-9355
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Szilard Nemeth


org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor#makeRemoteRequest 
has this log: 

{code:java}
if (ask.size() > 0 || release.size() > 0) {
  LOG.info("getResources() for " + applicationId + ":" + " ask="
  + ask.size() + " release= " + release.size() + " newContainers="
  + allocateResponse.getAllocatedContainers().size()
  + " finishedContainers=" + numCompletedContainers
  + " resourcelimit=" + availableResources + " knownNMs="
  + clusterNmCount);
}
{code}
The reason why "getResources()" is printed because 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator#getResources invokes 
makeRemoteRequest. This is not too informative and error-prone as name of 
getResources could change over time and the log will be outdated. Moreover, 
it's not a good idea to print a method name from a method below the current one 
in the stack.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9351) user can't use total resources of one partition even yarn.scheduler.capacity..minimum-user-limit-percent is set to 100

2019-03-07 Thread tianjuan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

tianjuan updated YARN-9351:
---
Description: 
if we configure queue capacity in absolute term, users can't use total resource 
of one partition even 
yarn.scheduler.capacity..minimum-user-limit-percent is set to 100 
 for example there are two partition A,B, partition A has (120G memory,30 
vcores), and partition B has (180G memory,60 vcores), and Queue Prod is 
configured with (75G memory, 25 vcores) partition A resource, like 
yarn.scheduler.capacity.root.Prod.accessible-node-labels.A.capacity=[memory=75Gi,vcores=25],

and 
yarn.scheduler.capacity.root.Prod.accessible-node-labels.A.maximum-capacity=[memory=120Gi,vcores=30]

yarn.scheduler.capacity.root.Prod.minimum-user-limit-percent=100, and at one 
point the used resource of queue Prod is (90G memory,10 vcores), at this time 
even though yarn.scheduler.capacity..minimum-user-limit-percent is 
set to 100 , users in queue A can't get more resource.

 

the reason for this is that  when {color:#d04437}*computeUserLimit*{color}, 
partitionResource is used for comparing consumed, queueCapacity, so in the 
example (75G memory, 25 vcores) is the user limit. 

Resource currentCapacity = Resources.lessThan(resourceCalculator,
 partitionResource, consumed, queueCapacity)
 ? queueCapacity
 : Resources.add(consumed, required);

Resource userLimitResource = Resources.max(resourceCalculator, 
partitionResource,Resources.divideAndCeil(resourceCalculator, resourceUsed,

usersSummedByWeight),Resources.divideAndCeil(resourceCalculator,Resources.multiplyAndRoundDown(currentCapacity,
 getUserLimit()),100));

 

but when *{color:#d04437}canAssignToUser{color}* = 
Resources.greaterThan(resourceCalculator, clusterResource,
 user.getUsed(nodePartition), limit)

*{color:#d04437}clusterResource{color}* {color:#33}is used for for 
comparing * *used and limit, the result *is false.*{color}

  was:
if we configure queue capacity in absolute term, users can't use total resource 
of one partition even 
yarn.scheduler.capacity..minimum-user-limit-percent is set to 100 
 for example there are two partition A,B, partition A has (120G memory,30 
vcores), and partition B has (180G memory,60 vcores), and Queue Prod is 
configured with (75G memory, 25 vcores) partition A resource, like 
yarn.scheduler.capacity.root.Prod.accessible-node-labels.A.capacity=[memory=75Gi,vcores=25],

and 
yarn.scheduler.capacity.root.Prod.accessible-node-labels.A.maximum-capacity=[memory=120Gi,vcores=30]

yarn.scheduler.capacity.root.Prod.minimum-user-limit-percent=100, and at one 
point the used resource of queue Prod is (90G memory,10 vcores), at this time 
even though yarn.scheduler.capacity..minimum-user-limit-percent is 
set to 100 , users in queue A can't get more resource.

 

the reason for this is that  when {color:#d04437}*computeUserLimit*{color}, 
partitionResource is used for comparing consumed, queueCapacity, so in the 
example (75G memory, 25 vcores) is the user limit. 

Resource currentCapacity = Resources.lessThan(resourceCalculator,
 partitionResource, consumed, queueCapacity)
 ? queueCapacity
 : Resources.add(consumed, required);

Resource userLimitResource = Resources.max(resourceCalculator, 
partitionResource,Resources.divideAndCeil(resourceCalculator, resourceUsed,

usersSummedByWeight),Resources.divideAndCeil(resourceCalculator,Resources.multiplyAndRoundDown(currentCapacity,
 getUserLimit()),100));

 

but when canAssignToUser = Resources.greaterThan(resourceCalculator, 
clusterResource,
 user.getUsed(nodePartition), limit)

*{color:#d04437}clusterResource{color}* {color:#33}is used for for 
comparing ** used and limit, the result *is false.*{color}


> user can't use total resources of one partition even 
> yarn.scheduler.capacity..minimum-user-limit-percent is set to 100 
> ---
>
> Key: YARN-9351
> URL: https://issues.apache.org/jira/browse/YARN-9351
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.9.2
>Reporter: tianjuan
>Priority: Major
>
> if we configure queue capacity in absolute term, users can't use total 
> resource of one partition even 
> yarn.scheduler.capacity..minimum-user-limit-percent is set to 100 
>  for example there are two partition A,B, partition A has (120G memory,30 
> vcores), and partition B has (180G memory,60 vcores), and Queue Prod is 
> configured with (75G memory, 25 vcores) partition A resource, like 
> yarn.scheduler.capacity.root.Prod.accessible-node-labels.A.capacity=[memory=75Gi,vcores=25],
> and 
> yarn.scheduler.capacity.root.Prod.accessible-node-labels.A.maximum-capacity=[memory=120Gi,vcores=30]
> 

[jira] [Updated] (YARN-9351) user can't use total resources of one partition even yarn.scheduler.capacity..minimum-user-limit-percent is set to 100

2019-03-07 Thread tianjuan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

tianjuan updated YARN-9351:
---
Description: 
if we configure queue capacity in absolute term, users can't use total resource 
of one partition even 
yarn.scheduler.capacity..minimum-user-limit-percent is set to 100 
 for example there are two partition A,B, partition A has (120G memory,30 
vcores), and partition B has (180G memory,60 vcores), and Queue Prod is 
configured with (75G memory, 25 vcores) partition A resource, like 
yarn.scheduler.capacity.root.Prod.accessible-node-labels.A.capacity=[memory=75Gi,vcores=25],

and 
yarn.scheduler.capacity.root.Prod.accessible-node-labels.A.maximum-capacity=[memory=120Gi,vcores=30]

yarn.scheduler.capacity.root.Prod.minimum-user-limit-percent=100, and at one 
point the used resource of queue Prod is (90G memory,10 vcores), at this time 
even though yarn.scheduler.capacity..minimum-user-limit-percent is 
set to 100 , users in queue A can't get more resource.

 

the reason for this is that  when {color:#d04437}*computeUserLimit*{color}, 
partitionResource is used for comparing consumed, queueCapacity, so in the 
example (75G memory, 25 vcores) is the user limit. 

Resource currentCapacity = Resources.lessThan(resourceCalculator,
 partitionResource, consumed, queueCapacity)
 ? queueCapacity
 : Resources.add(consumed, required);

Resource userLimitResource = Resources.max(resourceCalculator, 
partitionResource,Resources.divideAndCeil(resourceCalculator, resourceUsed,

usersSummedByWeight),Resources.divideAndCeil(resourceCalculator,Resources.multiplyAndRoundDown(currentCapacity,
 getUserLimit()),100));

 

but when canAssignToUser = Resources.greaterThan(resourceCalculator, 
clusterResource,
 user.getUsed(nodePartition), limit)

*{color:#d04437}clusterResource{color}* {color:#33}is used for for 
comparing ** used and limit, the result *is false.*{color}

  was:
if we configure queue capacity in absolute term, users can't use total resource 
of one partition even 
yarn.scheduler.capacity..minimum-user-limit-percent is set to 100 
 for example there are two partition A,B, partition A has (120G memory,30 
vcores), and partition B has (180G memory,60 vcores), and Queue Prod is 
configured with (75G memory, 25 vcores) partition A resource, like 
yarn.scheduler.capacity.root.Prod.accessible-node-labels.A.capacity=[memory=75Gi,vcores=25],
 

and 
yarn.scheduler.capacity.root.Prod.accessible-node-labels.A.maximum-capacity=[memory=120Gi,vcores=30]

yarn.scheduler.capacity.root.Prod.minimum-user-limit-percent=100, and at one 
point the used resource of queue Prod is (90G memory,10 vcores), at this time 
even though  


> user can't use total resources of one partition even 
> yarn.scheduler.capacity..minimum-user-limit-percent is set to 100 
> ---
>
> Key: YARN-9351
> URL: https://issues.apache.org/jira/browse/YARN-9351
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.9.2
>Reporter: tianjuan
>Priority: Major
>
> if we configure queue capacity in absolute term, users can't use total 
> resource of one partition even 
> yarn.scheduler.capacity..minimum-user-limit-percent is set to 100 
>  for example there are two partition A,B, partition A has (120G memory,30 
> vcores), and partition B has (180G memory,60 vcores), and Queue Prod is 
> configured with (75G memory, 25 vcores) partition A resource, like 
> yarn.scheduler.capacity.root.Prod.accessible-node-labels.A.capacity=[memory=75Gi,vcores=25],
> and 
> yarn.scheduler.capacity.root.Prod.accessible-node-labels.A.maximum-capacity=[memory=120Gi,vcores=30]
> yarn.scheduler.capacity.root.Prod.minimum-user-limit-percent=100, and at one 
> point the used resource of queue Prod is (90G memory,10 vcores), at this time 
> even though yarn.scheduler.capacity..minimum-user-limit-percent 
> is set to 100 , users in queue A can't get more resource.
>  
> the reason for this is that  when {color:#d04437}*computeUserLimit*{color}, 
> partitionResource is used for comparing consumed, queueCapacity, so in the 
> example (75G memory, 25 vcores) is the user limit. 
> Resource currentCapacity = Resources.lessThan(resourceCalculator,
>  partitionResource, consumed, queueCapacity)
>  ? queueCapacity
>  : Resources.add(consumed, required);
> Resource userLimitResource = Resources.max(resourceCalculator, 
> partitionResource,Resources.divideAndCeil(resourceCalculator, resourceUsed,
> usersSummedByWeight),Resources.divideAndCeil(resourceCalculator,Resources.multiplyAndRoundDown(currentCapacity,
>  getUserLimit()),100));
>  
> but when canAssignToUser = Resources.greaterThan(resourceCalculator, 
> clusterResource,
>  

[jira] [Updated] (YARN-9351) user can't use total resources of one partition even yarn.scheduler.capacity..minimum-user-limit-percent is set to 100

2019-03-07 Thread tianjuan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

tianjuan updated YARN-9351:
---
Description: 
if we configure queue capacity in absolute term, users can't use total resource 
of one partition even 
yarn.scheduler.capacity..minimum-user-limit-percent is set to 100 
 for example there are two partition A,B, partition A has (120G memory,30 
vcores), and partition B has (180G memory,60 vcores), and Queue Prod is 
configured with (75G memory, 25 vcores) partition A resource, like 
yarn.scheduler.capacity.root.Prod.accessible-node-labels.A.capacity=[memory=75Gi,vcores=25],
 

and 
yarn.scheduler.capacity.root.Prod.accessible-node-labels.A.maximum-capacity=[memory=120Gi,vcores=30]

yarn.scheduler.capacity.root.Prod.minimum-user-limit-percent=100, and at one 
point the used resource of queue Prod is (90G memory,10 vcores), at this time 
even though  

  was:
if we configure queue capacity in absolute term, users can't use total resource 
of one partition even 
yarn.scheduler.capacity..minimum-user-limit-percent is set to 100 
 for example there are two partition A,B, partition A has (120G memory,30 
vcores), and partition B has (180G memory,60 vcores), and Queue Prod is 
configured with (75G memory, 25 vcores) partition A resource, like 

yarn.scheduler.capacity.root.Prod.accessible-node-labels.A.capacity=[memory=75Gi,vcores=25]


> user can't use total resources of one partition even 
> yarn.scheduler.capacity..minimum-user-limit-percent is set to 100 
> ---
>
> Key: YARN-9351
> URL: https://issues.apache.org/jira/browse/YARN-9351
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.9.2
>Reporter: tianjuan
>Priority: Major
>
> if we configure queue capacity in absolute term, users can't use total 
> resource of one partition even 
> yarn.scheduler.capacity..minimum-user-limit-percent is set to 100 
>  for example there are two partition A,B, partition A has (120G memory,30 
> vcores), and partition B has (180G memory,60 vcores), and Queue Prod is 
> configured with (75G memory, 25 vcores) partition A resource, like 
> yarn.scheduler.capacity.root.Prod.accessible-node-labels.A.capacity=[memory=75Gi,vcores=25],
>  
> and 
> yarn.scheduler.capacity.root.Prod.accessible-node-labels.A.maximum-capacity=[memory=120Gi,vcores=30]
> yarn.scheduler.capacity.root.Prod.minimum-user-limit-percent=100, and at one 
> point the used resource of queue Prod is (90G memory,10 vcores), at this time 
> even though  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9354) TestUtils#createResource calls should be replaced with ResourceTypesTestHelper#newResource

2019-03-07 Thread Szilard Nemeth (JIRA)
Szilard Nemeth created YARN-9354:


 Summary: TestUtils#createResource calls should be replaced with 
ResourceTypesTestHelper#newResource
 Key: YARN-9354
 URL: https://issues.apache.org/jira/browse/YARN-9354
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Szilard Nemeth


org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestUtils#createResource
 has not identical, but very similar implementation to 
org.apache.hadoop.yarn.resourcetypes.ResourceTypesTestHelper#newResource. 
Since these 2 methods are doing the same essentially and 
ResourceTypesTestHelper is newer and used more, TestUtils#createResource should 
be replaced with ResourceTypesTestHelper#newResource with all occurrence.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8549) Adding a NoOp timeline writer and reader plugin classes for ATSv2

2019-03-07 Thread Prabha Manepalli (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabha Manepalli updated YARN-8549:
---
Attachment: YARN-8549-branch-2.001.patch

> Adding a NoOp timeline writer and reader plugin classes for ATSv2
> -
>
> Key: YARN-8549
> URL: https://issues.apache.org/jira/browse/YARN-8549
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: ATSv2, timelineclient, timelineserver
>Reporter: Prabha Manepalli
>Assignee: Prabha Manepalli
>Priority: Minor
> Attachments: YARN-8549-branch-2.001.patch, YARN-8549.v1.patch, 
> YARN-8549.v2.patch, YARN-8549.v4.patch, YARN-8549.v5.patch
>
>
> Stub implementation for TimeLineReader and TimeLineWriter classes. 
> These are useful for functional testing of writer and reader path for ATSv2



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9353) TestNMWebFilter should be renamed

2019-03-07 Thread Szilard Nemeth (JIRA)
Szilard Nemeth created YARN-9353:


 Summary: TestNMWebFilter should be renamed
 Key: YARN-9353
 URL: https://issues.apache.org/jira/browse/YARN-9353
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Szilard Nemeth


TestNMWebFilter should be renamed to should be renamed to NMWebAppFilter, as 
there is no class named NMWebFilter. The javadoc of the class is also outdated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9351) user can't use total resources of one partition even yarn.scheduler.capacity..minimum-user-limit-percent is set to 100

2019-03-07 Thread tianjuan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

tianjuan updated YARN-9351:
---
Description: 
if we configure queue capacity in absolute term, users can't use total resource 
of one partition even 
yarn.scheduler.capacity..minimum-user-limit-percent is set to 100 
 for example there are two partition A,B, partition A has (120G memory,30 
vcores), and partition B has (180G memory,60 vcores), and Queue Prod is 
configured with (75G memory, 25 vcores) partition A resource, like 

yarn.scheduler.capacity.root.Prod.accessible-node-labels.A.capacity=[memory=75Gi,vcores=25]

  was:
if we configure queue capacity in absolute term, users can't use total resource 
of one partition even 
yarn.scheduler.capacity..minimum-user-limit-percent is set to 100 
 for example there are two partition A,B, partition A has (120G 
memory,30cores), and partition B 


> user can't use total resources of one partition even 
> yarn.scheduler.capacity..minimum-user-limit-percent is set to 100 
> ---
>
> Key: YARN-9351
> URL: https://issues.apache.org/jira/browse/YARN-9351
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.9.2
>Reporter: tianjuan
>Priority: Major
>
> if we configure queue capacity in absolute term, users can't use total 
> resource of one partition even 
> yarn.scheduler.capacity..minimum-user-limit-percent is set to 100 
>  for example there are two partition A,B, partition A has (120G memory,30 
> vcores), and partition B has (180G memory,60 vcores), and Queue Prod is 
> configured with (75G memory, 25 vcores) partition A resource, like 
> yarn.scheduler.capacity.root.Prod.accessible-node-labels.A.capacity=[memory=75Gi,vcores=25]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8549) Adding a NoOp timeline writer and reader plugin classes for ATSv2

2019-03-07 Thread Prabha Manepalli (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabha Manepalli updated YARN-8549:
---
Attachment: (was: YARN-8549-branch-2.03.patch)

> Adding a NoOp timeline writer and reader plugin classes for ATSv2
> -
>
> Key: YARN-8549
> URL: https://issues.apache.org/jira/browse/YARN-8549
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: ATSv2, timelineclient, timelineserver
>Reporter: Prabha Manepalli
>Assignee: Prabha Manepalli
>Priority: Minor
> Attachments: YARN-8549.v1.patch, YARN-8549.v2.patch, 
> YARN-8549.v4.patch, YARN-8549.v5.patch
>
>
> Stub implementation for TimeLineReader and TimeLineWriter classes. 
> These are useful for functional testing of writer and reader path for ATSv2



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8549) Adding a NoOp timeline writer and reader plugin classes for ATSv2

2019-03-07 Thread Prabha Manepalli (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabha Manepalli updated YARN-8549:
---
Attachment: (was: YARN-8549-branch-2.04.patch)

> Adding a NoOp timeline writer and reader plugin classes for ATSv2
> -
>
> Key: YARN-8549
> URL: https://issues.apache.org/jira/browse/YARN-8549
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: ATSv2, timelineclient, timelineserver
>Reporter: Prabha Manepalli
>Assignee: Prabha Manepalli
>Priority: Minor
> Attachments: YARN-8549.v1.patch, YARN-8549.v2.patch, 
> YARN-8549.v4.patch, YARN-8549.v5.patch
>
>
> Stub implementation for TimeLineReader and TimeLineWriter classes. 
> These are useful for functional testing of writer and reader path for ATSv2



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9343) Replace isDebugEnabled with SLF4J parameterized log messages

2019-03-07 Thread Wilfred Spiegelenburg (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786572#comment-16786572
 ] 

Wilfred Spiegelenburg commented on YARN-9343:
-

yes I am fine with that. This patch is big enough to leave it like this.

I did not see any issues beside the ones to open new jiras for in the latest 
patch +1 (non binding) 

> Replace isDebugEnabled with SLF4J parameterized log messages
> 
>
> Key: YARN-9343
> URL: https://issues.apache.org/jira/browse/YARN-9343
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-9343-001.patch, YARN-9343-002.patch, 
> YARN-9343-003.patch
>
>
> Replace isDebugEnabled with SLF4J parameterized log messages. 
> https://www.slf4j.org/faq.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



  1   2   >