date:20201204

[jira] [Updated] (YARN-10517) QueueMetrics has incorrect Allocated Resource when labelled partitions updated

2020-12-04 Thread sibyl.lv (Jira)



 [ 
https://issues.apache.org/jira/browse/YARN-10517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sibyl.lv updated YARN-10517:

Description: 
After https://issues.apache.org/jira/browse/YARN-9596, QueueMetrics still has 
incorrect allocated jmx, such as  {color:#660e7a}allocatedMB, 
{color}{color:#660e7a}allocatedVCores and 
{color}{color:#660e7a}allocatedContainers, {color}when the node partition is 
updated from "DEFAULT" to other label and there are  running applications.

Steps to reproduce

==
 # Configure capacity-scheduler.xml with label configuration
 # Submit one application to default partition and run
 # Add label "tpcds" to cluster and replace label on node1 and node2 to be 
"tpcds" when the above application is running
 # Note down "VCores Used" at Web UI
 # When the application is finished, the metrics get wrong (screenshots 
attached).

==

 

FiCaSchedulerApp doesn't update queue metrics when CapacityScheduler handles 
this event {color:#660e7a}NODE_LABELS_UPDATE.{color}

So we should release container resource from old partition and add used 
resource to new partition, just as updating queueUsage.
{code:java}
// code placeholder
public void nodePartitionUpdated(RMContainer rmContainer, String oldPartition,
String newPartition) {
  Resource containerResource = rmContainer.getAllocatedResource();
  this.attemptResourceUsage.decUsed(oldPartition, containerResource);
  this.attemptResourceUsage.incUsed(newPartition, containerResource);
  getCSLeafQueue().decUsedResource(oldPartition, containerResource, this);
  getCSLeafQueue().incUsedResource(newPartition, containerResource, this);

  // Update new partition name if container is AM and also update AM resource
  if (rmContainer.isAMContainer()) {
setAppAMNodePartitionName(newPartition);
this.attemptResourceUsage.decAMUsed(oldPartition, containerResource);
this.attemptResourceUsage.incAMUsed(newPartition, containerResource);
getCSLeafQueue().decAMUsedResource(oldPartition, containerResource, this);
getCSLeafQueue().incAMUsedResource(newPartition, containerResource, this);
  }
}
{code}

  was:
After https://issues.apache.org/jira/browse/YARN-9596, QueueMetrics still has 
incorrect allocated jmx, such as  {color:#660e7a}allocatedMB, 
{color}{color:#660e7a}allocatedVCores and 
{color}{color:#660e7a}allocatedContainers, {color}when the node partition is 
updated from "DEFAULT" to other label and there are  running applications.

Steps to reproduce

==
 # Configure capacity-scheduler.xml with label configuration
 # Submit one application to default partition and run
 # Add label "tpcds" to cluster and replace label on node1 and node2 to be 
"tpcds" when the above application is running
 # Note down "VCores Used" at Web UI
 # When the application is finished, the metrics get wrong (screenshots 
attached).


> QueueMetrics has incorrect Allocated Resource when labelled partitions updated
> --
>
> Key: YARN-10517
> URL: https://issues.apache.org/jira/browse/YARN-10517
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.8.0, 3.3.0
>Reporter: sibyl.lv
>Priority: Major
> Fix For: 3.3.1, 3.2.3
>
> Attachments: YARN-10517-branch-3.2.001.patch, wrong metrics.png
>
>
> After https://issues.apache.org/jira/browse/YARN-9596, QueueMetrics still has 
> incorrect allocated jmx, such as  {color:#660e7a}allocatedMB, 
> {color}{color:#660e7a}allocatedVCores and 
> {color}{color:#660e7a}allocatedContainers, {color}when the node partition is 
> updated from "DEFAULT" to other label and there are  running applications.
> Steps to reproduce
> ==
>  # Configure capacity-scheduler.xml with label configuration
>  # Submit one application to default partition and run
>  # Add label "tpcds" to cluster and replace label on node1 and node2 to be 
> "tpcds" when the above application is running
>  # Note down "VCores Used" at Web UI
>  # When the application is finished, the metrics get wrong (screenshots 
> attached).
> ==
>  
> FiCaSchedulerApp doesn't update queue metrics when CapacityScheduler handles 
> this event {color:#660e7a}NODE_LABELS_UPDATE.{color}
> So we should release container resource from old partition and add used 
> resource to new partition, just as updating queueUsage.
> {code:java}
> // code placeholder
> public void nodePartitionUpdated(RMContainer rmContainer, String oldPartition,
> String newPartition) {
>   Resource containerResource = rmContainer.getAllocatedResource();
>   this.attemptResourceUsage.decUsed(oldPartition, containerResource);
>   this.attemptResourceUsage.incUsed(newPartition, containerResource);
>   getCSLeafQueue().decUsedResource(oldPartition, containerResource, t

[jira] [Updated] (YARN-10517) QueueMetrics has incorrect Allocated Resource when labelled partitions updated

2020-12-04 Thread sibyl.lv (Jira)



 [ 
https://issues.apache.org/jira/browse/YARN-10517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sibyl.lv updated YARN-10517:

Description: 
After https://issues.apache.org/jira/browse/YARN-9596, QueueMetrics still has 
incorrect allocated jmx, such as  {color:#660e7a}allocatedMB, 
{color}{color:#660e7a}allocatedVCores and 
{color}{color:#660e7a}allocatedContainers, {color}when the node partition is 
updated from "DEFAULT" to other label and there are  running applications.

Steps to reproduce

==
 # Configure capacity-scheduler.xml with label configuration
 # Submit one application to default partition and run
 # Add label "tpcds" to cluster and replace label on node1 and node2 to be 
"tpcds" when the above application is running
 # Note down "VCores Used" at Web UI
 # When the application is finished, the metrics get wrong (screenshots 
attached).

  was:
After https://issues.apache.org/jira/browse/YARN-9596, QueueMetrics still has 
incorrect allocated jmx, such as  {color:#660e7a}allocatedMB, 
{color}{color:#660e7a}allocatedVCores and 
{color}{color:#660e7a}allocatedContainers, {color}when the node partition is 
updated from "DEFAUT" to other label and there are  running applications.

Steps to reproduce

==
 # Configure capacity-scheduler.xml with label configuration
 # Submit one application to default partition and run
 # Add label "tpcds" to cluster and replace label on node1 and node2 to be 
"tpcds" when the above application is running
 # Note down "VCores Used" at Web UI
 # When the application is finished, the metrics get wrong (screenshots 
attached).


> QueueMetrics has incorrect Allocated Resource when labelled partitions updated
> --
>
> Key: YARN-10517
> URL: https://issues.apache.org/jira/browse/YARN-10517
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.8.0, 3.3.0
>Reporter: sibyl.lv
>Priority: Major
> Fix For: 3.3.1, 3.2.3
>
> Attachments: YARN-10517-branch-3.2.001.patch, wrong metrics.png
>
>
> After https://issues.apache.org/jira/browse/YARN-9596, QueueMetrics still has 
> incorrect allocated jmx, such as  {color:#660e7a}allocatedMB, 
> {color}{color:#660e7a}allocatedVCores and 
> {color}{color:#660e7a}allocatedContainers, {color}when the node partition is 
> updated from "DEFAULT" to other label and there are  running applications.
> Steps to reproduce
> ==
>  # Configure capacity-scheduler.xml with label configuration
>  # Submit one application to default partition and run
>  # Add label "tpcds" to cluster and replace label on node1 and node2 to be 
> "tpcds" when the above application is running
>  # Note down "VCores Used" at Web UI
>  # When the application is finished, the metrics get wrong (screenshots 
> attached).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-10517) QueueMetrics has incorrect Allocated Resource when labelled partitions updated

2020-12-04 Thread sibyl.lv (Jira)



 [ 
https://issues.apache.org/jira/browse/YARN-10517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sibyl.lv updated YARN-10517:

Description: 
After https://issues.apache.org/jira/browse/YARN-9596, QueueMetrics still has 
incorrect allocated jmx, such as  {color:#660e7a}allocatedMB, 
{color}{color:#660e7a}allocatedVCores and 
{color}{color:#660e7a}allocatedContainers, {color}when the node partition is 
updated from "DEFAUT" to other label and there are  running applications.

Steps to reproduce

==
 # Configure capacity-scheduler.xml with label configuration
 # Submit one application to default partition and run
 # Add label "tpcds" to cluster and replace label on node1 and node2 to be 
"tpcds" when the above application is running
 # Note down "VCores Used" at Web UI
 # When the application is finished, the metrics get wrong (screenshots 
attached).

  was:
After https://issues.apache.org/jira/browse/YARN-9596, QueueMetrics still has 
incorrect allocated jmx, such as  {color:#660e7a}allocatedMB, 
{color}{color:#660e7a}allocatedVCores and 
{color}{color:#660e7a}allocatedContainers, {color}when the node partition is 
updated from "DEFAUlT" to other label and there are  running applications.

Steps to reproduce

==
 # Configure capacity-scheduler.xml with label configuration
 # Submit one application to default partition and run
 # Add label "tpcds" to cluster and replace label on node1 and node2 to be 
"tpcds" when the above application is running
 # Note down "VCores Used" at Web UI
 # When the application is finished, the metrics get wrong (screenshots 
attached).


> QueueMetrics has incorrect Allocated Resource when labelled partitions updated
> --
>
> Key: YARN-10517
> URL: https://issues.apache.org/jira/browse/YARN-10517
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.8.0, 3.3.0
>Reporter: sibyl.lv
>Priority: Major
> Fix For: 3.3.1, 3.2.3
>
> Attachments: YARN-10517-branch-3.2.001.patch, wrong metrics.png
>
>
> After https://issues.apache.org/jira/browse/YARN-9596, QueueMetrics still has 
> incorrect allocated jmx, such as  {color:#660e7a}allocatedMB, 
> {color}{color:#660e7a}allocatedVCores and 
> {color}{color:#660e7a}allocatedContainers, {color}when the node partition is 
> updated from "DEFAUT" to other label and there are  running applications.
> Steps to reproduce
> ==
>  # Configure capacity-scheduler.xml with label configuration
>  # Submit one application to default partition and run
>  # Add label "tpcds" to cluster and replace label on node1 and node2 to be 
> "tpcds" when the above application is running
>  # Note down "VCores Used" at Web UI
>  # When the application is finished, the metrics get wrong (screenshots 
> attached).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-10517) QueueMetrics has incorrect Allocated Resource when labelled partitions updated

2020-12-04 Thread sibyl.lv (Jira)



 [ 
https://issues.apache.org/jira/browse/YARN-10517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sibyl.lv updated YARN-10517:

Attachment: YARN-10517-branch-3.2.001.patch

> QueueMetrics has incorrect Allocated Resource when labelled partitions updated
> --
>
> Key: YARN-10517
> URL: https://issues.apache.org/jira/browse/YARN-10517
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.8.0, 3.3.0
>Reporter: sibyl.lv
>Priority: Major
> Fix For: 3.3.1, 3.2.3
>
> Attachments: YARN-10517-branch-3.2.001.patch, wrong metrics.png
>
>
> After https://issues.apache.org/jira/browse/YARN-9596, QueueMetrics still has 
> incorrect allocated jmx, such as  {color:#660e7a}allocatedMB, 
> {color}{color:#660e7a}allocatedVCores and 
> {color}{color:#660e7a}allocatedContainers, {color}when the node partition is 
> updated from "DEFAUlT" to other label and there are  running applications.
> Steps to reproduce
> ==
>  # Configure capacity-scheduler.xml with label configuration
>  # Submit one application to default partition and run
>  # Add label "tpcds" to cluster and replace label on node1 and node2 to be 
> "tpcds" when the above application is running
>  # Note down "VCores Used" at Web UI
>  # When the application is finished, the metrics get wrong (screenshots 
> attached).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-10491) Fix deprecation warnings in SLSWebApp.java

2020-12-04 Thread Hadoop QA (Jira)



[ 
https://issues.apache.org/jira/browse/YARN-10491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17244162#comment-17244162
 ] 

Hadoop QA commented on YARN-10491:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 31m  
7s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} The patch does not contain any 
@author tags. {color} |
| {color:green}+1{color} | {color:green} {color} | {color:green}  0m  0s{color} 
| {color:green}test4tests{color} | {color:green} The patch appears to include 1 
new or modified test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 34m 
54s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
26s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
23s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
27s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m 53s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  0m 
46s{color} | {color:blue}{color} | {color:blue} Used deprecated FindBugs 
config; considering switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
44s{color} | {color:green}{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
24s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
20s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
20s{color} | {color:green}{color} | {color:green} 
hadoop-tools_hadoop-sls-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 generated 0 new + 0 unchanged - 6 fixed 
= 0 total (was 6) {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
17s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
17s{color} | {color:green}{color} | {color:green} 
hadoop-tools_hadoop-sls-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 
with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 generated 0 new + 
0 unchanged - 6 fixed = 0 total (was 6) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
20s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace 
issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 48s{color} | {color:green}{color} | {color:green} patch has no errors when 
building and testing our client

[jira] [Commented] (YARN-4783) Log aggregation failure for application when Nodemanager is restarted

2020-12-04 Thread Hadoop QA (Jira)



[ 
https://issues.apache.org/jira/browse/YARN-4783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17244055#comment-17244055
 ] 

Hadoop QA commented on YARN-4783:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  2m 
32s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} The patch does not contain any 
@author tags. {color} |
| {color:green}+1{color} | {color:green} {color} | {color:green}  0m  0s{color} 
| {color:green}test4tests{color} | {color:green} The patch appears to include 1 
new or modified test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 28m 
 3s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
14s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
8s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
26s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
39s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m 43s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
32s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  1m 
21s{color} | {color:blue}{color} | {color:blue} Used deprecated FindBugs 
config; considering switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
18s{color} | {color:green}{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
37s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
8s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
8s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
13s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
13s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 26s{color} | 
{color:orange}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/361/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt{color}
 | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 11 new + 58 unchanged - 0 fixed = 69 total (was 58) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
41s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace 
issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
18m  6s{color} | {color:green}{color} | {color:green} patch has no errors when 
building and testing our client artifacts. {color} |
| {col

[jira] [Commented] (YARN-4783) Log aggregation failure for application when Nodemanager is restarted

2020-12-04 Thread Szilard Nemeth (Jira)



[ 
https://issues.apache.org/jira/browse/YARN-4783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17244009#comment-17244009
 ] 

Szilard Nemeth commented on YARN-4783:
--

Hi [~gandras],

Latest patch had caused javadoc + checkstyle issues but I was not able open the 
link as it's no longer available.

Reuploaded latest patch to kick-off jenkins.

> Log aggregation failure for application when Nodemanager is restarted 
> --
>
> Key: YARN-4783
> URL: https://issues.apache.org/jira/browse/YARN-4783
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.1
>Reporter: Surendra Singh Lilhore
>Assignee: Andras Gyori
>Priority: Major
> Attachments: YARN-4783.001.patch, YARN-4783.002.patch, 
> YARN-4783.003.patch, YARN-4783.004.patch, YARN-4783.005.patch, 
> YARN-4783.005.patch
>
>
> Scenario :
>  =
> 1.Start NM with user dsperf:hadoop
>  2.Configure linux-execute user as dsperf
>  3.Submit application with yarn user 
>  4.Once few containers are allocated to NM 1
>  5.Nodemanager 1 is stopped (wait for expiry )
>  6.Start node manager after application is completed
>  7.Check the log aggregation is happening for the containers log in NMLocal 
> directory
> Expect Output :
>  ===
>  Log aggregation should be succesful
> Actual Output :
>  ===
>  Log aggreation not successful



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-4783) Log aggregation failure for application when Nodemanager is restarted

2020-12-04 Thread Szilard Nemeth (Jira)



 [ 
https://issues.apache.org/jira/browse/YARN-4783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-4783:
-
Attachment: YARN-4783.005.patch

> Log aggregation failure for application when Nodemanager is restarted 
> --
>
> Key: YARN-4783
> URL: https://issues.apache.org/jira/browse/YARN-4783
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.1
>Reporter: Surendra Singh Lilhore
>Assignee: Andras Gyori
>Priority: Major
> Attachments: YARN-4783.001.patch, YARN-4783.002.patch, 
> YARN-4783.003.patch, YARN-4783.004.patch, YARN-4783.005.patch, 
> YARN-4783.005.patch
>
>
> Scenario :
>  =
> 1.Start NM with user dsperf:hadoop
>  2.Configure linux-execute user as dsperf
>  3.Submit application with yarn user 
>  4.Once few containers are allocated to NM 1
>  5.Nodemanager 1 is stopped (wait for expiry )
>  6.Start node manager after application is completed
>  7.Check the log aggregation is happening for the containers log in NMLocal 
> directory
> Expect Output :
>  ===
>  Log aggregation should be succesful
> Actual Output :
>  ===
>  Log aggreation not successful



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-10517) QueueMetrics has incorrect Allocated Resource when labelled partitions updated

2020-12-04 Thread sibyl.lv (Jira)



 [ 
https://issues.apache.org/jira/browse/YARN-10517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sibyl.lv updated YARN-10517:

Description: 
After https://issues.apache.org/jira/browse/YARN-9596, QueueMetrics still has 
incorrect allocated jmx, such as  {color:#660e7a}allocatedMB, 
{color}{color:#660e7a}allocatedVCores and 
{color}{color:#660e7a}allocatedContainers, {color}when the node partition is 
updated from "DEFAUlT" to other label and there are  running applications.

Steps to reproduce

==
 # Configure capacity-scheduler.xml with label configuration
 # Submit one application to default partition and run
 # Add label "tpcds" to cluster and replace label on node1 and node2 to be 
"tpcds" when the above application is running
 # Note down "VCores Used" at Web UI
 # When the application is finished, the metrics get wrong (screenshots 
attached).

  was:
After https://issues.apache.org/jira/browse/YARN-9596, QueueMetrics still has 
incorrect allocated jmx, such as  {color:#660e7a}allocatedMB, 
{color}{color:#660e7a}allocatedVCores and 
{color}{color:#660e7a}allocatedContainers, {color}when the node partition is 
updated from "DEFAUlT" to other label.

Steps to reproduce

==
 # Configure capacity-scheduler.xml with label configuration
 # Submit one application to default partition and run
 # Add label "tpcds" to cluster and replace label on node1 and node2 to be 
"tpcds" when the above application is running
 # Note down "VCores Used" at Web UI
 # When the application is finished, the metrics get wrong (screenshots 
attached).


> QueueMetrics has incorrect Allocated Resource when labelled partitions updated
> --
>
> Key: YARN-10517
> URL: https://issues.apache.org/jira/browse/YARN-10517
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.8.0, 3.3.0
>Reporter: sibyl.lv
>Priority: Major
> Fix For: 3.3.1, 3.2.3
>
> Attachments: wrong metrics.png
>
>
> After https://issues.apache.org/jira/browse/YARN-9596, QueueMetrics still has 
> incorrect allocated jmx, such as  {color:#660e7a}allocatedMB, 
> {color}{color:#660e7a}allocatedVCores and 
> {color}{color:#660e7a}allocatedContainers, {color}when the node partition is 
> updated from "DEFAUlT" to other label and there are  running applications.
> Steps to reproduce
> ==
>  # Configure capacity-scheduler.xml with label configuration
>  # Submit one application to default partition and run
>  # Add label "tpcds" to cluster and replace label on node1 and node2 to be 
> "tpcds" when the above application is running
>  # Note down "VCores Used" at Web UI
>  # When the application is finished, the metrics get wrong (screenshots 
> attached).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-10517) QueueMetrics has incorrect Allocated Resource when labelled partitions updated

2020-12-04 Thread sibyl.lv (Jira)



 [ 
https://issues.apache.org/jira/browse/YARN-10517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sibyl.lv updated YARN-10517:

Attachment: wrong metrics.png

> QueueMetrics has incorrect Allocated Resource when labelled partitions updated
> --
>
> Key: YARN-10517
> URL: https://issues.apache.org/jira/browse/YARN-10517
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.8.0, 3.3.0
>Reporter: sibyl.lv
>Priority: Major
> Fix For: 3.3.1, 3.2.3
>
> Attachments: wrong metrics.png
>
>
> After https://issues.apache.org/jira/browse/YARN-9596, QueueMetrics still has 
> incorrect allocated jmx, such as  {color:#660e7a}allocatedMB, 
> {color}{color:#660e7a}allocatedVCores and 
> {color}{color:#660e7a}allocatedContainers, {color}when the node partition is 
> updated from "DEFAUlT" to other label.
> Steps to reproduce
> ==
>  # Configure capacity-scheduler.xml with label configuration
>  # Submit one application to default partition and run
>  # Add label "tpcds" to cluster and replace label on node1 and node2 to be 
> "tpcds" when the above application is running
>  # Note down "VCores Used" at Web UI
>  # When the application is finished, the metrics get wrong (screenshots 
> attached).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-10517) QueueMetrics has incorrect Allocated Resource when labelled partitions updated

2020-12-04 Thread sibyl.lv (Jira)



 [ 
https://issues.apache.org/jira/browse/YARN-10517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sibyl.lv updated YARN-10517:

Description: 
After https://issues.apache.org/jira/browse/YARN-9596, QueueMetrics still has 
incorrect allocated jmx, such as  {color:#660e7a}allocatedMB, 
{color}{color:#660e7a}allocatedVCores and 
{color}{color:#660e7a}allocatedContainers, {color}when the node partition is 
updated from "DEFAUlT" to other label.

Steps to reproduce

==
 # Configure capacity-scheduler.xml with label configuration
 # Submit one application to default partition and run
 # Add label "tpcds" to cluster and replace label on node1 and node2 to be 
"tpcds" when the above application is running
 # Note down "VCores Used" at Web UI
 # When the application is finished, the metrics get wrong (screenshots 
attached).

  was:
After https://issues.apache.org/jira/browse/YARN-9596, QueueMetrics still has 
incorrect allocated resources, such as 

{color:#660e7a}allocatedMB, {color}{color:#660e7a}allocatedVCores and 
{color}{color:#660e7a}allocatedContainers{color}. 

 


> QueueMetrics has incorrect Allocated Resource when labelled partitions updated
> --
>
> Key: YARN-10517
> URL: https://issues.apache.org/jira/browse/YARN-10517
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.8.0, 3.3.0
>Reporter: sibyl.lv
>Priority: Major
> Fix For: 3.3.1, 3.2.3
>
>
> After https://issues.apache.org/jira/browse/YARN-9596, QueueMetrics still has 
> incorrect allocated jmx, such as  {color:#660e7a}allocatedMB, 
> {color}{color:#660e7a}allocatedVCores and 
> {color}{color:#660e7a}allocatedContainers, {color}when the node partition is 
> updated from "DEFAUlT" to other label.
> Steps to reproduce
> ==
>  # Configure capacity-scheduler.xml with label configuration
>  # Submit one application to default partition and run
>  # Add label "tpcds" to cluster and replace label on node1 and node2 to be 
> "tpcds" when the above application is running
>  # Note down "VCores Used" at Web UI
>  # When the application is finished, the metrics get wrong (screenshots 
> attached).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-10517) QueueMetrics has incorrect Allocated Resource when labelled partitions updated

2020-12-04 Thread sibyl.lv (Jira)



 [ 
https://issues.apache.org/jira/browse/YARN-10517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sibyl.lv updated YARN-10517:

Description: 
After https://issues.apache.org/jira/browse/YARN-9596, QueueMetrics still has 
incorrect allocated resources, such as 

{color:#660e7a}allocatedMB, {color}{color:#660e7a}allocatedVCores and 
{color}{color:#660e7a}allocatedContainers{color}. 

 

  was:After https://issues.apache.org/jira/browse/YARN-9596, QueueMetrics still 
has incorrect  . 


> QueueMetrics has incorrect Allocated Resource when labelled partitions updated
> --
>
> Key: YARN-10517
> URL: https://issues.apache.org/jira/browse/YARN-10517
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.8.0, 3.3.0
>Reporter: sibyl.lv
>Priority: Major
> Fix For: 3.3.1, 3.2.3
>
>
> After https://issues.apache.org/jira/browse/YARN-9596, QueueMetrics still has 
> incorrect allocated resources, such as 
> {color:#660e7a}allocatedMB, {color}{color:#660e7a}allocatedVCores and 
> {color}{color:#660e7a}allocatedContainers{color}. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-10517) QueueMetrics has incorrect Allocated Resource when labelled partitions updated

2020-12-04 Thread sibyl.lv (Jira)



 [ 
https://issues.apache.org/jira/browse/YARN-10517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sibyl.lv updated YARN-10517:

Description: After https://issues.apache.org/jira/browse/YARN-9596, 
QueueMetrics still has incorrect  . 

> QueueMetrics has incorrect Allocated Resource when labelled partitions updated
> --
>
> Key: YARN-10517
> URL: https://issues.apache.org/jira/browse/YARN-10517
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.8.0, 3.3.0
>Reporter: sibyl.lv
>Priority: Major
> Fix For: 3.3.1, 3.2.3
>
>
> After https://issues.apache.org/jira/browse/YARN-9596, QueueMetrics still has 
> incorrect  . 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-10517) QueueMetrics has incorrect Allocated Resource when labelled partitions updated

2020-12-04 Thread sibyl.lv (Jira)

sibyl.lv created YARN-10517:
---

 Summary: QueueMetrics has incorrect Allocated Resource when 
labelled partitions updated
 Key: YARN-10517
 URL: https://issues.apache.org/jira/browse/YARN-10517
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.8.0, 3.3.0
Reporter: sibyl.lv
 Fix For: 3.3.1, 3.2.3






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-10491) Fix deprecation warnings in SLSWebApp.java

2020-12-04 Thread Hadoop QA (Jira)



[ 
https://issues.apache.org/jira/browse/YARN-10491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17243921#comment-17243921
 ] 

Hadoop QA commented on YARN-10491:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  2m 
32s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} The patch does not contain any 
@author tags. {color} |
| {color:green}+1{color} | {color:green} {color} | {color:green}  0m  0s{color} 
| {color:green}test4tests{color} | {color:green} The patch appears to include 1 
new or modified test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 35m 
30s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
27s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
18m  2s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  0m 
47s{color} | {color:blue}{color} | {color:blue} Used deprecated FindBugs 
config; considering switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
44s{color} | {color:green}{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
15s{color} | 
{color:red}https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2519/1/artifact/out/patch-mvninstall-hadoop-tools_hadoop-sls.txt{color}
 | {color:red} hadoop-sls in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
16s{color} | 
{color:red}https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2519/1/artifact/out/patch-compile-hadoop-tools_hadoop-sls-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt{color}
 | {color:red} hadoop-sls in the patch failed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 16s{color} 
| 
{color:red}https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2519/1/artifact/out/patch-compile-hadoop-tools_hadoop-sls-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt{color}
 | {color:red} hadoop-sls in the patch failed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
15s{color} | 
{color:red}https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2519/1/artifact/out/patch-compile-hadoop-tools_hadoop-sls-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt{color}
 | {color:red} hadoop-sls in the patch failed with JDK Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 15s{color} 
| 
{color:red}https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2519/1/artifact/out/patch-compile-hadoop-tools_hadoop-sls-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt{color}
 | {color:red} hadoop-sls in the patch failed with JDK Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m

[jira] [Updated] (YARN-10491) Fix deprecation warnings in SLSWebApp.java

2020-12-04 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/YARN-10491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated YARN-10491:
--
Labels: newbie pull-request-available  (was: newbie)

> Fix deprecation warnings in SLSWebApp.java
> --
>
> Key: YARN-10491
> URL: https://issues.apache.org/jira/browse/YARN-10491
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: build
>Reporter: Akira Ajisaka
>Assignee: Ankit Kumar
>Priority: Minor
>  Labels: newbie, pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {code}
>   simulateInfoTemplate = IOUtils.toString(
>   cl.getResourceAsStream("html/simulate.info.html.template"));
>   simulateTemplate = IOUtils.toString(
>   cl.getResourceAsStream("html/simulate.html.template"));
>   trackTemplate = IOUtils.toString(
>   cl.getResourceAsStream("html/track.html.template"));
> {code}
> {{IOUtils.toString(InputStream, Charset)}} should be used.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7200) SLS generates a realtimetrack.json file but that file is missing the closing ']'

2020-12-04 Thread Andras Gyori (Jira)



[ 
https://issues.apache.org/jira/browse/YARN-7200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17243816#comment-17243816
 ] 

Andras Gyori commented on YARN-7200:


Thank you [~akshink] it seems good to me +1 non-bindig.

> SLS generates a realtimetrack.json file but that file is missing the closing 
> ']'
> 
>
> Key: YARN-7200
> URL: https://issues.apache.org/jira/browse/YARN-7200
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler-load-simulator
>Reporter: Grant Sohn
>Assignee: Agshin Kazimli
>Priority: Minor
>  Labels: newbie, newbie++
> Attachments: YARN-7200-branch-trunk.patch, YARN-7200.002.patch, 
> YARN-7200.003.patch, snemeth-testing-20201113.zip
>
>
> File 
> hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/SchedulerMetrics.java
>  shows:
> {noformat}
>   void tearDown() throws Exception {
> if (metricsLogBW != null)  {
>   metricsLogBW.write("]");
>   metricsLogBW.close();
> }
> 
> {noformat}
> So the exit logic is flawed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-10514) Introduce a dominant resource based schedule policy to increase the resource utilization, avoid heavy cluster resource fragments.

2020-12-04 Thread zhuqi (Jira)



 [ 
https://issues.apache.org/jira/browse/YARN-10514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi updated YARN-10514:
-
Description: 
When we schedule in multi node lookup policy for async scheduling, or just use 
heartbeat update based scheduling, we both meet scheduling fragments. When 
cpu-intensive jobs or gpu-intensive or memory-intensive etc, the cluster will 
meet heavy waste of resources, so this issue will help to move scheduler 
support dominant resource based schedule, to help our cluster get better 
resource utilization, also in order to load balance nodemanager resource 
distribution.

 

  was:When we schedule in multi node lookup policy for async scheduling, or 
just use heartbeat update based scheduling, we both meet scheduling fragments. 
When cpu-intensive jobs or gpu-intensive or memory-intensive etc, the cluster 
will meet heavy waste of resources, so this issue will help to move scheduler 
support dominant resource based schedule, to help our cluster get better 
resource utilization.


> Introduce a dominant resource based schedule policy to increase the resource 
> utilization, avoid heavy cluster resource fragments.
> -
>
> Key: YARN-10514
> URL: https://issues.apache.org/jira/browse/YARN-10514
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 3.3.0, 3.4.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: YARN-10514.001.patch
>
>
> When we schedule in multi node lookup policy for async scheduling, or just 
> use heartbeat update based scheduling, we both meet scheduling fragments. 
> When cpu-intensive jobs or gpu-intensive or memory-intensive etc, the cluster 
> will meet heavy waste of resources, so this issue will help to move scheduler 
> support dominant resource based schedule, to help our cluster get better 
> resource utilization, also in order to load balance nodemanager resource 
> distribution.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-10507) Add the capability to fs2cs to write the converted placement rules inside capacity-scheduler.xml

2020-12-04 Thread Andras Gyori (Jira)



[ 
https://issues.apache.org/jira/browse/YARN-10507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17243813#comment-17243813
 ] 

Andras Gyori commented on YARN-10507:
-

Thank you [~pbacsko] for the patch! The overall logic seems good to me, but I 
have some minor addition to this:
 * Console Mode and File Mode together confused me at first, but later found 
out, that console mode works as a dry run option. However, warning message is 
not extended to mapping-rules.json, like in the case of other xml 
configurations. It might be a good idea to give a sign of this behaviour. 
 * A very minor nit, but getOutputStreamForJson could be reduced a little bit 
to this:

{code:java}
if (consoleMode && rulesToFile) {
  return System.out;
} else  if (rulesToFile) {
  File mappingRulesFile = new File(outputDirectory,
  MAPPING_RULES_JSON);
  return new FileOutputStream(mappingRulesFile);
} else {
  return new ByteArrayOutputStream();
}{code}

 * In TestFSConfigToCSConfigConverter, this line is not used anymore, I suppose:

{code:java}
ByteArrayOutputStream jsonOutStream = new ByteArrayOutputStream();
converter.setMappingRulesOutputStream(jsonOutStream);
{code}
Because you are getting the json file from the config afterwards as a string.

 

> Add the capability to fs2cs to write the converted placement rules inside 
> capacity-scheduler.xml
> 
>
> Key: YARN-10507
> URL: https://issues.apache.org/jira/browse/YARN-10507
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
>  Labels: fs2cs
> Attachments: YARN-10507-001.patch, YARN-10507-002.patch, 
> YARN-10507-003.patch, YARN-10507-004.patch, YARN-10507-005.patch, 
> YARN-10507-006.patch
>
>
> Currently, fs2cs tool generates a separate {{mapping-rules.json}} file when 
> it converts the placement rules.
> However, we also support having the JSON inlined inside 
> {{capacity-scheduler.xml}}.  Add a command line switch so that we can choose 
> the desired output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-10517) QueueMetrics has incorrect Allocated Resource when labelled partitions updated

[jira] [Updated] (YARN-10517) QueueMetrics has incorrect Allocated Resource when labelled partitions updated

[jira] [Updated] (YARN-10517) QueueMetrics has incorrect Allocated Resource when labelled partitions updated

[jira] [Updated] (YARN-10517) QueueMetrics has incorrect Allocated Resource when labelled partitions updated

[jira] [Commented] (YARN-10491) Fix deprecation warnings in SLSWebApp.java

[jira] [Commented] (YARN-4783) Log aggregation failure for application when Nodemanager is restarted

[jira] [Commented] (YARN-4783) Log aggregation failure for application when Nodemanager is restarted

[jira] [Updated] (YARN-4783) Log aggregation failure for application when Nodemanager is restarted

[jira] [Updated] (YARN-10517) QueueMetrics has incorrect Allocated Resource when labelled partitions updated

[jira] [Updated] (YARN-10517) QueueMetrics has incorrect Allocated Resource when labelled partitions updated

[jira] [Updated] (YARN-10517) QueueMetrics has incorrect Allocated Resource when labelled partitions updated

[jira] [Updated] (YARN-10517) QueueMetrics has incorrect Allocated Resource when labelled partitions updated

[jira] [Updated] (YARN-10517) QueueMetrics has incorrect Allocated Resource when labelled partitions updated

[jira] [Created] (YARN-10517) QueueMetrics has incorrect Allocated Resource when labelled partitions updated

[jira] [Commented] (YARN-10491) Fix deprecation warnings in SLSWebApp.java

[jira] [Updated] (YARN-10491) Fix deprecation warnings in SLSWebApp.java

[jira] [Commented] (YARN-7200) SLS generates a realtimetrack.json file but that file is missing the closing ']'

[jira] [Updated] (YARN-10514) Introduce a dominant resource based schedule policy to increase the resource utilization, avoid heavy cluster resource fragments.

[jira] [Commented] (YARN-10507) Add the capability to fs2cs to write the converted placement rules inside capacity-scheduler.xml

19 matches

Site Navigation

Mail list logo

Footer information