[jira] [Assigned] (YARN-11607) TestTimelineAuthFilterForV2 fails intermittently

2023-11-03 Thread Susheel Gupta (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-11607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susheel Gupta reassigned YARN-11607:


Assignee: Susheel Gupta

> TestTimelineAuthFilterForV2 fails intermittently 
> -
>
> Key: YARN-11607
> URL: https://issues.apache.org/jira/browse/YARN-11607
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Ayush Saxena
>Assignee: Susheel Gupta
>Priority: Major
>
> Ref:
> https://ci-hadoop.apache.org/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86_64/1398/testReport/junit/org.apache.hadoop.yarn.server.timelineservice.security/TestTimelineAuthFilterForV2/testPutTimelineEntities_boolean__boolean__3_/
> {noformat}
> org.opentest4j.AssertionFailedError: expected: <2> but was: <1>
>   at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55)
>   at 
> org.junit.jupiter.api.AssertionUtils.failNotEqual(AssertionUtils.java:62)
>   at 
> org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:150)
>   at 
> org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:145)
>   at org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:527)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2.publishAndVerifyEntity(TestTimelineAuthFilterForV2.java:324)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2.publishWithRetries(TestTimelineAuthFilterForV2.java:337)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2.testPutTimelineEntities(TestTimelineAuthFilterForV2.java:383)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-11420) Stabilize TestNMClient

2023-10-05 Thread Susheel Gupta (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-11420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susheel Gupta reassigned YARN-11420:


Assignee: Susheel Gupta  (was: Bence Kosztolnik)

> Stabilize TestNMClient
> --
>
> Key: YARN-11420
> URL: https://issues.apache.org/jira/browse/YARN-11420
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 3.4.0
>Reporter: Bence Kosztolnik
>Assignee: Susheel Gupta
>Priority: Major
>  Labels: pull-request-available
>
> The TestNMClient test methods can stuck if the test container fails, while 
> the test is expecting it running state. This can happen for example if the 
> container fails due low memory. To fix this the test should tolerate some 
> failure like this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-11535) Jackson-dataformat-yaml should be upgraded to 2.15.2 as it may cause transitive dependency issue with 2.12.7

2023-08-21 Thread Susheel Gupta (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17756877#comment-17756877
 ] 

Susheel Gupta commented on YARN-11535:
--

Jackson-dataformat-yaml 2.15.2 is *not* *compatible* with jackson-core 2.12.7, 
so we cannot update only jackson-dataformat-yaml, but we need to upgrade 
jackson-core and jackson-databind along with jackson-dataformat-yaml. Currently 
we don't have testcase in hadoop repo which reproduces this failure.

> Jackson-dataformat-yaml should be upgraded to 2.15.2 as it may cause 
> transitive dependency issue with 2.12.7
> 
>
> Key: YARN-11535
> URL: https://issues.apache.org/jira/browse/YARN-11535
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: build, yarn
>Affects Versions: 3.4.0
>Reporter: Susheel Gupta
>Assignee: Susheel Gupta
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> Hadoop-project uses  
> [snakeyaml.version-2.0|https://github.com/apache/hadoop/blame/trunk/hadoop-project/pom.xml#L198]
>  and 
> [jackson-dataformat-yaml-2.12.7|https://github.com/apache/hadoop/blob/trunk/hadoop-project/pom.xml#L72].
> But internally jackson-dataformat-yaml-2.12.7 uses compile dependency 
> [snakeyaml.version-1.27|https://mvnrepository.com/artifact/com.fasterxml.jackson.dataformat/jackson-dataformat-yaml/2.12.7]
>  .This may cause a transitive dependency issue in other services using hadoop 
> jar having jackson-dataformat-yaml-2.12.7 as  jackson-dataformat-yaml-2.12.7 
> will use nearest dependency available of snakeyaml i.e 1.27 and ignore the 
> version of snakeyaml-2.0 from hadoop-project.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-11535) Jackson-dataformat-yaml should be upgraded to 2.15.2 as it may cause transitive dependency issue with 2.12.7

2023-08-21 Thread Susheel Gupta (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17756877#comment-17756877
 ] 

Susheel Gupta edited comment on YARN-11535 at 8/21/23 12:35 PM:


Reopening this jira to revert the above merged commit because 
jackson-dataformat-yaml 2.15.2 is *not* *compatible* with jackson-core 2.12.7, 
so we cannot update only jackson-dataformat-yaml, but we need to upgrade 
jackson-core and jackson-databind along with jackson-dataformat-yaml. Currently 
we don't have testcase in hadoop repo which reproduces this failure.


was (Author: JIRAUSER299573):
Jackson-dataformat-yaml 2.15.2 is *not* *compatible* with jackson-core 2.12.7, 
so we cannot update only jackson-dataformat-yaml, but we need to upgrade 
jackson-core and jackson-databind along with jackson-dataformat-yaml. Currently 
we don't have testcase in hadoop repo which reproduces this failure.

> Jackson-dataformat-yaml should be upgraded to 2.15.2 as it may cause 
> transitive dependency issue with 2.12.7
> 
>
> Key: YARN-11535
> URL: https://issues.apache.org/jira/browse/YARN-11535
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: build, yarn
>Affects Versions: 3.4.0
>Reporter: Susheel Gupta
>Assignee: Susheel Gupta
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> Hadoop-project uses  
> [snakeyaml.version-2.0|https://github.com/apache/hadoop/blame/trunk/hadoop-project/pom.xml#L198]
>  and 
> [jackson-dataformat-yaml-2.12.7|https://github.com/apache/hadoop/blob/trunk/hadoop-project/pom.xml#L72].
> But internally jackson-dataformat-yaml-2.12.7 uses compile dependency 
> [snakeyaml.version-1.27|https://mvnrepository.com/artifact/com.fasterxml.jackson.dataformat/jackson-dataformat-yaml/2.12.7]
>  .This may cause a transitive dependency issue in other services using hadoop 
> jar having jackson-dataformat-yaml-2.12.7 as  jackson-dataformat-yaml-2.12.7 
> will use nearest dependency available of snakeyaml i.e 1.27 and ignore the 
> version of snakeyaml-2.0 from hadoop-project.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Reopened] (YARN-11535) Jackson-dataformat-yaml should be upgraded to 2.15.2 as it may cause transitive dependency issue with 2.12.7

2023-08-21 Thread Susheel Gupta (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-11535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susheel Gupta reopened YARN-11535:
--

> Jackson-dataformat-yaml should be upgraded to 2.15.2 as it may cause 
> transitive dependency issue with 2.12.7
> 
>
> Key: YARN-11535
> URL: https://issues.apache.org/jira/browse/YARN-11535
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: build, yarn
>Affects Versions: 3.4.0
>Reporter: Susheel Gupta
>Assignee: Susheel Gupta
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> Hadoop-project uses  
> [snakeyaml.version-2.0|https://github.com/apache/hadoop/blame/trunk/hadoop-project/pom.xml#L198]
>  and 
> [jackson-dataformat-yaml-2.12.7|https://github.com/apache/hadoop/blob/trunk/hadoop-project/pom.xml#L72].
> But internally jackson-dataformat-yaml-2.12.7 uses compile dependency 
> [snakeyaml.version-1.27|https://mvnrepository.com/artifact/com.fasterxml.jackson.dataformat/jackson-dataformat-yaml/2.12.7]
>  .This may cause a transitive dependency issue in other services using hadoop 
> jar having jackson-dataformat-yaml-2.12.7 as  jackson-dataformat-yaml-2.12.7 
> will use nearest dependency available of snakeyaml i.e 1.27 and ignore the 
> version of snakeyaml-2.0 from hadoop-project.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10680) Revisit try blocks without catch blocks but having finally blocks

2022-09-27 Thread Susheel Gupta (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susheel Gupta updated YARN-10680:
-
Attachment: YARN-10860.001.patch

> Revisit try blocks without catch blocks but having finally blocks
> -
>
> Key: YARN-10680
> URL: https://issues.apache.org/jira/browse/YARN-10680
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Szilard Nemeth
>Assignee: Susheel Gupta
>Priority: Minor
>  Labels: newbie, trivial
> Attachments: YARN-10860.001.patch
>
>
> This jira is to revisit all try blocks without catch blocks but having 
> finally blocks in SLS.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-11404) Add junit5 dependency to hadoop-mapreduce-client-app to fix few unit test failure

2023-01-10 Thread Susheel Gupta (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-11404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susheel Gupta reassigned YARN-11404:


Assignee: Susheel Gupta

> Add junit5 dependency to hadoop-mapreduce-client-app to fix few unit test 
> failure
> -
>
> Key: YARN-11404
> URL: https://issues.apache.org/jira/browse/YARN-11404
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Susheel Gupta
>Assignee: Susheel Gupta
>Priority: Major
>
> We need to add Junit 5 dependency in
> {code:java}
> /hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/pom.xml{code}
> as the testcase TestAMWebServicesJobConf, TestAMWebServicesJobs, 
> TestAMWebServices, TestAMWebServicesAttempts, TestAMWebServicesTasks were 
> passing locally but failed at jenkins build in this 
> [link|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5119/7/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-app.txt]
>  for YARN-5607



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-11404) Add junit5 dependency to hadoop-mapreduce-client-app to fix few unit test failure

2023-01-02 Thread Susheel Gupta (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-11404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susheel Gupta updated YARN-11404:
-
Description: 
We need to add Junit 5 dependency in
{code:java}
/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/pom.xml{code}
as the testcase TestAMWebServicesJobConf, TestAMWebServicesJobs, 
TestAMWebServices, TestAMWebServicesAttempts, TestAMWebServicesTasks were 
passing locally but failed at jenkins build in this 
[link|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5119/7/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-app.txt]
 for YARN-5607

  was:
We need to add Junit 5 dependency in
{code:java}
/Users/susheel.gupta/Documents/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/pom.xml{code}
as the testcase TestAMWebServicesJobConf, TestAMWebServicesJobs, 
TestAMWebServices, TestAMWebServicesAttempts, TestAMWebServicesTasks were 
passing locally but failed at jenkins build in this 
[link|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5119/7/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-app.txt]
 for [YARN-5607|https://issues.apache.org/jira/browse/YARN-5607]


> Add junit5 dependency to hadoop-mapreduce-client-app to fix few unit test 
> failure
> -
>
> Key: YARN-11404
> URL: https://issues.apache.org/jira/browse/YARN-11404
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Susheel Gupta
>Priority: Major
>
> We need to add Junit 5 dependency in
> {code:java}
> /hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/pom.xml{code}
> as the testcase TestAMWebServicesJobConf, TestAMWebServicesJobs, 
> TestAMWebServices, TestAMWebServicesAttempts, TestAMWebServicesTasks were 
> passing locally but failed at jenkins build in this 
> [link|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5119/7/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-app.txt]
>  for YARN-5607



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-11408) Add a check of autoQueueCreation is disabled for emitDefaultUserLimitFactor method

2023-01-04 Thread Susheel Gupta (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-11408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susheel Gupta updated YARN-11408:
-
Description: 
It is required to add user-limit-factor to -1 only for those queues which are 
leafqueue and auto-queue-creation is disabled. 

Follow-up of YARN-11393

  was:It is required to add user-limit-factor to -1 only for those queues which 
are leafqueue and auto-queue-creation is disabled. 


> Add a check of autoQueueCreation is disabled for emitDefaultUserLimitFactor 
> method  
> 
>
> Key: YARN-11408
> URL: https://issues.apache.org/jira/browse/YARN-11408
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Susheel Gupta
>Assignee: Susheel Gupta
>Priority: Major
>
> It is required to add user-limit-factor to -1 only for those queues which are 
> leafqueue and auto-queue-creation is disabled. 
> Follow-up of YARN-11393



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-11408) Add a check of autoQueueCreation is disabled for emitDefaultUserLimitFactor method

2023-01-04 Thread Susheel Gupta (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-11408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susheel Gupta updated YARN-11408:
-
Fix Version/s: 3.4.0

> Add a check of autoQueueCreation is disabled for emitDefaultUserLimitFactor 
> method  
> 
>
> Key: YARN-11408
> URL: https://issues.apache.org/jira/browse/YARN-11408
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Susheel Gupta
>Assignee: Susheel Gupta
>Priority: Major
> Fix For: 3.4.0
>
>
> It is required to add user-limit-factor to -1 only for those queues which are 
> leafqueue and auto-queue-creation is disabled. 
> Follow-up of YARN-11393



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-11416) FS2CS should use CapacitySchedulerConfiguration in FSQueueConverterBuilder

2023-01-11 Thread Susheel Gupta (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-11416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susheel Gupta reassigned YARN-11416:


Assignee: Susheel Gupta

> FS2CS should use CapacitySchedulerConfiguration in FSQueueConverterBuilder 
> ---
>
> Key: YARN-11416
> URL: https://issues.apache.org/jira/browse/YARN-11416
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Benjamin Teke
>Assignee: Susheel Gupta
>Priority: Major
>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.converter.FSQueueConverter
>  and it's builder stores the variable capacitySchedulerConfig as a simple 
> Configuration object instead of CapacitySchedulerConfiguration. This is 
> misleading, as capacitySchedulerConfig suggests that it is indeed a 
> CapacitySchedulerConfiguration and it loses access to the convenience methods 
> to check for various properties. Because of this every time a property getter 
> is changed FS2CS should be checked if it reimplemented the same, otherwise 
> there might be behaviour differences or even bugs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-11404) Add junit5 dependency to hadoop-mapreduce-client-app to fix few unit test failure

2023-01-02 Thread Susheel Gupta (Jira)
Susheel Gupta created YARN-11404:


 Summary: Add junit5 dependency to hadoop-mapreduce-client-app to 
fix few unit test failure
 Key: YARN-11404
 URL: https://issues.apache.org/jira/browse/YARN-11404
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Reporter: Susheel Gupta


We need to add Junit 5 dependency in
{code:java}
/Users/susheel.gupta/Documents/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/pom.xml{code}
as the testcase TestAMWebServicesJobConf, TestAMWebServicesJobs, 
TestAMWebServices, TestAMWebServicesAttempts, TestAMWebServicesTasks were 
passing locally but failed at jenkins build in this 
[link|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5119/7/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-app.txt]
 for [YARN-5607|https://issues.apache.org/jira/browse/YARN-5607]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10879) Incorrect WARN text in ACL check for application tag based placement

2022-12-09 Thread Susheel Gupta (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17645301#comment-17645301
 ] 

Susheel Gupta commented on YARN-10879:
--

[~sahuja] 

Do you mind if I start working on this?

> Incorrect WARN text in ACL check for application tag based placement
> 
>
> Key: YARN-10879
> URL: https://issues.apache.org/jira/browse/YARN-10879
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Brian Goerlitz
>Assignee: Siddharth Ahuja
>Priority: Minor
>
> After YARN-10070 the queue permissions check for application tag based 
> placement is performed for the proxy user instead of the end user, but the 
> warning message for permissions failure indicates that the end user is 
> missing permissions.
> {code:java}
>  LOG.warn("User '{}' from application tag does not have access to " +
>   " queue '{}'. " + "The placement is done for user '{}'",
>   userNameFromAppTag, queue, user);
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-11393) Fs2cs could be extended to set ULF to -1 upon conversion

2022-12-09 Thread Susheel Gupta (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-11393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susheel Gupta updated YARN-11393:
-
Description: 
A global configuration to set the default User Limit Factor to -1 on newly 
created queues.

To solve this is to make fs2cs (Fair Scheduler to Capacity Scheduler tool) add 
the user-limit-factor value -1 to the conversion as default. 

  was:A global configuration to set the default User Limit Factor to -1 on 
newly created queues.


> Fs2cs could be extended to set ULF to -1 upon conversion
> 
>
> Key: YARN-11393
> URL: https://issues.apache.org/jira/browse/YARN-11393
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Reporter: Susheel Gupta
>Assignee: Susheel Gupta
>Priority: Major
>
> A global configuration to set the default User Limit Factor to -1 on newly 
> created queues.
> To solve this is to make fs2cs (Fair Scheduler to Capacity Scheduler tool) 
> add the user-limit-factor value -1 to the conversion as default. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-11393) Fs2cs could be extended to set ULF to -1 upon conversion

2022-12-09 Thread Susheel Gupta (Jira)
Susheel Gupta created YARN-11393:


 Summary: Fs2cs could be extended to set ULF to -1 upon conversion
 Key: YARN-11393
 URL: https://issues.apache.org/jira/browse/YARN-11393
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: yarn
Reporter: Susheel Gupta
Assignee: Susheel Gupta


A global configuration to set the default User Limit Factor to -1 on newly 
created queues.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-10879) Incorrect WARN text in ACL check for application tag based placement

2022-12-12 Thread Susheel Gupta (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susheel Gupta reassigned YARN-10879:


Assignee: Susheel Gupta  (was: Siddharth Ahuja)

> Incorrect WARN text in ACL check for application tag based placement
> 
>
> Key: YARN-10879
> URL: https://issues.apache.org/jira/browse/YARN-10879
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Brian Goerlitz
>Assignee: Susheel Gupta
>Priority: Minor
>
> After YARN-10070 the queue permissions check for application tag based 
> placement is performed for the proxy user instead of the end user, but the 
> warning message for permissions failure indicates that the end user is 
> missing permissions.
> {code:java}
>  LOG.warn("User '{}' from application tag does not have access to " +
>   " queue '{}'. " + "The placement is done for user '{}'",
>   userNameFromAppTag, queue, user);
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-11079) Make an AbstractParentQueue to store common ParentQueue and ManagedParentQueue functionality

2022-12-19 Thread Susheel Gupta (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-11079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susheel Gupta reassigned YARN-11079:


Assignee: Susheel Gupta

> Make an AbstractParentQueue to store common ParentQueue and 
> ManagedParentQueue functionality
> 
>
> Key: YARN-11079
> URL: https://issues.apache.org/jira/browse/YARN-11079
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Reporter: Benjamin Teke
>Assignee: Susheel Gupta
>Priority: Major
>
> ParentQueue is an instantiable class which stores the necessary functionality 
> of parent queues, however it is also extended by the 
> AbstractManagedParentQueue, which is an abstract class for storing managed 
> parent queue functionality. Since legacy AQC doesn't allow dynamic queues 
> next to static ones, managed parent queues technically behave like leaf 
> queues by not having any static child queues when created. This structure and 
> behaviour is really error prone, as for example if someone is not completely 
> aware of this and simply changes the checking order by first checking if the 
> queue in question is a ParentQueue in a method like 
> MappingRuleValidationContextImpl.isDynamicParent can result a completely 
> wrong return value (as a ManagedParent is a dynamic parent, but currently 
> it's also a ParentQueue, and ManagedParent cannot have the 
> isEligibleForAutoQueueCreation as true, so the method will return false). 
> {code:java}
>   private boolean isDynamicParent(CSQueue queue) {
> if (queue == null) {
>   return false;
> }
> if (queue instanceof ManagedParentQueue) {
>   return true;
> }
> if (queue instanceof ParentQueue) {
>   return ((ParentQueue)queue).isEligibleForAutoQueueCreation();
> }
> return false;
>   }
> {code}
> Similarly to YARN-11024 an AbstractParentQueue class should be created to 
> completely separate the managed parents from the instantiable ParentQueue 
> class.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-4944) Handle lack of ResourceCalculatorPlugin gracefully

2022-11-29 Thread Susheel Gupta (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-4944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susheel Gupta resolved YARN-4944.
-
Resolution: Duplicate

> Handle lack of ResourceCalculatorPlugin gracefully
> --
>
> Key: YARN-4944
> URL: https://issues.apache.org/jira/browse/YARN-4944
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.8.0
>Reporter: Karthik Kambatla
>Assignee: Susheel Gupta
>Priority: Major
>  Labels: newbie++, trivial
>
> On some systems (e.g. mac), the NM might not be able to instantiate a 
> ResourceCalculatorPlugin and leads to logging a bunch of error messages. We 
> could improve the way we handle this. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-11408) Add a check of autoQueueCreation is disabled for emitDefaultUserLimitFactor method

2023-01-04 Thread Susheel Gupta (Jira)
Susheel Gupta created YARN-11408:


 Summary: Add a check of autoQueueCreation is disabled for 
emitDefaultUserLimitFactor method  
 Key: YARN-11408
 URL: https://issues.apache.org/jira/browse/YARN-11408
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Reporter: Susheel Gupta


It is required to add user-limit-factor to -1 only for those queues which are 
leafqueue and auto-queue-creation is disabled. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-11408) Add a check of autoQueueCreation is disabled for emitDefaultUserLimitFactor method

2023-01-04 Thread Susheel Gupta (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-11408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susheel Gupta reassigned YARN-11408:


Assignee: Susheel Gupta

> Add a check of autoQueueCreation is disabled for emitDefaultUserLimitFactor 
> method  
> 
>
> Key: YARN-11408
> URL: https://issues.apache.org/jira/browse/YARN-11408
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Susheel Gupta
>Assignee: Susheel Gupta
>Priority: Major
>
> It is required to add user-limit-factor to -1 only for those queues which are 
> leafqueue and auto-queue-creation is disabled. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-11427) Pull up the versioned imports in pom of hadoop-mapreduce-client-app to hadoop-project pom

2023-03-14 Thread Susheel Gupta (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-11427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susheel Gupta updated YARN-11427:
-
Description: 
The versioned imports in pom.xml of hadoop-mapreduce-client-app can be pulled 
up to hadoop-project pom as it is better for version maintenance and  ease of 
using an IDE to find where things are used

 
{code:java}
    
      org.mockito
      mockito-junit-jupiter
      4.11.0
      test
    
    
      uk.org.webcompere
      system-stubs-core
      1.1.0
      test
    
    
      uk.org.webcompere
      system-stubs-jupiter
      1.1.0
      test
     {code}
This jira also contain YARN-11404 (the commit of this jira was reverted becasue 
of some testcase failure caused due to transitive dependency 
[issue|https://github.com/apache/hadoop/pull/5295#issuecomment-1439366770]).

 

 

  was:
The versioned imports in pom.xml of hadoop-mapreduce-client-app can be pulled 
up to hadoop-project pom as it is better for version maintenance and  ease of 
using an IDE to find where things are used

 
{code:java}
    
      org.mockito
      mockito-junit-jupiter
      4.11.0
      test
    
    
      uk.org.webcompere
      system-stubs-core
      1.1.0
      test
    
    
      uk.org.webcompere
      system-stubs-jupiter
      1.1.0
      test
     {code}
 

 

 


> Pull up the versioned imports in pom of hadoop-mapreduce-client-app to 
> hadoop-project pom
> -
>
> Key: YARN-11427
> URL: https://issues.apache.org/jira/browse/YARN-11427
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: yarn
>Reporter: Susheel Gupta
>Assignee: Susheel Gupta
>Priority: Minor
>  Labels: pull-request-available
>
> The versioned imports in pom.xml of hadoop-mapreduce-client-app can be pulled 
> up to hadoop-project pom as it is better for version maintenance and  ease of 
> using an IDE to find where things are used
>  
> {code:java}
>     
>       org.mockito
>       mockito-junit-jupiter
>       4.11.0
>       test
>     
>     
>       uk.org.webcompere
>       system-stubs-core
>       1.1.0
>       test
>     
>     
>       uk.org.webcompere
>       system-stubs-jupiter
>       1.1.0
>       test
>      {code}
> This jira also contain YARN-11404 (the commit of this jira was reverted 
> becasue of some testcase failure caused due to transitive dependency 
> [issue|https://github.com/apache/hadoop/pull/5295#issuecomment-1439366770]).
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-11464) queue element is added to any other  leaf queue, it's queueType becomes QueueType.PARENT_QUEUE

2023-04-12 Thread Susheel Gupta (Jira)
Susheel Gupta created YARN-11464:


 Summary:  queue element is added to any other  leaf 
queue, it's queueType becomes QueueType.PARENT_QUEUE
 Key: YARN-11464
 URL: https://issues.apache.org/jira/browse/YARN-11464
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 3.3.4
Reporter: Susheel Gupta


This testcase clearly reproduces the issue. There is a missing dot before 
"auto-queue-creation-v2.enabled" for method call assertNoValueForQueues.
{code:java}
@Test
public void testAutoCreateV2FlagsInWeightMode() {
  converter = builder.withPercentages(false).build();

  converter.convertQueueHierarchy(rootQueue);

  assertTrue("root autocreate v2 flag",
  csConfig.getBoolean(
  PREFIX + "root.auto-queue-creation-v2.enabled", false));
  assertTrue("root.admins autocreate v2 flag",
  csConfig.getBoolean(
  PREFIX + "root.admins.auto-queue-creation-v2.enabled", false));
  assertTrue("root.users autocreate v2 flag",
  csConfig.getBoolean(
  PREFIX + "root.users.auto-queue-creation-v2.enabled", false));
  assertTrue("root.misc autocreate v2 flag",
  csConfig.getBoolean(
  PREFIX + "root.misc.auto-queue-creation-v2.enabled", false));

  Set leafs = Sets.difference(ALL_QUEUES,
  Sets.newHashSet("root",
  "root.default",
  "root.admins",
  "root.users",
  "root.misc"));
  assertNoValueForQueues(leafs, "auto-queue-creation-v2.enabled",
  csConfig);
} {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-11312) [UI2] Refresh buttons don't work after EmberJS upgrade

2023-04-25 Thread Susheel Gupta (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-11312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susheel Gupta reassigned YARN-11312:


Assignee: Susheel Gupta

> [UI2] Refresh buttons don't work after EmberJS upgrade
> --
>
> Key: YARN-11312
> URL: https://issues.apache.org/jira/browse/YARN-11312
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Brian Goerlitz
>Assignee: Susheel Gupta
>Priority: Minor
>
> After YARN-10826 and YARN-10858, UI2 uses EmberJS 2.8.0, but the refresh 
> buttons do not work anymore. The following error is thrown in the Chrome 
> console, but other browsers also fail.
> {noformat}
> yarn-ui.js:38 Uncaught TypeError: Cannot read properties of undefined 
> (reading 'send')
>     at Class.refresh (yarn-ui.js:38:311)
>     at Class.send (vendor.js:2504:107)
>     at Class.superWrapper [as send] (vendor.js:1875:112)
>     at vendor.js:1165:144
>     at Object.flaggedInstrument (vendor.js:1583:187)
>     at runRegisteredAction (vendor.js:1165:68)
>     at Backburner.run (vendor.js:738:228)
>     at Object.run [as default] (vendor.js:1840:517)
>     at Object.handler (vendor.js:1164:178)
>     at HTMLButtonElement. (vendor.js:2534:128){noformat}
> Downgrading the ember version to 2.7.0 seems to resolve the issue, but this 
> also requires a jquery downgrade.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-11427) Pullup the versioned imports in pom of hadoop-mapreduce-client-app to hadoop-project pom

2023-02-03 Thread Susheel Gupta (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-11427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susheel Gupta reassigned YARN-11427:


Assignee: Susheel Gupta

> Pullup the versioned imports in pom of hadoop-mapreduce-client-app to 
> hadoop-project pom
> 
>
> Key: YARN-11427
> URL: https://issues.apache.org/jira/browse/YARN-11427
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: yarn
>Reporter: Susheel Gupta
>Assignee: Susheel Gupta
>Priority: Minor
>
> The versioned imports in pom.xml of hadoop-mapreduce-client-app can be pullup 
> to hadoop-project pom as it is better for version maintenance and  ease of 
> using an IDE to find where things are used
>  
> {code:java}
>     
>       org.mockito
>       mockito-junit-jupiter
>       4.11.0
>       test
>     
>     
>       uk.org.webcompere
>       system-stubs-core
>       1.1.0
>       test
>     
>     
>       uk.org.webcompere
>       system-stubs-jupiter
>       1.1.0
>       test
>      {code}
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-11427) Pullup the versioned imports in pom of hadoop-mapreduce-client-app to hadoop-project pom

2023-02-03 Thread Susheel Gupta (Jira)
Susheel Gupta created YARN-11427:


 Summary: Pullup the versioned imports in pom of 
hadoop-mapreduce-client-app to hadoop-project pom
 Key: YARN-11427
 URL: https://issues.apache.org/jira/browse/YARN-11427
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Reporter: Susheel Gupta


The versioned imports in pom.xml of hadoop-mapreduce-client-app can be pullup 
to hadoop-project pom as it is better for version maintenance and  ease of 
using an IDE to find where things are used

 
{code:java}
    
      org.mockito
      mockito-junit-jupiter
      4.11.0
      test
    
    
      uk.org.webcompere
      system-stubs-core
      1.1.0
      test
    
    
      uk.org.webcompere
      system-stubs-jupiter
      1.1.0
      test
     {code}
 

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-11427) Pullup the versioned imports in pom of hadoop-mapreduce-client-app to hadoop-project pom

2023-02-03 Thread Susheel Gupta (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-11427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susheel Gupta updated YARN-11427:
-
Issue Type: Task  (was: Bug)

> Pullup the versioned imports in pom of hadoop-mapreduce-client-app to 
> hadoop-project pom
> 
>
> Key: YARN-11427
> URL: https://issues.apache.org/jira/browse/YARN-11427
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: yarn
>Reporter: Susheel Gupta
>Priority: Minor
>
> The versioned imports in pom.xml of hadoop-mapreduce-client-app can be pullup 
> to hadoop-project pom as it is better for version maintenance and  ease of 
> using an IDE to find where things are used
>  
> {code:java}
>     
>       org.mockito
>       mockito-junit-jupiter
>       4.11.0
>       test
>     
>     
>       uk.org.webcompere
>       system-stubs-core
>       1.1.0
>       test
>     
>     
>       uk.org.webcompere
>       system-stubs-jupiter
>       1.1.0
>       test
>      {code}
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-11513) Applications submitted to ambiguous queue fail during recovery if "Specified" Placement Rule is used

2023-06-14 Thread Susheel Gupta (Jira)
Susheel Gupta created YARN-11513:


 Summary: Applications submitted to ambiguous queue fail during 
recovery if "Specified" Placement Rule is used
 Key: YARN-11513
 URL: https://issues.apache.org/jira/browse/YARN-11513
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 3.3.4
Reporter: Susheel Gupta


When an app is submitted to an ambiguous queue using the full queue path and is 
placed in that pool via a {{%specified}} mapping Placement Rule, the queue in 
the stored ApplicationSubmissionContext will be the short name for the queue. 
During recovery from an RM failover, the placement rule will be evaluated using 
the stored short name of the queue, resulting in the RM killing the app as it 
cannot resolve the ambiguous queue name.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-11513) Applications submitted to ambiguous queue fail during recovery if "Specified" Placement Rule is used

2023-06-14 Thread Susheel Gupta (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-11513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susheel Gupta reassigned YARN-11513:


Assignee: Susheel Gupta

> Applications submitted to ambiguous queue fail during recovery if "Specified" 
> Placement Rule is used
> 
>
> Key: YARN-11513
> URL: https://issues.apache.org/jira/browse/YARN-11513
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.3.4
>Reporter: Susheel Gupta
>Assignee: Susheel Gupta
>Priority: Major
>
> When an app is submitted to an ambiguous queue using the full queue path and 
> is placed in that pool via a {{%specified}} mapping Placement Rule, the queue 
> in the stored ApplicationSubmissionContext will be the short name for the 
> queue. During recovery from an RM failover, the placement rule will be 
> evaluated using the stored short name of the queue, resulting in the RM 
> killing the app as it cannot resolve the ambiguous queue name.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-11464) queue element is added to any other  leaf queue, it's queueType becomes QueueType.PARENT_QUEUE

2023-07-06 Thread Susheel Gupta (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-11464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susheel Gupta reassigned YARN-11464:


Assignee: Susheel Gupta

>  queue element is added to any other  leaf queue, it's queueType 
> becomes QueueType.PARENT_QUEUE
> 
>
> Key: YARN-11464
> URL: https://issues.apache.org/jira/browse/YARN-11464
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.3.4
>Reporter: Susheel Gupta
>Assignee: Susheel Gupta
>Priority: Major
>  Labels: pull-request-available
>
> This testcase clearly reproduces the issue. There is a missing dot before 
> "auto-queue-creation-v2.enabled" for method call assertNoValueForQueues.
> {code:java}
> @Test
> public void testAutoCreateV2FlagsInWeightMode() {
>   converter = builder.withPercentages(false).build();
>   converter.convertQueueHierarchy(rootQueue);
>   assertTrue("root autocreate v2 flag",
>   csConfig.getBoolean(
>   PREFIX + "root.auto-queue-creation-v2.enabled", false));
>   assertTrue("root.admins autocreate v2 flag",
>   csConfig.getBoolean(
>   PREFIX + "root.admins.auto-queue-creation-v2.enabled", false));
>   assertTrue("root.users autocreate v2 flag",
>   csConfig.getBoolean(
>   PREFIX + "root.users.auto-queue-creation-v2.enabled", false));
>   assertTrue("root.misc autocreate v2 flag",
>   csConfig.getBoolean(
>   PREFIX + "root.misc.auto-queue-creation-v2.enabled", false));
>   Set leafs = Sets.difference(ALL_QUEUES,
>   Sets.newHashSet("root",
>   "root.default",
>   "root.admins",
>   "root.users",
>   "root.misc"));
>   assertNoValueForQueues(leafs, "auto-queue-creation-v2.enabled",
>   csConfig);
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-11464) TestFSQueueConverter#testAutoCreateV2FlagsInWeightMode has a missing dot before auto-queue-creation-v2.enabled for method assertNoValueForQueues

2023-07-10 Thread Susheel Gupta (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-11464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susheel Gupta updated YARN-11464:
-
Summary: TestFSQueueConverter#testAutoCreateV2FlagsInWeightMode has a 
missing dot before auto-queue-creation-v2.enabled for method 
assertNoValueForQueues  (was: 
TestFSQueueConverter#testAutoCreateV2FlagsInWeightMode has a missing dot before 
for method assertNoValueForQueues)

> TestFSQueueConverter#testAutoCreateV2FlagsInWeightMode has a missing dot 
> before auto-queue-creation-v2.enabled for method assertNoValueForQueues
> 
>
> Key: YARN-11464
> URL: https://issues.apache.org/jira/browse/YARN-11464
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.3.4
>Reporter: Susheel Gupta
>Assignee: Susheel Gupta
>Priority: Major
>  Labels: pull-request-available
>
> This testcase clearly reproduces the issue. There is a missing dot before 
> "auto-queue-creation-v2.enabled" for method call assertNoValueForQueues.
> {code:java}
> @Test
> public void testAutoCreateV2FlagsInWeightMode() {
>   converter = builder.withPercentages(false).build();
>   converter.convertQueueHierarchy(rootQueue);
>   assertTrue("root autocreate v2 flag",
>   csConfig.getBoolean(
>   PREFIX + "root.auto-queue-creation-v2.enabled", false));
>   assertTrue("root.admins autocreate v2 flag",
>   csConfig.getBoolean(
>   PREFIX + "root.admins.auto-queue-creation-v2.enabled", false));
>   assertTrue("root.users autocreate v2 flag",
>   csConfig.getBoolean(
>   PREFIX + "root.users.auto-queue-creation-v2.enabled", false));
>   assertTrue("root.misc autocreate v2 flag",
>   csConfig.getBoolean(
>   PREFIX + "root.misc.auto-queue-creation-v2.enabled", false));
>   Set leafs = Sets.difference(ALL_QUEUES,
>   Sets.newHashSet("root",
>   "root.default",
>   "root.admins",
>   "root.users",
>   "root.misc"));
>   assertNoValueForQueues(leafs, "auto-queue-creation-v2.enabled",
>   csConfig);
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-11464) TestFSQueueConverter#testAutoCreateV2FlagsInWeightMode has a missing dot before auto-queue-creation-v2.enabled for method call assertNoValueForQueues

2023-07-10 Thread Susheel Gupta (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-11464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susheel Gupta updated YARN-11464:
-
Summary: TestFSQueueConverter#testAutoCreateV2FlagsInWeightMode has a 
missing dot before auto-queue-creation-v2.enabled for method call 
assertNoValueForQueues  (was: 
TestFSQueueConverter#testAutoCreateV2FlagsInWeightMode has a missing dot before 
auto-queue-creation-v2.enabled for method assertNoValueForQueues)

> TestFSQueueConverter#testAutoCreateV2FlagsInWeightMode has a missing dot 
> before auto-queue-creation-v2.enabled for method call assertNoValueForQueues
> -
>
> Key: YARN-11464
> URL: https://issues.apache.org/jira/browse/YARN-11464
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.3.4
>Reporter: Susheel Gupta
>Assignee: Susheel Gupta
>Priority: Major
>  Labels: pull-request-available
>
> This testcase clearly reproduces the issue. There is a missing dot before 
> "auto-queue-creation-v2.enabled" for method call assertNoValueForQueues.
> {code:java}
> @Test
> public void testAutoCreateV2FlagsInWeightMode() {
>   converter = builder.withPercentages(false).build();
>   converter.convertQueueHierarchy(rootQueue);
>   assertTrue("root autocreate v2 flag",
>   csConfig.getBoolean(
>   PREFIX + "root.auto-queue-creation-v2.enabled", false));
>   assertTrue("root.admins autocreate v2 flag",
>   csConfig.getBoolean(
>   PREFIX + "root.admins.auto-queue-creation-v2.enabled", false));
>   assertTrue("root.users autocreate v2 flag",
>   csConfig.getBoolean(
>   PREFIX + "root.users.auto-queue-creation-v2.enabled", false));
>   assertTrue("root.misc autocreate v2 flag",
>   csConfig.getBoolean(
>   PREFIX + "root.misc.auto-queue-creation-v2.enabled", false));
>   Set leafs = Sets.difference(ALL_QUEUES,
>   Sets.newHashSet("root",
>   "root.default",
>   "root.admins",
>   "root.users",
>   "root.misc"));
>   assertNoValueForQueues(leafs, "auto-queue-creation-v2.enabled",
>   csConfig);
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-11464) TestFSQueueConverter#testAutoCreateV2FlagsInWeightMode has a missing dot before for method assertNoValueForQueues

2023-07-10 Thread Susheel Gupta (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-11464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susheel Gupta updated YARN-11464:
-
Summary: TestFSQueueConverter#testAutoCreateV2FlagsInWeightMode has a 
missing dot before for method assertNoValueForQueues  (was:  queue 
element is added to any other  leaf queue, it's queueType becomes 
QueueType.PARENT_QUEUE)

> TestFSQueueConverter#testAutoCreateV2FlagsInWeightMode has a missing dot 
> before for method assertNoValueForQueues
> -
>
> Key: YARN-11464
> URL: https://issues.apache.org/jira/browse/YARN-11464
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.3.4
>Reporter: Susheel Gupta
>Assignee: Susheel Gupta
>Priority: Major
>  Labels: pull-request-available
>
> This testcase clearly reproduces the issue. There is a missing dot before 
> "auto-queue-creation-v2.enabled" for method call assertNoValueForQueues.
> {code:java}
> @Test
> public void testAutoCreateV2FlagsInWeightMode() {
>   converter = builder.withPercentages(false).build();
>   converter.convertQueueHierarchy(rootQueue);
>   assertTrue("root autocreate v2 flag",
>   csConfig.getBoolean(
>   PREFIX + "root.auto-queue-creation-v2.enabled", false));
>   assertTrue("root.admins autocreate v2 flag",
>   csConfig.getBoolean(
>   PREFIX + "root.admins.auto-queue-creation-v2.enabled", false));
>   assertTrue("root.users autocreate v2 flag",
>   csConfig.getBoolean(
>   PREFIX + "root.users.auto-queue-creation-v2.enabled", false));
>   assertTrue("root.misc autocreate v2 flag",
>   csConfig.getBoolean(
>   PREFIX + "root.misc.auto-queue-creation-v2.enabled", false));
>   Set leafs = Sets.difference(ALL_QUEUES,
>   Sets.newHashSet("root",
>   "root.default",
>   "root.admins",
>   "root.users",
>   "root.misc"));
>   assertNoValueForQueues(leafs, "auto-queue-creation-v2.enabled",
>   csConfig);
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-11535) Snakeyaml should be excluded from jackson-dataformat-yaml-2.12.7 as it may cause transitive dependency issue.

2023-07-19 Thread Susheel Gupta (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-11535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susheel Gupta updated YARN-11535:
-
Affects Version/s: 3.4.0

> Snakeyaml should be excluded from jackson-dataformat-yaml-2.12.7 as it may 
> cause transitive dependency issue.
> -
>
> Key: YARN-11535
> URL: https://issues.apache.org/jira/browse/YARN-11535
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: build, yarn
>Affects Versions: 3.4.0
>Reporter: Susheel Gupta
>Priority: Major
>
> Hadoop-project uses  
> [snakeyaml.version-2.0|https://github.com/apache/hadoop/blame/trunk/hadoop-project/pom.xml#L198]
>  and 
> [jackson-dataformat-yaml-2.12.7|https://github.com/apache/hadoop/blob/trunk/hadoop-project/pom.xml#L72].
> But internally jackson-dataformat-yaml-2.12.7 uses compile dependency 
> [snakeyaml.version-1.27|https://mvnrepository.com/artifact/com.fasterxml.jackson.dataformat/jackson-dataformat-yaml/2.12.7]
>  .This may cause a transitive dependency issue in other services using hadoop 
> jar having jackson-dataformat-yaml-2.12.7 as  jackson-dataformat-yaml-2.12.7 
> will use nearest dependency available of snakeyaml i.e 1.27 and ignore the 
> version of snakeyaml-2.0 from hadoop-project.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-11535) Snakeyaml should be excluded from jackson-dataformat-yaml-2.12.7 as it may cause transitive dependency issue.

2023-07-19 Thread Susheel Gupta (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-11535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susheel Gupta reassigned YARN-11535:


Assignee: Susheel Gupta

> Snakeyaml should be excluded from jackson-dataformat-yaml-2.12.7 as it may 
> cause transitive dependency issue.
> -
>
> Key: YARN-11535
> URL: https://issues.apache.org/jira/browse/YARN-11535
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: build, yarn
>Affects Versions: 3.4.0
>Reporter: Susheel Gupta
>Assignee: Susheel Gupta
>Priority: Major
>
> Hadoop-project uses  
> [snakeyaml.version-2.0|https://github.com/apache/hadoop/blame/trunk/hadoop-project/pom.xml#L198]
>  and 
> [jackson-dataformat-yaml-2.12.7|https://github.com/apache/hadoop/blob/trunk/hadoop-project/pom.xml#L72].
> But internally jackson-dataformat-yaml-2.12.7 uses compile dependency 
> [snakeyaml.version-1.27|https://mvnrepository.com/artifact/com.fasterxml.jackson.dataformat/jackson-dataformat-yaml/2.12.7]
>  .This may cause a transitive dependency issue in other services using hadoop 
> jar having jackson-dataformat-yaml-2.12.7 as  jackson-dataformat-yaml-2.12.7 
> will use nearest dependency available of snakeyaml i.e 1.27 and ignore the 
> version of snakeyaml-2.0 from hadoop-project.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-11535) Snakeyaml should be excluded from jackson-dataformat-yaml-2.12.7 as it may cause transitive dependency issue.

2023-07-19 Thread Susheel Gupta (Jira)
Susheel Gupta created YARN-11535:


 Summary: Snakeyaml should be excluded from 
jackson-dataformat-yaml-2.12.7 as it may cause transitive dependency issue.
 Key: YARN-11535
 URL: https://issues.apache.org/jira/browse/YARN-11535
 Project: Hadoop YARN
  Issue Type: Task
  Components: yarn
Reporter: Susheel Gupta


Hadoop-project uses  
[snakeyaml.version-2.0|https://github.com/apache/hadoop/blame/trunk/hadoop-project/pom.xml#L198]
 and 
[jackson-dataformat-yaml-2.12.7|https://github.com/apache/hadoop/blob/trunk/hadoop-project/pom.xml#L72].
But internally jackson-dataformat-yaml-2.12.7 uses compile dependency 
[snakeyaml.version-1.27|https://mvnrepository.com/artifact/com.fasterxml.jackson.dataformat/jackson-dataformat-yaml/2.12.7]
 .This may cause a transitive dependency issue in other services using hadoop 
jar having jackson-dataformat-yaml-2.12.7 as  jackson-dataformat-yaml-2.12.7 
will use nearest dependency available of snakeyaml i.e 1.27 and ignore the 
version of snakeyaml-2.0 from hadoop-project.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-11535) Jackson-dataformat-yaml should be upgraded to 2.15.2 as it may cause transitive dependency issue with 2.12.7

2023-08-03 Thread Susheel Gupta (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-11535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susheel Gupta updated YARN-11535:
-
Summary: Jackson-dataformat-yaml should be upgraded to 2.15.2 as it may 
cause transitive dependency issue with 2.12.7  (was: Snakeyaml should be 
excluded from jackson-dataformat-yaml-2.12.7 as it may cause transitive 
dependency issue.)

> Jackson-dataformat-yaml should be upgraded to 2.15.2 as it may cause 
> transitive dependency issue with 2.12.7
> 
>
> Key: YARN-11535
> URL: https://issues.apache.org/jira/browse/YARN-11535
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: build, yarn
>Affects Versions: 3.4.0
>Reporter: Susheel Gupta
>Assignee: Susheel Gupta
>Priority: Major
>  Labels: pull-request-available
>
> Hadoop-project uses  
> [snakeyaml.version-2.0|https://github.com/apache/hadoop/blame/trunk/hadoop-project/pom.xml#L198]
>  and 
> [jackson-dataformat-yaml-2.12.7|https://github.com/apache/hadoop/blob/trunk/hadoop-project/pom.xml#L72].
> But internally jackson-dataformat-yaml-2.12.7 uses compile dependency 
> [snakeyaml.version-1.27|https://mvnrepository.com/artifact/com.fasterxml.jackson.dataformat/jackson-dataformat-yaml/2.12.7]
>  .This may cause a transitive dependency issue in other services using hadoop 
> jar having jackson-dataformat-yaml-2.12.7 as  jackson-dataformat-yaml-2.12.7 
> will use nearest dependency available of snakeyaml i.e 1.27 and ignore the 
> version of snakeyaml-2.0 from hadoop-project.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-11518) Renew Knox Tokens for Log Aggregation

2023-06-23 Thread Susheel Gupta (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-11518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susheel Gupta reassigned YARN-11518:


Assignee: Susheel Gupta

> Renew Knox Tokens for Log Aggregation
> -
>
> Key: YARN-11518
> URL: https://issues.apache.org/jira/browse/YARN-11518
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Reporter: Susheel Gupta
>Assignee: Susheel Gupta
>Priority: Major
>
> In Public Cloud environments log aggregation is generally configured to write 
> to a filesystem behind IDBroker (e.g. S3), which requires credential renewal 
> after the configured expiration. Currently, log aggregation fails for any 
> application running past this period, which may be common in streaming 
> environments.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-11518) Renew Knox Tokens for Log Aggregation

2023-06-23 Thread Susheel Gupta (Jira)
Susheel Gupta created YARN-11518:


 Summary: Renew Knox Tokens for Log Aggregation
 Key: YARN-11518
 URL: https://issues.apache.org/jira/browse/YARN-11518
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: yarn
Reporter: Susheel Gupta


In Public Cloud environments log aggregation is generally configured to write 
to a filesystem behind IDBroker (e.g. S3), which requires credential renewal 
after the configured expiration. Currently, log aggregation fails for any 
application running past this period, which may be common in streaming 
environments.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-11464) queue element is added to any other  leaf queue, it's queueType becomes QueueType.PARENT_QUEUE

2023-05-05 Thread Susheel Gupta (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17719699#comment-17719699
 ] 

Susheel Gupta edited comment on YARN-11464 at 5/5/23 7:14 AM:
--

I think this is not just a test issue. There was a missing dot before 
"{color:#57d9a3}auto-queue-creation-v2.enabled{color}", which is one issue. 
However, when I added a dot ({color:#57d9a3}.{color}) before 
"{color:#57d9a3}auto-queue-creation-v2.enabled{color}" for the method call 
assertNoValueForQueues in 
{*}TestFSQueueConverter#testAutoCreateV2FlagsInWeightMode{*}, the test failed 
again. This was because the queue root.admin.alice has 
"auto-queue-creation-v2.enabled" set to true ({*}even though we don't have the 
AQCv2 feature enabled to true for leaf queues{*}).

I tried to debug this and found that the XML file used for 
TestFSQueueConverter#testAutoCreateV2FlagsInWeightMode is 
"{{{}apache/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/fair-scheduler-conversion.xml{}}}".

In this file, there are two queues, {{root.admin.alice}} and 
{{{}root.admin.bob{}}}. When they defined the properties for root.admin.alice, 
they added the  queue element, which is correct since it is a leaf 
queue. However, due to the reservation queue element in root.admin.alice, its 
queueType becomes QueueType.PARENT_QUEUE (I guess), and 
"{color:#57d9a3}auto-queue-creation-v2.enabled{color}" is set to true for 
alice, which is wrong. I checked by adding the reservation queue element for 
other queues as well, namely {{{}root.admin.bob, root.users.john, and 
root.users.joe{}}}. All of these queues have AQCv2 enabled to true when the 
 queue element is added.


was (Author: JIRAUSER295692):
I think this is not just a test issue. There was a missing dot before 
"{color:#57d9a3}auto-queue-creation-v2.enabled{color}", which is one issue. 
However, when I added a dot ({color:#57d9a3}.{color}) before 
"{color:#57d9a3}auto-queue-creation-v2.enabled{color}" for the method call 
assertNoValueForQueues in 
{*}TestFSQueueConverter#testAutoCreateV2FlagsInWeightMode{*}, the test failed 
again. This was because the queue root.admin.alice has 
"auto-queue-creation-v2.enabled" set to true ({*}even though we don't have the 
AQCv2 feature enabled to true for leaf queues{*}).

I tried to debug this and found that the XML file used for 
TestFSQueueConverter#testAutoCreateV2FlagsInWeightMode is 
"{{{}apache/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/fair-scheduler-conversion.xml{}}}".

In this file, there are two queues, {{root.admin.alice}} and 
{{{}root.admin.bob{}}}. When they defined the properties for root.admin.alice, 
they added the  queue element, which is correct since it is a leaf 
queue. However, due to the reservation queue element in root.admin.alice, its 
queueType becomes QueueType.PARENT_QUEUE, and 
"{color:#57d9a3}auto-queue-creation-v2.enabled{color}" is set to true for 
alice, which is wrong. I checked by adding the reservation queue element for 
other queues as well, namely {{{}root.admin.bob, root.users.john, and 
root.users.joe{}}}. All of these queues have AQCv2 enabled to true when the 
 queue element is added.

>  queue element is added to any other  leaf queue, it's queueType 
> becomes QueueType.PARENT_QUEUE
> 
>
> Key: YARN-11464
> URL: https://issues.apache.org/jira/browse/YARN-11464
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.3.4
>Reporter: Susheel Gupta
>Priority: Major
>
> This testcase clearly reproduces the issue. There is a missing dot before 
> "auto-queue-creation-v2.enabled" for method call assertNoValueForQueues.
> {code:java}
> @Test
> public void testAutoCreateV2FlagsInWeightMode() {
>   converter = builder.withPercentages(false).build();
>   converter.convertQueueHierarchy(rootQueue);
>   assertTrue("root autocreate v2 flag",
>   csConfig.getBoolean(
>   PREFIX + "root.auto-queue-creation-v2.enabled", false));
>   assertTrue("root.admins autocreate v2 flag",
>   csConfig.getBoolean(
>   PREFIX + "root.admins.auto-queue-creation-v2.enabled", false));
>   assertTrue("root.users autocreate v2 flag",
>   csConfig.getBoolean(
>   PREFIX + "root.users.auto-queue-creation-v2.enabled", false));
>   assertTrue("root.misc autocreate v2 flag",
>   csConfig.getBoolean(
>   PREFIX + "root.misc.auto-queue-creation-v2.enabled", false));
>   Set leafs = Sets.difference(ALL_QUEUES,
>   Sets.newHashSet("root",
>   "root.default",
>   "root.admins",
>   "root.users",
>   "root.misc"));
>   

[jira] [Commented] (YARN-11464) queue element is added to any other  leaf queue, it's queueType becomes QueueType.PARENT_QUEUE

2023-05-05 Thread Susheel Gupta (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17719699#comment-17719699
 ] 

Susheel Gupta commented on YARN-11464:
--

I think this is not just a test issue. There was a missing dot before 
"{color:#57d9a3}auto-queue-creation-v2.enabled{color}", which is one issue. 
However, when I added a dot ({color:#57d9a3}.{color}) before 
"{color:#57d9a3}auto-queue-creation-v2.enabled{color}" for the method call 
assertNoValueForQueues in 
{*}TestFSQueueConverter#testAutoCreateV2FlagsInWeightMode{*}, the test failed 
again. This was because the queue root.admin.alice has 
"auto-queue-creation-v2.enabled" set to true ({*}even though we don't have the 
AQCv2 feature enabled to true for leaf queues{*}).

I tried to debug this and found that the XML file used for 
TestFSQueueConverter#testAutoCreateV2FlagsInWeightMode is 
"{{{}apache/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/fair-scheduler-conversion.xml{}}}".

In this file, there are two queues, {{root.admin.alice}} and 
{{{}root.admin.bob{}}}. When they defined the properties for root.admin.alice, 
they added the  queue element, which is correct since it is a leaf 
queue. However, due to the reservation queue element in root.admin.alice, its 
queueType becomes QueueType.PARENT_QUEUE, and 
"{color:#57d9a3}auto-queue-creation-v2.enabled{color}" is set to true for 
alice, which is wrong. I checked by adding the reservation queue element for 
other queues as well, namely {{{}root.admin.bob, root.users.john, and 
root.users.joe{}}}. All of these queues have AQCv2 enabled to true when the 
 queue element is added.

>  queue element is added to any other  leaf queue, it's queueType 
> becomes QueueType.PARENT_QUEUE
> 
>
> Key: YARN-11464
> URL: https://issues.apache.org/jira/browse/YARN-11464
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.3.4
>Reporter: Susheel Gupta
>Priority: Major
>
> This testcase clearly reproduces the issue. There is a missing dot before 
> "auto-queue-creation-v2.enabled" for method call assertNoValueForQueues.
> {code:java}
> @Test
> public void testAutoCreateV2FlagsInWeightMode() {
>   converter = builder.withPercentages(false).build();
>   converter.convertQueueHierarchy(rootQueue);
>   assertTrue("root autocreate v2 flag",
>   csConfig.getBoolean(
>   PREFIX + "root.auto-queue-creation-v2.enabled", false));
>   assertTrue("root.admins autocreate v2 flag",
>   csConfig.getBoolean(
>   PREFIX + "root.admins.auto-queue-creation-v2.enabled", false));
>   assertTrue("root.users autocreate v2 flag",
>   csConfig.getBoolean(
>   PREFIX + "root.users.auto-queue-creation-v2.enabled", false));
>   assertTrue("root.misc autocreate v2 flag",
>   csConfig.getBoolean(
>   PREFIX + "root.misc.auto-queue-creation-v2.enabled", false));
>   Set leafs = Sets.difference(ALL_QUEUES,
>   Sets.newHashSet("root",
>   "root.default",
>   "root.admins",
>   "root.users",
>   "root.misc"));
>   assertNoValueForQueues(leafs, "auto-queue-creation-v2.enabled",
>   csConfig);
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7548) TestCapacityOverTimePolicy.testAllocation is flaky

2024-02-21 Thread Susheel Gupta (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-7548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susheel Gupta updated YARN-7548:

Description: 
*Reported at: 15/Nov/18 20:32*

It failed in both YARN-7337 and YARN-6921 jenkins jobs.

org.apache.hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy.testAllocation[Duration
 90,000,000, height 0.25, numSubmission 1, periodic 8640)]

*Stacktrace*
{code:java}
junit.framework.AssertionFailedError: null
 at junit.framework.Assert.fail(Assert.java:55)
 at junit.framework.Assert.fail(Assert.java:64)
 at junit.framework.TestCase.fail(TestCase.java:235)
 at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.BaseSharingPolicyTest.runTest(BaseSharingPolicyTest.java:146)
 at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy.testAllocation(TestCapacityOverTimePolicy.java:136){code}
*Standard Output*
{code:java}
2017-11-20 23:57:03,759 INFO [main] recovery.RMStateStore 
(RMStateStore.java:transition(538)) - Storing reservation 
allocation.reservation_-9026698577416205920_6337917439559340517
 2017-11-20 23:57:03,759 INFO [main] recovery.RMStateStore 
(MemoryRMStateStore.java:storeReservationState(247)) - Storing 
reservationallocation for reservation_-9026698577416205920_6337917439559340517 
for plan dedicated
 2017-11-20 23:57:03,760 INFO [main] reservation.InMemoryPlan 
(InMemoryPlan.java:addReservation(373)) - Successfully added reservation: 
reservation_-9026698577416205920_6337917439559340517 to plan.
 In-memory Plan: Parent Queue: dedicatedTotal Capacity: Step: 1000reservation_-9026698577416205920_6337917439559340517 
user:u1 startTime: 0 endTime: 8640 Periodiciy: 8640 alloc:
 [Period: 8640
 0: 
 3423748: 
 86223748: 
 8640: 
 9223372036854775807: null
 ]

{code}
*Reported at: 21/Feb/24*

Ran TestCapacityOverTimePolicy testcase locally 100 times in a row and found it 
failed 5 times with the below error:

[INFO] Running 
org.apache.hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy
[ERROR] Tests run: 30, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.503 
s <<< FAILURE! - in 
org.apache.hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy
[ERROR] testAllocation[Duration 60,000, height 0.25, numSubmission 3, periodic 
720)](org.apache.hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy)
  Time elapsed: 0.009 s  <<< ERROR!
org.apache.hadoop.yarn.server.resourcemanager.reservation.exceptions.PlanningQuotaException:
 Integral (avg over time) quota capacity 0.25 over a window of 86400 seconds,  
would be exceeded by accepting reservation: 
reservation_-7619846766601560789_3793931544284185119
        at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.CapacityOverTimePolicy.validate(CapacityOverTimePolicy.java:206)
        at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.InMemoryPlan.addReservation(InMemoryPlan.java:348)
        at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.BaseSharingPolicyTest.runTest(BaseSharingPolicyTest.java:141)
        at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy.testAllocation(TestCapacityOverTimePolicy.java:136)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
        at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
        at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
        at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
        at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
        at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
        at 
org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
        at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
        at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
        at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
        at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
        at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
        at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
        at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
        at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
        at 

[jira] [Updated] (YARN-7548) TestCapacityOverTimePolicy.testAllocation is flaky

2024-02-21 Thread Susheel Gupta (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-7548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susheel Gupta updated YARN-7548:

Description: 
*Reported at: 15/Nov/18 20:32*

It failed in both YARN-7337 and YARN-6921 jenkins jobs.

org.apache.hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy.testAllocation[Duration
 90,000,000, height 0.25, numSubmission 1, periodic 8640)]

*Stacktrace*
{code:java}
junit.framework.AssertionFailedError: null
 at junit.framework.Assert.fail(Assert.java:55)
 at junit.framework.Assert.fail(Assert.java:64)
 at junit.framework.TestCase.fail(TestCase.java:235)
 at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.BaseSharingPolicyTest.runTest(BaseSharingPolicyTest.java:146)
 at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy.testAllocation(TestCapacityOverTimePolicy.java:136){code}
*Standard Output*
{code:java}
2017-11-20 23:57:03,759 INFO [main] recovery.RMStateStore 
(RMStateStore.java:transition(538)) - Storing reservation 
allocation.reservation_-9026698577416205920_6337917439559340517
 2017-11-20 23:57:03,759 INFO [main] recovery.RMStateStore 
(MemoryRMStateStore.java:storeReservationState(247)) - Storing 
reservationallocation for reservation_-9026698577416205920_6337917439559340517 
for plan dedicated
 2017-11-20 23:57:03,760 INFO [main] reservation.InMemoryPlan 
(InMemoryPlan.java:addReservation(373)) - Successfully added reservation: 
reservation_-9026698577416205920_6337917439559340517 to plan.
 In-memory Plan: Parent Queue: dedicatedTotal Capacity: Step: 1000reservation_-9026698577416205920_6337917439559340517 
user:u1 startTime: 0 endTime: 8640 Periodiciy: 8640 alloc:
 [Period: 8640
 0: 
 3423748: 
 86223748: 
 8640: 
 9223372036854775807: null
 ]

{code}

{*}Reported at: 21/Feb/24{*}{*}{*}

Ran TestCapacityOverTimePolicy testcase 100 times in a row and found it failed 
5 times with the below error:

[INFO] Running 
org.apache.hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy
[ERROR] Tests run: 30, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.503 
s <<< FAILURE! - in 
org.apache.hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy
[ERROR] testAllocation[Duration 60,000, height 0.25, numSubmission 3, periodic 
720)](org.apache.hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy)
  Time elapsed: 0.009 s  <<< ERROR!
org.apache.hadoop.yarn.server.resourcemanager.reservation.exceptions.PlanningQuotaException:
 Integral (avg over time) quota capacity 0.25 over a window of 86400 seconds,  
would be exceeded by accepting reservation: 
reservation_-7619846766601560789_3793931544284185119
        at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.CapacityOverTimePolicy.validate(CapacityOverTimePolicy.java:206)
        at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.InMemoryPlan.addReservation(InMemoryPlan.java:348)
        at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.BaseSharingPolicyTest.runTest(BaseSharingPolicyTest.java:141)
        at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy.testAllocation(TestCapacityOverTimePolicy.java:136)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
        at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
        at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
        at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
        at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
        at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
        at 
org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
        at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
        at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
        at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
        at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
        at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
        at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
        at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
        at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
        at 

[jira] [Commented] (YARN-7548) TestCapacityOverTimePolicy.testAllocation is flaky

2024-02-21 Thread Susheel Gupta (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-7548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17819141#comment-17819141
 ] 

Susheel Gupta commented on YARN-7548:
-

[~snemeth] Can I assign this Jira to myself and start working on it?

> TestCapacityOverTimePolicy.testAllocation is flaky
> --
>
> Key: YARN-7548
> URL: https://issues.apache.org/jira/browse/YARN-7548
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: reservation system
>Affects Versions: 3.0.0-beta1
>Reporter: Haibo Chen
>Assignee: Szilard Nemeth
>Priority: Major
>
> *Reported at: 15/Nov/18 20:32*
> It failed in both YARN-7337 and YARN-6921 jenkins jobs.
> org.apache.hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy.testAllocation[Duration
>  90,000,000, height 0.25, numSubmission 1, periodic 8640)]
> *Stacktrace*
> {code:java}
> junit.framework.AssertionFailedError: null
>  at junit.framework.Assert.fail(Assert.java:55)
>  at junit.framework.Assert.fail(Assert.java:64)
>  at junit.framework.TestCase.fail(TestCase.java:235)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.reservation.BaseSharingPolicyTest.runTest(BaseSharingPolicyTest.java:146)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy.testAllocation(TestCapacityOverTimePolicy.java:136){code}
> *Standard Output*
> {code:java}
> 2017-11-20 23:57:03,759 INFO [main] recovery.RMStateStore 
> (RMStateStore.java:transition(538)) - Storing reservation 
> allocation.reservation_-9026698577416205920_6337917439559340517
>  2017-11-20 23:57:03,759 INFO [main] recovery.RMStateStore 
> (MemoryRMStateStore.java:storeReservationState(247)) - Storing 
> reservationallocation for 
> reservation_-9026698577416205920_6337917439559340517 for plan dedicated
>  2017-11-20 23:57:03,760 INFO [main] reservation.InMemoryPlan 
> (InMemoryPlan.java:addReservation(373)) - Successfully added reservation: 
> reservation_-9026698577416205920_6337917439559340517 to plan.
>  In-memory Plan: Parent Queue: dedicatedTotal Capacity:  vCores:1000>Step: 1000reservation_-9026698577416205920_6337917439559340517 
> user:u1 startTime: 0 endTime: 8640 Periodiciy: 8640 alloc:
>  [Period: 8640
>  0: 
>  3423748: 
>  86223748: 
>  8640: 
>  9223372036854775807: null
>  ]
> {code}
> *Reported at: 21/Feb/24*
> Ran TestCapacityOverTimePolicy testcase locally 100 times in a row and found 
> it failed 5 times with the below error:
> [INFO] Running 
> org.apache.hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy
> [ERROR] Tests run: 30, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 0.503 s <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy
> [ERROR] testAllocation[Duration 60,000, height 0.25, numSubmission 3, 
> periodic 
> 720)](org.apache.hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy)
>   Time elapsed: 0.009 s  <<< ERROR!
> org.apache.hadoop.yarn.server.resourcemanager.reservation.exceptions.PlanningQuotaException:
>  Integral (avg over time) quota capacity 0.25 over a window of 86400 seconds, 
>  would be exceeded by accepting reservation: 
> reservation_-7619846766601560789_3793931544284185119
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.reservation.CapacityOverTimePolicy.validate(CapacityOverTimePolicy.java:206)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.reservation.InMemoryPlan.addReservation(InMemoryPlan.java:348)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.reservation.BaseSharingPolicyTest.runTest(BaseSharingPolicyTest.java:141)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy.testAllocation(TestCapacityOverTimePolicy.java:136)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>         at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>         at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>         at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>         at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>         at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>         at 
> 

[jira] [Created] (YARN-11621) Fix intermittently failing unit test: TestAMRMProxy.testAMRMProxyTokenRenewal

2023-11-29 Thread Susheel Gupta (Jira)
Susheel Gupta created YARN-11621:


 Summary: Fix intermittently failing unit test: 
TestAMRMProxy.testAMRMProxyTokenRenewal
 Key: YARN-11621
 URL: https://issues.apache.org/jira/browse/YARN-11621
 Project: Hadoop YARN
  Issue Type: Test
  Components: yarn
Reporter: Susheel Gupta


This test seems to be flaky as it failed 3 times out of 200 runs based on the 
trunk.
This was fixed earlier with YARN-7020, but it seems it didn't cover all the 
flakiness.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-11621) Fix intermittently failing unit test: TestAMRMProxy.testAMRMProxyTokenRenewal

2023-11-29 Thread Susheel Gupta (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-11621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susheel Gupta updated YARN-11621:
-
Affects Version/s: 3.3.6

> Fix intermittently failing unit test: TestAMRMProxy.testAMRMProxyTokenRenewal
> -
>
> Key: YARN-11621
> URL: https://issues.apache.org/jira/browse/YARN-11621
> Project: Hadoop YARN
>  Issue Type: Test
>  Components: yarn
>Affects Versions: 3.3.6
>Reporter: Susheel Gupta
>Assignee: Susheel Gupta
>Priority: Major
>
> This test seems to be flaky as it failed 3 times out of 200 runs based on the 
> trunk.
> This was fixed earlier with YARN-7020, but it seems it didn't cover all the 
> flakiness.
> h3.  
> {code:java}
> Error Message
> Application attempt appattempt_1630750910491_0001_01 doesn't exist in 
> ApplicationMasterService cache. at 
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:407)
>  at 
> org.apache.hadoop.yarn.server.nodemanager.amrmproxy.DefaultRequestInterceptor$3.allocate(DefaultRequestInterceptor.java:224)
>  at 
> org.apache.hadoop.yarn.server.nodemanager.amrmproxy.DefaultRequestInterceptor.allocate(DefaultRequestInterceptor.java:135)
>  at 
> org.apache.hadoop.yarn.server.nodemanager.amrmproxy.AMRMProxyService.allocate(AMRMProxyService.java:329)
>  at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:60)
>  at 
> org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:533)
>  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070) at 
> org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:989) at 
> org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:917) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:422) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
>  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2894)
> Stacktrace
> org.apache.hadoop.yarn.exceptions.ApplicationAttemptNotFoundException: 
> Application attempt appattempt_1630750910491_0001_01 doesn't exist in 
> ApplicationMasterService cache. at 
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:407)
>  at 
> org.apache.hadoop.yarn.server.nodemanager.amrmproxy.DefaultRequestInterceptor$3.allocate(DefaultRequestInterceptor.java:224)
>  at 
> org.apache.hadoop.yarn.server.nodemanager.amrmproxy.DefaultRequestInterceptor.allocate(DefaultRequestInterceptor.java:135)
>  at 
> org.apache.hadoop.yarn.server.nodemanager.amrmproxy.AMRMProxyService.allocate(AMRMProxyService.java:329)
>  at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:60)
>  at 
> org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:533)
>  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070) at 
> org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:989) at 
> org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:917) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:422) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
>  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2894) at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at 
> org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53) at 
> org.apache.hadoop.yarn.ipc.RPCUtil.instantiateYarnException(RPCUtil.java:75) 
> at 
> org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:116) 
> at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:79)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at 

[jira] [Assigned] (YARN-11621) Fix intermittently failing unit test: TestAMRMProxy.testAMRMProxyTokenRenewal

2023-11-29 Thread Susheel Gupta (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-11621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susheel Gupta reassigned YARN-11621:


Assignee: Susheel Gupta

> Fix intermittently failing unit test: TestAMRMProxy.testAMRMProxyTokenRenewal
> -
>
> Key: YARN-11621
> URL: https://issues.apache.org/jira/browse/YARN-11621
> Project: Hadoop YARN
>  Issue Type: Test
>  Components: yarn
>Reporter: Susheel Gupta
>Assignee: Susheel Gupta
>Priority: Major
>
> This test seems to be flaky as it failed 3 times out of 200 runs based on the 
> trunk.
> This was fixed earlier with YARN-7020, but it seems it didn't cover all the 
> flakiness.
> h3.  
> {code:java}
> Error Message
> Application attempt appattempt_1630750910491_0001_01 doesn't exist in 
> ApplicationMasterService cache. at 
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:407)
>  at 
> org.apache.hadoop.yarn.server.nodemanager.amrmproxy.DefaultRequestInterceptor$3.allocate(DefaultRequestInterceptor.java:224)
>  at 
> org.apache.hadoop.yarn.server.nodemanager.amrmproxy.DefaultRequestInterceptor.allocate(DefaultRequestInterceptor.java:135)
>  at 
> org.apache.hadoop.yarn.server.nodemanager.amrmproxy.AMRMProxyService.allocate(AMRMProxyService.java:329)
>  at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:60)
>  at 
> org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:533)
>  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070) at 
> org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:989) at 
> org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:917) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:422) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
>  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2894)
> Stacktrace
> org.apache.hadoop.yarn.exceptions.ApplicationAttemptNotFoundException: 
> Application attempt appattempt_1630750910491_0001_01 doesn't exist in 
> ApplicationMasterService cache. at 
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:407)
>  at 
> org.apache.hadoop.yarn.server.nodemanager.amrmproxy.DefaultRequestInterceptor$3.allocate(DefaultRequestInterceptor.java:224)
>  at 
> org.apache.hadoop.yarn.server.nodemanager.amrmproxy.DefaultRequestInterceptor.allocate(DefaultRequestInterceptor.java:135)
>  at 
> org.apache.hadoop.yarn.server.nodemanager.amrmproxy.AMRMProxyService.allocate(AMRMProxyService.java:329)
>  at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:60)
>  at 
> org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:533)
>  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070) at 
> org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:989) at 
> org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:917) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:422) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
>  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2894) at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at 
> org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53) at 
> org.apache.hadoop.yarn.ipc.RPCUtil.instantiateYarnException(RPCUtil.java:75) 
> at 
> org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:116) 
> at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:79)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at 

[jira] [Updated] (YARN-11621) Fix intermittently failing unit test: TestAMRMProxy.testAMRMProxyTokenRenewal

2023-11-29 Thread Susheel Gupta (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-11621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susheel Gupta updated YARN-11621:
-
Description: 
This test seems to be flaky as it failed 3 times out of 200 runs based on the 
trunk.
This was fixed earlier with YARN-7020, but it seems it didn't cover all the 
flakiness.


h3.  
{code:java}
Error Message
Application attempt appattempt_1630750910491_0001_01 doesn't exist in 
ApplicationMasterService cache. at 
org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:407)
 at 
org.apache.hadoop.yarn.server.nodemanager.amrmproxy.DefaultRequestInterceptor$3.allocate(DefaultRequestInterceptor.java:224)
 at 
org.apache.hadoop.yarn.server.nodemanager.amrmproxy.DefaultRequestInterceptor.allocate(DefaultRequestInterceptor.java:135)
 at 
org.apache.hadoop.yarn.server.nodemanager.amrmproxy.AMRMProxyService.allocate(AMRMProxyService.java:329)
 at 
org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:60)
 at 
org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99)
 at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:533)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070) at 
org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:989) at 
org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:917) at 
java.security.AccessController.doPrivileged(Native Method) at 
javax.security.auth.Subject.doAs(Subject.java:422) at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2894)

Stacktrace
org.apache.hadoop.yarn.exceptions.ApplicationAttemptNotFoundException: 
Application attempt appattempt_1630750910491_0001_01 doesn't exist in 
ApplicationMasterService cache. at 
org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:407)
 at 
org.apache.hadoop.yarn.server.nodemanager.amrmproxy.DefaultRequestInterceptor$3.allocate(DefaultRequestInterceptor.java:224)
 at 
org.apache.hadoop.yarn.server.nodemanager.amrmproxy.DefaultRequestInterceptor.allocate(DefaultRequestInterceptor.java:135)
 at 
org.apache.hadoop.yarn.server.nodemanager.amrmproxy.AMRMProxyService.allocate(AMRMProxyService.java:329)
 at 
org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:60)
 at 
org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99)
 at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:533)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070) at 
org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:989) at 
org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:917) at 
java.security.AccessController.doPrivileged(Native Method) at 
javax.security.auth.Subject.doAs(Subject.java:422) at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2894) at 
sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
 at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at 
org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53) at 
org.apache.hadoop.yarn.ipc.RPCUtil.instantiateYarnException(RPCUtil.java:75) at 
org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:116) at 
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:79)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498) at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:431)
 at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166)
 at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158)
 at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96)
 at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362)
 at com.sun.proxy.$Proxy90.allocate(Unknown Source) at 

[jira] [Assigned] (YARN-11682) Legacy auto created queue in absolute mode has zero capacity after creation during app recovery

2024-05-09 Thread Susheel Gupta (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-11682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susheel Gupta reassigned YARN-11682:


Assignee: Susheel Gupta

> Legacy auto created queue in absolute mode has zero capacity after creation 
> during app recovery
> ---
>
> Key: YARN-11682
> URL: https://issues.apache.org/jira/browse/YARN-11682
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Reporter: Brian Goerlitz
>Assignee: Susheel Gupta
>Priority: Major
>
> During recovery of a running app in a legacy auto created queue configured in 
> absolute mode the configured min resources will be set to zero as 
> NodeManagers have not registered yet (clusterResource is zero).
> {code:java}
> GuaranteedOrZeroCapacityOverTimePolicy.getInitialLeafQueueConfiguration(AbstractAutoCreatedLeafQueue
>  leafQueue)
> ...
>if (availableCapacity >= leafQueueTemplateCapacities
> .getAbsoluteCapacity(nodeLabel)) {
>   updateCapacityFromTemplate(capacities, nodeLabel);
>   activate(leafQueue, nodeLabel);
> } else{
>   updateToZeroCapacity(capacities, nodeLabel, leafQueue);
> }
> GuaranteedOrZeroCapacityOverTimePolicy.updateToZeroCapacity(QueueCapacities 
> capacities, String nodeLabel, AbstractLeafQueue leafQueue)
> private void updateToZeroCapacity(QueueCapacities capacities,
>   String nodeLabel, AbstractLeafQueue leafQueue) {
> capacities.setCapacity(nodeLabel, 0.0f);
> capacities.setMaximumCapacity(nodeLabel,
> leafQueueTemplateCapacities.getMaximumCapacity(nodeLabel));
> leafQueue.getQueueResourceQuotas().
> setConfiguredMinResource(nodeLabel, Resource.newInstance(0, 0));
>   }
> {code}
> When a NodeManager is registered, 
> {{AbstractCSQueue.updateEffectiveResources(Resource clusterResource)}} is 
> called, but specifically absolute mode queues are updated using the 
> configured min resource, which will now be zero.
> {code:java}
> AbstractCSQueue.updateEffectiveResources(Resource clusterResource)
> ...
> if (getCapacityConfigType().equals(
>   CapacityConfigType.ABSOLUTE_RESOURCE)) {
> newEffectiveMinResource = createNormalizedMinResource(
> 
> usageTracker.getQueueResourceQuotas().getConfiguredMinResource(label),
> ((AbstractParentQueue) parent).getEffectiveMinRatio(label));
> ...
>usageTracker.getQueueResourceQuotas().setEffectiveMinResource(label,
>   newEffectiveMinResource);
> {code}
> Reinitializing the queue via a config change will correctly recalculate the 
> capacity based on current clusterResource.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-11696) Add debug-level logs in RMAppImpl#aggregateLogReport and RMAppImpl#getLogAggregationStatusForAppReport

2024-05-13 Thread Susheel Gupta (Jira)
Susheel Gupta created YARN-11696:


 Summary: Add debug-level logs in RMAppImpl#aggregateLogReport and 
RMAppImpl#getLogAggregationStatusForAppReport
 Key: YARN-11696
 URL: https://issues.apache.org/jira/browse/YARN-11696
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: yarn
Reporter: Susheel Gupta
Assignee: Susheel Gupta


The events keep increasing in event-queue and many event thread are blocked.
To discover the deadlocking threads, add a few debug level logs to 
RMAppImpl#aggregateLogReport and RMAppImpl#getLogAggregationStatusForAppReport.
{code:java}
"RM Event dispatcher" #93 prio=5 os_prio=0 tid=0x7fcb67120800 nid=0x13e62 
waiting on condition [0x7fbef632a000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x7fc44cada248> (a 
java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
        at 
java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
        at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.aggregateLogReport(RMAppImpl.java:1799)
        at 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl.handleLogAggregationStatus(RMNodeImpl.java:1478)
        at 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl.access$500(RMNodeImpl.java:104)
        at 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl$StatusUpdateWhenHealthyTransition.transition(RMNodeImpl.java:1239)
        at 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl$StatusUpdateWhenHealthyTransition.transition(RMNodeImpl.java:1195)
        at 
org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
        at 
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
        at 
org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)
        at 
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)
        - locked <0x7fc04c0b6970> (a 
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine)
        at 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl.handle(RMNodeImpl.java:667)
        at 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl.handle(RMNodeImpl.java:101)
        at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$NodeEventDispatcher.handle(ResourceManager.java:1124)
        at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$NodeEventDispatcher.handle(ResourceManager.java:1108)
        at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:219)
        at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:133)
        at java.lang.Thread.run(Thread.java:748) {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-11696) Add debug-level logs in RMAppImpl

2024-05-13 Thread Susheel Gupta (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-11696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susheel Gupta updated YARN-11696:
-
Summary: Add debug-level logs in RMAppImpl  (was: Add debug-level logs in 
RMAppImpl#aggregateLogReport and RMAppImpl#getLogAggregationStatusForAppReport)

> Add debug-level logs in RMAppImpl
> -
>
> Key: YARN-11696
> URL: https://issues.apache.org/jira/browse/YARN-11696
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Reporter: Susheel Gupta
>Assignee: Susheel Gupta
>Priority: Minor
>
> The events keep increasing in event-queue and many event thread are blocked.
> To discover the deadlocking threads, add a few debug level logs to 
> RMAppImpl#aggregateLogReport and 
> RMAppImpl#getLogAggregationStatusForAppReport.
> {code:java}
> "RM Event dispatcher" #93 prio=5 os_prio=0 tid=0x7fcb67120800 nid=0x13e62 
> waiting on condition [0x7fbef632a000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x7fc44cada248> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>         at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>         at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>         at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>         at 
> java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.aggregateLogReport(RMAppImpl.java:1799)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl.handleLogAggregationStatus(RMNodeImpl.java:1478)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl.access$500(RMNodeImpl.java:104)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl$StatusUpdateWhenHealthyTransition.transition(RMNodeImpl.java:1239)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl$StatusUpdateWhenHealthyTransition.transition(RMNodeImpl.java:1195)
>         at 
> org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
>         at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>         at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)
>         at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)
>         - locked <0x7fc04c0b6970> (a 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl.handle(RMNodeImpl.java:667)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl.handle(RMNodeImpl.java:101)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$NodeEventDispatcher.handle(ResourceManager.java:1124)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$NodeEventDispatcher.handle(ResourceManager.java:1108)
>         at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:219)
>         at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:133)
>         at java.lang.Thread.run(Thread.java:748) {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] (YARN-11661) Adding new property to configure the "SameSite" cookie attribute on YARN UI

2024-03-14 Thread Susheel Gupta (Jira)


[ https://issues.apache.org/jira/browse/YARN-11661 ]


Susheel Gupta deleted comment on YARN-11661:
--

was (Author: JIRAUSER299573):
Closing this as workaround exists.

> Adding new property to configure the "SameSite" cookie attribute on YARN UI 
> 
>
> Key: YARN-11661
> URL: https://issues.apache.org/jira/browse/YARN-11661
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 3.4.0
>Reporter: Susheel Gupta
>Assignee: Susheel Gupta
>Priority: Major
>
> If we use 'SameSite=Strict,' the browser would only send the cookie for 
> same-site requests, rendering cross-site sessions ineffective.
> However, it’s worth noting that while using SameSite=None with TLS does 
> enhance the security of your cookies compared to using it without TLS, it 
> doesn’t provide complete security. Nevertheless, considering the necessity 
> for cross-site sessions, utilizing SameSite=None along with TLS can provide a 
> reasonable level of security.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-11661) Adding new property to configure the "SameSite" cookie attribute on YARN UI

2024-03-14 Thread Susheel Gupta (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-11661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susheel Gupta resolved YARN-11661.
--
Hadoop Flags: Reviewed
  Resolution: Workaround

Closing this as workaround exists.

> Adding new property to configure the "SameSite" cookie attribute on YARN UI 
> 
>
> Key: YARN-11661
> URL: https://issues.apache.org/jira/browse/YARN-11661
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Reporter: Susheel Gupta
>Assignee: Susheel Gupta
>Priority: Major
>
> If we use 'SameSite=Strict,' the browser would only send the cookie for 
> same-site requests, rendering cross-site sessions ineffective.
> However, it’s worth noting that while using SameSite=None with TLS does 
> enhance the security of your cookies compared to using it without TLS, it 
> doesn’t provide complete security. Nevertheless, considering the necessity 
> for cross-site sessions, utilizing SameSite=None along with TLS can provide a 
> reasonable level of security.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-11661) Adding new property to configure the "SameSite" cookie attribute on YARN UI

2024-03-14 Thread Susheel Gupta (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-11661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susheel Gupta updated YARN-11661:
-
Affects Version/s: 3.4.0

> Adding new property to configure the "SameSite" cookie attribute on YARN UI 
> 
>
> Key: YARN-11661
> URL: https://issues.apache.org/jira/browse/YARN-11661
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 3.4.0
>Reporter: Susheel Gupta
>Assignee: Susheel Gupta
>Priority: Major
>
> If we use 'SameSite=Strict,' the browser would only send the cookie for 
> same-site requests, rendering cross-site sessions ineffective.
> However, it’s worth noting that while using SameSite=None with TLS does 
> enhance the security of your cookies compared to using it without TLS, it 
> doesn’t provide complete security. Nevertheless, considering the necessity 
> for cross-site sessions, utilizing SameSite=None along with TLS can provide a 
> reasonable level of security.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-11661) Adding new property to configure the "SameSite" cookie attribute on YARN UI

2024-03-14 Thread Susheel Gupta (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17826972#comment-17826972
 ] 

Susheel Gupta commented on YARN-11661:
--

Closing this ticket as a workaround exist.
{code:java}

hadoop.http.header.Set-Cookie
SameSite=None; Secure
{code}
Adding this property in yarn-site.xml will fix this issue.

Also "Secure" needs to be added as Set-Cookie was blocked because it had the 
"SameSite=None" attribute but did not have the "Secure" attribute, which is 
required in order to use "SameSite=None".
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Set-Cookie#:~:text=This%20Set%2DCookie%20was%20blocked%20because%20it%20had%20the%20%22SameSite%3DNone%22%20attribute%20but%20did%20not%20have%20the%20%22Secure%22%20attribute%2C%20which%20is%20required%20in%20order%20to%20use%20%22SameSite%3DNone%22.

> Adding new property to configure the "SameSite" cookie attribute on YARN UI 
> 
>
> Key: YARN-11661
> URL: https://issues.apache.org/jira/browse/YARN-11661
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Reporter: Susheel Gupta
>Assignee: Susheel Gupta
>Priority: Major
>
> If we use 'SameSite=Strict,' the browser would only send the cookie for 
> same-site requests, rendering cross-site sessions ineffective.
> However, it’s worth noting that while using SameSite=None with TLS does 
> enhance the security of your cookies compared to using it without TLS, it 
> doesn’t provide complete security. Nevertheless, considering the necessity 
> for cross-site sessions, utilizing SameSite=None along with TLS can provide a 
> reasonable level of security.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-11661) Adding new property to configure the "SameSite" cookie attribute on YARN UI

2024-03-13 Thread Susheel Gupta (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-11661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susheel Gupta updated YARN-11661:
-
Docs Text:   (was:  If we use 'SameSite=Strict,' the browser would only 
send the cookie for same-site requests, rendering cross-site sessions 
ineffective.
However, it’s worth noting that while using SameSite=None with TLS does enhance 
the security of your cookies compared to using it without TLS, it doesn’t 
provide complete security. Nevertheless, considering the necessity for 
cross-site sessions, utilizing SameSite=None along with TLS can provide a 
reasonable level of security.)

> Adding new property to configure the "SameSite" cookie attribute on YARN UI 
> 
>
> Key: YARN-11661
> URL: https://issues.apache.org/jira/browse/YARN-11661
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Reporter: Susheel Gupta
>Priority: Major
>
> If we use 'SameSite=Strict,' the browser would only send the cookie for 
> same-site requests, rendering cross-site sessions ineffective.
> However, it’s worth noting that while using SameSite=None with TLS does 
> enhance the security of your cookies compared to using it without TLS, it 
> doesn’t provide complete security. Nevertheless, considering the necessity 
> for cross-site sessions, utilizing SameSite=None along with TLS can provide a 
> reasonable level of security.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-11661) Adding new property to configure the "SameSite" cookie attribute on YARN UI

2024-03-13 Thread Susheel Gupta (Jira)
Susheel Gupta created YARN-11661:


 Summary: Adding new property to configure the "SameSite" cookie 
attribute on YARN UI 
 Key: YARN-11661
 URL: https://issues.apache.org/jira/browse/YARN-11661
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: yarn
Reporter: Susheel Gupta


If we use 'SameSite=Strict,' the browser would only send the cookie for 
same-site requests, rendering cross-site sessions ineffective.
However, it’s worth noting that while using SameSite=None with TLS does enhance 
the security of your cookies compared to using it without TLS, it doesn’t 
provide complete security. Nevertheless, considering the necessity for 
cross-site sessions, utilizing SameSite=None along with TLS can provide a 
reasonable level of security.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-7548) TestCapacityOverTimePolicy.testAllocation is flaky

2024-02-22 Thread Susheel Gupta (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-7548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susheel Gupta reassigned YARN-7548:
---

Assignee: Susheel Gupta  (was: Szilard Nemeth)

> TestCapacityOverTimePolicy.testAllocation is flaky
> --
>
> Key: YARN-7548
> URL: https://issues.apache.org/jira/browse/YARN-7548
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: reservation system
>Affects Versions: 3.0.0-beta1
>Reporter: Haibo Chen
>Assignee: Susheel Gupta
>Priority: Major
>
> *Reported at: 15/Nov/18 20:32*
> It failed in both YARN-7337 and YARN-6921 jenkins jobs.
> org.apache.hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy.testAllocation[Duration
>  90,000,000, height 0.25, numSubmission 1, periodic 8640)]
> *Stacktrace*
> {code:java}
> junit.framework.AssertionFailedError: null
>  at junit.framework.Assert.fail(Assert.java:55)
>  at junit.framework.Assert.fail(Assert.java:64)
>  at junit.framework.TestCase.fail(TestCase.java:235)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.reservation.BaseSharingPolicyTest.runTest(BaseSharingPolicyTest.java:146)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy.testAllocation(TestCapacityOverTimePolicy.java:136){code}
> *Standard Output*
> {code:java}
> 2017-11-20 23:57:03,759 INFO [main] recovery.RMStateStore 
> (RMStateStore.java:transition(538)) - Storing reservation 
> allocation.reservation_-9026698577416205920_6337917439559340517
>  2017-11-20 23:57:03,759 INFO [main] recovery.RMStateStore 
> (MemoryRMStateStore.java:storeReservationState(247)) - Storing 
> reservationallocation for 
> reservation_-9026698577416205920_6337917439559340517 for plan dedicated
>  2017-11-20 23:57:03,760 INFO [main] reservation.InMemoryPlan 
> (InMemoryPlan.java:addReservation(373)) - Successfully added reservation: 
> reservation_-9026698577416205920_6337917439559340517 to plan.
>  In-memory Plan: Parent Queue: dedicatedTotal Capacity:  vCores:1000>Step: 1000reservation_-9026698577416205920_6337917439559340517 
> user:u1 startTime: 0 endTime: 8640 Periodiciy: 8640 alloc:
>  [Period: 8640
>  0: 
>  3423748: 
>  86223748: 
>  8640: 
>  9223372036854775807: null
>  ]
> {code}
> *Reported at: 21/Feb/24*
> Ran TestCapacityOverTimePolicy testcase locally 100 times in a row and found 
> it failed 5 times with the below error:
> [INFO] Running 
> org.apache.hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy
> [ERROR] Tests run: 30, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 0.503 s <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy
> [ERROR] testAllocation[Duration 60,000, height 0.25, numSubmission 3, 
> periodic 
> 720)](org.apache.hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy)
>   Time elapsed: 0.009 s  <<< ERROR!
> org.apache.hadoop.yarn.server.resourcemanager.reservation.exceptions.PlanningQuotaException:
>  Integral (avg over time) quota capacity 0.25 over a window of 86400 seconds, 
>  would be exceeded by accepting reservation: 
> reservation_-7619846766601560789_3793931544284185119
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.reservation.CapacityOverTimePolicy.validate(CapacityOverTimePolicy.java:206)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.reservation.InMemoryPlan.addReservation(InMemoryPlan.java:348)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.reservation.BaseSharingPolicyTest.runTest(BaseSharingPolicyTest.java:141)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy.testAllocation(TestCapacityOverTimePolicy.java:136)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>         at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>         at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>         at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>         at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>         at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>         at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>         at