[jira] [Comment Edited] (YARN-8036) Memory Available shows a negative value after running updateNodeResource

2020-06-17 Thread yinghua_zh (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138925#comment-17138925
 ] 

yinghua_zh edited comment on YARN-8036 at 6/18/20, 12:53 AM:
-

2020-06-16 15:10:16,235 [INFO] [main] |app.DAGAppMaster|: In Session mode. 
Waiting for DAG over RPC
 2020-06-16 15:10:16,261 [INFO] [AMRM Callback Handler Thread] 
|rm.YarnTaskSchedulerService|*{color:#ff}: App total resource memory: -2048 
cpu: 0 taskAllocations: 0{color}*
 2020-06-16 15:10:16,262 [INFO] [AMRM Callback Handler Thread] 
|rm.YarnTaskSchedulerService|{color:#ff}: {color}*{color:#ff}Allocated: 
 Free:  p{color}en*dingRequests: 0 
delayedContainers: 0 heartbeats: 1 lastPreemptionHeartbeat: 0
 2020-06-16 15:10:16,264 [INFO] [Dispatcher thread \\{Central}] 
|node.PerSourceNodeTracker|: Num cluster nodes = 11


was (Author: yinghua_zh):
2020-06-16 15:10:16,235 [INFO] [main] |app.DAGAppMaster|: In Session mode. 
Waiting for DAG over RPC
2
2020-06-16 15:10:16,261 [INFO] [AMRM Callback Handler Thread] 
|rm.YarnTaskSchedulerService|*{color:#FF}: App total resource memory: -2048 
cpu: 0 taskAllocations: 0{color}*
2
2020-06-16 15:10:16,262 [INFO] [AMRM Callback Handler Thread] 
|rm.YarnTaskSchedulerService|{color:#FF}: {color}*{color:#FF}Allocated: 
 Free:  p{color}en*dingRequests: 0 
delayedContainers: 0 heartbeats: 1 lastPreemptionHeartbeat: 0
2
2020-06-16 15:10:16,264 [INFO] [Dispatcher thread \{Central}] 
|node.PerSourceNodeTracker|: Num cluster nodes = 11

> Memory Available shows a negative value after running updateNodeResource
> 
>
> Key: YARN-8036
> URL: https://issues.apache.org/jira/browse/YARN-8036
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Charan Hebri
>Assignee: Zian Chen
>Priority: Major
>
> Running updateNodeResource for a node that already has applications running 
> on it doesn't update Memory Available with the right values. It may end up 
> showing negative values based on the requirements of the application. 
> Attached a screenshot for reference.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8036) Memory Available shows a negative value after running updateNodeResource

2020-06-17 Thread yinghua_zh (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138925#comment-17138925
 ] 

yinghua_zh commented on YARN-8036:
--

2020-06-16 15:10:16,235 [INFO] [main] |app.DAGAppMaster|: In Session mode. 
Waiting for DAG over RPC
2
2020-06-16 15:10:16,261 [INFO] [AMRM Callback Handler Thread] 
|rm.YarnTaskSchedulerService|*{color:#FF}: App total resource memory: -2048 
cpu: 0 taskAllocations: 0{color}*
2
2020-06-16 15:10:16,262 [INFO] [AMRM Callback Handler Thread] 
|rm.YarnTaskSchedulerService|{color:#FF}: {color}*{color:#FF}Allocated: 
 Free:  p{color}en*dingRequests: 0 
delayedContainers: 0 heartbeats: 1 lastPreemptionHeartbeat: 0
2
2020-06-16 15:10:16,264 [INFO] [Dispatcher thread \{Central}] 
|node.PerSourceNodeTracker|: Num cluster nodes = 11

> Memory Available shows a negative value after running updateNodeResource
> 
>
> Key: YARN-8036
> URL: https://issues.apache.org/jira/browse/YARN-8036
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Charan Hebri
>Assignee: Zian Chen
>Priority: Major
>
> Running updateNodeResource for a node that already has applications running 
> on it doesn't update Memory Available with the right values. It may end up 
> showing negative values based on the requirements of the application. 
> Attached a screenshot for reference.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9809) NMs should supply a health status when registering with RM

2020-06-17 Thread Eric Badger (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138899#comment-17138899
 ] 

Eric Badger commented on YARN-9809:
---

I can see pros and cons to both approaches. On the one hand, if the health 
check script fails to execute properly, that's not good and could imply 
something bad. But health check scripts are pretty dangerous since they can 
take out an entire cluster if they're written improperly. So if someone updates 
the script and all of a sudden the script errors out, the whole cluster is 
unhealthy. Or the health check script could rely on querying a service and that 
service times out. The node is healthy, but the health check script returned 
error. Unless you are parsing for specific error codes, you can no longer 
differentiate between the health check script failing internally and the health 
check script returning successfully that the node is unhealthy. 

Regardless of this discussion though, this is outside of the scope of this 
JIRA. That's an issue with how the health check script is handled while this 
JIRA is just about providing a health status at NM startup

> NMs should supply a health status when registering with RM
> --
>
> Key: YARN-9809
> URL: https://issues.apache.org/jira/browse/YARN-9809
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
> Attachments: YARN-9809.001.patch, YARN-9809.002.patch, 
> YARN-9809.003.patch, YARN-9809.004.patch
>
>
> Currently if the NM registers with the RM and it is unhealthy, it can be 
> scheduled many containers before the first heartbeat. After the first 
> heartbeat, the RM will mark the NM as unhealthy and kill all of the 
> containers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9809) NMs should supply a health status when registering with RM

2020-06-17 Thread Eric Yang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138894#comment-17138894
 ] 

Eric Yang commented on YARN-9809:
-

[~ebadger] Sorry, my statement was not clear.  If the script name is incorrect, 
resulting exit code is non-zero, or the execution exit code is non-zero.  In 
those cases, health check will report as healthy.  I think those conditions 
must be considered as unhealthy, in the event that check script does not have 
proper prerequisites.  The errors can be caught.  Is this something that we can 
fix to make this more user friendly?

> NMs should supply a health status when registering with RM
> --
>
> Key: YARN-9809
> URL: https://issues.apache.org/jira/browse/YARN-9809
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
> Attachments: YARN-9809.001.patch, YARN-9809.002.patch, 
> YARN-9809.003.patch, YARN-9809.004.patch
>
>
> Currently if the NM registers with the RM and it is unhealthy, it can be 
> scheduled many containers before the first heartbeat. After the first 
> heartbeat, the RM will mark the NM as unhealthy and kill all of the 
> containers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9809) NMs should supply a health status when registering with RM

2020-06-17 Thread Eric Badger (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138892#comment-17138892
 ] 

Eric Badger commented on YARN-9809:
---

{noformat:title=NodeHealthScriptRunner.newInstance()}
if (!shouldRun(scriptName, nodeHealthScript)) {
  return null;
}
{noformat}

{noformat:title=NodeHealthScriptRunner.shouldRun()}
  static boolean shouldRun(String script, String healthScript) {
if (healthScript == null || healthScript.trim().isEmpty()) {
  LOG.info("Missing location for the node health check script \"{}\".",
  script);
  return false;
}
{noformat}

If the health check script doesn't exist, then the health {{shouldRun}} will 
return false and the {{newInstance}} will return null. This will cause the 
health reporter to not be added as a service. So at the end of the day, your 
statement is correct. If the health check script doesn't exist, the node will 
report as healthy.

> NMs should supply a health status when registering with RM
> --
>
> Key: YARN-9809
> URL: https://issues.apache.org/jira/browse/YARN-9809
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
> Attachments: YARN-9809.001.patch, YARN-9809.002.patch, 
> YARN-9809.003.patch, YARN-9809.004.patch
>
>
> Currently if the NM registers with the RM and it is unhealthy, it can be 
> scheduled many containers before the first heartbeat. After the first 
> heartbeat, the RM will mark the NM as unhealthy and kill all of the 
> containers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9809) NMs should supply a health status when registering with RM

2020-06-17 Thread Eric Yang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138853#comment-17138853
 ] 

Eric Yang commented on YARN-9809:
-

[~Jim_Brennan] Thank you for the instruction.  I updated my check script 
accordingly to:

{code}
#!/bin/bash
echo "ERROR test"
{code}

This works.  The script must return 0 exit code to work as well, otherwise, it 
will report as healthy.  This implies, if the health check script doesn't 
exist, it reports as healthy.  Is this right?


> NMs should supply a health status when registering with RM
> --
>
> Key: YARN-9809
> URL: https://issues.apache.org/jira/browse/YARN-9809
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
> Attachments: YARN-9809.001.patch, YARN-9809.002.patch, 
> YARN-9809.003.patch, YARN-9809.004.patch
>
>
> Currently if the NM registers with the RM and it is unhealthy, it can be 
> scheduled many containers before the first heartbeat. After the first 
> heartbeat, the RM will mark the NM as unhealthy and kill all of the 
> containers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9809) NMs should supply a health status when registering with RM

2020-06-17 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138833#comment-17138833
 ] 

Jim Brennan commented on YARN-9809:
---

[~eyang] I believe the health check script output must contain a line that 
begins with the string "ERROR" for the node to be marked as unhealthy.  The 
exit code does not have any effect.  

> NMs should supply a health status when registering with RM
> --
>
> Key: YARN-9809
> URL: https://issues.apache.org/jira/browse/YARN-9809
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
> Attachments: YARN-9809.001.patch, YARN-9809.002.patch, 
> YARN-9809.003.patch, YARN-9809.004.patch
>
>
> Currently if the NM registers with the RM and it is unhealthy, it can be 
> scheduled many containers before the first heartbeat. After the first 
> heartbeat, the RM will mark the NM as unhealthy and kill all of the 
> containers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9809) NMs should supply a health status when registering with RM

2020-06-17 Thread Eric Yang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138685#comment-17138685
 ] 

Eric Yang commented on YARN-9809:
-

[~ebadger] Thank you for the patch.  The patch looks very close to final 
product.  I have confirmed the test case failure doesn't happen, if there are 
sufficient amount of RAM on the testing node.  I also validated that new node 
manager can work with unpatched resource manager.  However, I could not get 
health check script to fail to cause node registered as unhealthy.

Here is my check script:
{code}
#!/bin/bash
echo "i am here" > /tmp/hello
exit 1
{code}

It would be nice to have verbose message to show the exit code of the health 
check script in node manager log file.  The script is executed, but it shows 
healthy.  What am I doing wrong?

> NMs should supply a health status when registering with RM
> --
>
> Key: YARN-9809
> URL: https://issues.apache.org/jira/browse/YARN-9809
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
> Attachments: YARN-9809.001.patch, YARN-9809.002.patch, 
> YARN-9809.003.patch, YARN-9809.004.patch
>
>
> Currently if the NM registers with the RM and it is unhealthy, it can be 
> scheduled many containers before the first heartbeat. After the first 
> heartbeat, the RM will mark the NM as unhealthy and kill all of the 
> containers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10321) Break down TestUserGroupMappingPlacementRule#testMapping into test scenarios

2020-06-17 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138655#comment-17138655
 ] 

Hadoop QA commented on YARN-10321:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
51s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
19m  8s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  2m  
4s{color} | {color:blue} Used deprecated FindBugs config; considering switching 
to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
1s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 30s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 2 new + 3 unchanged - 11 fixed = 5 total (was 14) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 52s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
41s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 88m 46s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}159m 47s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption |
|   | hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy 
|
|   | hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://builds.apache.org/job/PreCommit-YARN-Build/26175/artifact/out/Dockerfile
 |
| JIRA Issue | YARN-10321 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/13005876/YARN-10321.001.patch |
| Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite 
unit shadedclient findbugs checkstyle |
| uname | Linux 82b9c11426f6 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | 

[jira] [Comment Edited] (YARN-10310) YARN Service - User is able to launch a service with same name

2020-06-17 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138637#comment-17138637
 ] 

Bilwa S T edited comment on YARN-10310 at 6/17/20, 4:50 PM:


Hi [~eyang]
I could see that ServiceClient#submitapp adds YarnServiceConstants.APP_TYPE to 
all applications .
{code:java}
submissionContext.setQueue(queue);
submissionContext.setApplicationName(serviceName);
submissionContext.setApplicationType(YarnServiceConstants.APP_TYPE);
{code}

I debugged it and found that yarnClient.getApplications(request) fails at below 
case
{code:java}
  if (users != null && !users.isEmpty() &&
  !users.contains(application.getUser())) {
continue;
  }
{code}

as applicationType was same but whereas users list had hdfs/had...@hadoop.com 
and application.getuser() was hdfs. 

Yes I think we should change appType also based on parameter. currently we are 
adding yarn-service by default to all apps. Thanks



was (Author: bilwast):
Hi [~eyang]
I could see that ServiceClient#submitapp adds YarnServiceConstants.APP_TYPE to 
all applications .
{code:java}
submissionContext.setQueue(queue);
submissionContext.setApplicationName(serviceName);
submissionContext.setApplicationType(YarnServiceConstants.APP_TYPE);
{code}

I debugged it and found that yarnClient.getApplications(request) fails at below 
case
{code:java}
 if (applicationTypes != null && !applicationTypes.isEmpty()) {
String appTypeToMatch =
StringUtils.toLowerCase(application.getApplicationType());
if (!applicationTypes.contains(appTypeToMatch)) {
  continue;
}
  }

  if (applicationStates != null && !applicationStates.isEmpty()) {
if (!applicationStates.contains(application
.createApplicationState())) {
  continue;
}
  }

  if (users != null && !users.isEmpty() &&
  !*users.contains(application.getUser())*) {
continue;
  }
{code}

as applicationType was same but whereas users list had hdfs/had...@hadoop.com 
and application.getuser() was hdfs. 

Yes I think we should change appType also based on parameter. currently we are 
adding yarn-service by default to all apps. Thanks


> YARN Service - User is able to launch a service with same name
> --
>
> Key: YARN-10310
> URL: https://issues.apache.org/jira/browse/YARN-10310
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10310.001.patch
>
>
> As ServiceClient uses UserGroupInformation.getCurrentUser().getUserName() to 
> get user whereas ClientRMService#submitApplication uses 
> UserGroupInformation.getCurrentUser().getShortUserName() to set application 
> username.
> In case of user with name hdfs/had...@hadoop.com. below condition fails
> ClientRMService#getApplications()
> {code:java}
> if (users != null && !users.isEmpty() &&
>   !users.contains(application.getUser())) {
> continue;
>  }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10310) YARN Service - User is able to launch a service with same name

2020-06-17 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138637#comment-17138637
 ] 

Bilwa S T commented on YARN-10310:
--

Hi [~eyang]
I could see that ServiceClient#submitapp adds YarnServiceConstants.APP_TYPE to 
all applications .
{code:java}
submissionContext.setQueue(queue);
submissionContext.setApplicationName(serviceName);
submissionContext.setApplicationType(YarnServiceConstants.APP_TYPE);
{code}

I debugged it and found that yarnClient.getApplications(request) fails at below 
case
{code:java}
 if (applicationTypes != null && !applicationTypes.isEmpty()) {
String appTypeToMatch =
StringUtils.toLowerCase(application.getApplicationType());
if (!applicationTypes.contains(appTypeToMatch)) {
  continue;
}
  }

  if (applicationStates != null && !applicationStates.isEmpty()) {
if (!applicationStates.contains(application
.createApplicationState())) {
  continue;
}
  }

  if (users != null && !users.isEmpty() &&
  !*users.contains(application.getUser())*) {
continue;
  }
{code}

as applicationType was same but whereas users list had hdfs/had...@hadoop.com 
and application.getuser() was hdfs. 

Yes I think we should change appType also based on parameter. currently we are 
adding yarn-service by default to all apps. Thanks


> YARN Service - User is able to launch a service with same name
> --
>
> Key: YARN-10310
> URL: https://issues.apache.org/jira/browse/YARN-10310
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10310.001.patch
>
>
> As ServiceClient uses UserGroupInformation.getCurrentUser().getUserName() to 
> get user whereas ClientRMService#submitApplication uses 
> UserGroupInformation.getCurrentUser().getShortUserName() to set application 
> username.
> In case of user with name hdfs/had...@hadoop.com. below condition fails
> ClientRMService#getApplications()
> {code:java}
> if (users != null && !users.isEmpty() &&
>   !users.contains(application.getUser())) {
> continue;
>  }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10308) Update javadoc and variable names for keytab in yarn services as it supports filesystems other than hdfs and local file system

2020-06-17 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138630#comment-17138630
 ] 

Bilwa S T commented on YARN-10308:
--

Thanks [~eyang]

> Update javadoc and variable names for keytab in yarn services as it supports 
> filesystems other than hdfs and local file system
> --
>
> Key: YARN-10308
> URL: https://issues.apache.org/jira/browse/YARN-10308
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Minor
> Fix For: 3.4.0
>
> Attachments: YARN-10308.001.patch, YARN-10308.002.patch
>
>
> 1.  Below description should be updated
> {code:java}
> @ApiModelProperty(value = "The URI of the kerberos keytab. It supports two " +
>   "schemes \"hdfs\" and \"file\". If the URI starts with \"hdfs://\" " +
>   "scheme, it indicates the path on hdfs where the keytab is stored. The 
> " +
>   "keytab will be localized by YARN and made available to AM in its 
> local" +
>   " directory. If the URI starts with \"file://\" scheme, it indicates a 
> " +
>   "path on the local host where the keytab is presumbaly installed by " +
>   "admins upfront. ")
>   public String getKeytab() {
> return keytab;
>   }
> {code}
> 2. Variables below are still named on hdfs which is confusing
> {code:java}
> if ("file".equals(keytabURI.getScheme())) {
>   LOG.info("Using a keytab from localhost: " + keytabURI);
> } else {
>   Path keytabOnhdfs = new Path(keytabURI);
>   if (!fileSystem.getFileSystem().exists(keytabOnhdfs)) {
> LOG.warn(service.getName() + "'s keytab (principalName = "
> + principalName + ") doesn't exist at: " + keytabOnhdfs);
> return;
>   }
>   LocalResource keytabRes = fileSystem.createAmResource(keytabOnhdfs,
>   LocalResourceType.FILE);
>   localResource.put(String.format(YarnServiceConstants.KEYTAB_LOCATION,
>   service.getName()), keytabRes);
>   LOG.info("Adding " + service.getName() + "'s keytab for "
>   + "localization, uri = " + keytabOnhdfs);
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10308) Update javadoc and variable names for keytab in yarn services as it supports filesystems other than hdfs and local file system

2020-06-17 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138619#comment-17138619
 ] 

Hudson commented on YARN-10308:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18359 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/18359/])
YARN-10308. Update javadoc and variable names for YARN service.  
(eyang: rev 89689c52c39cdcc498d04508dbd235c6036ec17c)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/api/records/KerberosPrincipal.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/client/ServiceClient.java


> Update javadoc and variable names for keytab in yarn services as it supports 
> filesystems other than hdfs and local file system
> --
>
> Key: YARN-10308
> URL: https://issues.apache.org/jira/browse/YARN-10308
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Minor
> Fix For: 3.4.0
>
> Attachments: YARN-10308.001.patch, YARN-10308.002.patch
>
>
> 1.  Below description should be updated
> {code:java}
> @ApiModelProperty(value = "The URI of the kerberos keytab. It supports two " +
>   "schemes \"hdfs\" and \"file\". If the URI starts with \"hdfs://\" " +
>   "scheme, it indicates the path on hdfs where the keytab is stored. The 
> " +
>   "keytab will be localized by YARN and made available to AM in its 
> local" +
>   " directory. If the URI starts with \"file://\" scheme, it indicates a 
> " +
>   "path on the local host where the keytab is presumbaly installed by " +
>   "admins upfront. ")
>   public String getKeytab() {
> return keytab;
>   }
> {code}
> 2. Variables below are still named on hdfs which is confusing
> {code:java}
> if ("file".equals(keytabURI.getScheme())) {
>   LOG.info("Using a keytab from localhost: " + keytabURI);
> } else {
>   Path keytabOnhdfs = new Path(keytabURI);
>   if (!fileSystem.getFileSystem().exists(keytabOnhdfs)) {
> LOG.warn(service.getName() + "'s keytab (principalName = "
> + principalName + ") doesn't exist at: " + keytabOnhdfs);
> return;
>   }
>   LocalResource keytabRes = fileSystem.createAmResource(keytabOnhdfs,
>   LocalResourceType.FILE);
>   localResource.put(String.format(YarnServiceConstants.KEYTAB_LOCATION,
>   service.getName()), keytabRes);
>   LOG.info("Adding " + service.getName() + "'s keytab for "
>   + "localization, uri = " + keytabOnhdfs);
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-10308) Update javadoc and variable names for keytab in yarn services as it supports filesystems other than hdfs and local file system

2020-06-17 Thread Eric Yang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138608#comment-17138608
 ] 

Eric Yang edited comment on YARN-10308 at 6/17/20, 4:07 PM:


+1 I just committed patch 002 to trunk.  Thank you [~BilwaST] for the patch.


was (Author: eyang):
+1 I just committed this to trunk.  Thank you [~BilwaST] for the patch.

> Update javadoc and variable names for keytab in yarn services as it supports 
> filesystems other than hdfs and local file system
> --
>
> Key: YARN-10308
> URL: https://issues.apache.org/jira/browse/YARN-10308
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Minor
> Fix For: 3.4.0
>
> Attachments: YARN-10308.001.patch, YARN-10308.002.patch
>
>
> 1.  Below description should be updated
> {code:java}
> @ApiModelProperty(value = "The URI of the kerberos keytab. It supports two " +
>   "schemes \"hdfs\" and \"file\". If the URI starts with \"hdfs://\" " +
>   "scheme, it indicates the path on hdfs where the keytab is stored. The 
> " +
>   "keytab will be localized by YARN and made available to AM in its 
> local" +
>   " directory. If the URI starts with \"file://\" scheme, it indicates a 
> " +
>   "path on the local host where the keytab is presumbaly installed by " +
>   "admins upfront. ")
>   public String getKeytab() {
> return keytab;
>   }
> {code}
> 2. Variables below are still named on hdfs which is confusing
> {code:java}
> if ("file".equals(keytabURI.getScheme())) {
>   LOG.info("Using a keytab from localhost: " + keytabURI);
> } else {
>   Path keytabOnhdfs = new Path(keytabURI);
>   if (!fileSystem.getFileSystem().exists(keytabOnhdfs)) {
> LOG.warn(service.getName() + "'s keytab (principalName = "
> + principalName + ") doesn't exist at: " + keytabOnhdfs);
> return;
>   }
>   LocalResource keytabRes = fileSystem.createAmResource(keytabOnhdfs,
>   LocalResourceType.FILE);
>   localResource.put(String.format(YarnServiceConstants.KEYTAB_LOCATION,
>   service.getName()), keytabRes);
>   LOG.info("Adding " + service.getName() + "'s keytab for "
>   + "localization, uri = " + keytabOnhdfs);
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10308) Update javadoc and variable names for keytab in yarn services as it supports filesystems other than hdfs and local file system

2020-06-17 Thread Eric Yang (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-10308:
-
   Fix Version/s: 3.4.0
Target Version/s: 3.4.0

> Update javadoc and variable names for keytab in yarn services as it supports 
> filesystems other than hdfs and local file system
> --
>
> Key: YARN-10308
> URL: https://issues.apache.org/jira/browse/YARN-10308
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Minor
> Fix For: 3.4.0
>
> Attachments: YARN-10308.001.patch, YARN-10308.002.patch
>
>
> 1.  Below description should be updated
> {code:java}
> @ApiModelProperty(value = "The URI of the kerberos keytab. It supports two " +
>   "schemes \"hdfs\" and \"file\". If the URI starts with \"hdfs://\" " +
>   "scheme, it indicates the path on hdfs where the keytab is stored. The 
> " +
>   "keytab will be localized by YARN and made available to AM in its 
> local" +
>   " directory. If the URI starts with \"file://\" scheme, it indicates a 
> " +
>   "path on the local host where the keytab is presumbaly installed by " +
>   "admins upfront. ")
>   public String getKeytab() {
> return keytab;
>   }
> {code}
> 2. Variables below are still named on hdfs which is confusing
> {code:java}
> if ("file".equals(keytabURI.getScheme())) {
>   LOG.info("Using a keytab from localhost: " + keytabURI);
> } else {
>   Path keytabOnhdfs = new Path(keytabURI);
>   if (!fileSystem.getFileSystem().exists(keytabOnhdfs)) {
> LOG.warn(service.getName() + "'s keytab (principalName = "
> + principalName + ") doesn't exist at: " + keytabOnhdfs);
> return;
>   }
>   LocalResource keytabRes = fileSystem.createAmResource(keytabOnhdfs,
>   LocalResourceType.FILE);
>   localResource.put(String.format(YarnServiceConstants.KEYTAB_LOCATION,
>   service.getName()), keytabRes);
>   LOG.info("Adding " + service.getName() + "'s keytab for "
>   + "localization, uri = " + keytabOnhdfs);
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-10310) YARN Service - User is able to launch a service with same name

2020-06-17 Thread Eric Yang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138573#comment-17138573
 ] 

Eric Yang edited comment on YARN-10310 at 6/17/20, 3:56 PM:


[~BilwaST] The root cause is the parameter -appTypes unit-test.

Using hdfs/had...@example.com principal, the error message is same as using 
h...@example.com.

{code}
[hdfs@kerberos hadoop-3.4.0-SNAPSHOT]$ ./bin/yarn app -launch abc sleeper 
2020-06-17 08:17:17,867 INFO client.DefaultNoHARMFailoverProxyProvider: 
Connecting to ResourceManager at kerberos.example.com/192.168.1.9:8032
2020-06-17 08:17:18,320 INFO client.DefaultNoHARMFailoverProxyProvider: 
Connecting to ResourceManager at kerberos.example.com/192.168.1.9:8032
2020-06-17 08:17:18,323 INFO client.ApiServiceClient: Loading service 
definition from local FS: 
/usr/local/hadoop-3.4.0-SNAPSHOT/share/hadoop/yarn/yarn-service-examples/sleeper/sleeper.json
2020-06-17 08:17:21,104 INFO client.ApiServiceClient: Application ID: 
application_1592406514799_0003
[hdfs@kerberos hadoop-3.4.0-SNAPSHOT]$ ./bin/yarn app -launch abc sleeper 
2020-06-17 08:17:32,401 INFO client.DefaultNoHARMFailoverProxyProvider: 
Connecting to ResourceManager at kerberos.example.com/192.168.1.9:8032
2020-06-17 08:17:32,971 INFO client.DefaultNoHARMFailoverProxyProvider: 
Connecting to ResourceManager at kerberos.example.com/192.168.1.9:8032
2020-06-17 08:17:32,974 INFO client.ApiServiceClient: Loading service 
definition from local FS: 
/usr/local/hadoop-3.4.0-SNAPSHOT/share/hadoop/yarn/yarn-service-examples/sleeper/sleeper.json
2020-06-17 08:17:35,320 ERROR client.ApiServiceClient: Service name abc is 
already taken.
{code}

verifyNoLiveAppInRM only look for appTypes == YarnServiceConstants.APP_TYPE.
The correct fix might be adding appTypes == unit-test to the 
GetApplicationRequest to obtain the correct type of applications.  

HDFS error message "Dir existing on hdfs." is to safe guard that a instance of 
the yarn-service application in suspended mode (where there is no copy running 
in RM), and its working directory exists.  The error message is not wrong for 
the suspended use case, and I agree that there might be better way to support 
--appTypes flag for YARN service API to yield consistent output.  Could you 
refine the patch accordingly?  Thanks


was (Author: eyang):
[~BilwaST] The root cause is the parameter -appTypes unit-test.

Using hdfs/had...@example.com principal, the error message is same as using 
h...@example.com.

{code}
[hdfs@kerberos hadoop-3.4.0-SNAPSHOT]$ ./bin/yarn app -launch abc sleeper 
2020-06-17 08:17:17,867 INFO client.DefaultNoHARMFailoverProxyProvider: 
Connecting to ResourceManager at kerberos.example.com/192.168.1.9:8032
2020-06-17 08:17:18,320 INFO client.DefaultNoHARMFailoverProxyProvider: 
Connecting to ResourceManager at kerberos.example.com/192.168.1.9:8032
2020-06-17 08:17:18,323 INFO client.ApiServiceClient: Loading service 
definition from local FS: 
/usr/local/hadoop-3.4.0-SNAPSHOT/share/hadoop/yarn/yarn-service-examples/sleeper/sleeper.json
2020-06-17 08:17:21,104 INFO client.ApiServiceClient: Application ID: 
application_1592406514799_0003
[hdfs@kerberos hadoop-3.4.0-SNAPSHOT]$ ./bin/yarn app -launch abc sleeper 
2020-06-17 08:17:32,401 INFO client.DefaultNoHARMFailoverProxyProvider: 
Connecting to ResourceManager at kerberos.example.com/192.168.1.9:8032
2020-06-17 08:17:32,971 INFO client.DefaultNoHARMFailoverProxyProvider: 
Connecting to ResourceManager at kerberos.example.com/192.168.1.9:8032
2020-06-17 08:17:32,974 INFO client.ApiServiceClient: Loading service 
definition from local FS: 
/usr/local/hadoop-3.4.0-SNAPSHOT/share/hadoop/yarn/yarn-service-examples/sleeper/sleeper.json
2020-06-17 08:17:35,320 ERROR client.ApiServiceClient: Service name abc is 
already taken.
{code}

verifyNoLiveAppInRM only look for appTypes == YarnServiceConstants.APP_TYPE.
The correct fix might be adding appTypes == unit-test to the 
GetApplicationRequest to obtain the correct type of applications.  

HDFS error message "Dir existing on hdfs." is to safe guard that a instance of 
the yarn-service application in suspended mode (where there is no copy running 
in RM), and it's working directory.  The error message is not wrong for the 
suspended use case, and I agree that there might be better way to support 
--appTypes flag for YARN service API to yield consistent output.  Could you 
refine the patch according it?  Thanks

> YARN Service - User is able to launch a service with same name
> --
>
> Key: YARN-10310
> URL: https://issues.apache.org/jira/browse/YARN-10310
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10310.001.patch
>
>
> As 

[jira] [Commented] (YARN-10311) Yarn Service should support obtaining tokens from multiple name services

2020-06-17 Thread Eric Yang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138597#comment-17138597
 ] 

Eric Yang commented on YARN-10311:
--

[~prabhujoseph] [~kyungwan nam], thank you for your input for clarifying the 
use case.  I found it difficult to manage multiple delegation tokens from 
multiple namenodes and use appropriate tokens to the corresponding namenode.  
However, that is a longer conversation to have in hadoop-common for hadoop 
security.  While I think this is a good addition to address the immediate 
problem, I do not have ability to spawn off multiple hdfs clusters at this 
time.  The patch looks good on the surface, and test case would really help to 
prevent regression.  I would appreciate if you can step in to review this patch 
and test on real clusters.  Thanks

> Yarn Service should support obtaining tokens from multiple name services
> 
>
> Key: YARN-10311
> URL: https://issues.apache.org/jira/browse/YARN-10311
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10311.001.patch, YARN-10311.002.patch
>
>
> Currently yarn services support single name service tokens. We can add a new 
> conf called
> "yarn.service.hdfs-servers" for supporting this



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10310) YARN Service - User is able to launch a service with same name

2020-06-17 Thread Eric Yang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138573#comment-17138573
 ] 

Eric Yang commented on YARN-10310:
--

[~BilwaST] The root cause is the parameter -appTypes unit-test.

Using hdfs/had...@example.com principal, the error message is same as using 
h...@example.com.

{code}
[hdfs@kerberos hadoop-3.4.0-SNAPSHOT]$ ./bin/yarn app -launch abc sleeper 
2020-06-17 08:17:17,867 INFO client.DefaultNoHARMFailoverProxyProvider: 
Connecting to ResourceManager at kerberos.example.com/192.168.1.9:8032
2020-06-17 08:17:18,320 INFO client.DefaultNoHARMFailoverProxyProvider: 
Connecting to ResourceManager at kerberos.example.com/192.168.1.9:8032
2020-06-17 08:17:18,323 INFO client.ApiServiceClient: Loading service 
definition from local FS: 
/usr/local/hadoop-3.4.0-SNAPSHOT/share/hadoop/yarn/yarn-service-examples/sleeper/sleeper.json
2020-06-17 08:17:21,104 INFO client.ApiServiceClient: Application ID: 
application_1592406514799_0003
[hdfs@kerberos hadoop-3.4.0-SNAPSHOT]$ ./bin/yarn app -launch abc sleeper 
2020-06-17 08:17:32,401 INFO client.DefaultNoHARMFailoverProxyProvider: 
Connecting to ResourceManager at kerberos.example.com/192.168.1.9:8032
2020-06-17 08:17:32,971 INFO client.DefaultNoHARMFailoverProxyProvider: 
Connecting to ResourceManager at kerberos.example.com/192.168.1.9:8032
2020-06-17 08:17:32,974 INFO client.ApiServiceClient: Loading service 
definition from local FS: 
/usr/local/hadoop-3.4.0-SNAPSHOT/share/hadoop/yarn/yarn-service-examples/sleeper/sleeper.json
2020-06-17 08:17:35,320 ERROR client.ApiServiceClient: Service name abc is 
already taken.
{code}

verifyNoLiveAppInRM only look for appTypes == YarnServiceConstants.APP_TYPE.
The correct fix might be adding appTypes == unit-test to the 
GetApplicationRequest to obtain the correct type of applications.  

HDFS error message "Dir existing on hdfs." is to safe guard that a instance of 
the yarn-service application in suspended mode (where there is no copy running 
in RM), and it's working directory.  The error message is not wrong for the 
suspended use case, and I agree that there might be better way to support 
--appTypes flag for YARN service API to yield consistent output.  Could you 
refine the patch according it?  Thanks

> YARN Service - User is able to launch a service with same name
> --
>
> Key: YARN-10310
> URL: https://issues.apache.org/jira/browse/YARN-10310
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10310.001.patch
>
>
> As ServiceClient uses UserGroupInformation.getCurrentUser().getUserName() to 
> get user whereas ClientRMService#submitApplication uses 
> UserGroupInformation.getCurrentUser().getShortUserName() to set application 
> username.
> In case of user with name hdfs/had...@hadoop.com. below condition fails
> ClientRMService#getApplications()
> {code:java}
> if (users != null && !users.isEmpty() &&
>   !users.contains(application.getUser())) {
> continue;
>  }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9460) QueueACLsManager and ReservationsACLManager should not use instanceof checks

2020-06-17 Thread Jira


[ 
https://issues.apache.org/jira/browse/YARN-9460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138552#comment-17138552
 ] 

Hudáky Márton Gyula commented on YARN-9460:
---

+1 (non-binding)

> QueueACLsManager and ReservationsACLManager should not use instanceof checks
> 
>
> Key: YARN-9460
> URL: https://issues.apache.org/jira/browse/YARN-9460
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-9460.001.patch, YARN-9460.002.patch, 
> YARN-9460.003.patch, YARN-9460.004.patch, YARN-9460.005.patch
>
>
> QueueACLsManager and ReservationsACLManager should not use instanceof checks 
> for the scheduler type.
> Rather, we should abstract this into two classes: Capacity and Fair variants 
> of these ACL classes.
> QueueACLsManager and ReservationsACLManager could be abstract classes, but 
> the implementation is the decision of one who will work on this jira.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10281) Redundant QueuePath usage in UserGroupMappingPlacementRule and AppNameMappingPlacementRule

2020-06-17 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138550#comment-17138550
 ] 

Hadoop QA commented on YARN-10281:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 30m  
9s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-3.3 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
54s{color} | {color:green} branch-3.3 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} branch-3.3 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} branch-3.3 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
49s{color} | {color:green} branch-3.3 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m  5s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
32s{color} | {color:green} branch-3.3 passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  1m 
44s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
43s{color} | {color:green} branch-3.3 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 32s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 3 new + 67 unchanged - 0 fixed = 70 total (was 67) {color} 
|
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 30s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
44s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 98m 19s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
32s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}193m 21s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://builds.apache.org/job/PreCommit-YARN-Build/26174/artifact/out/Dockerfile
 |
| JIRA Issue | YARN-10281 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/13005863/YARN-10281.branch-3.3.001.patch
 |
| Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite 
unit shadedclient findbugs checkstyle |
| uname | Linux b34d0ad1c712 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 
10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/hadoop.sh |
| git revision | branch-3.3 / c1ef247 |
| Default Java | 

[jira] [Updated] (YARN-10321) Break down TestUserGroupMappingPlacementRule#testMapping into test scenarios

2020-06-17 Thread Szilard Nemeth (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-10321:
--
Attachment: YARN-10321.001.patch

> Break down TestUserGroupMappingPlacementRule#testMapping into test scenarios
> 
>
> Key: YARN-10321
> URL: https://issues.apache.org/jira/browse/YARN-10321
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Minor
> Attachments: YARN-10321.001.patch
>
>
> org.apache.hadoop.yarn.server.resourcemanager.placement.TestUserGroupMappingPlacementRule#testMapping
>  is very large and hard to read/maintain and moreover, error-prone.
> We should break this testcase down into several separate testcases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10281) Redundant QueuePath usage in UserGroupMappingPlacementRule and AppNameMappingPlacementRule

2020-06-17 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138412#comment-17138412
 ] 

Hudson commented on YARN-10281:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18357 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/18357/])
YARN-10281. Redundant QueuePath usage in UserGroupMappingPlacementRule 
(snemeth: rev 5b1a56f9f1aec7d75b14a60d0c42192b04407356)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/placement/AppNameMappingPlacementRule.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/placement/QueuePlacementRuleUtils.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/placement/UserGroupMappingPlacementRule.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestQueueMappings.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/placement/QueueMapping.java
* (delete) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/placement/QueuePath.java


> Redundant QueuePath usage in UserGroupMappingPlacementRule and 
> AppNameMappingPlacementRule
> --
>
> Key: YARN-10281
> URL: https://issues.apache.org/jira/browse/YARN-10281
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Gergely Pollak
>Assignee: Gergely Pollak
>Priority: Major
> Fix For: 3.4.0, 3.3.1
>
> Attachments: YARN-10281.001.patch, YARN-10281.002.patch, 
> YARN-10281.003.patch, YARN-10281.004.patch, YARN-10281.branch-3.3.001.patch
>
>
> We use the QueuePath and QueueMapping (or QueueMappingEntity) objects in the 
> aforementioned classes, but these technically store the same kind of 
> information, yet we keep converting between them, let's examine if we can use 
> only the QueueMapping(Entity) instead, since that holds more information.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10281) Redundant QueuePath usage in UserGroupMappingPlacementRule and AppNameMappingPlacementRule

2020-06-17 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138401#comment-17138401
 ] 

Szilard Nemeth commented on YARN-10281:
---

Thanks [~shuzirra] for workin on this!

Patch LGTM, committed to trunk and branch-3.3.
Resolving jira.

> Redundant QueuePath usage in UserGroupMappingPlacementRule and 
> AppNameMappingPlacementRule
> --
>
> Key: YARN-10281
> URL: https://issues.apache.org/jira/browse/YARN-10281
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Gergely Pollak
>Assignee: Gergely Pollak
>Priority: Major
> Fix For: 3.4.0, 3.3.1
>
> Attachments: YARN-10281.001.patch, YARN-10281.002.patch, 
> YARN-10281.003.patch, YARN-10281.004.patch, YARN-10281.branch-3.3.001.patch
>
>
> We use the QueuePath and QueueMapping (or QueueMappingEntity) objects in the 
> aforementioned classes, but these technically store the same kind of 
> information, yet we keep converting between them, let's examine if we can use 
> only the QueueMapping(Entity) instead, since that holds more information.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-10281) Redundant QueuePath usage in UserGroupMappingPlacementRule and AppNameMappingPlacementRule

2020-06-17 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138401#comment-17138401
 ] 

Szilard Nemeth edited comment on YARN-10281 at 6/17/20, 12:37 PM:
--

Thanks [~shuzirra] for working on this!

Patch LGTM, committed to trunk and branch-3.3.
Resolving jira.


was (Author: snemeth):
Thanks [~shuzirra] for workin on this!

Patch LGTM, committed to trunk and branch-3.3.
Resolving jira.

> Redundant QueuePath usage in UserGroupMappingPlacementRule and 
> AppNameMappingPlacementRule
> --
>
> Key: YARN-10281
> URL: https://issues.apache.org/jira/browse/YARN-10281
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Gergely Pollak
>Assignee: Gergely Pollak
>Priority: Major
> Fix For: 3.4.0, 3.3.1
>
> Attachments: YARN-10281.001.patch, YARN-10281.002.patch, 
> YARN-10281.003.patch, YARN-10281.004.patch, YARN-10281.branch-3.3.001.patch
>
>
> We use the QueuePath and QueueMapping (or QueueMappingEntity) objects in the 
> aforementioned classes, but these technically store the same kind of 
> information, yet we keep converting between them, let's examine if we can use 
> only the QueueMapping(Entity) instead, since that holds more information.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10281) Redundant QueuePath usage in UserGroupMappingPlacementRule and AppNameMappingPlacementRule

2020-06-17 Thread Szilard Nemeth (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-10281:
--
Fix Version/s: 3.3.1
   3.4.0

> Redundant QueuePath usage in UserGroupMappingPlacementRule and 
> AppNameMappingPlacementRule
> --
>
> Key: YARN-10281
> URL: https://issues.apache.org/jira/browse/YARN-10281
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Gergely Pollak
>Assignee: Gergely Pollak
>Priority: Major
> Fix For: 3.4.0, 3.3.1
>
> Attachments: YARN-10281.001.patch, YARN-10281.002.patch, 
> YARN-10281.003.patch, YARN-10281.004.patch, YARN-10281.branch-3.3.001.patch
>
>
> We use the QueuePath and QueueMapping (or QueueMappingEntity) objects in the 
> aforementioned classes, but these technically store the same kind of 
> information, yet we keep converting between them, let's examine if we can use 
> only the QueueMapping(Entity) instead, since that holds more information.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9930) Support max running app logic for CapacityScheduler

2020-06-17 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138388#comment-17138388
 ] 

Peter Bacsko commented on YARN-9930:


So after talkin about the problems over voice chat, here is our conclusion:

_"So AFAIU it is absolutely normal that some queue is above its limit if the 
configurations have been changed. Doesn't it need some special attention in 
your algorithm when you recursively update the parents to search for queues 
where new apps could be submitted?"_

No, I tested this case manually, first 4 running apps were allowed to run, but 
no more. Then it went down to 3, then to 2. After that, it stayed at 2 running 
apps and everything else was accepted. Functionality was consistent during the 
test run.

 

_"I'd prefer your solution as its more clear, but since we already have the 
existing logic, the questions arises: why do we need a separate enforcer 
object? Couldn't it be implemented similarly? Or am I missing something here?"_

Yes, this approach is different from max-applications calculation. In theory, 
having a consistent implementation accross a module is often desirable, but 
this patch duplicates a battle-tested algorithm from {{MaxRunningAppsEnforcer}} 
which was then adapted to CS. So this class can be trusted. Rewriting the 
current patch would take a lot of time. I'm being very practical here, but I 
don't think it's a huge violation of coding principles (apart from the 
duplication, but that was also necessary IMO).

 

_"The existing implementation for max apps (that considers both running and 
pending ones) calls the {{OrderingPolicy#getNumSchedulableEntities()}} and 
compare it the to limit inside {{LeafQueue"}}_

This could be a bug! Apps that were marked as non-runnable are actually missing 
from {{schedulableEntities}} (precisely to prevent them from being scheduled). 
Looks like this needs a little change plus a unit test.

[~bteke]'s comments are also valid.

I'll address these issues and upload patch v5 soon.

> Support max running app logic for CapacityScheduler
> ---
>
> Key: YARN-9930
> URL: https://issues.apache.org/jira/browse/YARN-9930
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler, capacityscheduler
>Affects Versions: 3.1.0, 3.1.1
>Reporter: zhoukang
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-9930-001.patch, YARN-9930-002.patch, 
> YARN-9930-003.patch, YARN-9930-004.patch, YARN-9930-POC01.patch, 
> YARN-9930-POC02.patch, YARN-9930-POC03.patch, YARN-9930-POC04.patch, 
> YARN-9930-POC05.patch, screenshot-1.png
>
>
> In FairScheduler, there has limitation for max running which will let 
> application pending.
> But in CapacityScheduler there has no feature like max running app.Only got 
> max app,and jobs will be rejected directly on client.
> This jira i want to implement this semantic for CapacityScheduler.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-10281) Redundant QueuePath usage in UserGroupMappingPlacementRule and AppNameMappingPlacementRule

2020-06-17 Thread Adam Antal (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Antal reassigned YARN-10281:
-

Assignee: Adam Antal  (was: Gergely Pollak)

> Redundant QueuePath usage in UserGroupMappingPlacementRule and 
> AppNameMappingPlacementRule
> --
>
> Key: YARN-10281
> URL: https://issues.apache.org/jira/browse/YARN-10281
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Gergely Pollak
>Assignee: Adam Antal
>Priority: Major
> Attachments: YARN-10281.001.patch, YARN-10281.002.patch, 
> YARN-10281.003.patch, YARN-10281.004.patch, YARN-10281.branch-3.3.001.patch
>
>
> We use the QueuePath and QueueMapping (or QueueMappingEntity) objects in the 
> aforementioned classes, but these technically store the same kind of 
> information, yet we keep converting between them, let's examine if we can use 
> only the QueueMapping(Entity) instead, since that holds more information.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-10281) Redundant QueuePath usage in UserGroupMappingPlacementRule and AppNameMappingPlacementRule

2020-06-17 Thread Adam Antal (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Antal reassigned YARN-10281:
-

Assignee: Gergely Pollak  (was: Adam Antal)

> Redundant QueuePath usage in UserGroupMappingPlacementRule and 
> AppNameMappingPlacementRule
> --
>
> Key: YARN-10281
> URL: https://issues.apache.org/jira/browse/YARN-10281
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Gergely Pollak
>Assignee: Gergely Pollak
>Priority: Major
> Attachments: YARN-10281.001.patch, YARN-10281.002.patch, 
> YARN-10281.003.patch, YARN-10281.004.patch, YARN-10281.branch-3.3.001.patch
>
>
> We use the QueuePath and QueueMapping (or QueueMappingEntity) objects in the 
> aforementioned classes, but these technically store the same kind of 
> information, yet we keep converting between them, let's examine if we can use 
> only the QueueMapping(Entity) instead, since that holds more information.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10321) Break down TestUserGroupMappingPlacementRule#testMapping into test scenarios

2020-06-17 Thread Szilard Nemeth (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-10321:
--
Description: 
org.apache.hadoop.yarn.server.resourcemanager.placement.TestUserGroupMappingPlacementRule#testMapping
 is very large and hard to read/maintain and moreover, error-prone.
We should break this testcase down into several separate testcases.

> Break down TestUserGroupMappingPlacementRule#testMapping into test scenarios
> 
>
> Key: YARN-10321
> URL: https://issues.apache.org/jira/browse/YARN-10321
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Minor
>
> org.apache.hadoop.yarn.server.resourcemanager.placement.TestUserGroupMappingPlacementRule#testMapping
>  is very large and hard to read/maintain and moreover, error-prone.
> We should break this testcase down into several separate testcases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10321) Break down TestUserGroupMappingPlacementRule#testMapping into test scenarios

2020-06-17 Thread Szilard Nemeth (Jira)
Szilard Nemeth created YARN-10321:
-

 Summary: Break down TestUserGroupMappingPlacementRule#testMapping 
into test scenarios
 Key: YARN-10321
 URL: https://issues.apache.org/jira/browse/YARN-10321
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Szilard Nemeth
Assignee: Szilard Nemeth






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10281) Redundant QueuePath usage in UserGroupMappingPlacementRule and AppNameMappingPlacementRule

2020-06-17 Thread Gergely Pollak (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gergely Pollak updated YARN-10281:
--
Attachment: YARN-10281.branch-3.3.001.patch

> Redundant QueuePath usage in UserGroupMappingPlacementRule and 
> AppNameMappingPlacementRule
> --
>
> Key: YARN-10281
> URL: https://issues.apache.org/jira/browse/YARN-10281
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Gergely Pollak
>Assignee: Gergely Pollak
>Priority: Major
> Attachments: YARN-10281.001.patch, YARN-10281.002.patch, 
> YARN-10281.003.patch, YARN-10281.004.patch, YARN-10281.branch-3.3.001.patch
>
>
> We use the QueuePath and QueueMapping (or QueueMappingEntity) objects in the 
> aforementioned classes, but these technically store the same kind of 
> information, yet we keep converting between them, let's examine if we can use 
> only the QueueMapping(Entity) instead, since that holds more information.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-10320) Replace FSDataInputStream#read with readFully in Log Aggregation

2020-06-17 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph reassigned YARN-10320:


Assignee: Tanu Ajmera

> Replace FSDataInputStream#read with readFully in Log Aggregation
> 
>
> Key: YARN-10320
> URL: https://issues.apache.org/jira/browse/YARN-10320
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: log-aggregation
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Tanu Ajmera
>Priority: Major
>
> Have observed Log Aggregation code has used FSDataInputStream#read instead of 
> readFully in multiple places like below. One of the place is fixed by 
> YARN-8106.
> This Jira targets to fix at all other places.
> LogAggregationIndexedFileController#loadUUIDFromLogFile
> {code}
>   byte[] b = new byte[uuid.length];
>   int actual = fsDataInputStream.read(b);
>   if (actual != uuid.length || Arrays.equals(b, uuid)) {
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10320) Replace FSDataInputStream#read with readFully in Log Aggregation

2020-06-17 Thread Prabhu Joseph (Jira)
Prabhu Joseph created YARN-10320:


 Summary: Replace FSDataInputStream#read with readFully in Log 
Aggregation
 Key: YARN-10320
 URL: https://issues.apache.org/jira/browse/YARN-10320
 Project: Hadoop YARN
  Issue Type: Bug
  Components: log-aggregation
Affects Versions: 3.3.0
Reporter: Prabhu Joseph


Have observed Log Aggregation code has used FSDataInputStream#read instead of 
readFully in multiple places like below. One of the place is fixed by YARN-8106.
This Jira targets to fix at all other places.

LogAggregationIndexedFileController#loadUUIDFromLogFile
{code}
  byte[] b = new byte[uuid.length];
  int actual = fsDataInputStream.read(b);
  if (actual != uuid.length || Arrays.equals(b, uuid)) {
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8106) Update LogAggregationIndexedFileController to use readFully instead read to avoid IOException while loading log meta

2020-06-17 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-8106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-8106:

Summary: Update LogAggregationIndexedFileController to use readFully 
instead read to avoid IOException while loading log meta  (was: Update 
LogAggregationIndexedFileController to use readFull instead read to avoid 
IOException while loading log meta)

> Update LogAggregationIndexedFileController to use readFully instead read to 
> avoid IOException while loading log meta
> 
>
> Key: YARN-8106
> URL: https://issues.apache.org/jira/browse/YARN-8106
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: log-aggregation
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Critical
> Fix For: 3.1.1
>
> Attachments: YARN-8106.1.patch
>
>
> yarn logs command fails with below error message. Found  
> LogAggregationIndexedFileController uses read() to read the contents of 
> remoteLog into byte array which sometimes does not read full contents and so 
> actual bytes read is not same as offset leading to IOException. readFully() 
> can be used.
> {code}
> WARN ifile.LogAggregationIndexedFileController: Can not get log meta from the 
> log 
> Error on loading log meta from 
>  if (actual != offset) {
> throw new IOException("Error on loading log meta from "
> + remoteLogPath);
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10311) Yarn Service should support obtaining tokens from multiple name services

2020-06-17 Thread kyungwan nam (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138226#comment-17138226
 ] 

kyungwan nam commented on YARN-10311:
-

Hi, I've met same issue in YARN-9905.
I wanted to seperate the HDFS for log-aggregation under HDFS federation. but, 
It doesn't work due to this issue.
Thanks~




> Yarn Service should support obtaining tokens from multiple name services
> 
>
> Key: YARN-10311
> URL: https://issues.apache.org/jira/browse/YARN-10311
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10311.001.patch, YARN-10311.002.patch
>
>
> Currently yarn services support single name service tokens. We can add a new 
> conf called
> "yarn.service.hdfs-servers" for supporting this



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10311) Yarn Service should support obtaining tokens from multiple name services

2020-06-17 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138189#comment-17138189
 ] 

Prabhu Joseph commented on YARN-10311:
--

Hi [~eyang], [~BilwaST]

Have faced same issue. This issue (Log Aggregation fails for job) happens for 
any YARN Job which uses a FileSystem for input / output path (fs.defaultFS) 
other than the one configured for YARN logs 
(yarn.nodemanager.remote-app-log-dir).

For Example: 

MapReduce job input path is set to HDFS where as 
yarn.nodemanager.remote-app-log-dir set to ABFS will succeed but Log 
Aggregation will fail as NM won't have ABFS Token from end user credentials to 
aggregate logs.

MR Jobs & Spark Jobs can set the other filesystems for which delegation token 
will be obtained by client side using mapreduce.job.hdfs-servers and 
spark.yarn.access.hadoopFileSystems respectively.

But this option is not there in Tez, Yarn Native Service and custom YARN 
Application Type. I think it will be useful to have a generic fix which can 
obtain token for the filesystem specified at 
yarn.nodemanager.remote-app-log-dir for any YARN Job.
  


> Yarn Service should support obtaining tokens from multiple name services
> 
>
> Key: YARN-10311
> URL: https://issues.apache.org/jira/browse/YARN-10311
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10311.001.patch, YARN-10311.002.patch
>
>
> Currently yarn services support single name service tokens. We can add a new 
> conf called
> "yarn.service.hdfs-servers" for supporting this



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10311) Yarn Service should support obtaining tokens from multiple name services

2020-06-17 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138170#comment-17138170
 ] 

Bilwa S T commented on YARN-10311:
--

Hi [~eyang]

I had set "yarn.nodemanager.remote-app-log-dir" to different nameservice, not 
default filesystem.

So i get below exception in NM log
{code:java}
Failed on local exception: java.io.IOException: 
org.apache.hadoop.security.AccessControlException: Client cannot authenticate 
via:[TOKEN, KERBEROS]
{code}

> Yarn Service should support obtaining tokens from multiple name services
> 
>
> Key: YARN-10311
> URL: https://issues.apache.org/jira/browse/YARN-10311
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10311.001.patch, YARN-10311.002.patch
>
>
> Currently yarn services support single name service tokens. We can add a new 
> conf called
> "yarn.service.hdfs-servers" for supporting this



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org