[jira] [Commented] (YARN-8303) YarnClient should contact TimelineReader for application/attempt/container report

2018-11-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687610#comment-16687610
 ] 

Hadoop QA commented on YARN-8303:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
18s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 55s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
37s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
17s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 13s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 1 new + 31 unchanged - 0 fixed = 32 total (was 31) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 55s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
52s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
32s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
28s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 25m 
32s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}109m 22s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-8303 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12948262/YARN-8303.004.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux b7b37abe01e4 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / df5e863 |
| maven | 

[jira] [Commented] (YARN-8986) publish all exposed ports to random ports when using bridge network

2018-11-14 Thread Xun Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687588#comment-16687588
 ] 

Xun Liu commented on YARN-8986:
---

[~Charo Zhang],[~eyang]

Because hadoop submarine relies on this feature, it also needs to be submitted 
to the {color:#FF}trunk{color} branch.

> publish all exposed ports to random ports when using bridge network
> ---
>
> Key: YARN-8986
> URL: https://issues.apache.org/jira/browse/YARN-8986
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 3.1.1
>Reporter: Charo Zhang
>Assignee: Charo Zhang
>Priority: Minor
>  Labels: Docker
> Fix For: 3.1.2
>
> Attachments: 20181108155450.png, YARN-8986.001.patch, 
> YARN-8986.002.patch, YARN-8986.003.patch
>
>
> it's better to publish all exposed ports to random ports(-P) or support port 
> mapping(-p) for bridge network when using bridge network for docker container.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8833) compute shares may lock the scheduling process

2018-11-14 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687606#comment-16687606
 ] 

ASF GitHub Bot commented on YARN-8833:
--

GitHub user yoelee opened a pull request:

https://github.com/apache/hadoop/pull/439

YARN-8833 fix compute shares may  lock the scheduling process

When compute fair share, there may be a chance triggering the problem of 
Integer overflow, and entering an infinite loop, which blocks the scheduling 
process.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/yoelee/hadoop trunk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hadoop/pull/439.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #439


commit 39a6f7cab193be910bfb34265ceb696ddbd78da5
Author: liyakun.hit 
Date:   2018-11-15T07:28:34Z

YARN-8833 fix compute shares may  lock the scheduling process




> compute shares may  lock the scheduling process
> ---
>
> Key: YARN-8833
> URL: https://issues.apache.org/jira/browse/YARN-8833
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: liyakun
>Assignee: liyakun
>Priority: Major
>
> When use w2rRatio compute fair share, there may be a chance triggering the 
> problem of Int overflow, and entering an infinite loop.
> Since the compute share thread holds the writeLock, it may blocking 
> scheduling thread.
> This issue occurs in a production environment with 8500 nodes. And we have 
> already fixed it.
>  
> added 2018-10-29: elaborate the problem 
> /**
>  * Compute the resources that would be used given a weight-to-resource ratio
>  * w2rRatio, for use in the computeFairShares algorithm as described in #
>  */
>  private static int resourceUsedWithWeightToResourceRatio(double w2rRatio,
>  Collection schedulables, String type) {
>  int resourcesTaken = 0;
>  for (Schedulable sched : schedulables) \{ int share = computeShare(sched, 
> w2rRatio, type); resourcesTaken += share; }
> return resourcesTaken;
>  }
> The variable resourcesTaken is an integer type. And it also is accumulated 
> value of result of
> computeShare(Schedulable sched, double w2rRatio,String type) which is a value 
> between the min share and max share of a queue.
> For example, when there are 3 queues, each has min share = max share = 
> Integer.MAX_VALUE, the resourcesTaken will be out of Integer bound, and it 
> will be a negative number.
> when resourceUsedWithWeightToResourceRatio(double w2rRatio, Collection extends Schedulable> schedulables, String type) return a negative number, the 
> loop in 
> computeSharesInternal() may never out which got the scheduler lock.
>  
> //org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.ComputeFairShares
> while (resourceUsedWithWeightToResourceRatio(rMax, schedulables, type)
>  < totalResource){
> rMax *= 2.0;
> }
> This may blocking scheduling thread.
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8953) Add CSI driver adaptor module

2018-11-14 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687604#comment-16687604
 ] 

Weiwei Yang commented on YARN-8953:
---

Hi [~sunilg]

Thanks for the comments.

#1 fixed

#2 fixed

#3 GetPluginInfoRequest is a abstract class

#4 Yes, we can start with supporting 1 capability for now but API level we 
should be compatible with CSI, I just fixed this.

#5 fixed

#6 Valid concenr, I have same feeling, let me fix this.

#7 No, YarnCsiAdaptor.proto will be compiled against protobuf-2.5.0, this 
defines the protocol used by RM and CSI-adaptor. The only thing will be 
compiled with protobuf3 is the hadoop-yarn-csi module, because it ships 3rd 
party CSI proto.

#8 Are you saying we could just use proto messages defined by 
yarn_csi_adaptor.proto in abstract class 
\{{ValidateVolumeCapabilitiesRequest}}? I thought this class should be 
self-contained, I am not sure...

#9 You are correct. hadoop-yarn-common/hadoop-yarn-api doesn't depend on 
hadoop-yarn-csi

#10 \{{adaptorServiceAddress}} is expected to be a remote address,  this is the 
address for the adaptor and RM will need to talk with it. It can be deployed on 
any NM. We will have another config for unix domain socket, that is a config 
per csi-driver and that will be a local UDS file path. We will need to sort out 
these configs too..

#11 {{CsiAdaptorProtocolService}} will be started along with NM, if it is 
failing, NM also failed to start. No other handling is needed as far as I can 
think of right now.

#12 When there is new message added, we need to add a transformer class. This 
class handles the message transformation from YARN proto to CSI proto. for 
preprovisioned volumes, we probably need to add 2 or 3 more transformers to get 
complete lifecycle managed. The reason it is in hadoop-yarn-csi moduel was 
because it needs to talk with CSI proto. In future, if we stablize YARN volume 
API, then only need to update the transformer code if CSI proto updates; if 
YARN volume API updates, when we only need to update PBImpl classes in 
hadoop-yarn-api and corresponding transformer class.

 

> Add CSI driver adaptor module
> -
>
> Key: YARN-8953
> URL: https://issues.apache.org/jira/browse/YARN-8953
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: YARN-8953.001.patch, YARN-8953.002.patch, 
> YARN-8953.003.patch, YARN-8953.004.patch, csi_adaptor_workflow.png
>
>
> CSI adaptor is a layer between YARN and CSI driver, it transforms YARN 
> internal concepts and boxes them according to CSI protocol. Then forward the 
> call to a CSI driver. The adaptor should support both 
> controller/node/identity services.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8303) YarnClient should contact TimelineReader for application/attempt/container report

2018-11-14 Thread Rohith Sharma K S (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687590#comment-16687590
 ] 

Rohith Sharma K S commented on YARN-8303:
-

Digging more details, it appears 
NMTimelinePublisher#publishContainerCreatedEvent 
{code}entityInfo.put(ContainerMetricsConstants.ALLOCATED_PRIORITY_INFO,
container.getPriority().toString());{code} is publishing String value 
which is the reason for failure. So, lets not change the publisher rather lets 
change in converter!

> YarnClient should contact TimelineReader for application/attempt/container 
> report
> -
>
> Key: YARN-8303
> URL: https://issues.apache.org/jira/browse/YARN-8303
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Rohith Sharma K S
>Assignee: Abhishek Modi
>Priority: Critical
> Attachments: YARN-8303.001.patch, YARN-8303.002.patch, 
> YARN-8303.003.patch, YARN-8303.004.patch, YARN-8303.poc.patch
>
>
> YarnClient get app/attempt/container information from RM. If RM doesn't have 
> then queried to ahsClient. When ATSv2 is only enabled, yarnClient will result 
> empty. 
> YarnClient is used by many users which result in empty information for 
> app/attempt/container report. 
> Proposal is to have adapter from yarn client so that app/attempt/container 
> reports can be generated from AHSv2Client which does REST API to 
> TimelineReader and get the entity and convert it into app/attempt/container 
> report.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-5168) Add port mapping handling when docker container use bridge network

2018-11-14 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang reassigned YARN-5168:
---

Assignee: Xun Liu  (was: Eric Yang)

> Add port mapping handling when docker container use bridge network
> --
>
> Key: YARN-5168
> URL: https://issues.apache.org/jira/browse/YARN-5168
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jun Gong
>Assignee: Xun Liu
>Priority: Major
>  Labels: Docker
>
> YARN-4007 addresses different network setups when launching the docker 
> container. We need support port mapping when docker container uses bridge 
> network.
> The following problems are what we faced:
> 1. Add "-P" to map docker container's exposed ports to automatically.
> 2. Add "-p" to let user specify specific ports to map.
> 3. Add service registry support for bridge network case, then app could find 
> each other. It could be done out of YARN, however it might be more convenient 
> to support it natively in YARN.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8303) YarnClient should contact TimelineReader for application/attempt/container report

2018-11-14 Thread Rohith Sharma K S (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687572#comment-16687572
 ] 

Rohith Sharma K S commented on YARN-8303:
-

thanks [~abmodi] for the patch! I see some issues while testing this patch! 
# All entity info will be in String. So we need to parse string to int. 
{code}
2018-11-15 12:25:35,354 WARN impl.YarnClientImpl: Got an error while fetching 
container report from ATSv2
java.lang.ClassCastException: java.lang.String cannot be cast to 
java.lang.Integer
at 
org.apache.hadoop.yarn.util.timeline.TimelineEntityV2Converter.convertToContainerReport(TimelineEntityV2Converter.java:97)
at 
org.apache.hadoop.yarn.client.api.impl.AHSv2ClientImpl.getContainers(AHSv2ClientImpl.java:142)
at 
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getContainerReportFromHistory(YarnClientImpl.java:922)
at 
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getContainers(YarnClientImpl.java:872)
at 
org.apache.hadoop.yarn.client.cli.ApplicationCLI.listContainers(ApplicationCLI.java:1244)
at 
org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:487)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
at 
org.apache.hadoop.yarn.client.cli.ApplicationCLI.main(ApplicationCLI.java:123)
{code}

Looks like similar issue exist all other converter which need to be relooked 
based on variable type

> YarnClient should contact TimelineReader for application/attempt/container 
> report
> -
>
> Key: YARN-8303
> URL: https://issues.apache.org/jira/browse/YARN-8303
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Rohith Sharma K S
>Assignee: Abhishek Modi
>Priority: Critical
> Attachments: YARN-8303.001.patch, YARN-8303.002.patch, 
> YARN-8303.003.patch, YARN-8303.004.patch, YARN-8303.poc.patch
>
>
> YarnClient get app/attempt/container information from RM. If RM doesn't have 
> then queried to ahsClient. When ATSv2 is only enabled, yarnClient will result 
> empty. 
> YarnClient is used by many users which result in empty information for 
> app/attempt/container report. 
> Proposal is to have adapter from yarn client so that app/attempt/container 
> reports can be generated from AHSv2Client which does REST API to 
> TimelineReader and get the entity and convert it into app/attempt/container 
> report.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8925) Updating distributed node attributes only when necessary

2018-11-14 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687576#comment-16687576
 ] 

Weiwei Yang commented on YARN-8925:
---

[~Tao Yang], the UT failure looks related, can u pls take a look?

> Updating distributed node attributes only when necessary
> 
>
> Key: YARN-8925
> URL: https://issues.apache.org/jira/browse/YARN-8925
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 3.2.1
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
>  Labels: performance
> Attachments: YARN-8925.001.patch, YARN-8925.002.patch, 
> YARN-8925.003.patch, YARN-8925.004.patch, YARN-8925.005.patch, 
> YARN-8925.006.patch
>
>
> Currently if distributed node attributes exist, even though there is no 
> change, updating for distributed node attributes will happen in every 
> heartbeat between NM and RM. Updating process will hold 
> NodeAttributesManagerImpl#writeLock and may have some influence in a large 
> cluster. We have found nodes UI of a large cluster is opened slowly and most 
> time it's waiting for the lock in NodeAttributesManagerImpl. I think this 
> updating should be called only when necessary to enhance the performance of 
> related process.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-5168) Add port mapping handling when docker container use bridge network

2018-11-14 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687575#comment-16687575
 ] 

Eric Yang edited comment on YARN-5168 at 11/15/18 7:12 AM:
---

[~liuxun323] This jira is to aggregate the information and save in 
ContainerStatus.  YARN service will automatically include this information in 
yarn app -status call.  [~Charo Zhang] is already working on YARN-8986 to add 
-P to docker run.  I assigned this JIRA to you.


was (Author: eyang):
[~liuxun323] This jira is to aggregate the information and save in 
ContainerStatus.  YARN service will automatically include this information in 
yarn app -status call.  [~Charo Zhang] is already working on YARN-8986 to add 
-P to docker run.

> Add port mapping handling when docker container use bridge network
> --
>
> Key: YARN-5168
> URL: https://issues.apache.org/jira/browse/YARN-5168
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jun Gong
>Assignee: Xun Liu
>Priority: Major
>  Labels: Docker
>
> YARN-4007 addresses different network setups when launching the docker 
> container. We need support port mapping when docker container uses bridge 
> network.
> The following problems are what we faced:
> 1. Add "-P" to map docker container's exposed ports to automatically.
> 2. Add "-p" to let user specify specific ports to map.
> 3. Add service registry support for bridge network case, then app could find 
> each other. It could be done out of YARN, however it might be more convenient 
> to support it natively in YARN.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5168) Add port mapping handling when docker container use bridge network

2018-11-14 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687575#comment-16687575
 ] 

Eric Yang commented on YARN-5168:
-

[~liuxun323] This jira is to aggregate the information and save in 
ContainerStatus.  YARN service will automatically include this information in 
yarn app -status call.  [~Charo Zhang] is already working on YARN-8986 to add 
-P to docker run.

> Add port mapping handling when docker container use bridge network
> --
>
> Key: YARN-5168
> URL: https://issues.apache.org/jira/browse/YARN-5168
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jun Gong
>Assignee: Xun Liu
>Priority: Major
>  Labels: Docker
>
> YARN-4007 addresses different network setups when launching the docker 
> container. We need support port mapping when docker container uses bridge 
> network.
> The following problems are what we faced:
> 1. Add "-P" to map docker container's exposed ports to automatically.
> 2. Add "-p" to let user specify specific ports to map.
> 3. Add service registry support for bridge network case, then app could find 
> each other. It could be done out of YARN, however it might be more convenient 
> to support it natively in YARN.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8925) Updating distributed node attributes only when necessary

2018-11-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687574#comment-16687574
 ] 

Hadoop QA commented on YARN-8925:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  3m 
39s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m  4s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m  
5s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  7m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
11s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 23s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 1 new + 346 unchanged - 0 fixed = 347 total (was 346) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 16s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m  
4s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
50s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
41s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
43s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 19m 26s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}104m 15s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
43s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}230m  5s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 

[jira] [Comment Edited] (YARN-5168) Add port mapping handling when docker container use bridge network

2018-11-14 Thread Xun Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687527#comment-16687527
 ] 

Xun Liu edited comment on YARN-5168 at 11/15/18 7:01 AM:
-

[~eyang]

Thank you for your reply, remove the support of "-p port1:port2", I think so.
{quote}1. Add "-P" to map docker container's exposed ports to automatically.
{quote}
When the yarn starts docker, will this "-P" parameter be added? This way we can 
expose the services inside the container, for example:

So we can add it inside the Dockerfile
{code:java}
EXPOSE 1000
EXPOSE 2000
EXPOSE 3000
{code}
After launching docker via YARN, We can get the external port automatically 
assigned by the host.
{code:java}
$ docker run d -P -name exposed-ports-in-dockerfile exposed-ports
63264dae9db85c5d667a37dac77e0da7c8d2d699f49b69ba992485242160ad3a
$ docker port exposed-ports-in-dockerfile
1000/tcp -> 0.0.0.0:49156
2000/tcp -> 0.0.0.0:49157
3000/tcp -> 0.0.0.0:49158
{code}
{quote}1. Add "-P" to map docker container's exposed ports to automatically.
{quote}
When will this feature be added? I feel that waiting for hadoop3.3 to release 
is a bit late, Can you assign this JIRA to me and let me contribute the patch? 
thank you!

 


was (Author: liuxun323):
[~eyang]

Thank you for your reply, remove the support of "-p port1:port2", I think so.

 
{quote}1. Add "-P" to map docker container's exposed ports to automatically.
{quote}
When the yarn starts docker, will this "-P" parameter be added? This way we can 
expose the services inside the container, for example:

So we can add it inside the Dockerfile
{code:java}
EXPOSE 1000
EXPOSE 2000
EXPOSE 3000
{code}
After launching docker via YARN, We can get the external port automatically 
assigned by the host.
{code:java}
$ docker run d -P -name exposed-ports-in-dockerfile exposed-ports
63264dae9db85c5d667a37dac77e0da7c8d2d699f49b69ba992485242160ad3a
$ docker port exposed-ports-in-dockerfile
1000/tcp -> 0.0.0.0:49156
2000/tcp -> 0.0.0.0:49157
3000/tcp -> 0.0.0.0:49158
{code}
{quote}1. Add "-P" to map docker container's exposed ports to automatically.
{quote}
When will this feature be added? I feel that waiting for hadoop3.3 to release 
is a bit late, Can you assign this JIRA to me and let me contribute the patch? 
thank you!

 

> Add port mapping handling when docker container use bridge network
> --
>
> Key: YARN-5168
> URL: https://issues.apache.org/jira/browse/YARN-5168
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jun Gong
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
>
> YARN-4007 addresses different network setups when launching the docker 
> container. We need support port mapping when docker container uses bridge 
> network.
> The following problems are what we faced:
> 1. Add "-P" to map docker container's exposed ports to automatically.
> 2. Add "-p" to let user specify specific ports to map.
> 3. Add service registry support for bridge network case, then app could find 
> each other. It could be done out of YARN, however it might be more convenient 
> to support it natively in YARN.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8953) Add CSI driver adaptor module

2018-11-14 Thread Sunil Govindan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687561#comment-16687561
 ] 

Sunil Govindan commented on YARN-8953:
--

Hi [~cheersyang]

Thanks for working on this patch. Few comments.
 # In {{CsiAdaptorPB.java}}, first line of including package name is not 
correct. It should be after ASF license.
 # Missing javadocs in CsiAdaptorProtocol
 # Does GetPluginInfoRequest need to be an abstract class?
 # It seems there could be various VolumeCapability for a volume. However 
ValidateVolumeCapabilitiesRequest ctor support adding only one. If this is 
added for a long term vision, could we also change the ctor in such a way that 
it takes a List of one item. (List volumeCapability)
 #  In {{ValidateVolumeCapabilitiesResponse}} is it more like a diagnostics or 
a return feedback which ll be set from {{setMessage}}. I  think it can be 
renamed to diag or response message etc.
 # I think {{NM_CSI_ADAPTOR_ADDRESSES}} and {{NM_SIMPLE_CSI_ADAPTOR_ADDRESS}} 
are confusing name. could we avoid one in this? or could give a better name. 
SIMPLE looks like a default adaptor address? 
 # Is YarnCsiAdaptor.proto going to compiled against protobuff 3.0? If so, i 
think there is an ambiguity. This file is present in same proto folder. Not 
sure how to categorize for future  where some proto belong to 3.0 and some 2.5. 
What do u think?
 # Since yarn_csi_adaptor.proto has AccessMode and VolumeType enum, do we still 
need same in ValidateVolumeCapabilitiesRequest?
 # Now hadoop-yarn-csi/pom.xml needs hadoop-yarn-common, hadoop-yarn-api etc. 
Vice versa is not needed, correct? What i meant was, we might NOT need 
hadoop-yarn-common depend on hadoop-yarn-csi, correct?
 # Can adaptorServiceAddress be a remote host address?
 # In CsiAdaptorProtocolService#serviceStart, any possible exception handling 
needed due to ops in YarnRPC?
 # Does any change needed for Transformer classes, given we add a new field in 
the csi protos later? Earlier, when we have a change in proto, we usually 
change the PBImpls classes and respective converter class. Now the PBImpl are 
in yarn-api package and transformer code is in csi module. Any guide or change 
in code structure needed to ease this in future?

> Add CSI driver adaptor module
> -
>
> Key: YARN-8953
> URL: https://issues.apache.org/jira/browse/YARN-8953
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: YARN-8953.001.patch, YARN-8953.002.patch, 
> YARN-8953.003.patch, YARN-8953.004.patch, csi_adaptor_workflow.png
>
>
> CSI adaptor is a layer between YARN and CSI driver, it transforms YARN 
> internal concepts and boxes them according to CSI protocol. Then forward the 
> call to a CSI driver. The adaptor should support both 
> controller/node/identity services.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8960) [Submarine] Can't get submarine service status using the command of "yarn app -status" under security environment

2018-11-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687558#comment-16687558
 ] 

Hadoop QA commented on YARN-8960:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 16s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
11s{color} | {color:green} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine: 
The patch generated 0 new + 61 unchanged - 1 fixed = 61 total (was 62) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 30s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
30s{color} | {color:green} hadoop-yarn-submarine in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 47m  4s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-8960 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12948261/YARN-8960.007.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux ab7dee708bc4 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / df5e863 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/22544/testReport/ |
| Max. process+thread count | 414 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine 
U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/22544/console |
| Powered by | Apache Yetus 0.8.0   

[jira] [Assigned] (YARN-8953) Add CSI driver adaptor module

2018-11-14 Thread Sunil Govindan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil Govindan reassigned YARN-8953:


Assignee: Sunil Govindan  (was: Weiwei Yang)

> Add CSI driver adaptor module
> -
>
> Key: YARN-8953
> URL: https://issues.apache.org/jira/browse/YARN-8953
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Weiwei Yang
>Assignee: Sunil Govindan
>Priority: Major
> Attachments: YARN-8953.001.patch, YARN-8953.002.patch, 
> YARN-8953.003.patch, YARN-8953.004.patch, csi_adaptor_workflow.png
>
>
> CSI adaptor is a layer between YARN and CSI driver, it transforms YARN 
> internal concepts and boxes them according to CSI protocol. Then forward the 
> call to a CSI driver. The adaptor should support both 
> controller/node/identity services.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-8953) Add CSI driver adaptor module

2018-11-14 Thread Sunil Govindan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil Govindan reassigned YARN-8953:


Assignee: Weiwei Yang  (was: Sunil Govindan)

> Add CSI driver adaptor module
> -
>
> Key: YARN-8953
> URL: https://issues.apache.org/jira/browse/YARN-8953
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: YARN-8953.001.patch, YARN-8953.002.patch, 
> YARN-8953.003.patch, YARN-8953.004.patch, csi_adaptor_workflow.png
>
>
> CSI adaptor is a layer between YARN and CSI driver, it transforms YARN 
> internal concepts and boxes them according to CSI protocol. Then forward the 
> call to a CSI driver. The adaptor should support both 
> controller/node/identity services.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5168) Add port mapping handling when docker container use bridge network

2018-11-14 Thread Xun Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687527#comment-16687527
 ] 

Xun Liu commented on YARN-5168:
---

[~eyang]

Thank you for your reply, remove the support of "-p port1:port2", I think so.

 
{quote}1. Add "-P" to map docker container's exposed ports to automatically.
{quote}
When the yarn starts docker, will this "-P" parameter be added? This way we can 
expose the services inside the container, for example:

So we can add it inside the Dockerfile
EXPOSE 1000
EXPOSE 2000
EXPOSE 3000
After launching docker via YARN, We can get the external port automatically 
assigned by the host.
$ docker run -d -P --name exposed-ports-in-dockerfile exposed-ports
63264dae9db85c5d667a37dac77e0da7c8d2d699f49b69ba992485242160ad3a
$ docker port exposed-ports-in-dockerfile
1000/tcp -> 0.0.0.0:49156
2000/tcp -> 0.0.0.0:49157
3000/tcp -> 0.0.0.0:49158
 
{quote}1. Add "-P" to map docker container's exposed ports to automatically.
{quote}
When will this feature be added? I feel that waiting for hadoop3.3 to release 
is a bit late, Can you assign this JIRA to me and let me contribute the patch? 
thank you!

 

> Add port mapping handling when docker container use bridge network
> --
>
> Key: YARN-5168
> URL: https://issues.apache.org/jira/browse/YARN-5168
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jun Gong
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
>
> YARN-4007 addresses different network setups when launching the docker 
> container. We need support port mapping when docker container uses bridge 
> network.
> The following problems are what we faced:
> 1. Add "-P" to map docker container's exposed ports to automatically.
> 2. Add "-p" to let user specify specific ports to map.
> 3. Add service registry support for bridge network case, then app could find 
> each other. It could be done out of YARN, however it might be more convenient 
> to support it natively in YARN.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-5168) Add port mapping handling when docker container use bridge network

2018-11-14 Thread Xun Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687527#comment-16687527
 ] 

Xun Liu edited comment on YARN-5168 at 11/15/18 6:11 AM:
-

[~eyang]

Thank you for your reply, remove the support of "-p port1:port2", I think so.

 
{quote}1. Add "-P" to map docker container's exposed ports to automatically.
{quote}
When the yarn starts docker, will this "-P" parameter be added? This way we can 
expose the services inside the container, for example:

So we can add it inside the Dockerfile
{code:java}
EXPOSE 1000
EXPOSE 2000
EXPOSE 3000
{code}
After launching docker via YARN, We can get the external port automatically 
assigned by the host.
{code:java}
$ docker run d -P -name exposed-ports-in-dockerfile exposed-ports
63264dae9db85c5d667a37dac77e0da7c8d2d699f49b69ba992485242160ad3a
$ docker port exposed-ports-in-dockerfile
1000/tcp -> 0.0.0.0:49156
2000/tcp -> 0.0.0.0:49157
3000/tcp -> 0.0.0.0:49158
{code}
{quote}1. Add "-P" to map docker container's exposed ports to automatically.
{quote}
When will this feature be added? I feel that waiting for hadoop3.3 to release 
is a bit late, Can you assign this JIRA to me and let me contribute the patch? 
thank you!

 


was (Author: liuxun323):
[~eyang]

Thank you for your reply, remove the support of "-p port1:port2", I think so.

 
{quote}1. Add "-P" to map docker container's exposed ports to automatically.
{quote}
When the yarn starts docker, will this "-P" parameter be added? This way we can 
expose the services inside the container, for example:

So we can add it inside the Dockerfile
EXPOSE 1000
EXPOSE 2000
EXPOSE 3000
After launching docker via YARN, We can get the external port automatically 
assigned by the host.
$ docker run -d -P --name exposed-ports-in-dockerfile exposed-ports
63264dae9db85c5d667a37dac77e0da7c8d2d699f49b69ba992485242160ad3a
$ docker port exposed-ports-in-dockerfile
1000/tcp -> 0.0.0.0:49156
2000/tcp -> 0.0.0.0:49157
3000/tcp -> 0.0.0.0:49158
 
{quote}1. Add "-P" to map docker container's exposed ports to automatically.
{quote}
When will this feature be added? I feel that waiting for hadoop3.3 to release 
is a bit late, Can you assign this JIRA to me and let me contribute the patch? 
thank you!

 

> Add port mapping handling when docker container use bridge network
> --
>
> Key: YARN-5168
> URL: https://issues.apache.org/jira/browse/YARN-5168
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jun Gong
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
>
> YARN-4007 addresses different network setups when launching the docker 
> container. We need support port mapping when docker container uses bridge 
> network.
> The following problems are what we faced:
> 1. Add "-P" to map docker container's exposed ports to automatically.
> 2. Add "-p" to let user specify specific ports to map.
> 3. Add service registry support for bridge network case, then app could find 
> each other. It could be done out of YARN, however it might be more convenient 
> to support it natively in YARN.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8303) YarnClient should contact TimelineReader for application/attempt/container report

2018-11-14 Thread Abhishek Modi (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Modi updated YARN-8303:

Attachment: YARN-8303.004.patch

> YarnClient should contact TimelineReader for application/attempt/container 
> report
> -
>
> Key: YARN-8303
> URL: https://issues.apache.org/jira/browse/YARN-8303
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Rohith Sharma K S
>Assignee: Abhishek Modi
>Priority: Critical
> Attachments: YARN-8303.001.patch, YARN-8303.002.patch, 
> YARN-8303.003.patch, YARN-8303.004.patch, YARN-8303.poc.patch
>
>
> YarnClient get app/attempt/container information from RM. If RM doesn't have 
> then queried to ahsClient. When ATSv2 is only enabled, yarnClient will result 
> empty. 
> YarnClient is used by many users which result in empty information for 
> app/attempt/container report. 
> Proposal is to have adapter from yarn client so that app/attempt/container 
> reports can be generated from AHSv2Client which does REST API to 
> TimelineReader and get the entity and convert it into app/attempt/container 
> report.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8986) publish all exposed ports to random ports when using bridge network

2018-11-14 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687516#comment-16687516
 ] 

Eric Yang commented on YARN-8986:
-

[~Charo Zhang] {quote}"If you don’t want to preface the docker command with 
sudo, create a Unix group called docker and add users to it", it means we just 
need add YARN user who start NM java process to docker group, and we have done 
like this in our cluster. In another case,If we try to use /sys/fs/cgroup, we 
must grant YARN user the access to /sys/fs/cgroup/cpu,cpuacct, so we add YARN 
user do docker group is not giving too much power, it just can run docker 
command without sudo access.{quote}

Unfortunately this is not acceptable answer to YARN community in general.  
Docker command can be abused to allow parameter hijack to get into other 
people's container or cause damage at host level.  For example, using 
YARN_CONTAINER_RUNTIME_DOCKER_PORTS_MAPPING=:88,22 --privileged can result 
in construction of parameter passed to docker run with --privileged flag, if no 
additional validation is done.

This is the reason that container-executor does a lot of validations before 
invoking docker commands that it crafted internally.  This is to make it harder 
to get full docker power to prevent hacking yarn user.  For secure cluster, the 
right approach to use cgroup is to create /sys/fs/cgroup/cpu/yarn with yarn 
user permission to modify only this subtree to prevent yarn user from damaging 
other program's cgroup controls.  We play by the rule that Hadoop community set 
for us.  

{quote}At same time, i am going to try performing docker operations in 
container-executor to make process of adding "-P".{quote}

Thanks for looking into doing this in container-executor.

> publish all exposed ports to random ports when using bridge network
> ---
>
> Key: YARN-8986
> URL: https://issues.apache.org/jira/browse/YARN-8986
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 3.1.1
>Reporter: Charo Zhang
>Assignee: Charo Zhang
>Priority: Minor
>  Labels: Docker
> Fix For: 3.1.2
>
> Attachments: 20181108155450.png, YARN-8986.001.patch, 
> YARN-8986.002.patch, YARN-8986.003.patch
>
>
> it's better to publish all exposed ports to random ports(-P) or support port 
> mapping(-p) for bridge network when using bridge network for docker container.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8960) [Submarine] Can't get submarine service status using the command of "yarn app -status" under security environment

2018-11-14 Thread Zac Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zac Zhou updated YARN-8960:
---
Attachment: YARN-8960.007.patch

> [Submarine] Can't get submarine service status using the command of "yarn app 
> -status" under security environment
> -
>
> Key: YARN-8960
> URL: https://issues.apache.org/jira/browse/YARN-8960
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zac Zhou
>Assignee: Zac Zhou
>Priority: Major
> Attachments: YARN-8960.001.patch, YARN-8960.002.patch, 
> YARN-8960.003.patch, YARN-8960.004.patch, YARN-8960.005.patch, 
> YARN-8960.006.patch, YARN-8960.007.patch
>
>
> After submitting a submarine job, we tried to get service status using the 
> following command:
> yarn app -status ${service_name}
> But we got the following error:
> HTTP error code : 500
>  
> The stack in resourcemanager log is :
> {code}
> ERROR org.apache.hadoop.yarn.service.webapp.ApiServer: Get service failed: {}
> java.lang.reflect.UndeclaredThrowableException
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1748)
>  at 
> org.apache.hadoop.yarn.service.webapp.ApiServer.getServiceFromClient(ApiServer.java:800)
>  at 
> org.apache.hadoop.yarn.service.webapp.ApiServer.getService(ApiServer.java:186)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ...
> Caused by: org.apache.hadoop.yarn.exceptions.YarnException: No principal 
> specified in the persisted service definitio
> n, fail to connect to AM.
>  at 
> org.apache.hadoop.yarn.service.client.ServiceClient.createAMProxy(ServiceClient.java:1500)
>  at 
> org.apache.hadoop.yarn.service.client.ServiceClient.getStatus(ServiceClient.java:1376)
>  at 
> org.apache.hadoop.yarn.service.webapp.ApiServer.lambda$getServiceFromClient$4(ApiServer.java:804)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>  ... 68 more
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5168) Add port mapping handling when docker container use bridge network

2018-11-14 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687504#comment-16687504
 ] 

Eric Yang commented on YARN-5168:
-

[~liuxun323] {quote}What command or parameter should I use to tell YARN, Do I 
need to expose the service port in my container?
Exposing the specified port of the service should be specified at runtime. If 
the port mapping needs to use the EXPOSE command in the Dockerfile, it is too 
difficult to use.{quote}

If this is done right, by specifying bridge network and YARN automatically 
exposes service ports.  This would be the ideal solution.  EXPOSE directive in 
Dockerfile is the easiest way to ensure we don't have to track additional 
metadata or create redundant config that may have implicit conflicts else where.

I can not think of a reasonable syntax for user to fill in yarnfile to express 
service port routes at large scale.  It would be easier to use internal ports 
to communicate with peers, and only keep user interface like a web port exposed 
to outside where user can click through a hyperlink.  Hence, the only 
reasonable solution is to automatically add "-P" when bridge network is 
specified with EXPOSE directive in Dockerfile.  This JIRA only aggregates the 
service port information and display on UI for user to click on.

> Add port mapping handling when docker container use bridge network
> --
>
> Key: YARN-5168
> URL: https://issues.apache.org/jira/browse/YARN-5168
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jun Gong
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
>
> YARN-4007 addresses different network setups when launching the docker 
> container. We need support port mapping when docker container uses bridge 
> network.
> The following problems are what we faced:
> 1. Add "-P" to map docker container's exposed ports to automatically.
> 2. Add "-p" to let user specify specific ports to map.
> 3. Add service registry support for bridge network case, then app could find 
> each other. It could be done out of YARN, however it might be more convenient 
> to support it natively in YARN.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8960) [Submarine] Can't get submarine service status using the command of "yarn app -status" under security environment

2018-11-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687499#comment-16687499
 ] 

Hadoop QA commented on YARN-8960:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m  2s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 13s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine: 
The patch generated 1 new + 61 unchanged - 1 fixed = 62 total (was 62) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 31s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
32s{color} | {color:green} hadoop-yarn-submarine in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 55m 44s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-8960 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12948254/YARN-8960.006.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 5bf2a0b9f5c6 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / df5e863 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/22542/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-submarine.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/22542/testReport/ |
| Max. process+thread count | 306 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine 
U: 

[jira] [Commented] (YARN-8937) TestLeaderElectorService hangs

2018-11-14 Thread Akira Ajisaka (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687492#comment-16687492
 ] 

Akira Ajisaka commented on YARN-8937:
-

The situation has changed in CURATOR-409. Now I'm trying 
[https://github.com/risdenk/curator/tree/test-CURATOR-409-2.x] 

> TestLeaderElectorService hangs
> --
>
> Key: YARN-8937
> URL: https://issues.apache.org/jira/browse/YARN-8937
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Jason Lowe
>Priority: Major
>
> TestLeaderElectorService hangs waiting for the TestingZooKeeperServer to 
> start and eventually gets killed by the surefire timeout.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-8960) [Submarine] Can't get submarine service status using the command of "yarn app -status" under security environment

2018-11-14 Thread Zac Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687457#comment-16687457
 ] 

Zac Zhou edited comment on YARN-8960 at 11/15/18 4:18 AM:
--

Add a parameter, named distribute_keytab, which can be used to specify whether 
to distribute local keytab across the cluster. 

A submarine job can be submitted like this:
{code:java}
./yarn jar 
/home/hadoop/hadoop-current/share/hadoop/yarn/hadoop-yarn-submarine-3.2.0-SNAPSHOT.jar
 job run \
--env DOCKER_JAVA_HOME=/opt/java \
--env DOCKER_HADOOP_HDFS_HOME=/hadoop-3.1.0 --name distributed-tf-gpu \
--env YARN_CONTAINER_RUNTIME_DOCKER_CONTAINER_NETWORK=calico-network \
--worker_docker_image 0.0.0.0:5000/gpu-cuda9.0-tf1.8.0-with-models \
--input_path hdfs://mldev/tmp/cifar-10-data \
--checkpoint_path hdfs://mldev/user/hadoop/tf-distributed-checkpoint \
--num_ps 1 \
--ps_resources memory=4G,vcores=2,gpu=0 \
--ps_launch_cmd "python /test/cifar10_estimator/cifar10_main.py 
--data-dir=hdfs://mldev/tmp/cifar-10-data 
--job-dir=hdfs://mldev/tmp/cifar-10-jobdir --num-gpus=0" \
--ps_docker_image 0.0.0.0:5000/dockerfile-cpu-tf1.8.0-with-models \
--worker_resources memory=4G,vcores=2,gpu=1 --verbose \
--num_workers 2 \
--worker_launch_cmd "python /test/cifar10_estimator/cifar10_main.py 
--data-dir=hdfs://mldev/tmp/cifar-10-data 
--job-dir=hdfs://mldev/tmp/cifar-10-jobdir --train-steps=500 
--eval-batch-size=16 --train-batch-size=16 --sync --num-gpus=1" \
--keytab /tmp/keytabs/hadoop.keytab \
--principal hadoop/ad...@corp.com \
--distribute_keytab{code}




 

 


was (Author: yuan_zac):
Add a parameter, named distribute_keytab, which can be used to specify whether 
to distribute local keytab across the cluster. 

A submarine job can be submitted like this:

./yarn jar 
/home/hadoop/hadoop-current/share/hadoop/yarn/hadoop-yarn-submarine-3.2.0-SNAPSHOT.jar
 job run \
 --env DOCKER_JAVA_HOME=/opt/java \
 --env DOCKER_HADOOP_HDFS_HOME=/hadoop-3.1.0 --name distributed-tf-gpu \
 --env YARN_CONTAINER_RUNTIME_DOCKER_CONTAINER_NETWORK=calico-network \
 --worker_docker_image 0.0.0.0:5000/gpu-cuda9.0-tf1.8.0-with-models \
 --input_path hdfs://mldev/tmp/cifar-10-data \
 --checkpoint_path hdfs://mldev/user/hadoop/tf-distributed-checkpoint \
 --num_ps 1 \
 --ps_resources memory=4G,vcores=2,gpu=0 \
 --ps_launch_cmd "python /test/cifar10_estimator/cifar10_main.py 
--data-dir=hdfs://mldev/tmp/cifar-10-data 
--job-dir=hdfs://mldev/tmp/cifar-10-jobdir --num-gpus=0" \
 --ps_docker_image 0.0.0.0:5000/dockerfile-cpu-tf1.8.0-with-models \
 --worker_resources memory=4G,vcores=2,gpu=1 --verbose \
 --num_workers 2 \
 --worker_launch_cmd "python /test/cifar10_estimator/cifar10_main.py 
--data-dir=hdfs://mldev/tmp/cifar-10-data 
--job-dir=hdfs://mldev/tmp/cifar-10-jobdir --train-steps=500 
--eval-batch-size=16 --train-batch-size=16 --sync --num-gpus=1" \
 --keytab /tmp/keytabs/hadoop.keytab \
 --principal hadoop/ad...@corp.com \
 --distribute_keytab

 

 

> [Submarine] Can't get submarine service status using the command of "yarn app 
> -status" under security environment
> -
>
> Key: YARN-8960
> URL: https://issues.apache.org/jira/browse/YARN-8960
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zac Zhou
>Assignee: Zac Zhou
>Priority: Major
> Attachments: YARN-8960.001.patch, YARN-8960.002.patch, 
> YARN-8960.003.patch, YARN-8960.004.patch, YARN-8960.005.patch, 
> YARN-8960.006.patch
>
>
> After submitting a submarine job, we tried to get service status using the 
> following command:
> yarn app -status ${service_name}
> But we got the following error:
> HTTP error code : 500
>  
> The stack in resourcemanager log is :
> {code}
> ERROR org.apache.hadoop.yarn.service.webapp.ApiServer: Get service failed: {}
> java.lang.reflect.UndeclaredThrowableException
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1748)
>  at 
> org.apache.hadoop.yarn.service.webapp.ApiServer.getServiceFromClient(ApiServer.java:800)
>  at 
> org.apache.hadoop.yarn.service.webapp.ApiServer.getService(ApiServer.java:186)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ...
> Caused by: org.apache.hadoop.yarn.exceptions.YarnException: No principal 
> specified in the persisted service definitio
> n, fail to connect to AM.
>  at 
> org.apache.hadoop.yarn.service.client.ServiceClient.createAMProxy(ServiceClient.java:1500)
>  at 
> org.apache.hadoop.yarn.service.client.ServiceClient.getStatus(ServiceClient.java:1376)
>  at 
> 

[jira] [Commented] (YARN-8960) [Submarine] Can't get submarine service status using the command of "yarn app -status" under security environment

2018-11-14 Thread Zac Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687457#comment-16687457
 ] 

Zac Zhou commented on YARN-8960:


Add a parameter, named distribute_keytab, which can be used to specify whether 
to distribute local keytab across the cluster. 

A submarine job can be submitted like this:

./yarn jar 
/home/hadoop/hadoop-current/share/hadoop/yarn/hadoop-yarn-submarine-3.2.0-SNAPSHOT.jar
 job run \
 --env DOCKER_JAVA_HOME=/opt/java \
 --env DOCKER_HADOOP_HDFS_HOME=/hadoop-3.1.0 --name distributed-tf-gpu \
 --env YARN_CONTAINER_RUNTIME_DOCKER_CONTAINER_NETWORK=calico-network \
 --worker_docker_image 0.0.0.0:5000/gpu-cuda9.0-tf1.8.0-with-models \
 --input_path hdfs://mldev/tmp/cifar-10-data \
 --checkpoint_path hdfs://mldev/user/hadoop/tf-distributed-checkpoint \
 --num_ps 1 \
 --ps_resources memory=4G,vcores=2,gpu=0 \
 --ps_launch_cmd "python /test/cifar10_estimator/cifar10_main.py 
--data-dir=hdfs://mldev/tmp/cifar-10-data 
--job-dir=hdfs://mldev/tmp/cifar-10-jobdir --num-gpus=0" \
 --ps_docker_image 0.0.0.0:5000/dockerfile-cpu-tf1.8.0-with-models \
 --worker_resources memory=4G,vcores=2,gpu=1 --verbose \
 --num_workers 2 \
 --worker_launch_cmd "python /test/cifar10_estimator/cifar10_main.py 
--data-dir=hdfs://mldev/tmp/cifar-10-data 
--job-dir=hdfs://mldev/tmp/cifar-10-jobdir --train-steps=500 
--eval-batch-size=16 --train-batch-size=16 --sync --num-gpus=1" \
 --keytab /tmp/keytabs/hadoop.keytab \
 --principal hadoop/ad...@corp.com \
 --distribute_keytab

 

 

> [Submarine] Can't get submarine service status using the command of "yarn app 
> -status" under security environment
> -
>
> Key: YARN-8960
> URL: https://issues.apache.org/jira/browse/YARN-8960
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zac Zhou
>Assignee: Zac Zhou
>Priority: Major
> Attachments: YARN-8960.001.patch, YARN-8960.002.patch, 
> YARN-8960.003.patch, YARN-8960.004.patch, YARN-8960.005.patch, 
> YARN-8960.006.patch
>
>
> After submitting a submarine job, we tried to get service status using the 
> following command:
> yarn app -status ${service_name}
> But we got the following error:
> HTTP error code : 500
>  
> The stack in resourcemanager log is :
> {code}
> ERROR org.apache.hadoop.yarn.service.webapp.ApiServer: Get service failed: {}
> java.lang.reflect.UndeclaredThrowableException
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1748)
>  at 
> org.apache.hadoop.yarn.service.webapp.ApiServer.getServiceFromClient(ApiServer.java:800)
>  at 
> org.apache.hadoop.yarn.service.webapp.ApiServer.getService(ApiServer.java:186)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ...
> Caused by: org.apache.hadoop.yarn.exceptions.YarnException: No principal 
> specified in the persisted service definitio
> n, fail to connect to AM.
>  at 
> org.apache.hadoop.yarn.service.client.ServiceClient.createAMProxy(ServiceClient.java:1500)
>  at 
> org.apache.hadoop.yarn.service.client.ServiceClient.getStatus(ServiceClient.java:1376)
>  at 
> org.apache.hadoop.yarn.service.webapp.ApiServer.lambda$getServiceFromClient$4(ApiServer.java:804)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>  ... 68 more
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8960) [Submarine] Can't get submarine service status using the command of "yarn app -status" under security environment

2018-11-14 Thread Zac Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zac Zhou updated YARN-8960:
---
Attachment: YARN-8960.006.patch

> [Submarine] Can't get submarine service status using the command of "yarn app 
> -status" under security environment
> -
>
> Key: YARN-8960
> URL: https://issues.apache.org/jira/browse/YARN-8960
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zac Zhou
>Assignee: Zac Zhou
>Priority: Major
> Attachments: YARN-8960.001.patch, YARN-8960.002.patch, 
> YARN-8960.003.patch, YARN-8960.004.patch, YARN-8960.005.patch, 
> YARN-8960.006.patch
>
>
> After submitting a submarine job, we tried to get service status using the 
> following command:
> yarn app -status ${service_name}
> But we got the following error:
> HTTP error code : 500
>  
> The stack in resourcemanager log is :
> {code}
> ERROR org.apache.hadoop.yarn.service.webapp.ApiServer: Get service failed: {}
> java.lang.reflect.UndeclaredThrowableException
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1748)
>  at 
> org.apache.hadoop.yarn.service.webapp.ApiServer.getServiceFromClient(ApiServer.java:800)
>  at 
> org.apache.hadoop.yarn.service.webapp.ApiServer.getService(ApiServer.java:186)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ...
> Caused by: org.apache.hadoop.yarn.exceptions.YarnException: No principal 
> specified in the persisted service definitio
> n, fail to connect to AM.
>  at 
> org.apache.hadoop.yarn.service.client.ServiceClient.createAMProxy(ServiceClient.java:1500)
>  at 
> org.apache.hadoop.yarn.service.client.ServiceClient.getStatus(ServiceClient.java:1376)
>  at 
> org.apache.hadoop.yarn.service.webapp.ApiServer.lambda$getServiceFromClient$4(ApiServer.java:804)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>  ... 68 more
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8960) [Submarine] Can't get submarine service status using the command of "yarn app -status" under security environment

2018-11-14 Thread Zac Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zac Zhou updated YARN-8960:
---
Attachment: YARN-8960.006.patch

> [Submarine] Can't get submarine service status using the command of "yarn app 
> -status" under security environment
> -
>
> Key: YARN-8960
> URL: https://issues.apache.org/jira/browse/YARN-8960
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zac Zhou
>Assignee: Zac Zhou
>Priority: Major
> Attachments: YARN-8960.001.patch, YARN-8960.002.patch, 
> YARN-8960.003.patch, YARN-8960.004.patch, YARN-8960.005.patch
>
>
> After submitting a submarine job, we tried to get service status using the 
> following command:
> yarn app -status ${service_name}
> But we got the following error:
> HTTP error code : 500
>  
> The stack in resourcemanager log is :
> {code}
> ERROR org.apache.hadoop.yarn.service.webapp.ApiServer: Get service failed: {}
> java.lang.reflect.UndeclaredThrowableException
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1748)
>  at 
> org.apache.hadoop.yarn.service.webapp.ApiServer.getServiceFromClient(ApiServer.java:800)
>  at 
> org.apache.hadoop.yarn.service.webapp.ApiServer.getService(ApiServer.java:186)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ...
> Caused by: org.apache.hadoop.yarn.exceptions.YarnException: No principal 
> specified in the persisted service definitio
> n, fail to connect to AM.
>  at 
> org.apache.hadoop.yarn.service.client.ServiceClient.createAMProxy(ServiceClient.java:1500)
>  at 
> org.apache.hadoop.yarn.service.client.ServiceClient.getStatus(ServiceClient.java:1376)
>  at 
> org.apache.hadoop.yarn.service.webapp.ApiServer.lambda$getServiceFromClient$4(ApiServer.java:804)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>  ... 68 more
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8960) [Submarine] Can't get submarine service status using the command of "yarn app -status" under security environment

2018-11-14 Thread Zac Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zac Zhou updated YARN-8960:
---
Attachment: (was: YARN-8960.006.patch)

> [Submarine] Can't get submarine service status using the command of "yarn app 
> -status" under security environment
> -
>
> Key: YARN-8960
> URL: https://issues.apache.org/jira/browse/YARN-8960
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zac Zhou
>Assignee: Zac Zhou
>Priority: Major
> Attachments: YARN-8960.001.patch, YARN-8960.002.patch, 
> YARN-8960.003.patch, YARN-8960.004.patch, YARN-8960.005.patch
>
>
> After submitting a submarine job, we tried to get service status using the 
> following command:
> yarn app -status ${service_name}
> But we got the following error:
> HTTP error code : 500
>  
> The stack in resourcemanager log is :
> {code}
> ERROR org.apache.hadoop.yarn.service.webapp.ApiServer: Get service failed: {}
> java.lang.reflect.UndeclaredThrowableException
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1748)
>  at 
> org.apache.hadoop.yarn.service.webapp.ApiServer.getServiceFromClient(ApiServer.java:800)
>  at 
> org.apache.hadoop.yarn.service.webapp.ApiServer.getService(ApiServer.java:186)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ...
> Caused by: org.apache.hadoop.yarn.exceptions.YarnException: No principal 
> specified in the persisted service definitio
> n, fail to connect to AM.
>  at 
> org.apache.hadoop.yarn.service.client.ServiceClient.createAMProxy(ServiceClient.java:1500)
>  at 
> org.apache.hadoop.yarn.service.client.ServiceClient.getStatus(ServiceClient.java:1376)
>  at 
> org.apache.hadoop.yarn.service.webapp.ApiServer.lambda$getServiceFromClient$4(ApiServer.java:804)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>  ... 68 more
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8925) Updating distributed node attributes only when necessary

2018-11-14 Thread Tao Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687430#comment-16687430
 ] 

Tao Yang commented on YARN-8925:


Thanks [~cheersyang] for your suggestion.
Attached v6 patch to fix "Variable 'xxx' must be private..." checkstyle 
warnings.

> Updating distributed node attributes only when necessary
> 
>
> Key: YARN-8925
> URL: https://issues.apache.org/jira/browse/YARN-8925
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 3.2.1
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
>  Labels: performance
> Attachments: YARN-8925.001.patch, YARN-8925.002.patch, 
> YARN-8925.003.patch, YARN-8925.004.patch, YARN-8925.005.patch, 
> YARN-8925.006.patch
>
>
> Currently if distributed node attributes exist, even though there is no 
> change, updating for distributed node attributes will happen in every 
> heartbeat between NM and RM. Updating process will hold 
> NodeAttributesManagerImpl#writeLock and may have some influence in a large 
> cluster. We have found nodes UI of a large cluster is opened slowly and most 
> time it's waiting for the lock in NodeAttributesManagerImpl. I think this 
> updating should be called only when necessary to enhance the performance of 
> related process.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8925) Updating distributed node attributes only when necessary

2018-11-14 Thread Tao Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-8925:
---
Attachment: YARN-8925.006.patch

> Updating distributed node attributes only when necessary
> 
>
> Key: YARN-8925
> URL: https://issues.apache.org/jira/browse/YARN-8925
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 3.2.1
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
>  Labels: performance
> Attachments: YARN-8925.001.patch, YARN-8925.002.patch, 
> YARN-8925.003.patch, YARN-8925.004.patch, YARN-8925.005.patch, 
> YARN-8925.006.patch
>
>
> Currently if distributed node attributes exist, even though there is no 
> change, updating for distributed node attributes will happen in every 
> heartbeat between NM and RM. Updating process will hold 
> NodeAttributesManagerImpl#writeLock and may have some influence in a large 
> cluster. We have found nodes UI of a large cluster is opened slowly and most 
> time it's waiting for the lock in NodeAttributesManagerImpl. I think this 
> updating should be called only when necessary to enhance the performance of 
> related process.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8881) Phase 1 - Add basic pluggable device plugin framework

2018-11-14 Thread Zhankun Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhankun Tang updated YARN-8881:
---
Attachment: YARN-8881-trunk.010.patch

> Phase 1 - Add basic pluggable device plugin framework
> -
>
> Key: YARN-8881
> URL: https://issues.apache.org/jira/browse/YARN-8881
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zhankun Tang
>Assignee: Zhankun Tang
>Priority: Major
> Attachments: YARN-8881-trunk.001.patch, YARN-8881-trunk.002.patch, 
> YARN-8881-trunk.003.patch, YARN-8881-trunk.004.patch, 
> YARN-8881-trunk.005.patch, YARN-8881-trunk.006.patch, 
> YARN-8881-trunk.007.patch, YARN-8881-trunk.008.patch, 
> YARN-8881-trunk.009.patch, YARN-8881-trunk.010.patch
>
>
> It includes adding support in "ResourcePluginManager" to load plugin classes 
> based on configuration, an interface for the vendor to implement and the 
> adapter to decouple plugin and YARN internals. And the vendor device resource 
> discovery will be ready after this support



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5168) Add port mapping handling when docker container use bridge network

2018-11-14 Thread Xun Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687426#comment-16687426
 ] 

Xun Liu commented on YARN-5168:
---

[~eyang]

I saw [YARN-8569|https://issues.apache.org/jira/browse/YARN-8569], If I can get 
the container internal port and the port mapping information allocated on the 
host through service.json, I can also meet my needs.

My new question is:
{quote}We probably want to drop support for adhoc "-p" 
{quote}
What command or parameter should I use to tell YARN, Do I need to expose the 
service port in my container?
Exposing the specified port of the service should be specified at runtime. If 
the port mapping needs to use the EXPOSE command in the Dockerfile, it is too 
difficult to use.

> Add port mapping handling when docker container use bridge network
> --
>
> Key: YARN-5168
> URL: https://issues.apache.org/jira/browse/YARN-5168
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jun Gong
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
>
> YARN-4007 addresses different network setups when launching the docker 
> container. We need support port mapping when docker container uses bridge 
> network.
> The following problems are what we faced:
> 1. Add "-P" to map docker container's exposed ports to automatically.
> 2. Add "-p" to let user specify specific ports to map.
> 3. Add service registry support for bridge network case, then app could find 
> each other. It could be done out of YARN, however it might be more convenient 
> to support it natively in YARN.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-8881) Phase 1 - Add basic pluggable device plugin framework

2018-11-14 Thread Zhankun Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687406#comment-16687406
 ] 

Zhankun Tang edited comment on YARN-8881 at 11/15/18 2:28 AM:
--

[~csingh], Thanks for the review!
{quote}It checks if the pluginClazz is an implementation of {{DevicePlugin}} 
via {{isAssginableFrom}}. Why is {{checkInterfaceCompatibility}} required?
{quote}
Zhankun => Good question. isAssiginableFrom only check whether the interface is 
the plugin class's parent. But it doesn't ensure the real methods implemented 
in the plugin class matches YARN needs. For instance, "FakeTestDevicePlugin4" 
implement a method "getOldRegisterRequestInfo"(should be 
"getRegisterRequestInfo") but can also be reflected and will throw an error 
until _plugin.getRegisterRequestInfo_ is called if we don't check. This could 
happen due to the plugin depend on an outdated NM API or some reason we don't 
know.

Here the _checkInterfaceCompatibility_ does a basic fast fail to ensure methods 
invoked by YARN exists. We'll add more sanity-check later to ensure plugin that 
its return value is correct, the method implemented is stateless .etc.
{quote}DevicePluginAdapter -> resourceName and devicePlugin can be final.
 LOG -> can be private.
{quote}
Zhankun => Thanks. Will fix it
{quote}The braces {} don't need to be surrounded by quotes. It is surrounded by 
quotes at quite a few places.
{quote}
Zhankun => Yeah. Will remove all quotes.
{quote}serialVersionUID is 1 for all the serializable classes. Use a time based 
or random large number.
{quote}
Zhankun => Yeah. Will change them.


was (Author: tangzhankun):
[~csingh], Thanks for the review!
{quote}It checks if the pluginClazz is an implementation of {{DevicePlugin}} 
via {{isAssginableFrom}}. Why is {{checkInterfaceCompatibility}} required?
{quote}
Zhankun => Good question. isAssiginableFrom only check whether the interface is 
the plugin class's parent. But it doesn't ensure the real methods implemented 
in the plugin class matches YARN needs. For instance, "FakeTestDevicePlugin4" 
implement a method "getOldRegisterRequestInfo"(should be 
"getRegisterRequestInfo") but can also be reflected and will throw an error 
until _plugin.getRegisterRequestInfo_ is called. This could happen due to the 
plugin depend on an outdated NM API or some reason we don't know.

Here the _checkInterfaceCompatibility_ does a basic fast fail to ensure methods 
invoked by YARN exists. We'll add more sanity-check later to ensure plugin that 
its return value is correct, the method implemented is stateless .etc.
{quote}DevicePluginAdapter -> resourceName and devicePlugin can be final.
 LOG -> can be private.
{quote}
Zhankun => Thanks. Will fix it
{quote}The braces {} don't need to be surrounded by quotes. It is surrounded by 
quotes at quite a few places.
{quote}
Zhankun => Yeah. Will remove all quotes.
{quote}serialVersionUID is 1 for all the serializable classes. Use a time based 
or random large number.
{quote}
Zhankun => Yeah. Will change them.

> Phase 1 - Add basic pluggable device plugin framework
> -
>
> Key: YARN-8881
> URL: https://issues.apache.org/jira/browse/YARN-8881
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zhankun Tang
>Assignee: Zhankun Tang
>Priority: Major
> Attachments: YARN-8881-trunk.001.patch, YARN-8881-trunk.002.patch, 
> YARN-8881-trunk.003.patch, YARN-8881-trunk.004.patch, 
> YARN-8881-trunk.005.patch, YARN-8881-trunk.006.patch, 
> YARN-8881-trunk.007.patch, YARN-8881-trunk.008.patch, 
> YARN-8881-trunk.009.patch
>
>
> It includes adding support in "ResourcePluginManager" to load plugin classes 
> based on configuration, an interface for the vendor to implement and the 
> adapter to decouple plugin and YARN internals. And the vendor device resource 
> discovery will be ready after this support



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-8881) Phase 1 - Add basic pluggable device plugin framework

2018-11-14 Thread Zhankun Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687406#comment-16687406
 ] 

Zhankun Tang edited comment on YARN-8881 at 11/15/18 2:27 AM:
--

[~csingh], Thanks for the review!
{quote}It checks if the pluginClazz is an implementation of {{DevicePlugin}} 
via {{isAssginableFrom}}. Why is {{checkInterfaceCompatibility}} required?
{quote}
Zhankun => Good question. isAssiginableFrom only check whether the interface is 
the plugin class's parent. But it doesn't ensure the real methods implemented 
in the plugin class matches YARN needs. For instance, "FakeTestDevicePlugin4" 
implement a method "getOldRegisterRequestInfo"(should be 
"getRegisterRequestInfo") but can also be reflected and will throw an error 
until _plugin.getRegisterRequestInfo_ is called. This could happen due to the 
plugin depend on an outdated NM API or some reason we don't know.

Here the _checkInterfaceCompatibility_ does a basic fast fail to ensure methods 
invoked by YARN exists. We'll add more sanity-check later to ensure plugin that 
its return value is correct, the method implemented is stateless .etc.
{quote}DevicePluginAdapter -> resourceName and devicePlugin can be final.
 LOG -> can be private.
{quote}
Zhankun => Thanks. Will fix it
{quote}The braces {} don't need to be surrounded by quotes. It is surrounded by 
quotes at quite a few places.
{quote}
Zhankun => Yeah. Will remove all quotes.
{quote}serialVersionUID is 1 for all the serializable classes. Use a time based 
or random large number.
{quote}
Zhankun => Yeah. Will change them.


was (Author: tangzhankun):
[~csingh], Thanks for the review!
{quote}It checks if the pluginClazz is an implementation of {{DevicePlugin}} 
via {{isAssginableFrom}}. Why is {{checkInterfaceCompatibility}} required?
{quote}
Zhankun => Good question. isAssiginableFrom only check whether the interface is 
the plugin class's parent. But it doesn't ensure the real methods implemented 
in the plugin class matches YARN needs. For instance, "FakeTestDevicePlugin4" 
doesn't implement a method "getOldRegisterRequestInfo" but can also be 
reflected and will throw an error until _plugin.getRegisterRequestInfo_ is 
called. This could happen due to the plugin depend on an outdated NM API or 
something we don't know.

Here the _checkInterfaceCompatibility_ does a basic fast fail to ensure methods 
invoked by YARN exists. We'll add more sanity-check later to ensure plugin that 
its return value is correct, the method implemented is stateless .etc.
{quote}DevicePluginAdapter -> resourceName and devicePlugin can be final.
 LOG -> can be private.
{quote}
Zhankun => Thanks. Will fix it
{quote}The braces {} don't need to be surrounded by quotes. It is surrounded by 
quotes at quite a few places.
{quote}
Zhankun => Yeah. Will remove all quotes.
{quote}serialVersionUID is 1 for all the serializable classes. Use a time based 
or random large number.
{quote}
Zhankun => Yeah. Will change them.

> Phase 1 - Add basic pluggable device plugin framework
> -
>
> Key: YARN-8881
> URL: https://issues.apache.org/jira/browse/YARN-8881
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zhankun Tang
>Assignee: Zhankun Tang
>Priority: Major
> Attachments: YARN-8881-trunk.001.patch, YARN-8881-trunk.002.patch, 
> YARN-8881-trunk.003.patch, YARN-8881-trunk.004.patch, 
> YARN-8881-trunk.005.patch, YARN-8881-trunk.006.patch, 
> YARN-8881-trunk.007.patch, YARN-8881-trunk.008.patch, 
> YARN-8881-trunk.009.patch
>
>
> It includes adding support in "ResourcePluginManager" to load plugin classes 
> based on configuration, an interface for the vendor to implement and the 
> adapter to decouple plugin and YARN internals. And the vendor device resource 
> discovery will be ready after this support



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8917) Absolute (maximum) capacity of level3+ queues is wrongly calculated for absolute resource

2018-11-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687411#comment-16687411
 ] 

Hadoop QA commented on YARN-8917:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 25s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 32s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 1 new + 26 unchanged - 0 fixed = 27 total (was 26) {color} 
|
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 18s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}102m 28s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}154m 29s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-8917 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12944836/YARN-8917.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux e45d1da8920d 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 21ec4bd |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/22540/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/22540/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 

[jira] [Commented] (YARN-8881) Phase 1 - Add basic pluggable device plugin framework

2018-11-14 Thread Zhankun Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687406#comment-16687406
 ] 

Zhankun Tang commented on YARN-8881:


[~csingh], Thanks for the review!
{quote}It checks if the pluginClazz is an implementation of {{DevicePlugin}} 
via {{isAssginableFrom}}. Why is {{checkInterfaceCompatibility}} required?
{quote}
Zhankun => Good question. isAssiginableFrom only check whether the interface is 
the plugin class's parent. But it doesn't ensure the real methods implemented 
in the plugin class matches YARN needs. For instance, "FakeTestDevicePlugin4" 
doesn't implement a method "getOldRegisterRequestInfo" but can also be 
reflected and will throw an error until _plugin.getRegisterRequestInfo_ is 
called. This could happen due to the plugin depend on an outdated NM API or 
something we don't know.

Here the _checkInterfaceCompatibility_ does a basic fast fail to ensure methods 
invoked by YARN exists. We'll add more sanity-check later to ensure plugin that 
its return value is correct, the method implemented is stateless .etc.
{quote}DevicePluginAdapter -> resourceName and devicePlugin can be final.
 LOG -> can be private.
{quote}
Zhankun => Thanks. Will fix it
{quote}The braces {} don't need to be surrounded by quotes. It is surrounded by 
quotes at quite a few places.
{quote}
Zhankun => Yeah. Will remove all quotes.
{quote}serialVersionUID is 1 for all the serializable classes. Use a time based 
or random large number.
{quote}
Zhankun => Yeah. Will change them.

> Phase 1 - Add basic pluggable device plugin framework
> -
>
> Key: YARN-8881
> URL: https://issues.apache.org/jira/browse/YARN-8881
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zhankun Tang
>Assignee: Zhankun Tang
>Priority: Major
> Attachments: YARN-8881-trunk.001.patch, YARN-8881-trunk.002.patch, 
> YARN-8881-trunk.003.patch, YARN-8881-trunk.004.patch, 
> YARN-8881-trunk.005.patch, YARN-8881-trunk.006.patch, 
> YARN-8881-trunk.007.patch, YARN-8881-trunk.008.patch, 
> YARN-8881-trunk.009.patch
>
>
> It includes adding support in "ResourcePluginManager" to load plugin classes 
> based on configuration, an interface for the vendor to implement and the 
> adapter to decouple plugin and YARN internals. And the vendor device resource 
> discovery will be ready after this support



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8986) publish all exposed ports to random ports when using bridge network

2018-11-14 Thread Charo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687402#comment-16687402
 ] 

Charo Zhang commented on YARN-8986:
---

[~eyang]
1,YARN_CONTAINER_RUNTIME_DOCKER_PORTS_MAPPING support specific binding of host 
IP in uploaded patches,for example:
-shell_env 
YARN_CONTAINER_RUNTIME_DOCKER_PORTS_MAPPING=127.0.0.1:8080:80,1234:1234,:88,:
it's same to "docker run -p 127.0.0.1:8080:80 -p 1234:1234 -p :88 -p :"

2,"If you don’t want to preface the docker command with sudo, create a Unix 
group called docker and add users to it", it means we just need add YARN user 
who start NM java process to docker group, and we have done like this in our 
cluster. In another case,If we try to use /sys/fs/cgroup, we must grant YARN 
user the access to /sys/fs/cgroup/cpu,cpuacct, so we add YARN user do docker 
group is not giving too much power, it just can run docker command without sudo 
access.

At same time, i am going to try performing docker operations in 
container-executor to make process of adding "-P".


> publish all exposed ports to random ports when using bridge network
> ---
>
> Key: YARN-8986
> URL: https://issues.apache.org/jira/browse/YARN-8986
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 3.1.1
>Reporter: Charo Zhang
>Assignee: Charo Zhang
>Priority: Minor
>  Labels: Docker
> Fix For: 3.1.2
>
> Attachments: 20181108155450.png, YARN-8986.001.patch, 
> YARN-8986.002.patch, YARN-8986.003.patch
>
>
> it's better to publish all exposed ports to random ports(-P) or support port 
> mapping(-p) for bridge network when using bridge network for docker container.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4404) Typo in comment in SchedulerUtils

2018-11-14 Thread Dinesh Chitlangia (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687367#comment-16687367
 ] 

Dinesh Chitlangia commented on YARN-4404:
-

+1 LGTM

> Typo in comment in SchedulerUtils
> -
>
> Key: YARN-4404
> URL: https://issues.apache.org/jira/browse/YARN-4404
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.7.1
>Reporter: Daniel Templeton
>Assignee: Yesha Vora
>Priority: Trivial
>  Labels: newbie
> Attachments: YARN-4404.001.patch
>
>
> The comment starting on line 254 says:
> {code}
>   /**
>* Utility method to validate a resource request, by insuring that the
>* requested memory/vcore is non-negative and not greater than max
>* 
>* @throws InvalidResourceRequestException when there is invalid request
>*/
> {code}
> "Insuring" should be "ensuring."



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8917) Absolute (maximum) capacity of level3+ queues is wrongly calculated for absolute resource

2018-11-14 Thread Wangda Tan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-8917:
-
Target Version/s: 3.2.1

> Absolute (maximum) capacity of level3+ queues is wrongly calculated for 
> absolute resource
> -
>
> Key: YARN-8917
> URL: https://issues.apache.org/jira/browse/YARN-8917
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.2.1
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Critical
> Attachments: YARN-8917.001.patch, YARN-8917.002.patch
>
>
> Absolute capacity should be equal to multiply capacity by parent-queue's 
> absolute-capacity,
> but currently it's calculated as dividing capacity by parent-queue's 
> absolute-capacity.
> Calculation for absolute-maximum-capacity has the same problem.
> For example: 
> root.a   capacity=0.4   maximum-capacity=0.8
> root.a.a1   capacity=0.5  maximum-capacity=0.6
> Absolute capacity of root.a.a1 should be 0.2 but is wrongly calculated as 1.25
> Absolute maximum capacity of root.a.a1 should be 0.48 but is wrongly 
> calculated as 0.75
> Moreover:
> {{childQueue.getQueueCapacities().getCapacity()}}  should be changed to 
> {{childQueue.getQueueCapacities().getCapacity(label)}} to avoid getting wrong 
> capacity from default partition when calculating for a non-default partition.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8917) Absolute (maximum) capacity of level3+ queues is wrongly calculated for absolute resource

2018-11-14 Thread Wangda Tan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-8917:
-
Priority: Critical  (was: Major)

> Absolute (maximum) capacity of level3+ queues is wrongly calculated for 
> absolute resource
> -
>
> Key: YARN-8917
> URL: https://issues.apache.org/jira/browse/YARN-8917
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.2.1
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Critical
> Attachments: YARN-8917.001.patch, YARN-8917.002.patch
>
>
> Absolute capacity should be equal to multiply capacity by parent-queue's 
> absolute-capacity,
> but currently it's calculated as dividing capacity by parent-queue's 
> absolute-capacity.
> Calculation for absolute-maximum-capacity has the same problem.
> For example: 
> root.a   capacity=0.4   maximum-capacity=0.8
> root.a.a1   capacity=0.5  maximum-capacity=0.6
> Absolute capacity of root.a.a1 should be 0.2 but is wrongly calculated as 1.25
> Absolute maximum capacity of root.a.a1 should be 0.48 but is wrongly 
> calculated as 0.75
> Moreover:
> {{childQueue.getQueueCapacities().getCapacity()}}  should be changed to 
> {{childQueue.getQueueCapacities().getCapacity(label)}} to avoid getting wrong 
> capacity from default partition when calculating for a non-default partition.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7512) Support service upgrade via YARN Service API and CLI

2018-11-14 Thread Chandni Singh (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chandni Singh updated YARN-7512:

Release Note: 
Support upgrade of a Yarn Service (long running application):
1. In place upgrade of service instances.
2. Option to upgrade all the instances of a component and multiple components.
3. API to get instances with the following filter options- component name, 
state of the instance, version.
4. Option to perform express upgrade of the service.
5. Option to cancel an ongoing upgrade.


  was:
Support for upgrade of a Yarn Service (long running application):
1. In place upgrade of service instances.
2. Option to upgrade all the instances of a component and multiple components.
3. API to get instances with the following filter options- component name, 
state of the instance, version.
4. Option to perform express upgrade of the service.
5. Option to cancel an ongoing upgrade.



> Support service upgrade via YARN Service API and CLI
> 
>
> Key: YARN-7512
> URL: https://issues.apache.org/jira/browse/YARN-7512
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: yarn-native-services
>Reporter: Gour Saha
>Assignee: Chandni Singh
>Priority: Major
> Attachments: _In-Place Upgrade of Long-Running Applications in 
> YARN_v1.pdf, _In-Place Upgrade of Long-Running Applications in YARN_v2.pdf, 
> _In-Place Upgrade of Long-Running Applications in YARN_v3.pdf
>
>
> YARN Service API and CLI needs to support service (and containers) upgrade in 
> line with what Slider supported in SLIDER-787 
> (http://slider.incubator.apache.org/docs/slider_specs/application_pkg_upgrade.html)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7512) Support service upgrade via YARN Service API and CLI

2018-11-14 Thread Chandni Singh (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chandni Singh updated YARN-7512:

Release Note: 
Support for upgrade of a Yarn Service (long running application):
1. In place upgrade of service instances.
2. Option to upgrade all the instances of a component and multiple components.
3. API to get instances with the following filter options- component name, 
state of the instance, version.
4. Option to perform express upgrade of the service.
5. Option to cancel an ongoing upgrade.


  was:
Further functionality support for Long Running Services in YARN includes:
1. In place service (and containers) upgrade
2. Option to cancel an ongoing upgrade.


> Support service upgrade via YARN Service API and CLI
> 
>
> Key: YARN-7512
> URL: https://issues.apache.org/jira/browse/YARN-7512
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: yarn-native-services
>Reporter: Gour Saha
>Assignee: Chandni Singh
>Priority: Major
> Attachments: _In-Place Upgrade of Long-Running Applications in 
> YARN_v1.pdf, _In-Place Upgrade of Long-Running Applications in YARN_v2.pdf, 
> _In-Place Upgrade of Long-Running Applications in YARN_v3.pdf
>
>
> YARN Service API and CLI needs to support service (and containers) upgrade in 
> line with what Slider supported in SLIDER-787 
> (http://slider.incubator.apache.org/docs/slider_specs/application_pkg_upgrade.html)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8981) Virtual IP address support

2018-11-14 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687237#comment-16687237
 ] 

Eric Yang commented on YARN-8981:
-

@Ruslan Dautkhanov I think this is already supported in YARN.  [YARN 
service|https://hadoop.apache.org/docs/r3.1.1/hadoop-yarn/hadoop-yarn-site/yarn-service/Overview.html]
 supports [Service 
Discovery|https://hadoop.apache.org/docs/r3.1.1/hadoop-yarn/hadoop-yarn-site/yarn-service/ServiceDiscovery.html].
  The container hostname is encoded using this format:
{code}
[component-name]-[instance-number].[application-name].[user].[domain]
{code}

A example is:
{code}
httpd-0.myapp.john.example.com
{code}

IP address is automatically discovered and maps to pre-compiled hostnames.  
This supports virtual IP use case, where the container may move to another 
host, and IP address of the dns entry is updated within seconds.  Hence, end 
user only needs to know the hostname, and able to access the application.

> Virtual IP address support
> --
>
> Key: YARN-8981
> URL: https://issues.apache.org/jira/browse/YARN-8981
> Project: Hadoop YARN
>  Issue Type: New Feature
>Affects Versions: 3.0.2, 3.1.1
>Reporter: Ruslan Dautkhanov
>Priority: Major
>  Labels: DNS, Docker, docker, service, service-engine, 
> service-orchestration, virtual_hosts
>
> I couldn't find support for virtual IP addresses in YARN framework. 
> This would be great if we have a docker-on-yarn service and if it for example 
> has to be failed over to another physical host, clients can still find it. 
> So the idea is for YARN to bring up that virtual IP address (an 
> additional/secondary IP address ) on a physical host where that particular 
> docker container is running, so the clients that use that container's 
> services don't have to change connection details every time that container 
> moves around in YARN cluster.
> Similarly to virtual IP addresses in Kubernetes world:
> [https://kubernetes.io/docs/concepts/services-networking/service/#virtual-ips-and-service-proxies]
> One implementation could be through `ip address add` \ `ip address remove`.
> Kubernetes uses a more complicated `kube-proxy`, similarly to `docker-proxy` 
> process in pure dockers / non-kubernetes docker deployments. 
> Another approach is running a separate DNS service for a DNS subdomain (main 
> DNS server would have to forward all requests for that DNS subdomain to a 
> YARN DNS service). In Oracle Clusterware similar process is called GNS: 
> https://docs.oracle.com/en/database/oracle/oracle-database/12.2/cwsol/about-the-grid-naming-service-vip-address.html#GUID-A4EE0CC6-A5F1-4507-82D6-D5C43E0F1584
> Would be great to have support for either virtual IP addresses managed by 
> YARN directly or something similar to Oracle's GNS dns service.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8881) Phase 1 - Add basic pluggable device plugin framework

2018-11-14 Thread Chandni Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687244#comment-16687244
 ] 

Chandni Singh commented on YARN-8881:
-

Hi [~tangzhankun], Thanks for the patch. I have some questions/comments

1.
{code}   
if (!DevicePlugin.class.isAssignableFrom(pluginClazz)) {
throw new YarnRuntimeException("Class: " + pluginClassName
+ " not instance of " + DevicePlugin.class.getCanonicalName());
  }
  // sanity-check before initialization
  checkInterfaceCompatibility(DevicePlugin.class, pluginClazz);
{code}
It checks if the pluginClazz is an implementation of {{DevicePlugin}} via 
{{isAssginableFrom}}. Why is {{checkInterfaceCompatibility}} required? 

2. Nitpick: DevicePluginAdapter ->  resourceName  and devicePlugin can be final.
   LOG -> can be private.
{code}
  final static Log LOG = LogFactory.getLog(DevicePluginAdapter.class);

  private String resourceName;
  private DevicePlugin devicePlugin;
{code}

3.
{code}
LOG.debug("Checking implemented interface's compatibility: \"{}\"",
expectedClass.getSimpleName());
{code}
  The braces {} don't need to be surrounded by quotes. It is surrounded by 
quotes at quite a few places.

4. serialVersionUID is 1 for all the serializable classes. Use a time based or 
random large number.

> Phase 1 - Add basic pluggable device plugin framework
> -
>
> Key: YARN-8881
> URL: https://issues.apache.org/jira/browse/YARN-8881
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zhankun Tang
>Assignee: Zhankun Tang
>Priority: Major
> Attachments: YARN-8881-trunk.001.patch, YARN-8881-trunk.002.patch, 
> YARN-8881-trunk.003.patch, YARN-8881-trunk.004.patch, 
> YARN-8881-trunk.005.patch, YARN-8881-trunk.006.patch, 
> YARN-8881-trunk.007.patch, YARN-8881-trunk.008.patch, 
> YARN-8881-trunk.009.patch
>
>
> It includes adding support in "ResourcePluginManager" to load plugin classes 
> based on configuration, an interface for the vendor to implement and the 
> adapter to decouple plugin and YARN internals. And the vendor device resource 
> discovery will be ready after this support



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8898) Fix FederationInterceptor#allocate to set application priority in allocateResponse

2018-11-14 Thread Subru Krishnan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687190#comment-16687190
 ] 

Subru Krishnan commented on YARN-8898:
--

{quote}Unfortunately we didn't write 
ApplicationHomeSubCluster.getProto.getBytes to znode\{quote}

Thanks [~bibinchundatt] for bringing this to my attention. The intention was to 
persist _ApplicationHomeSubCluster_ and that's why it was defined as a proto 
object in the first place.

So I feel it might be better to fix it as at least the API is correct? I mean 
add the trimmed _ApplicationSubmissionContext_  to _ApplicationHomeSubCluster_  
and persist the entire _ApplicationHomeSubCluster_ in _ZK._

For SQL, it's adding a new column so it should be safe as well.

 

> Fix FederationInterceptor#allocate to set application priority in 
> allocateResponse
> --
>
> Key: YARN-8898
> URL: https://issues.apache.org/jira/browse/YARN-8898
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Bibin A Chundatt
>Priority: Major
> Attachments: YARN-8898.wip.patch
>
>
> In case of FederationInterceptor#mergeAllocateResponses skips 
> application_priority in response returned



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9009) Fix flaky test TestEntityGroupFSTimelineStore.testCleanLogs

2018-11-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687131#comment-16687131
 ] 

Hadoop QA commented on YARN-9009:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
 9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 54s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 32s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
23s{color} | {color:green} hadoop-yarn-server-timeline-pluginstorage in the 
patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 49m 13s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-9009 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12948206/YARN-9009-trunk-001.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 20479278c8d6 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / b57cc73 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/22539/testReport/ |
| Max. process+thread count | 414 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timeline-pluginstorage
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timeline-pluginstorage
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/22539/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Fix flaky test 

[jira] [Commented] (YARN-8672) TestContainerManager#testLocalingResourceWhileContainerRunning occasionally times out

2018-11-14 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687103#comment-16687103
 ] 

Hudson commented on YARN-8672:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15427 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/15427/])
YARN-8672.  Improve token filename management for localization.  
(eyang: rev 21ec4bdaef4b68adbbf4f33a6f74494c074f803c)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestDefaultContainerExecutor.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestLinuxContainerExecutorWithMocks.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestContainerLocalizer.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/ContainerExecutor.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LinuxContainerExecutor.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestLinuxContainerExecutor.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DefaultContainerExecutor.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/TestContainerLaunch.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerLaunch.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/WindowsSecureContainerExecutor.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerRelaunch.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ContainerLocalizer.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestContainerManager.java


> TestContainerManager#testLocalingResourceWhileContainerRunning occasionally 
> times out
> -
>
> Key: YARN-8672
> URL: https://issues.apache.org/jira/browse/YARN-8672
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.2.0
>Reporter: Jason Lowe
>Assignee: Chandni Singh
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-8672.001.patch, YARN-8672.002.patch, 
> YARN-8672.003.patch, YARN-8672.004.patch, YARN-8672.005.patch, 
> YARN-8672.006.patch, YARN-8672.007.patch, YARN-8672.008.patch
>
>
> Precommit builds have been failing in 
> TestContainerManager#testLocalingResourceWhileContainerRunning.  I have been 
> able to reproduce the problem without any patch applied if I run the test 
> enough times.  It looks like something is removing container tokens from the 
> nmPrivate area just as a new localizer starts.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8917) Absolute (maximum) capacity of level3+ queues is wrongly calculated for absolute resource

2018-11-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687070#comment-16687070
 ] 

Hadoop QA commented on YARN-8917:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 52s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 29s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 1 new + 26 unchanged - 0 fixed = 27 total (was 26) {color} 
|
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 27s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}104m 39s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}157m 37s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-8917 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12944836/YARN-8917.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 718c9a0d28cc 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / b57cc73 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/22538/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/22538/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 

[jira] [Updated] (YARN-9009) Fix flaky test TestEntityGroupFSTimelineStore.testCleanLogs

2018-11-14 Thread OrDTesters (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

OrDTesters updated YARN-9009:
-
Attachment: (was: YARN-9009-trunk-001.patch)

> Fix flaky test TestEntityGroupFSTimelineStore.testCleanLogs
> ---
>
> Key: YARN-9009
> URL: https://issues.apache.org/jira/browse/YARN-9009
> Project: Hadoop YARN
>  Issue Type: Bug
> Environment: Ubuntu 18.04
> java version "1.8.0_181"
> Java(TM) SE Runtime Environment (build 1.8.0_181-b13)
> Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)
>  
> Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 
> 2018-06-17T13:33:14-05:00)
>Reporter: OrDTesters
>Priority: Minor
> Attachments: YARN-9009-trunk-001.patch
>
>
> In TestEntityGroupFSTimelineStore, testCleanLogs fails when run after 
> testMoveToDone.
> testCleanLogs fails because testMoveToDone moves a file into the same 
> directory that testCleanLogs cleans, causing testCleanLogs to clean 3 files, 
> instead of 2 as testCleanLogs expects.
> To fix the failure of testCleanLogs, we can delete the file after the file is 
> moved by testMoveToDone.
> Pull request link: [https://github.com/apache/hadoop/pull/438]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8299) Yarn Service Upgrade: Add GET APIs that returns instances matching query params

2018-11-14 Thread Chandni Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687058#comment-16687058
 ] 

Chandni Singh commented on YARN-8299:
-

[~eyang] [~leftnoteasy] Could you please help back-porting this change? 
It is needed for backporting of https://issues.apache.org/jira/browse/YARN-8160

> Yarn Service Upgrade: Add GET APIs that returns instances matching query 
> params
> ---
>
> Key: YARN-8299
> URL: https://issues.apache.org/jira/browse/YARN-8299
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: YARN-8299-branch-3.1.001.patch, YARN-8299.001.patch, 
> YARN-8299.002.patch, YARN-8299.003.patch, YARN-8299.004.patch, 
> YARN-8299.005.patch
>
>
> We need APIs that returns containers that match the query params. These are 
> needed so that we can find out what containers have been upgraded.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9009) Fix flaky test TestEntityGroupFSTimelineStore.testCleanLogs

2018-11-14 Thread OrDTesters (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

OrDTesters updated YARN-9009:
-
Attachment: YARN-9009-trunk-001.patch

> Fix flaky test TestEntityGroupFSTimelineStore.testCleanLogs
> ---
>
> Key: YARN-9009
> URL: https://issues.apache.org/jira/browse/YARN-9009
> Project: Hadoop YARN
>  Issue Type: Bug
> Environment: Ubuntu 18.04
> java version "1.8.0_181"
> Java(TM) SE Runtime Environment (build 1.8.0_181-b13)
> Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)
>  
> Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 
> 2018-06-17T13:33:14-05:00)
>Reporter: OrDTesters
>Priority: Minor
> Attachments: YARN-9009-trunk-001.patch
>
>
> In TestEntityGroupFSTimelineStore, testCleanLogs fails when run after 
> testMoveToDone.
> testCleanLogs fails because testMoveToDone moves a file into the same 
> directory that testCleanLogs cleans, causing testCleanLogs to clean 3 files, 
> instead of 2 as testCleanLogs expects.
> To fix the failure of testCleanLogs, we can delete the file after the file is 
> moved by testMoveToDone.
> Pull request link: [https://github.com/apache/hadoop/pull/438]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9022) MiniYarnCluster 3.1.0 RESTAPI not working for some cases

2018-11-14 Thread liehuo chen (JIRA)
liehuo chen created YARN-9022:
-

 Summary: MiniYarnCluster 3.1.0 RESTAPI not working for some cases
 Key: YARN-9022
 URL: https://issues.apache.org/jira/browse/YARN-9022
 Project: Hadoop YARN
  Issue Type: Bug
  Components: test
Affects Versions: 3.1.0
Reporter: liehuo chen


Actually I am not sure Should I open this Jira in Hadoop or in Spark projects.

Try spark 2.4 rc5 with hadoop 3.1.0, it failed 4 tests in test suite: 
org.apache.spark.deploy.yarn.YarnClusterSuite,  the reason is those tests are 
trying to access logs from UI like:  
[http://$RM_ADDRESS:49363/node/containerlogs/$container_id/user/stdout?start=-4096,|http://192.168.0.30:49363/node/containerlogs/container_1542175195899_0001_02_02/user/stdout?start=-4096,]
 failed on following msg: 
{code:java}
// code placeholder
{code}
java.lang.AbstractMethodError: 
javax.ws.rs.core.UriBuilder.uri(Ljava/lang/String;)Ljavax/ws/rs/core/UriBuilder;
 at javax.ws.rs.core.UriBuilder.fromUri(UriBuilder.java:119) at 
com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:911)
 at 
com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:875)
 at 
org.apache.hadoop.yarn.server.nodemanager.webapp.NMWebAppFilter.doFilter(NMWebAppFilter.java:73)
 at 
com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:829)
 at 
com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:119)
 at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:133) at 
com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:130) at 
com.google.inject.servlet.GuiceFilter$Context.call(GuiceFilter.java:203) at 
com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:130) at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
 at 
org.apache.hadoop.security.http.XFrameOptionsFilter.doFilter(XFrameOptionsFilter.java:57)
 at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
 at 
org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:644)
 at 
org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:592)
 at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
 at 
org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:110)
 at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
 at 
org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1601)
 at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
 at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45) at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
 at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582) 
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) 
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548) 
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
 at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
 at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512) 
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
 at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
 at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) 
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
 at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134) 
at org.eclipse.jetty.server.Server.handle(Server.java:539) at 
org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:333) at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251) at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
 at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108) at 
org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93) 
at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
 at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
 at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
 at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
 at 
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589) 
at java.lang.Thread.run(Thread.java:748) 



--
This message was sent by 

[jira] [Comment Edited] (YARN-8986) publish all exposed ports to random ports when using bridge network

2018-11-14 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686938#comment-16686938
 ] 

Eric Yang edited comment on YARN-8986 at 11/14/18 6:23 PM:
---

[~Charo Zhang] Thank you for the patch.  This patch assumes YARN user has 
ability to run "docker" command line.  This is not true in secure clusters.  
[Docker access|https://docs.docker.com/install/linux/linux-postinstall/] should 
be given to trusted system admin with sudo access only.  YARN user can only 
acquire privileges to run docker command via C version of container-executor 
binary.  This ensures that we are not giving too much power to YARN user.

We should route "docker network ls" check through C version of 
container-executor to perform docker operations.  The decision making process 
of adding "-P" probably belongs to get_docker_run_command.

YARN_CONTAINER_RUNTIME_DOCKER_PORTS_MAPPING looks ok.  Do you plan to support 
specific binding of host IP?  i.e. 127.0.0.1:8080:80 to restrict the container 
port 80 to map to host 127.0.0.1:8080.


was (Author: eyang):
[~Charo Zhang] Thank you for the patch.  This patch assumes YARN user has 
ability to run "docker" command line.  This is not true in secure clusters.  
[Docker access|https://docs.docker.com/install/linux/linux-postinstall/] should 
be given to trusted system admin with sudo access only.  YARN user can only 
acquire privileges to run docker command via C version of container-executor 
binary.  This ensures that we are not giving too much power to YARN user.

We should route "docker network ls" check through C version of 
container-executor to perform docker operations.  The decision making process 
of adding "-P" probably belongs to get_docker_run_command.

> publish all exposed ports to random ports when using bridge network
> ---
>
> Key: YARN-8986
> URL: https://issues.apache.org/jira/browse/YARN-8986
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 3.1.1
>Reporter: Charo Zhang
>Assignee: Charo Zhang
>Priority: Minor
>  Labels: Docker
> Fix For: 3.1.2
>
> Attachments: 20181108155450.png, YARN-8986.001.patch, 
> YARN-8986.002.patch, YARN-8986.003.patch
>
>
> it's better to publish all exposed ports to random ports(-P) or support port 
> mapping(-p) for bridge network when using bridge network for docker container.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8986) publish all exposed ports to random ports when using bridge network

2018-11-14 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686938#comment-16686938
 ] 

Eric Yang commented on YARN-8986:
-

[~Charo Zhang] Thank you for the patch.  This patch assumes YARN user has 
ability to run "docker" command line.  This is not true in secure clusters.  
[Docker access|https://docs.docker.com/install/linux/linux-postinstall/] should 
be given to trusted system admin with sudo access only.  YARN user can only 
acquire privileges to run docker command via C version of container-executor 
binary.  This ensures that we are not giving too much power to YARN user.

We should route "docker network ls" check through C version of 
container-executor to perform docker operations.  The decision making process 
of adding "-P" probably belongs to get_docker_run_command.

> publish all exposed ports to random ports when using bridge network
> ---
>
> Key: YARN-8986
> URL: https://issues.apache.org/jira/browse/YARN-8986
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 3.1.1
>Reporter: Charo Zhang
>Assignee: Charo Zhang
>Priority: Minor
>  Labels: Docker
> Fix For: 3.1.2
>
> Attachments: 20181108155450.png, YARN-8986.001.patch, 
> YARN-8986.002.patch, YARN-8986.003.patch
>
>
> it's better to publish all exposed ports to random ports(-P) or support port 
> mapping(-p) for bridge network when using bridge network for docker container.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8856) TestTimelineReaderWebServicesHBaseStorage tests failing with NoClassDefFoundError

2018-11-14 Thread JIRA


[ 
https://issues.apache.org/jira/browse/YARN-8856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686921#comment-16686921
 ] 

Íñigo Goiri commented on YARN-8856:
---

[~vrushalic], did you have a chance to verify?
[~rohithsharma], cna you also take a look?
I'd like to get this in soon.

> TestTimelineReaderWebServicesHBaseStorage tests failing with 
> NoClassDefFoundError
> -
>
> Key: YARN-8856
> URL: https://issues.apache.org/jira/browse/YARN-8856
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jason Lowe
>Assignee: Sushil Ks
>Priority: Major
> Attachments: YARN-8856.001.patch
>
>
> TestTimelineReaderWebServicesHBaseStorage has been failing in nightly builds 
> with NoClassDefFoundError in the tests.  Sample error and stacktrace to 
> follow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-9020) set a wrong AbsoluteCapacity when call ParentQueue#setAbsoluteCapacity

2018-11-14 Thread Wangda Tan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan resolved YARN-9020.
--
Resolution: Duplicate

Thanks [~jutia] for reporting this. It is a valid issue.

This is dup of YARN-8917, [~Tao Yang] has put a patch already. Closing this as 
dup.

> set a wrong AbsoluteCapacity when call  ParentQueue#setAbsoluteCapacity
> ---
>
> Key: YARN-9020
> URL: https://issues.apache.org/jira/browse/YARN-9020
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: tianjuan
>Assignee: tianjuan
>Priority: Major
>
> set a wrong AbsoluteCapacity when call  ParentQueue#setAbsoluteCapacity
> private void deriveCapacityFromAbsoluteConfigurations(String label,
>  Resource clusterResource, ResourceCalculator rc, CSQueue childQueue) {
> // 3. Update absolute capacity as a float based on parent's minResource and
>  // cluster resource.
>  childQueue.getQueueCapacities().setAbsoluteCapacity(label,
>  (float) childQueue.getQueueCapacities().{color:#d04437}getCapacity(){color}
>  / getQueueCapacities().getAbsoluteCapacity(label));
>  
> {color:#d04437}should be{color} 
> childQueue.getQueueCapacities().setAbsoluteCapacity(label,
>  (float) 
> childQueue.getQueueCapacities().{color:#f6c342}getCapacity(label){color}
>  / getQueueCapacities().getAbsoluteCapacity(label));



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5168) Add port mapping handling when docker container use bridge network

2018-11-14 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686910#comment-16686910
 ] 

Eric Yang commented on YARN-5168:
-

[~liuxun323] {quote}What we need to do is to send the conflict information to 
the user if the conflict occurs, and let the user create a port.{quote}

How do you propose the user to create a port while the service is in between 
state of getting constructed?  We probably want to drop support for adhoc "-p" 
because user will not have a chance to get all the port mapping correct.  Some 
components may depend on ephemeral port information that have not exist yet.  
If user runs with overlay network, components are running with ports that are 
known without conflict.  There is no mapping necessary.

When host chosen port information is aggregated, ephemeral port information can 
be published via YARN SysFS (YARN-8569).Xun, is this in alignment with your 
view?

> Add port mapping handling when docker container use bridge network
> --
>
> Key: YARN-5168
> URL: https://issues.apache.org/jira/browse/YARN-5168
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jun Gong
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
>
> YARN-4007 addresses different network setups when launching the docker 
> container. We need support port mapping when docker container uses bridge 
> network.
> The following problems are what we faced:
> 1. Add "-P" to map docker container's exposed ports to automatically.
> 2. Add "-p" to let user specify specific ports to map.
> 3. Add service registry support for bridge network case, then app could find 
> each other. It could be done out of YARN, however it might be more convenient 
> to support it natively in YARN.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8917) Absolute (maximum) capacity of level3+ queues is wrongly calculated for absolute resource

2018-11-14 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686903#comment-16686903
 ] 

Wangda Tan commented on YARN-8917:
--

This JIRA somehow dropped from our radar, retriggering Jenkins job and will get 
it committed.

> Absolute (maximum) capacity of level3+ queues is wrongly calculated for 
> absolute resource
> -
>
> Key: YARN-8917
> URL: https://issues.apache.org/jira/browse/YARN-8917
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.2.1
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-8917.001.patch, YARN-8917.002.patch
>
>
> Absolute capacity should be equal to multiply capacity by parent-queue's 
> absolute-capacity,
> but currently it's calculated as dividing capacity by parent-queue's 
> absolute-capacity.
> Calculation for absolute-maximum-capacity has the same problem.
> For example: 
> root.a   capacity=0.4   maximum-capacity=0.8
> root.a.a1   capacity=0.5  maximum-capacity=0.6
> Absolute capacity of root.a.a1 should be 0.2 but is wrongly calculated as 1.25
> Absolute maximum capacity of root.a.a1 should be 0.48 but is wrongly 
> calculated as 0.75
> Moreover:
> {{childQueue.getQueueCapacities().getCapacity()}}  should be changed to 
> {{childQueue.getQueueCapacities().getCapacity(label)}} to avoid getting wrong 
> capacity from default partition when calculating for a non-default partition.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-6223) [Umbrella] Natively support GPU configuration/discovery/scheduling/isolation on YARN

2018-11-14 Thread Wangda Tan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan reassigned YARN-6223:


Assignee: Wangda Tan  (was: Antal Bálint Steinbach)

> [Umbrella] Natively support GPU configuration/discovery/scheduling/isolation 
> on YARN
> 
>
> Key: YARN-6223
> URL: https://issues.apache.org/jira/browse/YARN-6223
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: YARN-6223.Natively-support-GPU-on-YARN-v1.pdf, 
> YARN-6223.wip.1.patch, YARN-6223.wip.2.patch, YARN-6223.wip.3.patch
>
>
> As varieties of workloads are moving to YARN, including machine learning / 
> deep learning which can speed up by leveraging GPU computation power. 
> Workloads should be able to request GPU from YARN as simple as CPU and memory.
> *To make a complete GPU story, we should support following pieces:*
> 1) GPU discovery/configuration: Admin can either config GPU resources and 
> architectures on each node, or more advanced, NodeManager can automatically 
> discover GPU resources and architectures and report to ResourceManager 
> 2) GPU scheduling: YARN scheduler should account GPU as a resource type just 
> like CPU and memory.
> 3) GPU isolation/monitoring: once launch a task with GPU resources, 
> NodeManager should properly isolate and monitor task's resource usage.
> For #2, YARN-3926 can support it natively. For #3, YARN-3611 has introduced 
> an extensible framework to support isolation for different resource types and 
> different runtimes.
> *Related JIRAs:*
> There're a couple of JIRAs (YARN-4122/YARN-5517) filed with similar goals but 
> different solutions:
> For scheduling:
> - YARN-4122/YARN-5517 are all adding a new GPU resource type to Resource 
> protocol instead of leveraging YARN-3926.
> For isolation:
> - And YARN-4122 proposed to use CGroups to do isolation which cannot solve 
> the problem listed at 
> https://github.com/NVIDIA/nvidia-docker/wiki/GPU-isolation#challenges such as 
> minor device number mapping; load nvidia_uvm module; mismatch of CUDA/driver 
> versions, etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9021) Service AM support for restoring decommissioned component instances

2018-11-14 Thread Billie Rinaldi (JIRA)
Billie Rinaldi created YARN-9021:


 Summary: Service AM support for restoring decommissioned component 
instances
 Key: YARN-9021
 URL: https://issues.apache.org/jira/browse/YARN-9021
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Billie Rinaldi


YARN-8761 added support for decommissioning component instances. This ticket 
for restoring decommissioned instances, which would involve removing the 
component instance from the list of decommissioned instances in the service 
specification and flexing the number of component instances up. Additional work 
would be needed if we wanted to preserve any component instance state while the 
component was decommissioned, such as the last host where the instance ran.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7171) RM UI should sort memory / cores numerically

2018-11-14 Thread Eric Payne (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686821#comment-16686821
 ] 

Eric Payne commented on YARN-7171:
--

bq. it has been fixed in YARN-3466
This is not accurate. This problem has not been fixed.

> RM UI should sort memory / cores numerically
> 
>
> Key: YARN-7171
> URL: https://issues.apache.org/jira/browse/YARN-7171
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.7.1
>Reporter: Eric Maynard
>Priority: Major
>
> Currently, the RM web UI sorts allocated memory and cores in lexicographic 
> order which can be quite obtuse. When there are a large number of running 
> jobs, it can be opaque to find the job which is allocating the most amount of 
> memory or view jobs with similar allocation. Sorting these values numerically 
> would improve usability.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8986) publish all exposed ports to random ports when using bridge network

2018-11-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686823#comment-16686823
 ] 

Hadoop QA commented on YARN-8986:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 12m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
57m 29s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 14m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 14m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 11m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 39s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}133m 25s{color} 
| {color:red} root in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
50s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}247m 50s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage
 |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-8986 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12948122/YARN-8986.001.patch |
| Optional Tests |  dupname  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux a6c15384bdcb 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / a948281 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/22536/artifact/out/patch-unit-root.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/22536/testReport/ |
| Max. process+thread count | 4547 (vs. ulimit of 1) |
| modules | C: . U: . |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/22536/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> publish all exposed ports to random ports when using bridge network
> ---
>
> Key: YARN-8986
> URL: https://issues.apache.org/jira/browse/YARN-8986
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 3.1.1
>Reporter: Charo Zhang
>Assignee: Charo Zhang
>Priority: Minor
>  Labels: Docker
> Fix For: 3.1.2
>
> Attachments: 20181108155450.png, 

[jira] [Commented] (YARN-8960) [Submarine] Can't get submarine service status using the command of "yarn app -status" under security environment

2018-11-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686777#comment-16686777
 ] 

Hadoop QA commented on YARN-8960:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  3s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 11s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine: 
The patch generated 10 new + 60 unchanged - 1 fixed = 70 total (was 61) {color} 
|
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 21s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
33s{color} | {color:green} hadoop-yarn-submarine in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 52m 39s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-8960 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12948150/YARN-8960.005.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux cca30e2c52b6 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / b57cc73 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/22537/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-submarine.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/22537/testReport/ |
| Max. process+thread count | 471 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine 
U: 

[jira] [Commented] (YARN-8960) [Submarine] Can't get submarine service status using the command of "yarn app -status" under security environment

2018-11-14 Thread Zac Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686692#comment-16686692
 ] 

Zac Zhou commented on YARN-8960:


 

Thanks,[~leftnoteasy]

For comment 1
{quote}1) doLoginIfSecure, could u print login user if keytab/principal is 
empty? (Assume the user has login using kinit). We should fail the job 
submission if user doesn't login using kinit AND no keytab/principal specified 
AND security is enabled. And suggest to use Log.info instead of debug.
{quote}
LoginIfSecure is changed.

For comment 2
{quote}2) Regarding to upload keytab, I'm a bit concerned about this behavior, 
instead of doing that, should we assume keytabs will be placed under all 
machine's directory? For example, if "zac" user has 
/security/keytabs/zac.keytab, the remote machine should have the same keytab on 
the same folder. Passing around keytab could be a high risk of the cluster.

If you think #2 is necessary, please at least make uploading keytab to an 
optional parameter, and add a note to command line description (Such as 
"distributing keytab to other machines is a risky operation to your 
credentials. Please consider options pre-distribute your keytab by admin as an 
alternative and more safety solution").
{quote}

Yeah, I agree with you. Publishing keytab to the cluster seems a risk. 
But I think we need to support it, as it's easier for user to submit a 
submarine job. I checked spark code(Client.prepareLocalResource) for it's 
--keytab 
--principal parameter. Spark uploaded the user's keytab to hdfs to resolve am 
delegationToken renewer issue for long-running app(AMDelegationTokenRenewer). 
As the keytab is uploaded to user's home directory, we can set it's permission 
to 400 to avoid others to get it. if 
[YARN-8725|https://issues.apache.org/jira/browse/YARN-8725]
is done, the staging dir will be cleaned up after the job is done. I think it's 
a controllable risk.

Your advice is great, keytab uploading is changed to optional and warnings is 
added.

Thanks

> [Submarine] Can't get submarine service status using the command of "yarn app 
> -status" under security environment
> -
>
> Key: YARN-8960
> URL: https://issues.apache.org/jira/browse/YARN-8960
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zac Zhou
>Assignee: Zac Zhou
>Priority: Major
> Attachments: YARN-8960.001.patch, YARN-8960.002.patch, 
> YARN-8960.003.patch, YARN-8960.004.patch, YARN-8960.005.patch
>
>
> After submitting a submarine job, we tried to get service status using the 
> following command:
> yarn app -status ${service_name}
> But we got the following error:
> HTTP error code : 500
>  
> The stack in resourcemanager log is :
> {code}
> ERROR org.apache.hadoop.yarn.service.webapp.ApiServer: Get service failed: {}
> java.lang.reflect.UndeclaredThrowableException
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1748)
>  at 
> org.apache.hadoop.yarn.service.webapp.ApiServer.getServiceFromClient(ApiServer.java:800)
>  at 
> org.apache.hadoop.yarn.service.webapp.ApiServer.getService(ApiServer.java:186)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ...
> Caused by: org.apache.hadoop.yarn.exceptions.YarnException: No principal 
> specified in the persisted service definitio
> n, fail to connect to AM.
>  at 
> org.apache.hadoop.yarn.service.client.ServiceClient.createAMProxy(ServiceClient.java:1500)
>  at 
> org.apache.hadoop.yarn.service.client.ServiceClient.getStatus(ServiceClient.java:1376)
>  at 
> org.apache.hadoop.yarn.service.webapp.ApiServer.lambda$getServiceFromClient$4(ApiServer.java:804)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>  ... 68 more
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8960) [Submarine] Can't get submarine service status using the command of "yarn app -status" under security environment

2018-11-14 Thread Zac Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zac Zhou updated YARN-8960:
---
Attachment: YARN-8960.005.patch

> [Submarine] Can't get submarine service status using the command of "yarn app 
> -status" under security environment
> -
>
> Key: YARN-8960
> URL: https://issues.apache.org/jira/browse/YARN-8960
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zac Zhou
>Assignee: Zac Zhou
>Priority: Major
> Attachments: YARN-8960.001.patch, YARN-8960.002.patch, 
> YARN-8960.003.patch, YARN-8960.004.patch, YARN-8960.005.patch
>
>
> After submitting a submarine job, we tried to get service status using the 
> following command:
> yarn app -status ${service_name}
> But we got the following error:
> HTTP error code : 500
>  
> The stack in resourcemanager log is :
> {code}
> ERROR org.apache.hadoop.yarn.service.webapp.ApiServer: Get service failed: {}
> java.lang.reflect.UndeclaredThrowableException
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1748)
>  at 
> org.apache.hadoop.yarn.service.webapp.ApiServer.getServiceFromClient(ApiServer.java:800)
>  at 
> org.apache.hadoop.yarn.service.webapp.ApiServer.getService(ApiServer.java:186)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ...
> Caused by: org.apache.hadoop.yarn.exceptions.YarnException: No principal 
> specified in the persisted service definitio
> n, fail to connect to AM.
>  at 
> org.apache.hadoop.yarn.service.client.ServiceClient.createAMProxy(ServiceClient.java:1500)
>  at 
> org.apache.hadoop.yarn.service.client.ServiceClient.getStatus(ServiceClient.java:1376)
>  at 
> org.apache.hadoop.yarn.service.webapp.ApiServer.lambda$getServiceFromClient$4(ApiServer.java:804)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>  ... 68 more
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5168) Add port mapping handling when docker container use bridge network

2018-11-14 Thread Xun Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686634#comment-16686634
 ] 

Xun Liu commented on YARN-5168:
---

I think [~shaneku...@gmail.com] said that the port conflict is such a result.
{quote}I believe supporting -P will lead to port conflicts. What if two 
containers running on the same NM both expose 8080?
{quote}
Both applications use the -p 8080:8080 parameter to create a port mapping for 
the container. The second application specifies that the host's 8080 port will 
definitely conflict.
However, this is not a problem, because kubernetes will conflict if you specify 
the port when creating the pod. What we need to do is to send the conflict 
information to the user if the conflict occurs, and let the user create a port.

I think [~eyang] said very important.
{quote}2. Add "-p" to let user specify specific ports to map.
{quote}
The -P parameter is very useful. You can do this -P 8080, let the host pick an 
unoccupied port and bind it to the 8080 port of the container. Then the user 
can know which physical port is used by obtaining the container information.
{quote}3. Add service registry support for bridge network case, then app could 
find each other. It could be done out of YARN, however it might be more 
convenient to support it natively in YARN.
{quote}
Just like kubernetes, the container service in the YARN cluster is reversed by 
ngxin or traefik, so that it can be used regardless of the container changes.
With these network features supported, YARN can provide better online services 
through docker.

[~eyang],The hadoop submarine project requires a port mapping feature very 
much. Can you assign this JIRA to me and let me contribute the patch? thank you!

> Add port mapping handling when docker container use bridge network
> --
>
> Key: YARN-5168
> URL: https://issues.apache.org/jira/browse/YARN-5168
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jun Gong
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
>
> YARN-4007 addresses different network setups when launching the docker 
> container. We need support port mapping when docker container uses bridge 
> network.
> The following problems are what we faced:
> 1. Add "-P" to map docker container's exposed ports to automatically.
> 2. Add "-p" to let user specify specific ports to map.
> 3. Add service registry support for bridge network case, then app could find 
> each other. It could be done out of YARN, however it might be more convenient 
> to support it natively in YARN.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-6223) [Umbrella] Natively support GPU configuration/discovery/scheduling/isolation on YARN

2018-11-14 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/YARN-6223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antal Bálint Steinbach reassigned YARN-6223:


Assignee: Antal Bálint Steinbach  (was: Wangda Tan)

> [Umbrella] Natively support GPU configuration/discovery/scheduling/isolation 
> on YARN
> 
>
> Key: YARN-6223
> URL: https://issues.apache.org/jira/browse/YARN-6223
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Wangda Tan
>Assignee: Antal Bálint Steinbach
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: YARN-6223.Natively-support-GPU-on-YARN-v1.pdf, 
> YARN-6223.wip.1.patch, YARN-6223.wip.2.patch, YARN-6223.wip.3.patch
>
>
> As varieties of workloads are moving to YARN, including machine learning / 
> deep learning which can speed up by leveraging GPU computation power. 
> Workloads should be able to request GPU from YARN as simple as CPU and memory.
> *To make a complete GPU story, we should support following pieces:*
> 1) GPU discovery/configuration: Admin can either config GPU resources and 
> architectures on each node, or more advanced, NodeManager can automatically 
> discover GPU resources and architectures and report to ResourceManager 
> 2) GPU scheduling: YARN scheduler should account GPU as a resource type just 
> like CPU and memory.
> 3) GPU isolation/monitoring: once launch a task with GPU resources, 
> NodeManager should properly isolate and monitor task's resource usage.
> For #2, YARN-3926 can support it natively. For #3, YARN-3611 has introduced 
> an extensible framework to support isolation for different resource types and 
> different runtimes.
> *Related JIRAs:*
> There're a couple of JIRAs (YARN-4122/YARN-5517) filed with similar goals but 
> different solutions:
> For scheduling:
> - YARN-4122/YARN-5517 are all adding a new GPU resource type to Resource 
> protocol instead of leveraging YARN-3926.
> For isolation:
> - And YARN-4122 proposed to use CGroups to do isolation which cannot solve 
> the problem listed at 
> https://github.com/NVIDIA/nvidia-docker/wiki/GPU-isolation#challenges such as 
> minor device number mapping; load nvidia_uvm module; mismatch of CUDA/driver 
> versions, etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8303) YarnClient should contact TimelineReader for application/attempt/container report

2018-11-14 Thread Rohith Sharma K S (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686556#comment-16686556
 ] 

Rohith Sharma K S commented on YARN-8303:
-

[~abmodi] along with above comments, 2 of checkstyle warnings could be fixed 
also

> YarnClient should contact TimelineReader for application/attempt/container 
> report
> -
>
> Key: YARN-8303
> URL: https://issues.apache.org/jira/browse/YARN-8303
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Rohith Sharma K S
>Assignee: Abhishek Modi
>Priority: Critical
> Attachments: YARN-8303.001.patch, YARN-8303.002.patch, 
> YARN-8303.003.patch, YARN-8303.poc.patch
>
>
> YarnClient get app/attempt/container information from RM. If RM doesn't have 
> then queried to ahsClient. When ATSv2 is only enabled, yarnClient will result 
> empty. 
> YarnClient is used by many users which result in empty information for 
> app/attempt/container report. 
> Proposal is to have adapter from yarn client so that app/attempt/container 
> reports can be generated from AHSv2Client which does REST API to 
> TimelineReader and get the entity and convert it into app/attempt/container 
> report.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8303) YarnClient should contact TimelineReader for application/attempt/container report

2018-11-14 Thread Rohith Sharma K S (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686554#comment-16686554
 ] 

Rohith Sharma K S commented on YARN-8303:
-

Some comments
 YarnClientImpl
 # Most of the API's does't have *return* after ahs2client.() is called. 
This causes always ATS1.5 API to call!
{code:java}
if (timelineV2ServiceEnabled) {
try {
  ahsV2Client.getApplicationAttemptReport(appAttemptId);
} catch (Exception ex) {
  LOG.warn("Failed to fetch application attempt report from "
  + "ATS v2", ex);
}
  }
{code}

 # Newly added method getContainerReportFromHistory has ambiguity. After 
catching, again ahsV2Client is called!.
{code:java}
  private List getContainerReportFromHistory(
  ApplicationAttemptId applicationAttemptId)
  throws IOException, YarnException {
List containersListFromAHS = null;
if (timelineV2ServiceEnabled) {
  try {
containersListFromAHS = ahsV2Client.getContainers(applicationAttemptId);
  } catch (Exception e) {
LOG.warn("Got an error while fetching container report from ATSv2", e);
if (historyServiceEnabled) {
  containersListFromAHS = ahsV2Client.getContainers(
  applicationAttemptId);
} else {
  throw e;
}
  }
} else if (historyServiceEnabled) {
  containersListFromAHS = historyClient.getContainers(applicationAttemptId);
}
return containersListFromAHS;
  }
{code}

> YarnClient should contact TimelineReader for application/attempt/container 
> report
> -
>
> Key: YARN-8303
> URL: https://issues.apache.org/jira/browse/YARN-8303
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Rohith Sharma K S
>Assignee: Abhishek Modi
>Priority: Critical
> Attachments: YARN-8303.001.patch, YARN-8303.002.patch, 
> YARN-8303.003.patch, YARN-8303.poc.patch
>
>
> YarnClient get app/attempt/container information from RM. If RM doesn't have 
> then queried to ahsClient. When ATSv2 is only enabled, yarnClient will result 
> empty. 
> YarnClient is used by many users which result in empty information for 
> app/attempt/container report. 
> Proposal is to have adapter from yarn client so that app/attempt/container 
> reports can be generated from AHSv2Client which does REST API to 
> TimelineReader and get the entity and convert it into app/attempt/container 
> report.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9007) CS preemption monitor should only select GUARANTEED containers as candidates for queue and reserved container preemption

2018-11-14 Thread Wilfred Spiegelenburg (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686547#comment-16686547
 ] 

Wilfred Spiegelenburg commented on YARN-9007:
-

[~sunilg] OPPORTUNISTIC containers should not be counted by the scheduler as 
containers that take resources on the node. When I look at the 
{{SchedulerNode.allocateContainer()}} and its counter part 
{{updateResourceForReleasedContainer()}} it only updates the resources used of 
the node for GUARANTEED containers.
That would mean that even if an OPPORTUNISTIC container gets pre-empted by the 
scheduler it will not release resources and thus not help with scheduling a 
GUARANTEED container for which the pre-emption runs. Based on that the 
pre-emption in both schedulers should ignore OPPORTUNISTIC container types. The 
NM handles the killing/preemption of those containers that run on it based on 
different triggers.

I might be missing something but I don't think the upgrade/downgrade of 
OPPORTUNISTIC containers is relevant for the pre-emption of GUARANTEED 
containers.

> CS preemption monitor should only select GUARANTEED containers as candidates 
> for queue and reserved container preemption
> 
>
> Key: YARN-9007
> URL: https://issues.apache.org/jira/browse/YARN-9007
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.2.1
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-9007.001.patch
>
>
> Currently CS preemption monitor doesn't consider execution type of 
> containers, so OPPORTUNISTIC containers maybe selected and killed without 
> effect.
> In some scenario with OPPORTUNISTIC containers, not even preemption can't 
> work properly to balance resources, but also some apps with OPPORTUNISTIC 
> containers maybe effected and unable to work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-8898) Fix FederationInterceptor#allocate to set application priority in allocateResponse

2018-11-14 Thread Bibin A Chundatt (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686081#comment-16686081
 ] 

Bibin A Chundatt edited comment on YARN-8898 at 11/14/18 1:40 PM:
--

[~subru]

 Are you suggesting add the _ApplicationSubmissionContext_ to 
_ApplicationHomeSubCluster_ and then  write to 2 different znodes/tables at 
store side.

If you are suggesting same znode then might not be feasible as old.
I think not possible to have a rolling upgrade scearios with existing zk and 
sql implementation.

We are writing only subClusterIdProto to *APPLICATION/APPID* node as per 
current implementation in ZK and  SQL only  subclusterid  is written to column
{code:java}
String appZNode = getNodePath(appsZNode, appId.toString());
SubClusterIdProto proto =
((SubClusterIdPBImpl)subClusterId).getProto();
byte[] data = proto.toByteArray();
put(appZNode, data, update);
{code}

This is not extendable rt ?? Unfortunately we didn't write 
ApplicationHomeSubCluster.getProto.getBytes to znode :(


was (Author: bibinchundatt):
[~subru] are you suggesting add the _ApplicationSubmissionContext_ to 
_ApplicationHomeSubCluster_ and then  write to 2 different znodes/tables at 
store side ??

If you are suggesting same znode then might not be feasible .I think not 
possible to have a rolling upgrade scearios with existing zk and sql 
implementation.

We are writing only subClusterIdProto to *APPLICATION/APPID* node.
SQL only the subcluster is is written,

{code:java}
String appZNode = getNodePath(appsZNode, appId.toString());
SubClusterIdProto proto =
((SubClusterIdPBImpl)subClusterId).getProto();
byte[] data = proto.toByteArray();
put(appZNode, data, update);
{code}

This is not extendable rt ?? Unfortunately we didnt write 
ApplicationHomeSubCluster.getProto.getBytes to znode :(



> Fix FederationInterceptor#allocate to set application priority in 
> allocateResponse
> --
>
> Key: YARN-8898
> URL: https://issues.apache.org/jira/browse/YARN-8898
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Bibin A Chundatt
>Priority: Major
> Attachments: YARN-8898.wip.patch
>
>
> In case of FederationInterceptor#mergeAllocateResponses skips 
> application_priority in response returned



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9020) set a wrong AbsoluteCapacity when call ParentQueue#setAbsoluteCapacity

2018-11-14 Thread tianjuan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

tianjuan updated YARN-9020:
---
Description: 
set a wrong AbsoluteCapacity when call  ParentQueue#setAbsoluteCapacity

private void deriveCapacityFromAbsoluteConfigurations(String label,
 Resource clusterResource, ResourceCalculator rc, CSQueue childQueue) {

// 3. Update absolute capacity as a float based on parent's minResource and
 // cluster resource.
 childQueue.getQueueCapacities().setAbsoluteCapacity(label,
 (float) childQueue.getQueueCapacities().{color:#d04437}getCapacity(){color}
 / getQueueCapacities().getAbsoluteCapacity(label));

 

{color:#d04437}should be{color} 
childQueue.getQueueCapacities().setAbsoluteCapacity(label,
 (float) 
childQueue.getQueueCapacities().{color:#f6c342}getCapacity(label){color}
 / getQueueCapacities().getAbsoluteCapacity(label));

  was:
bug at setAbsoluteCapacity

private void deriveCapacityFromAbsoluteConfigurations(String label,
 Resource clusterResource, ResourceCalculator rc, CSQueue childQueue) {

// 3. Update absolute capacity as a float based on parent's minResource and
 // cluster resource.
 childQueue.getQueueCapacities().setAbsoluteCapacity(label,
 (float) childQueue.getQueueCapacities().{color:#d04437}getCapacity(){color}
 / getQueueCapacities().getAbsoluteCapacity(label));

 

{color:#d04437}should be{color} 
childQueue.getQueueCapacities().setAbsoluteCapacity(label,
 (float) 
childQueue.getQueueCapacities().{color:#f6c342}getCapacity(label){color}
 / getQueueCapacities().getAbsoluteCapacity(label));


> set a wrong AbsoluteCapacity when call  ParentQueue#setAbsoluteCapacity
> ---
>
> Key: YARN-9020
> URL: https://issues.apache.org/jira/browse/YARN-9020
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: tianjuan
>Priority: Major
>
> set a wrong AbsoluteCapacity when call  ParentQueue#setAbsoluteCapacity
> private void deriveCapacityFromAbsoluteConfigurations(String label,
>  Resource clusterResource, ResourceCalculator rc, CSQueue childQueue) {
> // 3. Update absolute capacity as a float based on parent's minResource and
>  // cluster resource.
>  childQueue.getQueueCapacities().setAbsoluteCapacity(label,
>  (float) childQueue.getQueueCapacities().{color:#d04437}getCapacity(){color}
>  / getQueueCapacities().getAbsoluteCapacity(label));
>  
> {color:#d04437}should be{color} 
> childQueue.getQueueCapacities().setAbsoluteCapacity(label,
>  (float) 
> childQueue.getQueueCapacities().{color:#f6c342}getCapacity(label){color}
>  / getQueueCapacities().getAbsoluteCapacity(label));



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9020) bug when setAbsoluteCapacity(String label, float value)

2018-11-14 Thread tianjuan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

tianjuan updated YARN-9020:
---
Description: 
bug at setAbsoluteCapacity

private void deriveCapacityFromAbsoluteConfigurations(String label,
 Resource clusterResource, ResourceCalculator rc, CSQueue childQueue) {

// 3. Update absolute capacity as a float based on parent's minResource and
 // cluster resource.
 childQueue.getQueueCapacities().setAbsoluteCapacity(label,
 (float) childQueue.getQueueCapacities().{color:#d04437}getCapacity(){color}
 / getQueueCapacities().getAbsoluteCapacity(label));

 

{color:#d04437}should be{color} 
childQueue.getQueueCapacities().setAbsoluteCapacity(label,
 (float) 
childQueue.getQueueCapacities().{color:#f6c342}getCapacity(label){color}
 / getQueueCapacities().getAbsoluteCapacity(label));

  was:
bug at setAbsoluteCapacity

private void deriveCapacityFromAbsoluteConfigurations(String label,
 Resource clusterResource, ResourceCalculator rc, CSQueue childQueue) {

// 3. Update absolute capacity as a float based on parent's minResource and
 // cluster resource.
 childQueue.getQueueCapacities().setAbsoluteCapacity(label,
 (float) childQueue.getQueueCapacities().getCapacity()
 / getQueueCapacities().getAbsoluteCapacity(label));

 

should be childQueue.getQueueCapacities().setAbsoluteCapacity(label,
(float) childQueue.getQueueCapacities().{color:#f6c342}getCapacity(label){color}
/ getQueueCapacities().getAbsoluteCapacity(label));


> bug when setAbsoluteCapacity(String label, float value)
> ---
>
> Key: YARN-9020
> URL: https://issues.apache.org/jira/browse/YARN-9020
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: tianjuan
>Priority: Major
>
> bug at setAbsoluteCapacity
> private void deriveCapacityFromAbsoluteConfigurations(String label,
>  Resource clusterResource, ResourceCalculator rc, CSQueue childQueue) {
> // 3. Update absolute capacity as a float based on parent's minResource and
>  // cluster resource.
>  childQueue.getQueueCapacities().setAbsoluteCapacity(label,
>  (float) childQueue.getQueueCapacities().{color:#d04437}getCapacity(){color}
>  / getQueueCapacities().getAbsoluteCapacity(label));
>  
> {color:#d04437}should be{color} 
> childQueue.getQueueCapacities().setAbsoluteCapacity(label,
>  (float) 
> childQueue.getQueueCapacities().{color:#f6c342}getCapacity(label){color}
>  / getQueueCapacities().getAbsoluteCapacity(label));



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9020) bug when setAbsoluteCapacity(String label, float value)

2018-11-14 Thread tianjuan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

tianjuan updated YARN-9020:
---
Description: 
bug at setAbsoluteCapacity

private void deriveCapacityFromAbsoluteConfigurations(String label,
 Resource clusterResource, ResourceCalculator rc, CSQueue childQueue) {

// 3. Update absolute capacity as a float based on parent's minResource and
 // cluster resource.
 childQueue.getQueueCapacities().setAbsoluteCapacity(label,
 (float) childQueue.getQueueCapacities().getCapacity()
 / getQueueCapacities().getAbsoluteCapacity(label));

 

should be childQueue.getQueueCapacities().setAbsoluteCapacity(label,
(float) childQueue.getQueueCapacities().{color:#f6c342}getCapacity(label){color}
/ getQueueCapacities().getAbsoluteCapacity(label));

  was:
bug at setAbsoluteCapacity

private void deriveCapacityFromAbsoluteConfigurations(String label,
 Resource clusterResource, ResourceCalculator rc, CSQueue childQueue) {

 /*
 * In case when queues are configured with absolute resources, it is better
 * to update capacity/max-capacity etc w.r.t absolute resource as well. In
 * case of computation, these values wont be used any more. However for
 * metrics and UI, its better these values are pre-computed here itself.
 */

 // 1. Update capacity as a float based on parent's minResource
 childQueue.getQueueCapacities().setCapacity(label,
 rc.divide(clusterResource,
 childQueue.getQueueResourceQuotas().getEffectiveMinResource(label),
 getQueueResourceQuotas().getEffectiveMinResource(label)));

 // 2. Update max-capacity as a float based on parent's maxResource
 childQueue.getQueueCapacities().setMaximumCapacity(label,
 rc.divide(clusterResource,
 childQueue.getQueueResourceQuotas().getEffectiveMaxResource(label),
 getQueueResourceQuotas().getEffectiveMaxResource(label)));

 // 3. Update absolute capacity as a float based on parent's minResource and
 // cluster resource.
 childQueue.getQueueCapacities().setAbsoluteCapacity(label,
 (float) childQueue.getQueueCapacities().getCapacity()
 / getQueueCapacities().getAbsoluteCapacity(label));


> bug when setAbsoluteCapacity(String label, float value)
> ---
>
> Key: YARN-9020
> URL: https://issues.apache.org/jira/browse/YARN-9020
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: tianjuan
>Priority: Major
>
> bug at setAbsoluteCapacity
> private void deriveCapacityFromAbsoluteConfigurations(String label,
>  Resource clusterResource, ResourceCalculator rc, CSQueue childQueue) {
> // 3. Update absolute capacity as a float based on parent's minResource and
>  // cluster resource.
>  childQueue.getQueueCapacities().setAbsoluteCapacity(label,
>  (float) childQueue.getQueueCapacities().getCapacity()
>  / getQueueCapacities().getAbsoluteCapacity(label));
>  
> should be childQueue.getQueueCapacities().setAbsoluteCapacity(label,
> (float) 
> childQueue.getQueueCapacities().{color:#f6c342}getCapacity(label){color}
> / getQueueCapacities().getAbsoluteCapacity(label));



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9020) set a wrong AbsoluteCapacity when call ParentQueue#setAbsoluteCapacity

2018-11-14 Thread tianjuan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686469#comment-16686469
 ] 

tianjuan commented on YARN-9020:


[~leftnoteasy] could you take a look this?

> set a wrong AbsoluteCapacity when call  ParentQueue#setAbsoluteCapacity
> ---
>
> Key: YARN-9020
> URL: https://issues.apache.org/jira/browse/YARN-9020
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: tianjuan
>Assignee: tianjuan
>Priority: Major
>
> set a wrong AbsoluteCapacity when call  ParentQueue#setAbsoluteCapacity
> private void deriveCapacityFromAbsoluteConfigurations(String label,
>  Resource clusterResource, ResourceCalculator rc, CSQueue childQueue) {
> // 3. Update absolute capacity as a float based on parent's minResource and
>  // cluster resource.
>  childQueue.getQueueCapacities().setAbsoluteCapacity(label,
>  (float) childQueue.getQueueCapacities().{color:#d04437}getCapacity(){color}
>  / getQueueCapacities().getAbsoluteCapacity(label));
>  
> {color:#d04437}should be{color} 
> childQueue.getQueueCapacities().setAbsoluteCapacity(label,
>  (float) 
> childQueue.getQueueCapacities().{color:#f6c342}getCapacity(label){color}
>  / getQueueCapacities().getAbsoluteCapacity(label));



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-9020) set a wrong AbsoluteCapacity when call ParentQueue#setAbsoluteCapacity

2018-11-14 Thread tianjuan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

tianjuan reassigned YARN-9020:
--

Assignee: tianjuan

> set a wrong AbsoluteCapacity when call  ParentQueue#setAbsoluteCapacity
> ---
>
> Key: YARN-9020
> URL: https://issues.apache.org/jira/browse/YARN-9020
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: tianjuan
>Assignee: tianjuan
>Priority: Major
>
> set a wrong AbsoluteCapacity when call  ParentQueue#setAbsoluteCapacity
> private void deriveCapacityFromAbsoluteConfigurations(String label,
>  Resource clusterResource, ResourceCalculator rc, CSQueue childQueue) {
> // 3. Update absolute capacity as a float based on parent's minResource and
>  // cluster resource.
>  childQueue.getQueueCapacities().setAbsoluteCapacity(label,
>  (float) childQueue.getQueueCapacities().{color:#d04437}getCapacity(){color}
>  / getQueueCapacities().getAbsoluteCapacity(label));
>  
> {color:#d04437}should be{color} 
> childQueue.getQueueCapacities().setAbsoluteCapacity(label,
>  (float) 
> childQueue.getQueueCapacities().{color:#f6c342}getCapacity(label){color}
>  / getQueueCapacities().getAbsoluteCapacity(label));



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9020) set a wrong AbsoluteCapacity when call ParentQueue#setAbsoluteCapacity

2018-11-14 Thread tianjuan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

tianjuan updated YARN-9020:
---
Summary: set a wrong AbsoluteCapacity when call  
ParentQueue#setAbsoluteCapacity  (was: bug when setAbsoluteCapacity(String 
label, float value))

> set a wrong AbsoluteCapacity when call  ParentQueue#setAbsoluteCapacity
> ---
>
> Key: YARN-9020
> URL: https://issues.apache.org/jira/browse/YARN-9020
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: tianjuan
>Priority: Major
>
> bug at setAbsoluteCapacity
> private void deriveCapacityFromAbsoluteConfigurations(String label,
>  Resource clusterResource, ResourceCalculator rc, CSQueue childQueue) {
> // 3. Update absolute capacity as a float based on parent's minResource and
>  // cluster resource.
>  childQueue.getQueueCapacities().setAbsoluteCapacity(label,
>  (float) childQueue.getQueueCapacities().{color:#d04437}getCapacity(){color}
>  / getQueueCapacities().getAbsoluteCapacity(label));
>  
> {color:#d04437}should be{color} 
> childQueue.getQueueCapacities().setAbsoluteCapacity(label,
>  (float) 
> childQueue.getQueueCapacities().{color:#f6c342}getCapacity(label){color}
>  / getQueueCapacities().getAbsoluteCapacity(label));



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9020) bug when setAbsoluteCapacity(String label, float value)

2018-11-14 Thread tianjuan (JIRA)
tianjuan created YARN-9020:
--

 Summary: bug when setAbsoluteCapacity(String label, float value)
 Key: YARN-9020
 URL: https://issues.apache.org/jira/browse/YARN-9020
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: tianjuan


bug at setAbsoluteCapacity

private void deriveCapacityFromAbsoluteConfigurations(String label,
 Resource clusterResource, ResourceCalculator rc, CSQueue childQueue) {

 /*
 * In case when queues are configured with absolute resources, it is better
 * to update capacity/max-capacity etc w.r.t absolute resource as well. In
 * case of computation, these values wont be used any more. However for
 * metrics and UI, its better these values are pre-computed here itself.
 */

 // 1. Update capacity as a float based on parent's minResource
 childQueue.getQueueCapacities().setCapacity(label,
 rc.divide(clusterResource,
 childQueue.getQueueResourceQuotas().getEffectiveMinResource(label),
 getQueueResourceQuotas().getEffectiveMinResource(label)));

 // 2. Update max-capacity as a float based on parent's maxResource
 childQueue.getQueueCapacities().setMaximumCapacity(label,
 rc.divide(clusterResource,
 childQueue.getQueueResourceQuotas().getEffectiveMaxResource(label),
 getQueueResourceQuotas().getEffectiveMaxResource(label)));

 // 3. Update absolute capacity as a float based on parent's minResource and
 // cluster resource.
 childQueue.getQueueCapacities().setAbsoluteCapacity(label,
 (float) childQueue.getQueueCapacities().getCapacity()
 / getQueueCapacities().getAbsoluteCapacity(label));



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8986) publish all exposed ports to random ports when using bridge network

2018-11-14 Thread Charo Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charo Zhang updated YARN-8986:
--
Description: 
it's better to publish all exposed ports to random ports(-P) or support port 
mapping(-p) for bridge network when using bridge network for docker container.

 

  was:
it's better to publish all exposed ports to random ports or support port 
mapping for bridge network when using bridge network for docker container.

 


> publish all exposed ports to random ports when using bridge network
> ---
>
> Key: YARN-8986
> URL: https://issues.apache.org/jira/browse/YARN-8986
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 3.1.1
>Reporter: Charo Zhang
>Assignee: Charo Zhang
>Priority: Minor
>  Labels: Docker
> Fix For: 3.1.2
>
> Attachments: 20181108155450.png, YARN-8986.001.patch, 
> YARN-8986.002.patch, YARN-8986.003.patch
>
>
> it's better to publish all exposed ports to random ports(-P) or support port 
> mapping(-p) for bridge network when using bridge network for docker container.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-8898) Fix FederationInterceptor#allocate to set application priority in allocateResponse

2018-11-14 Thread Bibin A Chundatt (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686415#comment-16686415
 ] 

Bibin A Chundatt edited comment on YARN-8898 at 11/14/18 11:57 AM:
---

Probably we can take 2 approaches

[solution 
1|https://issues.apache.org/jira/browse/YARN-8898?focusedCommentId=16685683=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16685683]

Or

As you mentioned add to ApplicationHomeSubCluster write new addHomeCluster 
calls as following
{code:java}
 * |--- APPLICATIONDATA
 *   |- APP1
 *   |- APP2
{code}
While getting app from ZK get ApplicationHomeSubCluster from 
*APPLICATIONDATA/APP1* if doesnt exists get *APPLICATION/APP1*


was (Author: bibinchundatt):
Probably we can take 2 approaches

solution 1

Or

As you mentioned add to ApplicationHomeSubCluster write new addHomeCluster 
calls as following
{code:java}
 * |--- APPLICATIONDATA
 *   |- APP1
 *   |- APP2
{code}
While getting app from ZK get ApplicationHomeSubCluster from 
*APPLICATIONDATA/APP1* if doesnt exists get *APPLICATION/APP1*

> Fix FederationInterceptor#allocate to set application priority in 
> allocateResponse
> --
>
> Key: YARN-8898
> URL: https://issues.apache.org/jira/browse/YARN-8898
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Bibin A Chundatt
>Priority: Major
> Attachments: YARN-8898.wip.patch
>
>
> In case of FederationInterceptor#mergeAllocateResponses skips 
> application_priority in response returned



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8898) Fix FederationInterceptor#allocate to set application priority in allocateResponse

2018-11-14 Thread Bibin A Chundatt (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686415#comment-16686415
 ] 

Bibin A Chundatt commented on YARN-8898:


Probably we can take 2 approaches

solution 1

Or

As you mentioned add to ApplicationHomeSubCluster write new addHomeCluster 
calls as following
{code:java}
 * |--- APPLICATIONDATA
 *   |- APP1
 *   |- APP2
{code}
While getting app from ZK get ApplicationHomeSubCluster from 
*APPLICATIONDATA/APP1* if doesnt exists get *APPLICATION/APP1*

> Fix FederationInterceptor#allocate to set application priority in 
> allocateResponse
> --
>
> Key: YARN-8898
> URL: https://issues.apache.org/jira/browse/YARN-8898
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Bibin A Chundatt
>Priority: Major
> Attachments: YARN-8898.wip.patch
>
>
> In case of FederationInterceptor#mergeAllocateResponses skips 
> application_priority in response returned



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-9000) Add missing data access methods to webapp entities classes

2018-11-14 Thread Oleksandr Shevchenko (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleksandr Shevchenko reassigned YARN-9000:
--

Assignee: Oleksandr Shevchenko

> Add missing data access methods to webapp entities classes
> --
>
> Key: YARN-9000
> URL: https://issues.apache.org/jira/browse/YARN-9000
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Oleksandr Shevchenko
>Assignee: Oleksandr Shevchenko
>Priority: Minor
>
> From Hadoop side, we have entity classes which represent the data which can 
> be accessed via REST. All these classes are placed in .../webapp/dao packages 
> (for example 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.NodeInfo).
> Typically these classes are created via constructors (some classes have 
> setters) in controllers and then is marshaled to XML/JSON format for data 
> transfer. Therefore, these classes are used more like as DTO.
> We want to write some UI tests to verify the both YARN Web UIs (current ui 
> and ui2). We need to get some information from REST and compare with 
> information which displayed on UI.
> The problem is we can't use for it the same entities from Hadoop. Because we 
> can't create these entities and set needed data from UI since many getters 
> and setters are missed. So, we will forced to write some layer which 
> represents the same data and exactly copies webapp/dao classes but includes 
> needed getters and setters.
> Access methods are not unified. Some classes have only getters, some have 
> several setters, some have all the necessary getters and setters. In all 
> classes, we have a different set of methods, this is not controlled, new 
> methods are added as necessary. We open a lot of tickets for adding a 
> particular method to a particular class, this lead to some overhead.
> In this ticket, I propose to unify access to the data and add all getters and 
> setters for all YARN webapp/dao classes (I will create a separated ticket for 
> MapReduce project if the idea will be approved and I will start working on 
> this issue).
> Thanks a lot for any comments and attention to this problem!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8964) UI2 should use clusters/{cluster name} for all ATSv2 REST APIs

2018-11-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686376#comment-16686376
 ] 

Hadoop QA commented on YARN-8964:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
38m 30s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 41s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 53m 45s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-8964 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12948105/YARN-8964.001.patch |
| Optional Tests |  dupname  asflicense  shadedclient  |
| uname | Linux d2b03298596f 4.4.0-134-generic #160~14.04.1-Ubuntu SMP Fri Aug 
17 11:07:07 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 3fade86 |
| maven | version: Apache Maven 3.3.9 |
| Max. process+thread count | 333 (vs. ulimit of 1) |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/22535/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> UI2 should use clusters/{cluster name} for all ATSv2 REST APIs
> --
>
> Key: YARN-8964
> URL: https://issues.apache.org/jira/browse/YARN-8964
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Rohith Sharma K S
>Assignee: Akhil PB
>Priority: Major
> Attachments: YARN-8964.001.patch
>
>
> UI2 makes a REST call to TimelineReader without cluster name. It is advised 
> to make a REST call with clusters/{cluster name} so that remote 
> TimelineReader daemon could serve for different clusters.
> *Example*:
> *Current*: /ws/v2/timeline/flows/
> *Change*: /ws/v2/timeline/*clusters/\{cluster name\}*/flows/
> *yarn.resourcemanager.cluster-id *is configured with cluster. So, this config 
> could be used to get cluster-id
> cc:/ [~sunilg] [~akhilpb]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9016) DocumentStore as a backend for ATSv2

2018-11-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686256#comment-16686256
 ] 

Hadoop QA commented on YARN-9016:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 12 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
1s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
19m 31s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-assemblies hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
10s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
19s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
 3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 14m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  4m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
4s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  2s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-assemblies hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
28s{color} | {color:red} hadoop-yarn-server-timelineservice-documentstore in 
the patch failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
38s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
24s{color} | {color:green} hadoop-assemblies in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}145m 26s{color} 
| {color:red} hadoop-yarn-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
16s{color} | {color:green} hadoop-yarn-server-timelineservice in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
39s{color} | {color:green} hadoop-yarn-server-timelineservice-documentstore in 
the patch passed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
41s{color} | {color:red} The patch generated 5 ASF License warnings. {color} |
| {color:black}{color} 

[jira] [Commented] (YARN-8953) Add CSI driver adaptor module

2018-11-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686215#comment-16686215
 ] 

Hadoop QA commented on YARN-8953:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 7 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  3m 
48s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 19s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m  
9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
41s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  7m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 30s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
25s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 44s{color} 
| {color:red} hadoop-yarn-api in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
32s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
34s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}103m 55s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
45s{color} | {color:green} hadoop-yarn-csi in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
39s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}208m 26s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.conf.TestYarnConfigurationFields |
|   | 

[jira] [Comment Edited] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.

2018-11-14 Thread Xun Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16684794#comment-16684794
 ] 

Xun Liu edited comment on YARN-8714 at 11/14/18 9:02 AM:
-

1) Support for multiple files or directories.

2) Must maintain the original state, such as folder subdirectory structure, zip 
file or normal file, do not make any changes.

3) Mount the specified file or folder to the absolute path in the container. 
Because the user can customize the WORKDIR of the container through the 
Dockerfile, And the containers are dedicated and custom made, So the user knows 
exactly where to mount the file.

4) Keep consistent with the use of docker, use`{color:#ff}:{color}` split 
source directory and destination directory.

5) Parameter format: -localizations 
hdfs:///user/yarn{color:#ff}:{color}/absolute/path

Requirements document: 
[https://docs.google.com/document/d/16YN8Kjmxt1Ym3clx5pDnGNXGajUT36hzQxjaik1cP4A/edit#heading=h.s07ukakieg7q]

 


was (Author: liuxun323):
1) Support for multiple files or directories.

2) Must maintain the original state, such as folder subdirectory structure, zip 
file or normal file, do not make any changes.

3) Mount the specified file or folder to the absolute path in the container. 
Because the user can customize the WORKDIR of the container through the 
Dockerfile, And the containers are dedicated and custom made, So the user knows 
exactly where to mount the file.

4) Keep consistent with the use of docker, use`{color:#FF}:{color}` split 
source directory and destination directory.

5) Parameter format: -localizations 
hdfs:///user/yarn{color:#FF}:{color}/absolute/path

> [Submarine] Support files/tarballs to be localized for a training job.
> --
>
> Key: YARN-8714
> URL: https://issues.apache.org/jira/browse/YARN-8714
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Zhankun Tang
>Priority: Major
> Attachments: YARN-8714-WIP1-trunk-001.patch
>
>
> See 
> https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit#heading=h.vkxp9edl11m7,
>  {{job run --localizations ...}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org