[jira] [Comment Edited] (YARN-6289) Fail to achieve data locality when runing MapReduce and Spark on HDFS
[ https://issues.apache.org/jira/browse/YARN-6289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968691#comment-15968691 ] Huangkaixuan edited comment on YARN-6289 at 4/14/17 6:33 AM: - Thanks, [~leftnoteasy]. It seems works for me. I noticed the patch has been merged into Hadoop 3.0 snapshot. Acutally, I am using Hadoop 2.7 and 2.8. Is there a way to port this patch to 2.7 and 2.8? was (Author: huangkx6810): Thanks, [~leftnoteasy]. It seems work for me. I noticed the patch has been merged into Hadoop 3.0 snapshot. Acutally, I am using Hadoop 2.7 and 2.8. Is there a way to port this patch to 2.7 and 2.8? > Fail to achieve data locality when runing MapReduce and Spark on HDFS > - > > Key: YARN-6289 > URL: https://issues.apache.org/jira/browse/YARN-6289 > Project: Hadoop YARN > Issue Type: Bug > Components: distributed-scheduling > Environment: Hardware configuration > CPU: 2 x Intel(R) Xeon(R) E5-2620 v2 @ 2.10GHz /15M Cache 6-Core 12-Thread > Memory: 128GB Memory (16x8GB) 1600MHz > Disk: 600GBx2 3.5-inch with RAID-1 > Network bandwidth: 968Mb/s > Software configuration > Spark-1.6.2 Hadoop-2.7.1 >Reporter: Huangkaixuan > Attachments: Hadoop_Spark_Conf.zip, YARN-DataLocality.docx, > YARN-RackAwareness.docx > > > When running a simple wordcount experiment on YARN, I noticed that the task > failed to achieve data locality, even though there is no other job running on > the cluster at the same time. The experiment was done in a 7-node (1 master, > 6 data nodes/node managers) cluster and the input of the wordcount job (both > Spark and MapReduce) is a single-block file in HDFS which is two-way > replicated (replication factor = 2). I ran wordcount on YARN for 10 times. > The results show that only 30% of tasks can achieve data locality, which > seems like the result of a random placement of tasks. The experiment details > are in the attachment, and feel free to reproduce the experiments. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6289) Fail to achieve data locality when runing MapReduce and Spark on HDFS
[ https://issues.apache.org/jira/browse/YARN-6289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968691#comment-15968691 ] Huangkaixuan commented on YARN-6289: Thanks, [~leftnoteasy]. It seems work for me. I noticed the patch has been merged into Hadoop 3.0 snapshot. Acutally, I am using Hadoop 2.7 and 2.8. Is there a way to port this patch to 2.7 and 2.8? > Fail to achieve data locality when runing MapReduce and Spark on HDFS > - > > Key: YARN-6289 > URL: https://issues.apache.org/jira/browse/YARN-6289 > Project: Hadoop YARN > Issue Type: Bug > Components: distributed-scheduling > Environment: Hardware configuration > CPU: 2 x Intel(R) Xeon(R) E5-2620 v2 @ 2.10GHz /15M Cache 6-Core 12-Thread > Memory: 128GB Memory (16x8GB) 1600MHz > Disk: 600GBx2 3.5-inch with RAID-1 > Network bandwidth: 968Mb/s > Software configuration > Spark-1.6.2 Hadoop-2.7.1 >Reporter: Huangkaixuan > Attachments: Hadoop_Spark_Conf.zip, YARN-DataLocality.docx, > YARN-RackAwareness.docx > > > When running a simple wordcount experiment on YARN, I noticed that the task > failed to achieve data locality, even though there is no other job running on > the cluster at the same time. The experiment was done in a 7-node (1 master, > 6 data nodes/node managers) cluster and the input of the wordcount job (both > Spark and MapReduce) is a single-block file in HDFS which is two-way > replicated (replication factor = 2). I ran wordcount on YARN for 10 times. > The results show that only 30% of tasks can achieve data locality, which > seems like the result of a random placement of tasks. The experiment details > are in the attachment, and feel free to reproduce the experiments. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6344) Add parameter for rack locality delay in CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-6344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968690#comment-15968690 ] Huangkaixuan commented on YARN-6344: Thanks [~kkaranasos] I noticed the patch has been merged into Hadoop 3.0 snapshot. Acutally, I am using Hadoop 2.7 and 2.8. Is there a way to port this patch to 2.7 and 2.8? > Add parameter for rack locality delay in CapacityScheduler > -- > > Key: YARN-6344 > URL: https://issues.apache.org/jira/browse/YARN-6344 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Reporter: Konstantinos Karanasos >Assignee: Konstantinos Karanasos > Fix For: 2.9.0, 3.0.0-alpha3 > > Attachments: YARN-6344.001.patch, YARN-6344.002.patch, > YARN-6344.003.patch, YARN-6344.004.patch > > > When relaxing locality from node to rack, the {{node-locality-parameter}} is > used: when scheduling opportunities for a scheduler key are more than the > value of this parameter, we relax locality and try to assign the container to > a node in the corresponding rack. > On the other hand, when relaxing locality to off-switch (i.e., assign the > container anywhere in the cluster), we are using a {{localityWaitFactor}}, > which is computed based on the number of outstanding requests for a specific > scheduler key, which is divided by the size of the cluster. > In case of applications that request containers in big batches (e.g., > traditional MR jobs), and for relatively small clusters, the > localityWaitFactor does not affect relaxing locality much. > However, in case of applications that request containers in small batches, > this load factor takes a very small value, which leads to assigning > off-switch containers too soon. This situation is even more pronounced in big > clusters. > For example, if an application requests only one container per request, the > locality will be relaxed after a single missed scheduling opportunity. > The purpose of this JIRA is to rethink the way we are relaxing locality for > off-switch assignments. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6457) Allow custom SSL configuration to be supplied in WebApps
[ https://issues.apache.org/jira/browse/YARN-6457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968667#comment-15968667 ] ASF GitHub Bot commented on YARN-6457: -- GitHub user sanjaypujare opened a pull request: https://github.com/apache/hadoop/pull/213 YARN-6457 use existing conf object as sslConf object in WebApps for the builder to use for the HttpServer2 You can merge this pull request into a Git repository by running: $ git pull https://github.com/sanjaypujare/hadoop YARN-6457.sanjay Alternatively you can review and apply these changes as the patch at: https://github.com/apache/hadoop/pull/213.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #213 commit daa1fc2cc42617b362cae17c886d706d48a0b84f Author: DTAdmin Date: 2017-04-09T23:19:30Z YARN-6457 use existing conf object as sslConf object in WebApps for the builder to use for the HttpServer2 > Allow custom SSL configuration to be supplied in WebApps > > > Key: YARN-6457 > URL: https://issues.apache.org/jira/browse/YARN-6457 > Project: Hadoop YARN > Issue Type: Improvement > Components: webapp, yarn >Reporter: Sanjay M Pujare > Original Estimate: 96h > Remaining Estimate: 96h > > Currently a custom SSL store cannot be passed on to WebApps which forces the > embedded web-server to use the default keystore set up in ssl-server.xml for > the whole Hadoop cluster. There are cases where the Hadoop app needs to use > its own/custom keystore. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6396) Call verifyAndCreateRemoteLogDir at service initialization instead of application initialization to decrease load for name node
[ https://issues.apache.org/jira/browse/YARN-6396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968642#comment-15968642 ] Jian He commented on YARN-6396: --- Hi [~zxu], only one thing is that if the valid permission becomes invalid maybe by accident at some point, the verifyAndCreateRemoteLogDir won't be invoked any more to validate the permission, the log-aggregation will eventually fail. I think it's a trade off between validation and efficiency. I'm ok with current approach. [~xgong], do you have comments ? > Call verifyAndCreateRemoteLogDir at service initialization instead of > application initialization to decrease load for name node > --- > > Key: YARN-6396 > URL: https://issues.apache.org/jira/browse/YARN-6396 > Project: Hadoop YARN > Issue Type: Improvement > Components: log-aggregation >Affects Versions: 3.0.0-alpha2 >Reporter: zhihai xu >Assignee: zhihai xu >Priority: Minor > Attachments: YARN-6396.000.patch > > > Call verifyAndCreateRemoteLogDir at service initialization instead of > application initialization to decrease load for name node. > Currently for every application at each Node, verifyAndCreateRemoteLogDir > will be called before doing log aggregation, This will be a non trivial > overhead for name node in a large cluster since verifyAndCreateRemoteLogDir > calls getFileStatus. Once the remote log directory is created successfully, > it is not necessary to call it again. It will be better to call > verifyAndCreateRemoteLogDir at LogAggregationService service initialization. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3839) Quit throwing NMNotYetReadyException
[ https://issues.apache.org/jira/browse/YARN-3839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968640#comment-15968640 ] Manikandan R commented on YARN-3839: Adding [~jianhe], [~vinodkv].. > Quit throwing NMNotYetReadyException > > > Key: YARN-3839 > URL: https://issues.apache.org/jira/browse/YARN-3839 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Karthik Kambatla > > Quit throwing NMNotYetReadyException when NM has not yet registered with the > RM. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-2985) YARN should support to delete the aggregated logs for Non-MapReduce applications
[ https://issues.apache.org/jira/browse/YARN-2985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968609#comment-15968609 ] Steven Rand commented on YARN-2985: --- [~jlowe], thanks for the thoughtful response. Based on that information, it seems like the most straightforward way to proceed, at least for branch-2, is to add a configuration option for running the deletion service in only the timeline server, and not the JHS. Something like {{yarn.log-aggregation.run-in-timeline-server}} that defaults to {{false}} for backcompat, but when set to {{true}}, prevents the JHS from performing retention, and tells the timeline server to do it instead. Does that seem reasonable? If so I'll update the patch to do that, but certainly open to alternatives if there's a better way. For trunk, I imagine it might be worth just removing retention from the JHS and moving it to the timeline server entirely, since my understanding is that the timeline server is supposed to replace the JHS, even for deployments that only run MR jobs, and 3.0 seems like a reasonable enough point at which to require the switch from JHS to timeline server. I might be misunderstanding the relationship between the two though, so please correct me if that doesn't make sense. > YARN should support to delete the aggregated logs for Non-MapReduce > applications > > > Key: YARN-2985 > URL: https://issues.apache.org/jira/browse/YARN-2985 > Project: Hadoop YARN > Issue Type: New Feature > Components: log-aggregation, nodemanager >Affects Versions: 2.8.0 >Reporter: Xu Yang >Assignee: Steven Rand > Attachments: YARN-2985-branch-2-001.patch > > > Before Hadoop 2.6, the LogAggregationService is started in NodeManager. But > the AggregatedLogDeletionService is started in mapreduce`s JobHistoryServer. > Therefore, the Non-MapReduce application can aggregate their logs to HDFS, > but can not delete those logs. Need the NodeManager take over the function of > aggregated log deletion. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3839) Quit throwing NMNotYetReadyException
[ https://issues.apache.org/jira/browse/YARN-3839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968591#comment-15968591 ] Manikandan R commented on YARN-3839: {quote}My understanding is the same. It looks like the existing cases when we throw it will already be covered by the NMToken or ContainerToken so we know whether the launch is valid or not.{quote} Ok. Shall I proceed further? {quote}NMNotYetReadyException class around for compatibility with clients but the NM would stop throwing the exception.{quote} Yes. > Quit throwing NMNotYetReadyException > > > Key: YARN-3839 > URL: https://issues.apache.org/jira/browse/YARN-3839 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Karthik Kambatla > > Quit throwing NMNotYetReadyException when NM has not yet registered with the > RM. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3666) Federation Intercepting and propagating AM-RM communications
[ https://issues.apache.org/jira/browse/YARN-3666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968553#comment-15968553 ] Hadoop QA commented on YARN-3666: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 44s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 55s{color} | {color:green} YARN-2915 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 49s{color} | {color:green} YARN-2915 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 52s{color} | {color:green} YARN-2915 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 17s{color} | {color:green} YARN-2915 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 57s{color} | {color:green} YARN-2915 passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 48s{color} | {color:green} YARN-2915 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 47s{color} | {color:green} YARN-2915 passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 9m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 50s{color} | {color:green} hadoop-yarn-project/hadoop-yarn: The patch generated 0 new + 30 unchanged - 1 fixed = 30 total (was 31) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 55s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 70m 28s{color} | {color:red} hadoop-yarn in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 12m 55s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 33s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}146m 56s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer | | | hadoop.yarn.server.resourcemanager.federation.TestFederationRMStateStoreService | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:612578f | | JIRA Issue | YARN-3666 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12863403/YARN-3666-YARN-2915.v5.patch | | Optional Tests | asflicense findbugs xml compile javac javadoc mvninstall mvnsite unit checkstyle | | uname | Linux
[jira] [Commented] (YARN-6406) Remove SchedulerRequestKeys when no more pending ResourceRequest
[ https://issues.apache.org/jira/browse/YARN-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968542#comment-15968542 ] Hadoop QA commented on YARN-6406: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} docker {color} | {color:red} 0m 6s{color} | {color:red} Docker failed to build yetus/hadoop:b59b8b7. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-6406 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12863411/YARN-6406-branch-2.001.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/15638/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Remove SchedulerRequestKeys when no more pending ResourceRequest > > > Key: YARN-6406 > URL: https://issues.apache.org/jira/browse/YARN-6406 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.8.0, 2.7.3, 3.0.0-alpha2 >Reporter: Arun Suresh >Assignee: Arun Suresh > Fix For: 3.0.0-alpha3 > > Attachments: YARN-6406.001.patch, YARN-6406.002.patch, > YARN-6406-branch-2.001.patch > > > YARN-5540 introduced some optimizations to remove satisfied SchedulerKeys > from the AppScheduleingInfo. It looks like after YARN-6040, > ScedulerRequestKeys are removed only if the Application sends a 0 > numContainers requests. While earlier, the outstanding schedulerKeys were > also remove as soon as a container is allocated as well. > An additional optimization we were hoping to include is to remove the > ResourceRequests itself once the numContainers == 0, since we see in our > clusters that the RM heap space consumption increases drastically due to a > large number of ResourceRequests with 0 num containers. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6467) CSQueueMetrics needs to update the current metrics for default partition only
[ https://issues.apache.org/jira/browse/YARN-6467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968541#comment-15968541 ] Naganarasimha G R commented on YARN-6467: - [~maniraj...@gmail.com], yep as simple as that. Hope you could attach the patch for the same. > CSQueueMetrics needs to update the current metrics for default partition only > - > > Key: YARN-6467 > URL: https://issues.apache.org/jira/browse/YARN-6467 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 2.8.0, 2.7.3, 3.0.0-alpha2 >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R > > As a followup to YARN-6195, we need to update existing metrics to only > default Partition. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6406) Remove SchedulerRequestKeys when no more pending ResourceRequest
[ https://issues.apache.org/jira/browse/YARN-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-6406: -- Attachment: YARN-6406-branch-2.001.patch Attaching patch for branch-2. [~leftnoteasy], can you take a look. > Remove SchedulerRequestKeys when no more pending ResourceRequest > > > Key: YARN-6406 > URL: https://issues.apache.org/jira/browse/YARN-6406 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.8.0, 2.7.3, 3.0.0-alpha2 >Reporter: Arun Suresh >Assignee: Arun Suresh > Fix For: 3.0.0-alpha3 > > Attachments: YARN-6406.001.patch, YARN-6406.002.patch, > YARN-6406-branch-2.001.patch > > > YARN-5540 introduced some optimizations to remove satisfied SchedulerKeys > from the AppScheduleingInfo. It looks like after YARN-6040, > ScedulerRequestKeys are removed only if the Application sends a 0 > numContainers requests. While earlier, the outstanding schedulerKeys were > also remove as soon as a container is allocated as well. > An additional optimization we were hoping to include is to remove the > ResourceRequests itself once the numContainers == 0, since we see in our > clusters that the RM heap space consumption increases drastically due to a > large number of ResourceRequests with 0 num containers. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Reopened] (YARN-6406) Remove SchedulerRequestKeys when no more pending ResourceRequest
[ https://issues.apache.org/jira/browse/YARN-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh reopened YARN-6406: --- Reopening issue to add patch for branch-2 > Remove SchedulerRequestKeys when no more pending ResourceRequest > > > Key: YARN-6406 > URL: https://issues.apache.org/jira/browse/YARN-6406 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.8.0, 2.7.3, 3.0.0-alpha2 >Reporter: Arun Suresh >Assignee: Arun Suresh > Fix For: 3.0.0-alpha3 > > Attachments: YARN-6406.001.patch, YARN-6406.002.patch, > YARN-6406-branch-2.001.patch > > > YARN-5540 introduced some optimizations to remove satisfied SchedulerKeys > from the AppScheduleingInfo. It looks like after YARN-6040, > ScedulerRequestKeys are removed only if the Application sends a 0 > numContainers requests. While earlier, the outstanding schedulerKeys were > also remove as soon as a container is allocated as well. > An additional optimization we were hoping to include is to remove the > ResourceRequests itself once the numContainers == 0, since we see in our > clusters that the RM heap space consumption increases drastically due to a > large number of ResourceRequests with 0 num containers. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6406) Remove SchedulerRequestKeys when no more pending ResourceRequest
[ https://issues.apache.org/jira/browse/YARN-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-6406: -- Fix Version/s: 2.9.0 > Remove SchedulerRequestKeys when no more pending ResourceRequest > > > Key: YARN-6406 > URL: https://issues.apache.org/jira/browse/YARN-6406 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.8.0, 2.7.3, 3.0.0-alpha2 >Reporter: Arun Suresh >Assignee: Arun Suresh > Fix For: 3.0.0-alpha3 > > Attachments: YARN-6406.001.patch, YARN-6406.002.patch > > > YARN-5540 introduced some optimizations to remove satisfied SchedulerKeys > from the AppScheduleingInfo. It looks like after YARN-6040, > ScedulerRequestKeys are removed only if the Application sends a 0 > numContainers requests. While earlier, the outstanding schedulerKeys were > also remove as soon as a container is allocated as well. > An additional optimization we were hoping to include is to remove the > ResourceRequests itself once the numContainers == 0, since we see in our > clusters that the RM heap space consumption increases drastically due to a > large number of ResourceRequests with 0 num containers. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6406) Remove SchedulerRequestKeys when no more pending ResourceRequest
[ https://issues.apache.org/jira/browse/YARN-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-6406: -- Fix Version/s: (was: 2.9.0) > Remove SchedulerRequestKeys when no more pending ResourceRequest > > > Key: YARN-6406 > URL: https://issues.apache.org/jira/browse/YARN-6406 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.8.0, 2.7.3, 3.0.0-alpha2 >Reporter: Arun Suresh >Assignee: Arun Suresh > Fix For: 3.0.0-alpha3 > > Attachments: YARN-6406.001.patch, YARN-6406.002.patch > > > YARN-5540 introduced some optimizations to remove satisfied SchedulerKeys > from the AppScheduleingInfo. It looks like after YARN-6040, > ScedulerRequestKeys are removed only if the Application sends a 0 > numContainers requests. While earlier, the outstanding schedulerKeys were > also remove as soon as a container is allocated as well. > An additional optimization we were hoping to include is to remove the > ResourceRequests itself once the numContainers == 0, since we see in our > clusters that the RM heap space consumption increases drastically due to a > large number of ResourceRequests with 0 num containers. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-6478) Fix a spelling mistake in FileSystemTimelineWriter
[ https://issues.apache.org/jira/browse/YARN-6478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naganarasimha G R reassigned YARN-6478: --- Assignee: Jinjiang Ling > Fix a spelling mistake in FileSystemTimelineWriter > -- > > Key: YARN-6478 > URL: https://issues.apache.org/jira/browse/YARN-6478 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jinjiang Ling >Assignee: Jinjiang Ling >Priority: Trivial > Labels: newbie > Attachments: YARN-6478-0.patch > > > Find a spelling mistake in FileSystemTimelineWriter.java > the "writeSummmaryEntityLogs" should be "writeSummaryEntityLogs". -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6478) Fix a spelling mistake in FileSystemTimelineWriter
[ https://issues.apache.org/jira/browse/YARN-6478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968529#comment-15968529 ] Naganarasimha G R commented on YARN-6478: - [~lingjinjiang], I understand that we need to start from some where for contribution, but this is too trivial and nearly unused code. And raising multiple issues for spelling checks(may be not all by a single person) is not advisable, what you could do is collect multiple such issues under a class/package/project and raise one! Adding you to the contributor's list anyway. > Fix a spelling mistake in FileSystemTimelineWriter > -- > > Key: YARN-6478 > URL: https://issues.apache.org/jira/browse/YARN-6478 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jinjiang Ling >Priority: Trivial > Labels: newbie > Attachments: YARN-6478-0.patch > > > Find a spelling mistake in FileSystemTimelineWriter.java > the "writeSummmaryEntityLogs" should be "writeSummaryEntityLogs". -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6363) Extending SLS: Synthetic Load Generator
[ https://issues.apache.org/jira/browse/YARN-6363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968519#comment-15968519 ] Hadoop QA commented on YARN-6363: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 12 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 43s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 25s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 29s{color} | {color:orange} hadoop-tools: The patch generated 192 new + 140 unchanged - 23 fixed = 332 total (was 163) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} shellcheck {color} | {color:red} 2m 45s{color} | {color:red} The patch generated 5 new + 72 unchanged - 3 fixed = 77 total (was 75) {color} | | {color:green}+1{color} | {color:green} shelldocs {color} | {color:green} 0m 11s{color} | {color:green} There were no new shelldocs issues. {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch 1 line(s) with tabs. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 6s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 22s{color} | {color:green} hadoop-rumen in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 1m 54s{color} | {color:red} hadoop-sls in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 23s{color} | {color:red} The patch generated 2 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 39m 56s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.sls.TestSLSRunner | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:612578f | | JIRA Issue | YARN-6363 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12863407/YARN-6363.v10.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml shellcheck shelldocs | | uname | Linux a67029d8c05d 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 0cab5
[jira] [Updated] (YARN-6216) Unify Container Resizing code paths with Container Updates making it scheduler agnostic
[ https://issues.apache.org/jira/browse/YARN-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-6216: -- Fix Version/s: 2.9.0 > Unify Container Resizing code paths with Container Updates making it > scheduler agnostic > --- > > Key: YARN-6216 > URL: https://issues.apache.org/jira/browse/YARN-6216 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, fairscheduler, resourcemanager >Affects Versions: 3.0.0-alpha2 >Reporter: Arun Suresh >Assignee: Arun Suresh > Fix For: 2.9.0, 3.0.0-alpha3 > > Attachments: YARN-6216.001.patch, YARN-6216.002.patch, > YARN-6216.003.patch, YARN-6216-branch-2.001.patch > > > YARN-5959 introduced an {{ContainerUpdateContext}} which can be used to > update the ExecutionType of a container in a scheduler agnostic manner. As > mentioned in that JIRA, extending that to encompass Container resizing is > trivial. > This JIRA proposes to remove all the CapacityScheduler specific code paths. > (CapacityScheduler, CSQueue, FicaSchedulerApp etc.) and modify the code to > use the framework introduced in YARN-5959 without any loss of functionality. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6216) Unify Container Resizing code paths with Container Updates making it scheduler agnostic
[ https://issues.apache.org/jira/browse/YARN-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-6216: -- Target Version/s: 2.9.0, 3.0.0-alpha3 (was: 3.0.0-alpha3) > Unify Container Resizing code paths with Container Updates making it > scheduler agnostic > --- > > Key: YARN-6216 > URL: https://issues.apache.org/jira/browse/YARN-6216 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, fairscheduler, resourcemanager >Affects Versions: 3.0.0-alpha2 >Reporter: Arun Suresh >Assignee: Arun Suresh > Fix For: 2.9.0, 3.0.0-alpha3 > > Attachments: YARN-6216.001.patch, YARN-6216.002.patch, > YARN-6216.003.patch, YARN-6216-branch-2.001.patch > > > YARN-5959 introduced an {{ContainerUpdateContext}} which can be used to > update the ExecutionType of a container in a scheduler agnostic manner. As > mentioned in that JIRA, extending that to encompass Container resizing is > trivial. > This JIRA proposes to remove all the CapacityScheduler specific code paths. > (CapacityScheduler, CSQueue, FicaSchedulerApp etc.) and modify the code to > use the framework introduced in YARN-5959 without any loss of functionality. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6216) Unify Container Resizing code paths with Container Updates making it scheduler agnostic
[ https://issues.apache.org/jira/browse/YARN-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968514#comment-15968514 ] Arun Suresh commented on YARN-6216: --- Apologize for the comment spam - yetus has some issues so had kicked jenkins multiple times to get a good run. Re-based with branch-2 and ran all yarn tests locally. Committing this to branch-2 as well.. > Unify Container Resizing code paths with Container Updates making it > scheduler agnostic > --- > > Key: YARN-6216 > URL: https://issues.apache.org/jira/browse/YARN-6216 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, fairscheduler, resourcemanager >Affects Versions: 3.0.0-alpha2 >Reporter: Arun Suresh >Assignee: Arun Suresh > Fix For: 3.0.0-alpha3 > > Attachments: YARN-6216.001.patch, YARN-6216.002.patch, > YARN-6216.003.patch, YARN-6216-branch-2.001.patch > > > YARN-5959 introduced an {{ContainerUpdateContext}} which can be used to > update the ExecutionType of a container in a scheduler agnostic manner. As > mentioned in that JIRA, extending that to encompass Container resizing is > trivial. > This JIRA proposes to remove all the CapacityScheduler specific code paths. > (CapacityScheduler, CSQueue, FicaSchedulerApp etc.) and modify the code to > use the framework introduced in YARN-5959 without any loss of functionality. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6363) Extending SLS: Synthetic Load Generator
[ https://issues.apache.org/jira/browse/YARN-6363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carlo Curino updated YARN-6363: --- Attachment: YARN-6363.v10.patch > Extending SLS: Synthetic Load Generator > --- > > Key: YARN-6363 > URL: https://issues.apache.org/jira/browse/YARN-6363 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Carlo Curino >Assignee: Carlo Curino > Attachments: YARN-6363 overview.pdf, YARN-6363.v0.patch, > YARN-6363.v10.patch, YARN-6363.v1.patch, YARN-6363.v2.patch, > YARN-6363.v3.patch, YARN-6363.v4.patch, YARN-6363.v5.patch, > YARN-6363.v6.patch, YARN-6363.v7.patch, YARN-6363.v9.patch > > > This JIRA tracks the introduction of a synthetic load generator in the SLS. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6363) Extending SLS: Synthetic Load Generator
[ https://issues.apache.org/jira/browse/YARN-6363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968488#comment-15968488 ] Carlo Curino commented on YARN-6363: [~wangda] thanks for the feedback, I agree with much of it. # I introduced a compatibility layer, that allows to run old version of commandline (I didn't think SLS was used widely enough to matter, but still worth it)---tested only via unit tests for now, I will run local runs tomorrow. # I fixed the --nodes, in the edit I mistakenly remove and else branch for that # added test coverage for the --nodes and the old commandline format # the issue you observe it might be due to very slow address resolution (I mentioned to you offline). I will look into it tomorrow. If you have time you can check the "compatibility" changes, see if they make sense to you (and the nodes support) (patch v10). I will make sure things work from commandline and check the issue you found tomorrow. > Extending SLS: Synthetic Load Generator > --- > > Key: YARN-6363 > URL: https://issues.apache.org/jira/browse/YARN-6363 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Carlo Curino >Assignee: Carlo Curino > Attachments: YARN-6363 overview.pdf, YARN-6363.v0.patch, > YARN-6363.v10.patch, YARN-6363.v1.patch, YARN-6363.v2.patch, > YARN-6363.v3.patch, YARN-6363.v4.patch, YARN-6363.v5.patch, > YARN-6363.v6.patch, YARN-6363.v7.patch, YARN-6363.v9.patch > > > This JIRA tracks the introduction of a synthetic load generator in the SLS. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6478) Fix a spelling mistake in FileSystemTimelineWriter
[ https://issues.apache.org/jira/browse/YARN-6478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968468#comment-15968468 ] Jinjiang Ling commented on YARN-6478: - Hi [~Naganarasimha], I have attach a patch. Can you take a review? > Fix a spelling mistake in FileSystemTimelineWriter > -- > > Key: YARN-6478 > URL: https://issues.apache.org/jira/browse/YARN-6478 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jinjiang Ling >Priority: Trivial > Labels: newbie > Attachments: YARN-6478-0.patch > > > Find a spelling mistake in FileSystemTimelineWriter.java > the "writeSummmaryEntityLogs" should be "writeSummaryEntityLogs". -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-3666) Federation Intercepting and propagating AM-RM communications
[ https://issues.apache.org/jira/browse/YARN-3666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Botong Huang updated YARN-3666: --- Attachment: YARN-3666-YARN-2915.v5.patch > Federation Intercepting and propagating AM-RM communications > > > Key: YARN-3666 > URL: https://issues.apache.org/jira/browse/YARN-3666 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Kishore Chaliparambil >Assignee: Botong Huang > Attachments: YARN-3666-YARN-2915.v1.patch, > YARN-3666-YARN-2915.v2.patch, YARN-3666-YARN-2915.v3.patch, > YARN-3666-YARN-2915.v4.patch, YARN-3666-YARN-2915.v5.patch > > > In order, to support transparent "spanning" of jobs across sub-clusters, all > AM-RM communications are proxied (via YARN-2884). > This JIRA tracks federation-specific mechanisms that decide how to > "split/broadcast" requests to the RMs and "merge" answers to > the AM. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Issue Comment Deleted] (YARN-6378) Negative usedResources memory in CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-6378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash updated YARN-6378: --- Comment: was deleted (was: From what I can tell, there's an app : {{application_1487222361993_12379}} which was moved first from interactive to the production queue, and then from production queue to the etl queue . This was a massive application so I'm not sure if the discrepancy in accounting is an artifact of the application being moved twice of it being a massive app and some race condition being triggered. Or if this application's shenanigans were at all involved ;-)) > Negative usedResources memory in CapacityScheduler > -- > > Key: YARN-6378 > URL: https://issues.apache.org/jira/browse/YARN-6378 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, resourcemanager >Affects Versions: 2.7.2 >Reporter: Ravi Prakash >Assignee: Ravi Prakash > > Courtesy Thomas Nystrand, we found that on one of our clusters configured > with the CapacityScheduler, usedResources occasionally becomes negative. > e.g. > {code} > 2017-03-15 11:10:09,449 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: > assignedContainer application attempt=appattempt_1487222361993_17177_01 > container=Container: [ContainerId: container_1487222361993_17177_01_14, > NodeId: :27249, NodeHttpAddress: :8042, Resource: > , Priority: 2, Token: null, ] queue=: > capacity=0.2, absoluteCapacity=0.2, usedResources=, > usedCapacity=0.03409091, absoluteUsedCapacity=0.006818182, numApps=1, > numContainers=3 clusterResource= type=RACK_LOCAL > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6378) Negative usedResources memory in CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-6378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968398#comment-15968398 ] Ravi Prakash commented on YARN-6378: >From what I can tell, there's an app : {{application_1487222361993_12379}} >which was moved first from interactive to the production queue, and then from >production queue to the etl queue . This was a massive application so I'm not >sure if the discrepancy in accounting is an artifact of the application being >moved twice of it being a massive app and some race condition being triggered. >Or if this application's shenanigans were at all involved ;-) > Negative usedResources memory in CapacityScheduler > -- > > Key: YARN-6378 > URL: https://issues.apache.org/jira/browse/YARN-6378 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, resourcemanager >Affects Versions: 2.7.2 >Reporter: Ravi Prakash >Assignee: Ravi Prakash > > Courtesy Thomas Nystrand, we found that on one of our clusters configured > with the CapacityScheduler, usedResources occasionally becomes negative. > e.g. > {code} > 2017-03-15 11:10:09,449 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: > assignedContainer application attempt=appattempt_1487222361993_17177_01 > container=Container: [ContainerId: container_1487222361993_17177_01_14, > NodeId: :27249, NodeHttpAddress: :8042, Resource: > , Priority: 2, Token: null, ] queue=: > capacity=0.2, absoluteCapacity=0.2, usedResources=, > usedCapacity=0.03409091, absoluteUsedCapacity=0.006818182, numApps=1, > numContainers=3 clusterResource= type=RACK_LOCAL > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Issue Comment Deleted] (YARN-6378) Negative usedResources memory in CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-6378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash updated YARN-6378: --- Comment: was deleted (was: I downloaded the RM logs (thanks again DP team) on dogfood. The RM for firstdata was restarted on 02-16. The first time since then that there are negative resources was on 03-01. {code} 2017-03-01 13:35:20,813 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: Re-sorting completed queue: root.etl stats: etl: capacity=0.2, absoluteCapacity=0.2, usedResources=, usedCapacity=0.011363636, absoluteUsedCapacity=0.0022727272, numApps=1, numContainers=1 2017-03-01 13:35:20,813 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Application attempt appattempt_1487222361993_12379_01 released container container_1487222361993_12379_01_61 on node: host: 203-35.as1.altiscale.com:26469 #containers=9 available= used= with event: RELEASED 2017-03-01 13:35:20,813 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Null container completed... 2017-03-01 13:35:20,813 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1487222361993_12379_01_68 Container Transitioned from RUNNING to RELEASED 2017-03-01 13:35:20,813 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp: Completed container: container_1487222361993_12379_01_68 in state: RELEASED event:RELEASED 2017-03-01 13:35:20,813 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: Released container container_1487222361993_12379_01_68 of capacity on host 203-03.as1.altiscale.com:27249, which currently has 7 containers, used and available, release resources=true 2017-03-01 13:35:20,813 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: etl used= numContainers=0 user=vijayasarathyparanthaman user-resources= 2017-03-01 13:35:20,813 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: completedContainer container=Container: [ContainerId: container_1487222361993_12379_01_68, NodeId: 203-03.as1.altiscale.com:27249, NodeHttpAddress: 203-03.as1.altiscale.com:8042, Resource: , Priority: 2, Token: Token { kind: ContainerToken, service: 10.247.57.232:27249 }, ] queue=etl: capacity=0.2, absoluteCapacity=0.2, usedResources=, usedCapacity=0.0, absoluteUsedCapacity=0.0, numApps=1, numContainers=0 cluster={code} At 12:53, usedResources are 0,0 on etl {code} 2017-03-01 12:53:17,934 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: completedContainer container=Container: [ContainerId: container_1487222361993_12294_01_01, NodeId: 202-33.as1.altiscale.com:33675, NodeHttpAddress: 202-33.as1.altiscale.com:8042, Resource: , Priority: 0, Token: Token { kind: ContainerToken, service: 10.247.57.237:33675 }, ] queue=etl: capacity=0.2, absoluteCapacity=0.2, usedResources=, usedCapacity=0.0, absoluteUsedCapacity=0.0, numApps=1, numContainers=0 cluster= 2017-03-01 12:53:17,934 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: Re-sorting completed queue: root.etl stats: etl: capacity=0.2, absoluteCapacity=0.2, usedResources=, usedCapacity=0.0, absoluteUsedCapacity=0.0, numApps=1, numContainers=0 {code} Something happens between 12:53 and 13:35. Going to investigate.) > Negative usedResources memory in CapacityScheduler > -- > > Key: YARN-6378 > URL: https://issues.apache.org/jira/browse/YARN-6378 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, resourcemanager >Affects Versions: 2.7.2 >Reporter: Ravi Prakash >Assignee: Ravi Prakash > > Courtesy Thomas Nystrand, we found that on one of our clusters configured > with the CapacityScheduler, usedResources occasionally becomes negative. > e.g. > {code} > 2017-03-15 11:10:09,449 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: > assignedContainer application attempt=appattempt_1487222361993_17177_01 > container=Container: [ContainerId: container_1487222361993_17177_01_14, > NodeId: :27249, NodeHttpAddress: :8042, Resource: > , Priority: 2, Token: null, ] queue=: > capacity=0.2, absoluteCapacity=0.2, usedResources=, > usedCapacity=0.03409091, absoluteUsedCapacity=0.006818182, numApps=1, > numContainers=3 clusterResource= type=RACK_LOCAL > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-3663) Federation State and Policy Store (DBMS implementation)
[ https://issues.apache.org/jira/browse/YARN-3663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968218#comment-15968218 ] Carlo Curino edited comment on YARN-3663 at 4/13/17 11:17 PM: -- Thanks [~giovanni.fumarola] for the updated patch. Conf: # You set {{YarnConfiguration.DEFAULT_FEDERATION_STATESTORE_SQL_MAXCONNECTIONS}} to 1. Isn't this too small? What values you commonly use? I would put a conservative but not overly tight value, otherwise users are forced to learn/modify more params. # We should omit {{yarn.federation.state-store.sql.max-connections} from yarn-default.xml like you do for all other params in this patch. In {{SQLFederationStateStore}} # For most fields except {{CALL_SP_GET_ALL_POLICY_CONFIGURATIONS}} you simply use the plural to differentiate the getter of one or all items. Be consistent (e.g., remove the ALL here, or add it everywhere else) # Minor: userNameDB --> userName, passwordDB --> password # When you throw the exceptions (e.g., subcluster registration), it might be nice in the message to include the sub-clusterID / ip or any other info one can use to debug. # Can you comment on why we are using: {{FederationStateStoreErrorCode}}? They don't seem to be connected to SQL error codes, and they are not used anywhere else (we normally use named exception, which are easier to understand/track). # at line 277-278: formatting # We should try to remove redundance, e.g., you have lots of things that look like this: {code} try { FederationMembershipStateStoreInputValidator .validateGetSubClusterInfoRequest(subClusterRequest); } catch (FederationStateStoreInvalidInputException e) { FederationStateStoreUtils.logAndThrowInvalidInputException(LOG, "Unable to obtain the infomation about the SubCluster. " + e.getMessage()); } {code} They could be factored out to {{FederationMembershipStateStoreInputValidator.validate(subClusterRequest)}} where the type of input param is used to differentiate the method, and the logAndThrowInvalidInputException is done on that side. Same goes for {{checkSubClusterInfo}}. # Similarly to the above we should try to factor out the very repetitive code to create connection/statements, set params, run, and throw. I don't have a specific advise on this, but the code is mostly copy and paste, which we should avoid. # Move the {{fromStringToSubClusterState}} to the SubclusterState class (and call it fromString(). # Why {{getSubCluster}} and {{getSubClusters}} use different mechanics for return values? (registered params vs ResultSet)? Might be worth to be consistent (probably using ResultSet). # Line 540: Is this behavior (overwrite existing) consistent with general YARN? (I think so, but want to check) # Some of the log are a bit vague {{LOG.debug("Got the information about the specified application }} say spefically what info where gotten # if you use {{LOG.debug}} consider to prefix it with a check if we are in debug mode (save time/objects creations for the String that are then not used). # You have several {{ if (cstmt.getInt(2) != 1) }} ROWCOUNT checks. This mix the no tuple where changed to multiple tuple where changed. Distinguishing the two cases, might help debug (do we have duplicates in DB, or the entry was not found). (Not mandatory, just somethign to consider) # {{setPolicyConfiguration}} You are doing {{cstmt.setBytes(3, new byte[policyConf.getParams().remaining()]);}} which adds an empty byte[] instead of what is coming in input. # {{getCurrentVersion}} and {{loadVersion}} throw a NotSupportedException or something of the sort, a silent return null is easy to confuse people. (I know the full version will be in V2, let's just have a clear breakage if someone try to use this methods). ( to be continued ...) ( continued...) In {{FederationStateStoreInputValidator}}: # Please consider to rename the validate methods (ok to have separate JIRA for this). In {{FederationStateStoreUtils}} # You log at info and debug level inconsistently for the {{set*}} you added. I would suggest debug for all. In {{HSQLDBFederationStateStore}} # the empty constructor with super() is redundant, just omit the constructor alltogether. # I think the code would be more readable if all the schema was in a well-formatted hsqldb-federation-schema.sql (or broken down like the SQLServer one) and the code was reading from file the statements and executing them. # The use of {{SELECT *}} is kind of dangerous, because it hides field renames/moves and other schema evolution issues, that might lead to hard to debug, I would use explicit named fields always # shouldn't {{SP_SUBCLUSTERHEARTBEAT}} update the {{lastHeartBeat}} field? More concerning, why do the tests miss this? If I am correct,
[jira] [Commented] (YARN-6216) Unify Container Resizing code paths with Container Updates making it scheduler agnostic
[ https://issues.apache.org/jira/browse/YARN-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968368#comment-15968368 ] Hadoop QA commented on YARN-6216: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} docker {color} | {color:red} 14m 18s{color} | {color:red} Docker failed to build yetus/hadoop:b59b8b7. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-6216 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12863355/YARN-6216-branch-2.001.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/15635/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Unify Container Resizing code paths with Container Updates making it > scheduler agnostic > --- > > Key: YARN-6216 > URL: https://issues.apache.org/jira/browse/YARN-6216 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, fairscheduler, resourcemanager >Affects Versions: 3.0.0-alpha2 >Reporter: Arun Suresh >Assignee: Arun Suresh > Fix For: 3.0.0-alpha3 > > Attachments: YARN-6216.001.patch, YARN-6216.002.patch, > YARN-6216.003.patch, YARN-6216-branch-2.001.patch > > > YARN-5959 introduced an {{ContainerUpdateContext}} which can be used to > update the ExecutionType of a container in a scheduler agnostic manner. As > mentioned in that JIRA, extending that to encompass Container resizing is > trivial. > This JIRA proposes to remove all the CapacityScheduler specific code paths. > (CapacityScheduler, CSQueue, FicaSchedulerApp etc.) and modify the code to > use the framework introduced in YARN-5959 without any loss of functionality. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6378) Negative usedResources memory in CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-6378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968360#comment-15968360 ] Ravi Prakash commented on YARN-6378: I downloaded the RM logs (thanks again DP team) on dogfood. The RM for firstdata was restarted on 02-16. The first time since then that there are negative resources was on 03-01. {code} 2017-03-01 13:35:20,813 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: Re-sorting completed queue: root.etl stats: etl: capacity=0.2, absoluteCapacity=0.2, usedResources=, usedCapacity=0.011363636, absoluteUsedCapacity=0.0022727272, numApps=1, numContainers=1 2017-03-01 13:35:20,813 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Application attempt appattempt_1487222361993_12379_01 released container container_1487222361993_12379_01_61 on node: host: 203-35.as1.altiscale.com:26469 #containers=9 available= used= with event: RELEASED 2017-03-01 13:35:20,813 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Null container completed... 2017-03-01 13:35:20,813 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1487222361993_12379_01_68 Container Transitioned from RUNNING to RELEASED 2017-03-01 13:35:20,813 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp: Completed container: container_1487222361993_12379_01_68 in state: RELEASED event:RELEASED 2017-03-01 13:35:20,813 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: Released container container_1487222361993_12379_01_68 of capacity on host 203-03.as1.altiscale.com:27249, which currently has 7 containers, used and available, release resources=true 2017-03-01 13:35:20,813 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: etl used= numContainers=0 user=vijayasarathyparanthaman user-resources= 2017-03-01 13:35:20,813 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: completedContainer container=Container: [ContainerId: container_1487222361993_12379_01_68, NodeId: 203-03.as1.altiscale.com:27249, NodeHttpAddress: 203-03.as1.altiscale.com:8042, Resource: , Priority: 2, Token: Token { kind: ContainerToken, service: 10.247.57.232:27249 }, ] queue=etl: capacity=0.2, absoluteCapacity=0.2, usedResources=, usedCapacity=0.0, absoluteUsedCapacity=0.0, numApps=1, numContainers=0 cluster={code} At 12:53, usedResources are 0,0 on etl {code} 2017-03-01 12:53:17,934 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: completedContainer container=Container: [ContainerId: container_1487222361993_12294_01_01, NodeId: 202-33.as1.altiscale.com:33675, NodeHttpAddress: 202-33.as1.altiscale.com:8042, Resource: , Priority: 0, Token: Token { kind: ContainerToken, service: 10.247.57.237:33675 }, ] queue=etl: capacity=0.2, absoluteCapacity=0.2, usedResources=, usedCapacity=0.0, absoluteUsedCapacity=0.0, numApps=1, numContainers=0 cluster= 2017-03-01 12:53:17,934 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: Re-sorting completed queue: root.etl stats: etl: capacity=0.2, absoluteCapacity=0.2, usedResources=, usedCapacity=0.0, absoluteUsedCapacity=0.0, numApps=1, numContainers=0 {code} Something happens between 12:53 and 13:35. Going to investigate. > Negative usedResources memory in CapacityScheduler > -- > > Key: YARN-6378 > URL: https://issues.apache.org/jira/browse/YARN-6378 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, resourcemanager >Affects Versions: 2.7.2 >Reporter: Ravi Prakash >Assignee: Ravi Prakash > > Courtesy Thomas Nystrand, we found that on one of our clusters configured > with the CapacityScheduler, usedResources occasionally becomes negative. > e.g. > {code} > 2017-03-15 11:10:09,449 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: > assignedContainer application attempt=appattempt_1487222361993_17177_01 > container=Container: [ContainerId: container_1487222361993_17177_01_14, > NodeId: :27249, NodeHttpAddress: :8042, Resource: > , Priority: 2, Token: null, ] queue=: > capacity=0.2, absoluteCapacity=0.2, usedResources=, > usedCapacity=0.03409091, absoluteUsedCapacity=0.006818182, numApps=1, > numContainers=3 clusterResource= type=RACK_LOCAL > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-6378) Negative usedResources memory in CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-6378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash reassigned YARN-6378: -- Assignee: Ravi Prakash > Negative usedResources memory in CapacityScheduler > -- > > Key: YARN-6378 > URL: https://issues.apache.org/jira/browse/YARN-6378 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, resourcemanager >Affects Versions: 2.7.2 >Reporter: Ravi Prakash >Assignee: Ravi Prakash > > Courtesy Thomas Nystrand, we found that on one of our clusters configured > with the CapacityScheduler, usedResources occasionally becomes negative. > e.g. > {code} > 2017-03-15 11:10:09,449 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: > assignedContainer application attempt=appattempt_1487222361993_17177_01 > container=Container: [ContainerId: container_1487222361993_17177_01_14, > NodeId: :27249, NodeHttpAddress: :8042, Resource: > , Priority: 2, Token: null, ] queue=: > capacity=0.2, absoluteCapacity=0.2, usedResources=, > usedCapacity=0.03409091, absoluteUsedCapacity=0.006818182, numApps=1, > numContainers=3 clusterResource= type=RACK_LOCAL > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5269) Bubble exceptions and errors all the way up the calls, including to clients.
[ https://issues.apache.org/jira/browse/YARN-5269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968338#comment-15968338 ] Haibo Chen commented on YARN-5269: -- Yeah, especially if we want to provide detailed error information to client (now or future), that needs changes in TimelineV2Client API. > Bubble exceptions and errors all the way up the calls, including to clients. > > > Key: YARN-5269 > URL: https://issues.apache.org/jira/browse/YARN-5269 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Joep Rottinghuis >Assignee: Haibo Chen > Labels: YARN-5355 > > Currently we ignore (swallow) exception from the HBase side in many cases > (reads and writes). > Also, on the client side, neither TimelineClient#putEntities (the v2 flavor) > nor the #putEntitiesAsync method return any value. > For the second drop we may want to consider how we properly bubble up > exceptions throughout the write and reader call paths and if we want to > return a response in putEntities and some future kind of result for > putEntitiesAsync. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-3663) Federation State and Policy Store (DBMS implementation)
[ https://issues.apache.org/jira/browse/YARN-3663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968218#comment-15968218 ] Carlo Curino edited comment on YARN-3663 at 4/13/17 10:20 PM: -- Thanks [~giovanni.fumarola] for the updated patch. Conf: # You set {{YarnConfiguration.DEFAULT_FEDERATION_STATESTORE_SQL_MAXCONNECTIONS}} to 1. Isn't this too small? What values you commonly use? I would put a conservative but not overly tight value, otherwise users are forced to learn/modify more params. # We should omit {{yarn.federation.state-store.sql.max-connections} from yarn-default.xml like you do for all other params in this patch. In {{SQLFederationStateStore}} # For most fields except {{CALL_SP_GET_ALL_POLICY_CONFIGURATIONS}} you simply use the plural to differentiate the getter of one or all items. Be consistent (e.g., remove the ALL here, or add it everywhere else) # Minor: userNameDB --> userName, passwordDB --> password # When you throw the exceptions (e.g., subcluster registration), it might be nice in the message to include the sub-clusterID / ip or any other info one can use to debug. # Can you comment on why we are using: {{FederationStateStoreErrorCode}}? They don't seem to be connected to SQL error codes, and they are not used anywhere else (we normally use named exception, which are easier to understand/track). # at line 277-278: formatting # We should try to remove redundance, e.g., you have lots of things that look like this: {code} try { FederationMembershipStateStoreInputValidator .validateGetSubClusterInfoRequest(subClusterRequest); } catch (FederationStateStoreInvalidInputException e) { FederationStateStoreUtils.logAndThrowInvalidInputException(LOG, "Unable to obtain the infomation about the SubCluster. " + e.getMessage()); } {code} They could be factored out to {{FederationMembershipStateStoreInputValidator.validate(subClusterRequest)}} where the type of input param is used to differentiate the method, and the logAndThrowInvalidInputException is done on that side. Same goes for {{checkSubClusterInfo}}. # Similarly to the above we should try to factor out the very repetitive code to create connection/statements, set params, run, and throw. I don't have a specific advise on this, but the code is mostly copy and paste, which we should avoid. # Move the {{fromStringToSubClusterState}} to the SubclusterState class (and call it fromString(). # Why {{getSubCluster}} and {{getSubClusters}} use different mechanics for return values? (registered params vs ResultSet)? Might be worth to be consistent (probably using ResultSet). # Line 540: Is this behavior (overwrite existing) consistent with general YARN? (I think so, but want to check) # Some of the log are a bit vague {{LOG.debug("Got the information about the specified application }} say spefically what info where gotten # if you use {{LOG.debug}} consider to prefix it with a check if we are in debug mode (save time/objects creations for the String that are then not used). # You have several {{ if (cstmt.getInt(2) != 1) }} ROWCOUNT checks. This mix the no tuple where changed to multiple tuple where changed. Distinguishing the two cases, might help debug (do we have duplicates in DB, or the entry was not found). (Not mandatory, just somethign to consider) # {{setPolicyConfiguration}} You are doing {{cstmt.setBytes(3, new byte[policyConf.getParams().remaining()]);}} which adds an empty byte[] instead of what is coming in input. # {{getCurrentVersion}} and {{loadVersion}} throw a NotSupportedException or something of the sort, a silent return null is easy to confuse people. (I know the full version will be in V2, let's just have a clear breakage if someone try to use this methods). ( to be continued ...) ( continued...) In {{FederationStateStoreInputValidator}}: # Please consider to rename the validate methods (ok to have separate JIRA for this). In {{FederationStateStoreUtils}} # You log at info and debug level inconsistently for the {{set*}} you added. I would suggest debug for all. In {{HSQLDBFederationStateStore}} # the empty constructor with super() is redundant, just omit the constructor alltogether. # I think the code would be more readable if all the schema was in a well-formatted hsqldb-federation-schema.sql (or broken down like the SQLServer one) and the code was reading from file the statements and executing them. # The use of {{SELECT *}} is kind of dangerous, because it hides field renames/moves and other schema evolution issues, that might lead to hard to debug, I would use explicit named fields always # I am not sure this is kosher {{SELECT applicationId_IN, homeSubCluster_IN}} line 133. You are not actually returning the homeSubCluste
[jira] [Comment Edited] (YARN-3663) Federation State and Policy Store (DBMS implementation)
[ https://issues.apache.org/jira/browse/YARN-3663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968218#comment-15968218 ] Carlo Curino edited comment on YARN-3663 at 4/13/17 10:15 PM: -- Thanks [~giovanni.fumarola] for the updated patch. Conf: # You set {{YarnConfiguration.DEFAULT_FEDERATION_STATESTORE_SQL_MAXCONNECTIONS}} to 1. Isn't this too small? What values you commonly use? I would put a conservative but not overly tight value, otherwise users are forced to learn/modify more params. # We should omit {{yarn.federation.state-store.sql.max-connections} from yarn-default.xml like you do for all other params in this patch. In {{SQLFederationStateStore}} # For most fields except {{CALL_SP_GET_ALL_POLICY_CONFIGURATIONS}} you simply use the plural to differentiate the getter of one or all items. Be consistent (e.g., remove the ALL here, or add it everywhere else) # Minor: userNameDB --> userName, passwordDB --> password # When you throw the exceptions (e.g., subcluster registration), it might be nice in the message to include the sub-clusterID / ip or any other info one can use to debug. # Can you comment on why we are using: {{FederationStateStoreErrorCode}}? They don't seem to be connected to SQL error codes, and they are not used anywhere else (we normally use named exception, which are easier to understand/track). # at line 277-278: formatting # We should try to remove redundance, e.g., you have lots of things that look like this: {code} try { FederationMembershipStateStoreInputValidator .validateGetSubClusterInfoRequest(subClusterRequest); } catch (FederationStateStoreInvalidInputException e) { FederationStateStoreUtils.logAndThrowInvalidInputException(LOG, "Unable to obtain the infomation about the SubCluster. " + e.getMessage()); } {code} They could be factored out to {{FederationMembershipStateStoreInputValidator.validate(subClusterRequest)}} where the type of input param is used to differentiate the method, and the logAndThrowInvalidInputException is done on that side. Same goes for {{checkSubClusterInfo}}. # Similarly to the above we should try to factor out the very repetitive code to create connection/statements, set params, run, and throw. I don't have a specific advise on this, but the code is mostly copy and paste, which we should avoid. # Move the {{fromStringToSubClusterState}} to the SubclusterState class (and call it fromString(). # Why {{getSubCluster}} and {{getSubClusters}} use different mechanics for return values? (registered params vs ResultSet)? Might be worth to be consistent (probably using ResultSet). # Line 540: Is this behavior (overwrite existing) consistent with general YARN? (I think so, but want to check) # Some of the log are a bit vague {{LOG.debug("Got the information about the specified application }} say spefically what info where gotten # if you use {{LOG.debug}} consider to prefix it with a check if we are in debug mode (save time/objects creations for the String that are then not used). # You have several {{ if (cstmt.getInt(2) != 1) }} ROWCOUNT checks. This mix the no tuple where changed to multiple tuple where changed. Distinguishing the two cases, might help debug (do we have duplicates in DB, or the entry was not found). (Not mandatory, just somethign to consider) # {{setPolicyConfiguration}} You are doing {{cstmt.setBytes(3, new byte[policyConf.getParams().remaining()]);}} which adds an empty byte[] instead of what is coming in input. # {{getCurrentVersion}} and {{loadVersion}} throw a NotSupportedException or something of the sort, a silent return null is easy to confuse people. (I know the full version will be in V2, let's just have a clear breakage if someone try to use this methods). ( to be continued ...) ( continued...) In {{FederationStateStoreInputValidator}}: # Please consider to rename the validate methods (ok to have separate JIRA for this). In {{FederationStateStoreUtils}} # You log at info and debug level inconsistently for the {{set*}} you added. I would suggest debug for all. In {{HSQLDBFederationStateStore}} # the empty constructor with super() is redundant, just omit the constructor alltogether. # I think the code would be more readable if all the schema was in a well-formatted hsqldb-federation-schema.sql (or broken down like the SQLServer one) and the code was reading from file the statements and executing them. # The use of {{SELECT *}} is kind of dangerous, because it hides field renames/moves and other schema evolution issues, that might lead to hard to debug, I would use explicit named fields always # I am not sure this is kosher {{SELECT applicationId_IN, homeSubCluster_IN}} line 133. You are not actually returning the homeSubCluste
[jira] [Commented] (YARN-6480) Timeout is too aggressive for TestAMRestart.testPreemptedAMRestartOnRMRestart
[ https://issues.apache.org/jira/browse/YARN-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968306#comment-15968306 ] Hadoop QA commented on YARN-6480: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 40m 32s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 69m 21s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMAdminService | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:612578f | | JIRA Issue | YARN-6480 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12863359/YARN-6480.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux b06c3af0e0c6 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 0cab572 | | Default Java | 1.8.0_121 | | findbugs | v3.0.0 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/15625/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/15625/testReport/ | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/15625/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Timeout is too aggressive for TestAMRestart.testPreemptedAMRestartOnRMRestart > - > > Key: YARN-6480 >
[jira] [Commented] (YARN-5269) Bubble exceptions and errors all the way up the calls, including to clients.
[ https://issues.apache.org/jira/browse/YARN-5269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968301#comment-15968301 ] Joep Rottinghuis commented on YARN-5269: If we do want to change the API, then we should consider doing it in this drop alpha-1. A bit of a catch-22 > Bubble exceptions and errors all the way up the calls, including to clients. > > > Key: YARN-5269 > URL: https://issues.apache.org/jira/browse/YARN-5269 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Joep Rottinghuis >Assignee: Haibo Chen > Labels: YARN-5355 > > Currently we ignore (swallow) exception from the HBase side in many cases > (reads and writes). > Also, on the client side, neither TimelineClient#putEntities (the v2 flavor) > nor the #putEntitiesAsync method return any value. > For the second drop we may want to consider how we properly bubble up > exceptions throughout the write and reader call paths and if we want to > return a response in putEntities and some future kind of result for > putEntitiesAsync. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6455) Enhance the timelinewriter.flush() race condition fix in YARN-6382
[ https://issues.apache.org/jira/browse/YARN-6455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968292#comment-15968292 ] Joep Rottinghuis commented on YARN-6455: +1 looks good to me. Thanks for the patch [~haibochen] > Enhance the timelinewriter.flush() race condition fix in YARN-6382 > -- > > Key: YARN-6455 > URL: https://issues.apache.org/jira/browse/YARN-6455 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 3.0.0-alpha2 >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: YARN-6455.00.patch > > > YARN-6376 fixes timelinewriter.flush() race condition among concurrent > putEntities() calls and periodical flush by TimelineCollectorManager by > synchronizing on the writer object. > Synchronizing on the writer is still a little brittle there, because there is > a getWriter method which lets callers access the writer without synchronizing > on it. AppLevelTimelineCollector#AppLevelAggregator#agregate() does this in > line 152: getWriter().write(...) In this case it doesn't flush, but if that > were to be added, that would re-introduce the race fixed in YARN-6376. > Instead of exposing the writer, perhaps it would be better to have the > sub-classes call #putEntities instead. It defers to the private > writeTimelineEntities which does the same work to get the context: > TimelineCollectorContext context = getTimelineEntityContext(); -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6216) Unify Container Resizing code paths with Container Updates making it scheduler agnostic
[ https://issues.apache.org/jira/browse/YARN-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968291#comment-15968291 ] Hadoop QA commented on YARN-6216: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} docker {color} | {color:red} 14m 31s{color} | {color:red} Docker failed to build yetus/hadoop:b59b8b7. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-6216 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12863355/YARN-6216-branch-2.001.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/15634/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Unify Container Resizing code paths with Container Updates making it > scheduler agnostic > --- > > Key: YARN-6216 > URL: https://issues.apache.org/jira/browse/YARN-6216 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, fairscheduler, resourcemanager >Affects Versions: 3.0.0-alpha2 >Reporter: Arun Suresh >Assignee: Arun Suresh > Fix For: 3.0.0-alpha3 > > Attachments: YARN-6216.001.patch, YARN-6216.002.patch, > YARN-6216.003.patch, YARN-6216-branch-2.001.patch > > > YARN-5959 introduced an {{ContainerUpdateContext}} which can be used to > update the ExecutionType of a container in a scheduler agnostic manner. As > mentioned in that JIRA, extending that to encompass Container resizing is > trivial. > This JIRA proposes to remove all the CapacityScheduler specific code paths. > (CapacityScheduler, CSQueue, FicaSchedulerApp etc.) and modify the code to > use the framework introduced in YARN-5959 without any loss of functionality. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3666) Federation Intercepting and propagating AM-RM communications
[ https://issues.apache.org/jira/browse/YARN-3666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968274#comment-15968274 ] Hadoop QA commented on YARN-3666: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 54s{color} | {color:green} YARN-2915 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s{color} | {color:green} YARN-2915 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s{color} | {color:green} YARN-2915 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s{color} | {color:green} YARN-2915 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s{color} | {color:green} YARN-2915 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 43s{color} | {color:green} YARN-2915 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s{color} | {color:green} YARN-2915 passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s{color} | {color:green} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 0 new + 30 unchanged - 1 fixed = 30 total (was 31) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 3s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 50s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager generated 67 new + 0 unchanged - 0 fixed = 67 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 12m 49s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 33m 59s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager | | | org.apache.hadoop.yarn.proto.YarnServerNodemanagerRecoveryProtos$ContainerManagerApplicationProto.PARSER isn't final but should be At YarnServerNodemanagerRecoveryProtos.java:be At YarnServerNodemanagerRecoveryProtos.java:[line 229] | | | Class org.apache.hadoop.yarn.proto.YarnServerNodemanagerRecoveryProtos$ContainerManagerApplicationProto defines non-transient non-serializable instance field unknownFields In YarnServerNodemanagerRecoveryProtos.java:instance field unknownFields In YarnServerNodemanagerRecoveryProtos.java | | | org.apache.hadoop.yarn.proto.YarnServerNodemanagerRecoveryProtos$DeletionServiceDeleteTaskProto.PARSER isn't final but should be At YarnServerNodemanagerRecoveryProtos.java:be At YarnServerNodemanagerRecoveryProtos.java:[line 1690] | | | Class org.apache.hadoop.yarn.proto.YarnServerNodemanagerRecoveryProtos$DeletionServiceDeleteTaskProto defines non-transient non-serializable instance field unknownFields In YarnServerNodemanagerRecoveryProtos.java:instance field unknownFields In YarnServer
[jira] [Created] (YARN-6481) Yarn top shows negative container number in FS
Yufei Gu created YARN-6481: -- Summary: Yarn top shows negative container number in FS Key: YARN-6481 URL: https://issues.apache.org/jira/browse/YARN-6481 Project: Hadoop YARN Issue Type: Bug Components: yarn Affects Versions: 2.9.0 Reporter: Yufei Gu yarn top shows negative container numbers and they didn't change even they were supposed to. {code} NodeManager(s): 2 total, 2 active, 0 unhealthy, 0 decommissioned, 0 lost, 0 rebooted Queue(s) Applications: 0 running, 12 submitted, 0 pending, 12 completed, 0 killed, 0 failed Queue(s) Mem(GB): 0 available, 0 allocated, 0 pending, 0 reserved Queue(s) VCores: 0 available, 0 allocated, 0 pending, 0 reserved Queue(s) Containers: -2 allocated, -2 pending, -2 reserved APPLICATIONID USER TYPE QUEUE #CONT #RCONT VCORES RVC {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6396) Call verifyAndCreateRemoteLogDir at service initialization instead of application initialization to decrease load for name node
[ https://issues.apache.org/jira/browse/YARN-6396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968239#comment-15968239 ] Robert Kanter commented on YARN-6396: - Won't this be a problem if someone deletes the remote log dir sometime after starting the NM? > Call verifyAndCreateRemoteLogDir at service initialization instead of > application initialization to decrease load for name node > --- > > Key: YARN-6396 > URL: https://issues.apache.org/jira/browse/YARN-6396 > Project: Hadoop YARN > Issue Type: Improvement > Components: log-aggregation >Affects Versions: 3.0.0-alpha2 >Reporter: zhihai xu >Assignee: zhihai xu >Priority: Minor > Attachments: YARN-6396.000.patch > > > Call verifyAndCreateRemoteLogDir at service initialization instead of > application initialization to decrease load for name node. > Currently for every application at each Node, verifyAndCreateRemoteLogDir > will be called before doing log aggregation, This will be a non trivial > overhead for name node in a large cluster since verifyAndCreateRemoteLogDir > calls getFileStatus. Once the remote log directory is created successfully, > it is not necessary to call it again. It will be better to call > verifyAndCreateRemoteLogDir at LogAggregationService service initialization. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6216) Unify Container Resizing code paths with Container Updates making it scheduler agnostic
[ https://issues.apache.org/jira/browse/YARN-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968236#comment-15968236 ] Hadoop QA commented on YARN-6216: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} docker {color} | {color:red} 0m 6s{color} | {color:red} Docker failed to build yetus/hadoop:b59b8b7. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-6216 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12863355/YARN-6216-branch-2.001.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/15633/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Unify Container Resizing code paths with Container Updates making it > scheduler agnostic > --- > > Key: YARN-6216 > URL: https://issues.apache.org/jira/browse/YARN-6216 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, fairscheduler, resourcemanager >Affects Versions: 3.0.0-alpha2 >Reporter: Arun Suresh >Assignee: Arun Suresh > Fix For: 3.0.0-alpha3 > > Attachments: YARN-6216.001.patch, YARN-6216.002.patch, > YARN-6216.003.patch, YARN-6216-branch-2.001.patch > > > YARN-5959 introduced an {{ContainerUpdateContext}} which can be used to > update the ExecutionType of a container in a scheduler agnostic manner. As > mentioned in that JIRA, extending that to encompass Container resizing is > trivial. > This JIRA proposes to remove all the CapacityScheduler specific code paths. > (CapacityScheduler, CSQueue, FicaSchedulerApp etc.) and modify the code to > use the framework introduced in YARN-5959 without any loss of functionality. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6216) Unify Container Resizing code paths with Container Updates making it scheduler agnostic
[ https://issues.apache.org/jira/browse/YARN-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968237#comment-15968237 ] Hadoop QA commented on YARN-6216: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} docker {color} | {color:red} 0m 6s{color} | {color:red} Docker failed to build yetus/hadoop:b59b8b7. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-6216 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12863355/YARN-6216-branch-2.001.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/15631/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Unify Container Resizing code paths with Container Updates making it > scheduler agnostic > --- > > Key: YARN-6216 > URL: https://issues.apache.org/jira/browse/YARN-6216 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, fairscheduler, resourcemanager >Affects Versions: 3.0.0-alpha2 >Reporter: Arun Suresh >Assignee: Arun Suresh > Fix For: 3.0.0-alpha3 > > Attachments: YARN-6216.001.patch, YARN-6216.002.patch, > YARN-6216.003.patch, YARN-6216-branch-2.001.patch > > > YARN-5959 introduced an {{ContainerUpdateContext}} which can be used to > update the ExecutionType of a container in a scheduler agnostic manner. As > mentioned in that JIRA, extending that to encompass Container resizing is > trivial. > This JIRA proposes to remove all the CapacityScheduler specific code paths. > (CapacityScheduler, CSQueue, FicaSchedulerApp etc.) and modify the code to > use the framework introduced in YARN-5959 without any loss of functionality. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6216) Unify Container Resizing code paths with Container Updates making it scheduler agnostic
[ https://issues.apache.org/jira/browse/YARN-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968235#comment-15968235 ] Hadoop QA commented on YARN-6216: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} docker {color} | {color:red} 0m 6s{color} | {color:red} Docker failed to build yetus/hadoop:b59b8b7. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-6216 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12863355/YARN-6216-branch-2.001.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/15632/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Unify Container Resizing code paths with Container Updates making it > scheduler agnostic > --- > > Key: YARN-6216 > URL: https://issues.apache.org/jira/browse/YARN-6216 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, fairscheduler, resourcemanager >Affects Versions: 3.0.0-alpha2 >Reporter: Arun Suresh >Assignee: Arun Suresh > Fix For: 3.0.0-alpha3 > > Attachments: YARN-6216.001.patch, YARN-6216.002.patch, > YARN-6216.003.patch, YARN-6216-branch-2.001.patch > > > YARN-5959 introduced an {{ContainerUpdateContext}} which can be used to > update the ExecutionType of a container in a scheduler agnostic manner. As > mentioned in that JIRA, extending that to encompass Container resizing is > trivial. > This JIRA proposes to remove all the CapacityScheduler specific code paths. > (CapacityScheduler, CSQueue, FicaSchedulerApp etc.) and modify the code to > use the framework introduced in YARN-5959 without any loss of functionality. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6216) Unify Container Resizing code paths with Container Updates making it scheduler agnostic
[ https://issues.apache.org/jira/browse/YARN-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968231#comment-15968231 ] Hadoop QA commented on YARN-6216: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} docker {color} | {color:red} 13m 1s{color} | {color:red} Docker failed to build yetus/hadoop:b59b8b7. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-6216 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12863355/YARN-6216-branch-2.001.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/15629/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Unify Container Resizing code paths with Container Updates making it > scheduler agnostic > --- > > Key: YARN-6216 > URL: https://issues.apache.org/jira/browse/YARN-6216 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, fairscheduler, resourcemanager >Affects Versions: 3.0.0-alpha2 >Reporter: Arun Suresh >Assignee: Arun Suresh > Fix For: 3.0.0-alpha3 > > Attachments: YARN-6216.001.patch, YARN-6216.002.patch, > YARN-6216.003.patch, YARN-6216-branch-2.001.patch > > > YARN-5959 introduced an {{ContainerUpdateContext}} which can be used to > update the ExecutionType of a container in a scheduler agnostic manner. As > mentioned in that JIRA, extending that to encompass Container resizing is > trivial. > This JIRA proposes to remove all the CapacityScheduler specific code paths. > (CapacityScheduler, CSQueue, FicaSchedulerApp etc.) and modify the code to > use the framework introduced in YARN-5959 without any loss of functionality. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6216) Unify Container Resizing code paths with Container Updates making it scheduler agnostic
[ https://issues.apache.org/jira/browse/YARN-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968229#comment-15968229 ] Hadoop QA commented on YARN-6216: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} docker {color} | {color:red} 13m 3s{color} | {color:red} Docker failed to build yetus/hadoop:b59b8b7. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-6216 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12863355/YARN-6216-branch-2.001.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/15627/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Unify Container Resizing code paths with Container Updates making it > scheduler agnostic > --- > > Key: YARN-6216 > URL: https://issues.apache.org/jira/browse/YARN-6216 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, fairscheduler, resourcemanager >Affects Versions: 3.0.0-alpha2 >Reporter: Arun Suresh >Assignee: Arun Suresh > Fix For: 3.0.0-alpha3 > > Attachments: YARN-6216.001.patch, YARN-6216.002.patch, > YARN-6216.003.patch, YARN-6216-branch-2.001.patch > > > YARN-5959 introduced an {{ContainerUpdateContext}} which can be used to > update the ExecutionType of a container in a scheduler agnostic manner. As > mentioned in that JIRA, extending that to encompass Container resizing is > trivial. > This JIRA proposes to remove all the CapacityScheduler specific code paths. > (CapacityScheduler, CSQueue, FicaSchedulerApp etc.) and modify the code to > use the framework introduced in YARN-5959 without any loss of functionality. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-3666) Federation Intercepting and propagating AM-RM communications
[ https://issues.apache.org/jira/browse/YARN-3666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Botong Huang updated YARN-3666: --- Attachment: YARN-3666-YARN-2915.v4.patch > Federation Intercepting and propagating AM-RM communications > > > Key: YARN-3666 > URL: https://issues.apache.org/jira/browse/YARN-3666 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Kishore Chaliparambil >Assignee: Botong Huang > Attachments: YARN-3666-YARN-2915.v1.patch, > YARN-3666-YARN-2915.v2.patch, YARN-3666-YARN-2915.v3.patch, > YARN-3666-YARN-2915.v4.patch > > > In order, to support transparent "spanning" of jobs across sub-clusters, all > AM-RM communications are proxied (via YARN-2884). > This JIRA tracks federation-specific mechanisms that decide how to > "split/broadcast" requests to the RMs and "merge" answers to > the AM. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3663) Federation State and Policy Store (DBMS implementation)
[ https://issues.apache.org/jira/browse/YARN-3663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968218#comment-15968218 ] Carlo Curino commented on YARN-3663: Thanks [~giovanni.fumarola] for the updated patch. Conf: # You set {{YarnConfiguration.DEFAULT_FEDERATION_STATESTORE_SQL_MAXCONNECTIONS}} to 1. Isn't this too small? What values you commonly use? I would put a conservative but not overly tight value, otherwise users are forced to learn/modify more params. # We should omit {{yarn.federation.state-store.sql.max-connections} from yarn-default.xml like you do for all other params in this patch. In {{SQLFederationStateStore}} # For most fields except {{CALL_SP_GET_ALL_POLICY_CONFIGURATIONS}} you simply use the plural to differentiate the getter of one or all items. Be consistent (e.g., remove the ALL here, or add it everywhere else) # Minor: userNameDB --> userName, passwordDB --> password # When you throw the exceptions (e.g., subcluster registration), it might be nice in the message to include the sub-clusterID / ip or any other info one can use to debug. # Can you comment on why we are using: {{FederationStateStoreErrorCode}}? They don't seem to be connected to SQL error codes, and they are not used anywhere else (we normally use named exception, which are easier to understand/track). # at line 277-278: formatting # We should try to remove redundance, e.g., you have lots of things that look like this: {code} try { FederationMembershipStateStoreInputValidator .validateGetSubClusterInfoRequest(subClusterRequest); } catch (FederationStateStoreInvalidInputException e) { FederationStateStoreUtils.logAndThrowInvalidInputException(LOG, "Unable to obtain the infomation about the SubCluster. " + e.getMessage()); } {code} They could be factored out to {{FederationMembershipStateStoreInputValidator.validate(subClusterRequest)}} where the type of input param is used to differentiate the method, and the logAndThrowInvalidInputException is done on that side. Same goes for {{checkSubClusterInfo}}. # Similarly to the above we should try to factor out the very repetitive code to create connection/statements, set params, run, and throw. I don't have a specific advise on this, but the code is mostly copy and paste, which we should avoid. # Move the {{fromStringToSubClusterState}} to the SubclusterState class (and call it fromString(). # Why {{getSubCluster}} and {{getSubClusters}} use different mechanics for return values? (registered params vs ResultSet)? Might be worth to be consistent (probably using ResultSet). # Line 540: Is this behavior (overwrite existing) consistent with general YARN? (I think so, but want to check) # Some of the log are a bit vague {{LOG.debug("Got the information about the specified application }} say spefically what info where gotten # if you use {{LOG.debug}} consider to prefix it with a check if we are in debug mode (save time/objects creations for the String that are then not used). # You have several {{ if (cstmt.getInt(2) != 1) }} ROWCOUNT checks. This mix the no tuple where changed to multiple tuple where changed. Distinguishing the two cases, might help debug (do we have duplicates in DB, or the entry was not found). (Not mandatory, just somethign to consider) # {{setPolicyConfiguration}} You are doing {{cstmt.setBytes(3, new byte[policyConf.getParams().remaining()]);}} which adds an empty byte[] instead of what is coming in input. # {{getCurrentVersion}} and {{loadVersion}} throw a NotSupportedException or something of the sort, a silent return null is easy to confuse people. (I know the full version will be in V2, let's just have a clear breakage if someone try to use this methods). ( to be continued ...) > Federation State and Policy Store (DBMS implementation) > --- > > Key: YARN-3663 > URL: https://issues.apache.org/jira/browse/YARN-3663 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Affects Versions: YARN-2915 >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola > Attachments: YARN-3663-YARN-2915.v1.patch, > YARN-3663-YARN-2915.v2.patch > > > This JIRA tracks a SQL-based implementation of the Federation State and > Policy Store, which implements YARN-3662 APIs. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6216) Unify Container Resizing code paths with Container Updates making it scheduler agnostic
[ https://issues.apache.org/jira/browse/YARN-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968210#comment-15968210 ] Hadoop QA commented on YARN-6216: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} docker {color} | {color:red} 0m 6s{color} | {color:red} Docker failed to build yetus/hadoop:b59b8b7. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-6216 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12863355/YARN-6216-branch-2.001.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/15626/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Unify Container Resizing code paths with Container Updates making it > scheduler agnostic > --- > > Key: YARN-6216 > URL: https://issues.apache.org/jira/browse/YARN-6216 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, fairscheduler, resourcemanager >Affects Versions: 3.0.0-alpha2 >Reporter: Arun Suresh >Assignee: Arun Suresh > Fix For: 3.0.0-alpha3 > > Attachments: YARN-6216.001.patch, YARN-6216.002.patch, > YARN-6216.003.patch, YARN-6216-branch-2.001.patch > > > YARN-5959 introduced an {{ContainerUpdateContext}} which can be used to > update the ExecutionType of a container in a scheduler agnostic manner. As > mentioned in that JIRA, extending that to encompass Container resizing is > trivial. > This JIRA proposes to remove all the CapacityScheduler specific code paths. > (CapacityScheduler, CSQueue, FicaSchedulerApp etc.) and modify the code to > use the framework introduced in YARN-5959 without any loss of functionality. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6216) Unify Container Resizing code paths with Container Updates making it scheduler agnostic
[ https://issues.apache.org/jira/browse/YARN-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968212#comment-15968212 ] Hadoop QA commented on YARN-6216: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} docker {color} | {color:red} 0m 6s{color} | {color:red} Docker failed to build yetus/hadoop:b59b8b7. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-6216 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12863355/YARN-6216-branch-2.001.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/15628/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Unify Container Resizing code paths with Container Updates making it > scheduler agnostic > --- > > Key: YARN-6216 > URL: https://issues.apache.org/jira/browse/YARN-6216 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, fairscheduler, resourcemanager >Affects Versions: 3.0.0-alpha2 >Reporter: Arun Suresh >Assignee: Arun Suresh > Fix For: 3.0.0-alpha3 > > Attachments: YARN-6216.001.patch, YARN-6216.002.patch, > YARN-6216.003.patch, YARN-6216-branch-2.001.patch > > > YARN-5959 introduced an {{ContainerUpdateContext}} which can be used to > update the ExecutionType of a container in a scheduler agnostic manner. As > mentioned in that JIRA, extending that to encompass Container resizing is > trivial. > This JIRA proposes to remove all the CapacityScheduler specific code paths. > (CapacityScheduler, CSQueue, FicaSchedulerApp etc.) and modify the code to > use the framework introduced in YARN-5959 without any loss of functionality. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5331) Extend RLESparseResourceAllocation with period for supporting recurring reservations in YARN ReservationSystem
[ https://issues.apache.org/jira/browse/YARN-5331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968207#comment-15968207 ] Sean Po commented on YARN-5331: --- Thanks for the patch [~ajsangeetha], it looks good to me. +1 > Extend RLESparseResourceAllocation with period for supporting recurring > reservations in YARN ReservationSystem > -- > > Key: YARN-5331 > URL: https://issues.apache.org/jira/browse/YARN-5331 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Subru Krishnan >Assignee: Sangeetha Abdu Jyothi > Labels: oct16-medium > Attachments: YARN-5331.001.patch, YARN-5331.002.patch, > YARN-5331.003.patch, YARN-5331.004.patch, YARN-5331.005.patch, > YARN-5331.006.patch, YARN-5331.007.patch, YARN-5331.008.patch > > > YARN-5326 proposes adding native support for recurring reservations in the > YARN ReservationSystem. This JIRA is a sub-task to add a > PeriodicRLESparseResourceAllocation. Please refer to the design doc in the > parent JIRA for details. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6216) Unify Container Resizing code paths with Container Updates making it scheduler agnostic
[ https://issues.apache.org/jira/browse/YARN-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968206#comment-15968206 ] Hadoop QA commented on YARN-6216: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} docker {color} | {color:red} 13m 26s{color} | {color:red} Docker failed to build yetus/hadoop:b59b8b7. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-6216 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12863355/YARN-6216-branch-2.001.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/15624/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Unify Container Resizing code paths with Container Updates making it > scheduler agnostic > --- > > Key: YARN-6216 > URL: https://issues.apache.org/jira/browse/YARN-6216 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, fairscheduler, resourcemanager >Affects Versions: 3.0.0-alpha2 >Reporter: Arun Suresh >Assignee: Arun Suresh > Fix For: 3.0.0-alpha3 > > Attachments: YARN-6216.001.patch, YARN-6216.002.patch, > YARN-6216.003.patch, YARN-6216-branch-2.001.patch > > > YARN-5959 introduced an {{ContainerUpdateContext}} which can be used to > update the ExecutionType of a container in a scheduler agnostic manner. As > mentioned in that JIRA, extending that to encompass Container resizing is > trivial. > This JIRA proposes to remove all the CapacityScheduler specific code paths. > (CapacityScheduler, CSQueue, FicaSchedulerApp etc.) and modify the code to > use the framework introduced in YARN-5959 without any loss of functionality. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6480) Timeout is too aggressive for TestAMRestart.testPreemptedAMRestartOnRMRestart
[ https://issues.apache.org/jira/browse/YARN-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated YARN-6480: -- Attachment: YARN-6480.001.patch Updating timeout to 60s > Timeout is too aggressive for TestAMRestart.testPreemptedAMRestartOnRMRestart > - > > Key: YARN-6480 > URL: https://issues.apache.org/jira/browse/YARN-6480 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Eric Badger >Assignee: Eric Badger > Attachments: YARN-6480.001.patch > > > Timeout is set to 20 seconds, but the test runs regularly at 15 seconds on my > machine. Any load and it could timeout. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6075) Yarn top for FairScheduler
[ https://issues.apache.org/jira/browse/YARN-6075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968191#comment-15968191 ] Yufei Gu commented on YARN-6075: Yarn top works well for FS as I tested. So close this. > Yarn top for FairScheduler > -- > > Key: YARN-6075 > URL: https://issues.apache.org/jira/browse/YARN-6075 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Reporter: Prabhu Joseph >Assignee: Yufei Gu > Attachments: Yarn_Top_FairScheduler.png > > > Yarn top output for FairScheduler shows empty values. (attached output) We > need to handle yarn top with FairScheduler. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Resolved] (YARN-6075) Yarn top for FairScheduler
[ https://issues.apache.org/jira/browse/YARN-6075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yufei Gu resolved YARN-6075. Resolution: Won't Fix > Yarn top for FairScheduler > -- > > Key: YARN-6075 > URL: https://issues.apache.org/jira/browse/YARN-6075 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Reporter: Prabhu Joseph >Assignee: Yufei Gu > Attachments: Yarn_Top_FairScheduler.png > > > Yarn top output for FairScheduler shows empty values. (attached output) We > need to handle yarn top with FairScheduler. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6216) Unify Container Resizing code paths with Container Updates making it scheduler agnostic
[ https://issues.apache.org/jira/browse/YARN-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968188#comment-15968188 ] Hadoop QA commented on YARN-6216: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} docker {color} | {color:red} 21m 12s{color} | {color:red} Docker failed to build yetus/hadoop:b59b8b7. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-6216 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12863355/YARN-6216-branch-2.001.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/15622/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Unify Container Resizing code paths with Container Updates making it > scheduler agnostic > --- > > Key: YARN-6216 > URL: https://issues.apache.org/jira/browse/YARN-6216 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, fairscheduler, resourcemanager >Affects Versions: 3.0.0-alpha2 >Reporter: Arun Suresh >Assignee: Arun Suresh > Fix For: 3.0.0-alpha3 > > Attachments: YARN-6216.001.patch, YARN-6216.002.patch, > YARN-6216.003.patch, YARN-6216-branch-2.001.patch > > > YARN-5959 introduced an {{ContainerUpdateContext}} which can be used to > update the ExecutionType of a container in a scheduler agnostic manner. As > mentioned in that JIRA, extending that to encompass Container resizing is > trivial. > This JIRA proposes to remove all the CapacityScheduler specific code paths. > (CapacityScheduler, CSQueue, FicaSchedulerApp etc.) and modify the code to > use the framework introduced in YARN-5959 without any loss of functionality. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6216) Unify Container Resizing code paths with Container Updates making it scheduler agnostic
[ https://issues.apache.org/jira/browse/YARN-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968176#comment-15968176 ] Hadoop QA commented on YARN-6216: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} docker {color} | {color:red} 0m 50s{color} | {color:red} Docker failed to build yetus/hadoop:b59b8b7. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-6216 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12863355/YARN-6216-branch-2.001.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/15623/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Unify Container Resizing code paths with Container Updates making it > scheduler agnostic > --- > > Key: YARN-6216 > URL: https://issues.apache.org/jira/browse/YARN-6216 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, fairscheduler, resourcemanager >Affects Versions: 3.0.0-alpha2 >Reporter: Arun Suresh >Assignee: Arun Suresh > Fix For: 3.0.0-alpha3 > > Attachments: YARN-6216.001.patch, YARN-6216.002.patch, > YARN-6216.003.patch, YARN-6216-branch-2.001.patch > > > YARN-5959 introduced an {{ContainerUpdateContext}} which can be used to > update the ExecutionType of a container in a scheduler agnostic manner. As > mentioned in that JIRA, extending that to encompass Container resizing is > trivial. > This JIRA proposes to remove all the CapacityScheduler specific code paths. > (CapacityScheduler, CSQueue, FicaSchedulerApp etc.) and modify the code to > use the framework introduced in YARN-5959 without any loss of functionality. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6216) Unify Container Resizing code paths with Container Updates making it scheduler agnostic
[ https://issues.apache.org/jira/browse/YARN-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-6216: -- Attachment: YARN-6216-branch-2.001.patch Now that YARN-6040 is committed, would like this to get into branch-2. Uploading branch-2 patch > Unify Container Resizing code paths with Container Updates making it > scheduler agnostic > --- > > Key: YARN-6216 > URL: https://issues.apache.org/jira/browse/YARN-6216 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, fairscheduler, resourcemanager >Affects Versions: 3.0.0-alpha2 >Reporter: Arun Suresh >Assignee: Arun Suresh > Fix For: 3.0.0-alpha3 > > Attachments: YARN-6216.001.patch, YARN-6216.002.patch, > YARN-6216.003.patch, YARN-6216-branch-2.001.patch > > > YARN-5959 introduced an {{ContainerUpdateContext}} which can be used to > update the ExecutionType of a container in a scheduler agnostic manner. As > mentioned in that JIRA, extending that to encompass Container resizing is > trivial. > This JIRA proposes to remove all the CapacityScheduler specific code paths. > (CapacityScheduler, CSQueue, FicaSchedulerApp etc.) and modify the code to > use the framework introduced in YARN-5959 without any loss of functionality. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5617) AMs only intended to run one attempt can be run more than once
[ https://issues.apache.org/jira/browse/YARN-5617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968158#comment-15968158 ] Hadoop QA commented on YARN-5617: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 0 new + 171 unchanged - 6 fixed = 171 total (was 177) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 38m 39s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 61m 0s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:612578f | | JIRA Issue | YARN-5617 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12863341/YARN-5617.003.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 830e91818f25 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 0cab572 | | Default Java | 1.8.0_121 | | findbugs | v3.0.0 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/15621/testReport/ | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/15621/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > AMs only intended to run one attempt can be run more than once > -- > > Key: YARN-5617 > URL: https://issues.apache.org/jira/browse/YARN-5617 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versi
[jira] [Comment Edited] (YARN-6396) Call verifyAndCreateRemoteLogDir at service initialization instead of application initialization to decrease load for name node
[ https://issues.apache.org/jira/browse/YARN-6396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15961891#comment-15961891 ] zhihai xu edited comment on YARN-6396 at 4/13/17 7:47 PM: -- Thanks for the review [~haibochen], [~jianhe], [~rkanter], [~xgong] Could you also help review the patch? thanks was (Author: zxu): Thanks for the review [~haibochen], [~rkanter], [~xgong] Could you also help review the patch? thanks > Call verifyAndCreateRemoteLogDir at service initialization instead of > application initialization to decrease load for name node > --- > > Key: YARN-6396 > URL: https://issues.apache.org/jira/browse/YARN-6396 > Project: Hadoop YARN > Issue Type: Improvement > Components: log-aggregation >Affects Versions: 3.0.0-alpha2 >Reporter: zhihai xu >Assignee: zhihai xu >Priority: Minor > Attachments: YARN-6396.000.patch > > > Call verifyAndCreateRemoteLogDir at service initialization instead of > application initialization to decrease load for name node. > Currently for every application at each Node, verifyAndCreateRemoteLogDir > will be called before doing log aggregation, This will be a non trivial > overhead for name node in a large cluster since verifyAndCreateRemoteLogDir > calls getFileStatus. Once the remote log directory is created successfully, > it is not necessary to call it again. It will be better to call > verifyAndCreateRemoteLogDir at LogAggregationService service initialization. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-6480) Timeout is too aggressive for TestAMRestart.testPreemptedAMRestartOnRMRestart
Eric Badger created YARN-6480: - Summary: Timeout is too aggressive for TestAMRestart.testPreemptedAMRestartOnRMRestart Key: YARN-6480 URL: https://issues.apache.org/jira/browse/YARN-6480 Project: Hadoop YARN Issue Type: Bug Reporter: Eric Badger Assignee: Eric Badger Timeout is set to 20 seconds, but the test runs regularly at 15 seconds on my machine. Any load and it could timeout. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6363) Extending SLS: Synthetic Load Generator
[ https://issues.apache.org/jira/browse/YARN-6363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968110#comment-15968110 ] Wangda Tan commented on YARN-6363: -- Thanks [~curino], For the patch: My biggest concern is compatibility: - This patch removed several options, if you plan to commit this patch to trunk (hadoop-3) only, it is fine. However it will be problematic if you want to commit it to branch-2. Do you think we should make it work in a compatible way? Other comments: - It looks {{--nodes}} not honored by the new patch. SLSRunner#nodeFile not read by anyone. Also I tested the patch, it works fine for the original SLS load. However I cannot get it run for SYNTH workload, it get stuck before RM get started (I cannot access web UI). I uploaded jstack output to: https://www.dropbox.com/s/tz5qaxy5qqt7j44/YARN-6363-jstack.001.txt?dl=0. And this is the command I'm using: {code} ./bin/slsrun.sh --tracetype=SYNTH --tracelocation=src/test/resources/syn.json --output-dir=/tmp/sls-out {code} I'm using {{hadoop-tools/hadoop-sls/src/main/sample-conf}} as HADOOP_CONF_DIR. Could you please share an example about how to run the SYNTH sls? > Extending SLS: Synthetic Load Generator > --- > > Key: YARN-6363 > URL: https://issues.apache.org/jira/browse/YARN-6363 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Carlo Curino >Assignee: Carlo Curino > Attachments: YARN-6363 overview.pdf, YARN-6363.v0.patch, > YARN-6363.v1.patch, YARN-6363.v2.patch, YARN-6363.v3.patch, > YARN-6363.v4.patch, YARN-6363.v5.patch, YARN-6363.v6.patch, > YARN-6363.v7.patch, YARN-6363.v9.patch > > > This JIRA tracks the introduction of a synthetic load generator in the SLS. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-2915) Enable YARN RM scale out via federation using multiple RM's
[ https://issues.apache.org/jira/browse/YARN-2915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carlo Curino updated YARN-2915: --- Labels: federation (was: ) > Enable YARN RM scale out via federation using multiple RM's > --- > > Key: YARN-2915 > URL: https://issues.apache.org/jira/browse/YARN-2915 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager, resourcemanager >Reporter: Sriram Rao >Assignee: Subru Krishnan > Labels: federation > Attachments: Federation-BoF.pdf, > FEDERATION_CAPACITY_ALLOCATION_JIRA.pdf, federation-prototype.patch, > Yarn_federation_design_v1.pdf, YARN-Federation-Hadoop-Summit_final.pptx > > > This is an umbrella JIRA that proposes to scale out YARN to support large > clusters comprising of tens of thousands of nodes. That is, rather than > limiting a YARN managed cluster to about 4k in size, the proposal is to > enable the YARN managed cluster to be elastically scalable. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6203) Occasional test failure in TestWeightedRandomRouterPolicy
[ https://issues.apache.org/jira/browse/YARN-6203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968097#comment-15968097 ] Carlo Curino commented on YARN-6203: Thanks [~subru] I committed this to branch YARN-2915. > Occasional test failure in TestWeightedRandomRouterPolicy > - > > Key: YARN-6203 > URL: https://issues.apache.org/jira/browse/YARN-6203 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Affects Versions: YARN-2915 >Reporter: Botong Huang >Assignee: Carlo Curino >Priority: Minor > Fix For: YARN-2915 > > Attachments: YARN-6203-YARN-2915.v0.patch, > YARN-6203-YARN-2915.v1.patch > > > Tests run: 8, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 8.432 sec <<< > FAILURE! - in > org.apache.hadoop.yarn.server.federation.policies.router.TestWeightedRandomRouterPolicy > testClusterChosenWithRightProbability(org.apache.hadoop.yarn.server.federation.policies.router.TestWeightedRandomRouterPolicy) > Time elapsed: 7.437 sec <<< FAILURE! > java.lang.AssertionError: Id sc5 Actual weight: 0.00228 expected weight: > 0.0018106138 > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.assertTrue(Assert.java:41) > at > org.apache.hadoop.yarn.server.federation.policies.router.TestWeightedRandomRouterPolicy.testClusterChosenWithRightProbability(TestWeightedRandomRouterPolicy.java:118) -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6203) Occasional test failure in TestWeightedRandomRouterPolicy
[ https://issues.apache.org/jira/browse/YARN-6203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carlo Curino updated YARN-6203: --- Component/s: federation > Occasional test failure in TestWeightedRandomRouterPolicy > - > > Key: YARN-6203 > URL: https://issues.apache.org/jira/browse/YARN-6203 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Affects Versions: YARN-2915 >Reporter: Botong Huang >Assignee: Carlo Curino >Priority: Minor > Fix For: YARN-2915 > > Attachments: YARN-6203-YARN-2915.v0.patch, > YARN-6203-YARN-2915.v1.patch > > > Tests run: 8, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 8.432 sec <<< > FAILURE! - in > org.apache.hadoop.yarn.server.federation.policies.router.TestWeightedRandomRouterPolicy > testClusterChosenWithRightProbability(org.apache.hadoop.yarn.server.federation.policies.router.TestWeightedRandomRouterPolicy) > Time elapsed: 7.437 sec <<< FAILURE! > java.lang.AssertionError: Id sc5 Actual weight: 0.00228 expected weight: > 0.0018106138 > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.assertTrue(Assert.java:41) > at > org.apache.hadoop.yarn.server.federation.policies.router.TestWeightedRandomRouterPolicy.testClusterChosenWithRightProbability(TestWeightedRandomRouterPolicy.java:118) -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5617) AMs only intended to run one attempt can be run more than once
[ https://issues.apache.org/jira/browse/YARN-5617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-5617: - Attachment: YARN-5617.003.patch Updated the patch. > AMs only intended to run one attempt can be run more than once > -- > > Key: YARN-5617 > URL: https://issues.apache.org/jira/browse/YARN-5617 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.5.0 >Reporter: Jason Lowe >Assignee: Jason Lowe > Attachments: YARN-5617.001.patch, YARN-5617.002.patch, > YARN-5617.003.patch > > > There are times when a user only wants to run an application with one > attempt. Examples would be cases where the second AM attempt is not prepared > to handle recovery or will accidentally corrupt state (e.g.: by re-executing > something from scratch that should not be). Prior to YARN-614 setting the > max attempts to 1 would guarantee the app ran at most one attempt, but now it > can run more than one attempt if the attempts fail due to a fault not > attributed to the application. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5881) Enable configuration of queue capacity in terms of absolute resources
[ https://issues.apache.org/jira/browse/YARN-5881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968035#comment-15968035 ] Wangda Tan commented on YARN-5881: -- [~sunilg], For the configurations of the absolute min/max resource, I suggest to make it to be more compacted, for example: {code} ${queue-path}.min/max-resource = {code} And in the future we can support per-resource-type capacities in percentage as well: {code} ${queue-path}.(maximum-)capacity = {code} To me this will be better than flatten all resource types in your prototype patch. Thoughts? > Enable configuration of queue capacity in terms of absolute resources > - > > Key: YARN-5881 > URL: https://issues.apache.org/jira/browse/YARN-5881 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Sean Po >Assignee: Wangda Tan > Attachments: > YARN-5881.Support.Absolute.Min.Max.Resource.In.Capacity.Scheduler.design-doc.v1.pdf > > > Currently, Yarn RM supports the configuration of queue capacity in terms of a > proportion to cluster capacity. In the context of Yarn being used as a public > cloud service, it makes more sense if queues can be configured absolutely. > This will allow administrators to set usage limits more concretely and > simplify customer expectations for cluster allocation. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3839) Quit throwing NMNotYetReadyException
[ https://issues.apache.org/jira/browse/YARN-3839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968022#comment-15968022 ] Jason Lowe commented on YARN-3839: -- My understanding is the same. It looks like the existing cases when we throw it will already be covered by the NMToken or ContainerToken so we know whether the launch is valid or not. As Vinod pointed out we still need to keep the NMNotYetReadyException class around for compatibility with clients but the NM would stop throwing the exception. > Quit throwing NMNotYetReadyException > > > Key: YARN-3839 > URL: https://issues.apache.org/jira/browse/YARN-3839 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Karthik Kambatla > > Quit throwing NMNotYetReadyException when NM has not yet registered with the > RM. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6471) Support to add min/max resource configuration for a queue
[ https://issues.apache.org/jira/browse/YARN-6471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968018#comment-15968018 ] Wangda Tan commented on YARN-6471: -- [~sunilg] thanks for working on the patch, let's keep all configuration options in YARN-5881 and all implementation discussions in this one. Will post comments suggestions of configurations to YARN-5881 later. > Support to add min/max resource configuration for a queue > - > > Key: YARN-6471 > URL: https://issues.apache.org/jira/browse/YARN-6471 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Sunil G >Assignee: Sunil G > Attachments: YARN-6471.001.patch > > > This jira will track the new configurations which are needed to configure min > resource and max resource of various resource types in a queue. > For eg: > {noformat} > yarn.scheduler.capacity.root.default.memory.min-resource > yarn.scheduler.capacity.root.default.memory.max-resource > yarn.scheduler.capacity.root.default.vcores.min-resource > yarn.scheduler.capacity.root.default.vcores.max-resource > {noformat} > Uploading a patch soon -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6040) Introduce api independent PendingAsk to replace usage of ResourceRequest within Scheduler classes
[ https://issues.apache.org/jira/browse/YARN-6040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-6040: - Fix Version/s: 2.9.0 > Introduce api independent PendingAsk to replace usage of ResourceRequest > within Scheduler classes > - > > Key: YARN-6040 > URL: https://issues.apache.org/jira/browse/YARN-6040 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan > Fix For: 2.9.0, 3.0.0-alpha2 > > Attachments: YARN-6040.001.patch, YARN-6040.002.patch, > YARN-6040.003.patch, YARN-6040.004.patch, YARN-6040.005.patch, > YARN-6040.006.patch, YARN-6040.007.patch, YARN-6040.branch-2.007.patch, > YARN-6040.branch-2.008.patch, YARN-6040.branch-2.009.patch > > > As mentioned by YARN-5906, currently schedulers are using ResourceRequest > heavily so it will be very hard to adopt the new PowerfulResourceRequest > (YARN-4902). > This JIRA is the 2nd step of refactoring, which remove usage of > ResourceRequest from AppSchedulingInfo / SchedulerApplicationAttempt. Instead > of returning ResourceRequest, it returns a lightweight and API-independent > object - {{PendingAsk}}. > The only remained ResourceRequest API of AppSchedulingInfo will be used by > web service to get list of ResourceRequests. > So after this patch, usage of ResourceRequest will be isolated inside > AppSchedulingInfo, so it will be more flexible to update internal data > structure and upgrade old ResourceRequest API to new. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6467) CSQueueMetrics needs to update the current metrics for default partition only
[ https://issues.apache.org/jira/browse/YARN-6467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967982#comment-15967982 ] Manikandan R commented on YARN-6467: [~Naganarasimha], Yes, I went through YARN-6195 changes and got some understanding on setting usedCapacity and AbsoluteUsedCapacity for default partition. CSQueueMetrics has few more metrics - AMResourceLimitMB, AMResourceLimitVCores, usedAMResourceMB & usedAMResourceVCores. These metrics are being set from LeafQueue.java. Similar to YARN-6195, we will need to ensure other metrics being set in LeafQueue.java happens only for default partition. Please validate and provide suggestions. > CSQueueMetrics needs to update the current metrics for default partition only > - > > Key: YARN-6467 > URL: https://issues.apache.org/jira/browse/YARN-6467 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 2.8.0, 2.7.3, 3.0.0-alpha2 >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R > > As a followup to YARN-6195, we need to update existing metrics to only > default Partition. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6392) add submit time to Application Summary log
[ https://issues.apache.org/jira/browse/YARN-6392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967981#comment-15967981 ] Wangda Tan commented on YARN-6392: -- +1, patch LGTM. Thanks [~zxu] > add submit time to Application Summary log > -- > > Key: YARN-6392 > URL: https://issues.apache.org/jira/browse/YARN-6392 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 3.0.0-alpha2 >Reporter: zhihai xu >Assignee: zhihai xu >Priority: Minor > Attachments: YARN-6392.000.patch > > > add submit time to Application Summary log, application submit time will be > passed to Application Master in env variable "APP_SUBMIT_TIME_ENV". It is a > very important parameter, So it will be useful to log it in Application > Summary. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6479) TestDistributedShell.testDSShellWithoutDomainV1_5 fails
[ https://issues.apache.org/jira/browse/YARN-6479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated YARN-6479: -- Affects Version/s: 2.8.0 > TestDistributedShell.testDSShellWithoutDomainV1_5 fails > --- > > Key: YARN-6479 > URL: https://issues.apache.org/jira/browse/YARN-6479 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.8.0 >Reporter: Eric Badger > > {noformat} > java.lang.AssertionError: expected:<2> but was:<0> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShell(TestDistributedShell.java:385) > at > org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShellWithoutDomainV1_5(TestDistributedShell.java:236) > {noformat} > This particular run was in 2.8, but may also be present through trunk. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-6479) TestDistributedShell.testDSShellWithoutDomainV1_5 fails
Eric Badger created YARN-6479: - Summary: TestDistributedShell.testDSShellWithoutDomainV1_5 fails Key: YARN-6479 URL: https://issues.apache.org/jira/browse/YARN-6479 Project: Hadoop YARN Issue Type: Bug Reporter: Eric Badger {noformat} java.lang.AssertionError: expected:<2> but was:<0> at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.junit.Assert.assertEquals(Assert.java:542) at org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShell(TestDistributedShell.java:385) at org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShellWithoutDomainV1_5(TestDistributedShell.java:236) {noformat} This particular run was in 2.8, but may also be present through trunk. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3839) Quit throwing NMNotYetReadyException
[ https://issues.apache.org/jira/browse/YARN-3839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967899#comment-15967899 ] Manikandan R commented on YARN-3839: [~jianhe], [~jlowe], [~kasha] I am trying to understand the changes required for this jira with the help of YARN-3842 discussion/comments and to see if I can able to contribute to this jira. My understanding is, ensuring that NMNotReadyException is not being used in any places and making sure other exceptions (for ex, invalid token exception) is being thrown in case of situations like NM restarts etc and can be validated by running corresponding test cases. Since NMNotReadyException is not useful, there is no use in having blockNewContainerRequests atomic boolean variable as well because start/stop/increase containers methods depends on blockNewContainerRequests value to throw NMNotReadyException or not. Hence, corresponding setter and getter methods also can be removed. In addition, corresponding test cases also needs to be cleaned up. Currently, failed containers adds up in case of InvalidToken and YARN exceptions (in startContainers() method) and retries doesn't happen as those are system errors, which should be retained as it is. Can you please validate this and provide suggestions? > Quit throwing NMNotYetReadyException > > > Key: YARN-3839 > URL: https://issues.apache.org/jira/browse/YARN-3839 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Karthik Kambatla > > Quit throwing NMNotYetReadyException when NM has not yet registered with the > RM. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6423) Queue metrics doesn't work for Fair Scheduler in SLS
[ https://issues.apache.org/jira/browse/YARN-6423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967852#comment-15967852 ] Yufei Gu commented on YARN-6423: Thanks [~miklos.szeg...@cloudera.com]! > Queue metrics doesn't work for Fair Scheduler in SLS > > > Key: YARN-6423 > URL: https://issues.apache.org/jira/browse/YARN-6423 > Project: Hadoop YARN > Issue Type: Sub-task > Components: scheduler-load-simulator >Affects Versions: 2.9.0, 3.0.0-alpha2 >Reporter: Yufei Gu >Assignee: Yufei Gu > Attachments: YARN-6423.001.patch, YARN-6423.002.patch > > > Queue allocated memory and vcores always be 0. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6423) Queue metrics doesn't work for Fair Scheduler in SLS
[ https://issues.apache.org/jira/browse/YARN-6423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967843#comment-15967843 ] Miklos Szegedi commented on YARN-6423: -- +1 (non-binding) Thank you, [~yufeigu]. > Queue metrics doesn't work for Fair Scheduler in SLS > > > Key: YARN-6423 > URL: https://issues.apache.org/jira/browse/YARN-6423 > Project: Hadoop YARN > Issue Type: Sub-task > Components: scheduler-load-simulator >Affects Versions: 2.9.0, 3.0.0-alpha2 >Reporter: Yufei Gu >Assignee: Yufei Gu > Attachments: YARN-6423.001.patch, YARN-6423.002.patch > > > Queue allocated memory and vcores always be 0. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6478) Fix a spelling mistake in FileSystemTimelineWriter
[ https://issues.apache.org/jira/browse/YARN-6478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton updated YARN-6478: --- Labels: newbie (was: ) > Fix a spelling mistake in FileSystemTimelineWriter > -- > > Key: YARN-6478 > URL: https://issues.apache.org/jira/browse/YARN-6478 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jinjiang Ling >Priority: Trivial > Labels: newbie > Attachments: YARN-6478-0.patch > > > Find a spelling mistake in FileSystemTimelineWriter.java > the "writeSummmaryEntityLogs" should be "writeSummaryEntityLogs". -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3509) CollectorNodemanagerProtocol's authorization doesn't work
[ https://issues.apache.org/jira/browse/YARN-3509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967762#comment-15967762 ] Vrushali C commented on YARN-3509: -- cc [~varun_saxena] for security jiras > CollectorNodemanagerProtocol's authorization doesn't work > - > > Key: YARN-3509 > URL: https://issues.apache.org/jira/browse/YARN-3509 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, security, timelineserver >Reporter: Zhijie Shen >Assignee: Zhijie Shen > Labels: YARN-5355 > Attachments: YARN-3509.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-2985) YARN should support to delete the aggregated logs for Non-MapReduce applications
[ https://issues.apache.org/jira/browse/YARN-2985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967748#comment-15967748 ] Jason Lowe commented on YARN-2985: -- Based on the description of this JIRA, I think there's some confusion here. Aggregated logs are deleted for non-MapReduce applications as long as the deletion service is running, whether that deletion service is hosted by the MapReduce job history server or somewhere else. That's why the proposed patch is so small -- it's simply reusing the same code the JHS is already running. The log deletion service looks at the remote log directory in HDFS. It doesn't filter the list of application logs it finds there based on whether it thinks the app is MapReduce or not, rather it just treats them as generic applications. It happens to run in the MapReduce history server, but it is _not_ MapReduce-specific. If users don't want to run MapReduce applications but want to do log aggregtion then they just need to run the MapReduce history server. They won't use it for MapReduce job history since there are no MapReduce jobs, but that server will perform aggregated log retention for *all* applications. Therefore this JIRA is really about adding the ability to relocate the aggregated log deletion service from the MapReduce job history server to the YARN timeline server. We don't want two of these things running in the cluster if someone has deployed the MapReduce history server and the YARN timeline server. That could lead to error messages in the logs as one of them goes to traverse/delete the logs just as the other is already deleting them. However we also don't want to just rip it out of the MapReduce history server and move it to the timeline server because the timeline server is still an optional server in YARN. So we either need a way for the user to specify where they want the deletion service to run, whether that's the legacy location in the MapReduce history server (since they aren't going to run a timeline server which is still an optional YARN server) or in the timeline server. Or we need to just declare the timeline server a mandatory server to run (at least for log aggregation support) and move it from one to the other. In addition the MapReduce history server supports dynamic refresh of the log deletion service configs, and it would be nice not to lose that ability when it is hosted in the timeline server. That could be a separate JIRA unless we're ripping it out of the JHS. If it can only run in the timeline server then we would lose refresh functionality unless that JIRA was completed. As for unit tests, I agree the existing tests for the deletion service cover the correctness of the service itself, so we just need unit tests for the timeline server and MapReduce JHS to verify each is starting the deletion service or not starting the service based on how the cluster is configured. > YARN should support to delete the aggregated logs for Non-MapReduce > applications > > > Key: YARN-2985 > URL: https://issues.apache.org/jira/browse/YARN-2985 > Project: Hadoop YARN > Issue Type: New Feature > Components: log-aggregation, nodemanager >Affects Versions: 2.8.0 >Reporter: Xu Yang >Assignee: Steven Rand > Attachments: YARN-2985-branch-2-001.patch > > > Before Hadoop 2.6, the LogAggregationService is started in NodeManager. But > the AggregatedLogDeletionService is started in mapreduce`s JobHistoryServer. > Therefore, the Non-MapReduce application can aggregate their logs to HDFS, > but can not delete those logs. Need the NodeManager take over the function of > aggregated log deletion. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6445) Improve YARN-3926 performance with respect to SLS
[ https://issues.apache.org/jira/browse/YARN-6445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967407#comment-15967407 ] Hadoop QA commented on YARN-6445: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 4m 13s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 41s{color} | {color:green} YARN-3926 passed {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 5m 5s{color} | {color:red} hadoop-yarn in YARN-3926 failed. {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 4s{color} | {color:green} YARN-3926 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 7s{color} | {color:green} YARN-3926 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 15s{color} | {color:green} YARN-3926 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 43s{color} | {color:green} YARN-3926 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 42s{color} | {color:green} YARN-3926 passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 23s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 4m 39s{color} | {color:red} hadoop-yarn in the patch failed. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 4m 39s{color} | {color:red} hadoop-yarn in the patch failed. {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 56s{color} | {color:green} hadoop-yarn-project/hadoop-yarn: The patch generated 0 new + 50 unchanged - 42 fixed = 50 total (was 92) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 39s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 41s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 41m 16s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 44s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}106m 53s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:a9ad5d6 | | JIRA Issue | YARN-6445 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12863224/YARN-6445-YARN-3926.003.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux b5d54a899376 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | YARN-3926 / f3dc4ca | | Default Java | 1.8.0_121 | | compile | https://builds.apache.org/job/PreCommit-YARN-Build/15620/artifact/patchprocess/branch-compile-hadoop-yarn-project_hadoop-yarn.txt | | findb
[jira] [Commented] (YARN-6478) Fix a spelling mistake in FileSystemTimelineWriter
[ https://issues.apache.org/jira/browse/YARN-6478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967324#comment-15967324 ] ASF GitHub Bot commented on YARN-6478: -- GitHub user lingjinjiang opened a pull request: https://github.com/apache/hadoop/pull/212 [YARN-6478] Fix a spelling mistake Fix a spelling mistake in FileSystemTimelineWriter.java. The "writeSummmaryEntityLogs" should be "writeSummaryEntityLogs". https://issues.apache.org/jira/browse/YARN-6478 You can merge this pull request into a Git repository by running: $ git pull https://github.com/lingjinjiang/hadoop trunk Alternatively you can review and apply these changes as the patch at: https://github.com/apache/hadoop/pull/212.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #212 commit a249a6ec6628955bb066f218cb7d30f455915bb4 Author: lingjinjiang Date: 2017-04-13T09:33:25Z [YARN-6478] Fix a spelling mistake > Fix a spelling mistake in FileSystemTimelineWriter > -- > > Key: YARN-6478 > URL: https://issues.apache.org/jira/browse/YARN-6478 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jinjiang Ling >Priority: Trivial > Attachments: YARN-6478-0.patch > > > Find a spelling mistake in FileSystemTimelineWriter.java > the "writeSummmaryEntityLogs" should be "writeSummaryEntityLogs". -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6478) Fix a spelling mistake in FileSystemTimelineWriter
[ https://issues.apache.org/jira/browse/YARN-6478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jinjiang Ling updated YARN-6478: Attachment: YARN-6478-0.patch > Fix a spelling mistake in FileSystemTimelineWriter > -- > > Key: YARN-6478 > URL: https://issues.apache.org/jira/browse/YARN-6478 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jinjiang Ling >Priority: Trivial > Attachments: YARN-6478-0.patch > > > Find a spelling mistake in FileSystemTimelineWriter.java > the "writeSummmaryEntityLogs" should be "writeSummaryEntityLogs". -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6478) Fix a spelling mistake in FileSystemTimelineWriter
[ https://issues.apache.org/jira/browse/YARN-6478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jinjiang Ling updated YARN-6478: Description: Find a spelling mistake in FileSystemTimelineWriter.java the "writeSummmaryEntityLogs" should be "writeSummaryEntityLogs". was: Find a spelling mistake in FileSystemTimelineWriter.java the "write*Summmary*EntityLogs" should be "write*Summary*EntityLogs". > Fix a spelling mistake in FileSystemTimelineWriter > -- > > Key: YARN-6478 > URL: https://issues.apache.org/jira/browse/YARN-6478 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jinjiang Ling >Priority: Trivial > > Find a spelling mistake in FileSystemTimelineWriter.java > the "writeSummmaryEntityLogs" should be "writeSummaryEntityLogs". -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-6478) Fix a spelling mistake in FileSystemTimelineWriter
Jinjiang Ling created YARN-6478: --- Summary: Fix a spelling mistake in FileSystemTimelineWriter Key: YARN-6478 URL: https://issues.apache.org/jira/browse/YARN-6478 Project: Hadoop YARN Issue Type: Bug Reporter: Jinjiang Ling Priority: Trivial Find a spelling mistake in FileSystemTimelineWriter.java the "write*Summmary*EntityLogs" should be "write*Summary*EntityLogs". -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6445) Improve YARN-3926 performance with respect to SLS
[ https://issues.apache.org/jira/browse/YARN-6445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Vasudev updated YARN-6445: Attachment: YARN-6445-YARN-3926.003.patch Thanks for the review [~templedf]! bq. I don't see why we need the ExtendedResources class. You don't use the none() method that I see, and I don't see where the unbounded() method is materially different from what's in Resources. The tests in TestResources run with different resource types. This combined with the fact that Resources.NONE and Resources.UNBOUNDED are final static variables means that depending on the ordering of the tests, Resources.NONE and Resources.UNBOUNDED may or may not have the entries for all the resource types. The ExtendedResources class just lets me create a new FixedValueResource every time. {quote} You left in some commented-out code: // ResourceInformation resourceInformation = ResourceInformation //.newInstance(numerator.getResourceInformation(resource)); // ResourceInformation tmp = //ResourceInformation.newInstance(rResourceInformation); You left in commented-out code: // tmp.setValue(value); ResourceInformation .copy(rResourceInformation, ret.getResourceInformation(resource)); ret.getResourceInformation(resource).setValue(value); // ret.setResourceInformation(resource, tmp); {quote} Doh! Removed all of these. {quote} I found this logic in divideAndCeil() to be surprisingly obtuse: ResourceInformation resourceInformation = ret.getResourceInformation(resource); ResourceInformation.copy(numerator.getResourceInformation(resource), resourceInformation); It would really help to add a comment that explains what's happening. In general, I'm finding the idiom of getting the RI into a tmp object and then copying the RI from the source into the tmp RI to be obtuse. Can you add a wrapper method that does the same thing but labels it with an obvious name? Or maybe just do what you do is some places: ResourceInformation .copy(rResourceInformation, ret.getResourceInformation(resource)); {quote} Fair point. I've removed the copy calls and cleaned up the code a bit. {quote} Resource In newInstance() you have this: public static Resource newInstance(Resource resource) \{ Resource ret = Resource.newInstance(0, 0); for (Map.Entry entry : resource.getResources() .entrySet()) \{ try \{ ResourceInformation .copy(entry.getValue(), ret.getResourceInformation(entry.getKey())); } catch (YarnException ye) \{ ret.setResourceInformation(entry.getKey(), ResourceInformation.newInstance(entry.getValue())); } } return ret; } Since you know that ret only has CPU and memory, this code seems odd to me. Maybe add a comment to be clear about what you're doing so no one's confused. {quote} ret doesn't have only memory and cpu - it will have as many resources as are setup on the RM. However, this code also got cleaned up as part of the copy clean up. Let me know if it still needs a comment. {quote} ResourcePMImpl In getResources(), I don't see the reason to remove the unmodifiable wrapper. {quote} Couple of reasons for this - 1. It was creating a lot of objects that were created and then never used. The API is inconsistent - I can get the ResourceInformation for a specify type and modify it but I can't get the ResourceInformation for all the types to modify them. {quote} There are a lot of places where Resource.getResourceInformation() is called in context of a try-catch that just wraps the exception in an IllegalArgumentException and rethrows it. It's out of scope for this JIRA, but I think that's the wrong thing to do. The correct approach is what you do in Resource.newInstance(), i.e. treat it like what it is: a missing resource, and move on. {quote} Hmm - let me think about this and get back to you. > Improve YARN-3926 performance with respect to SLS > - > > Key: YARN-6445 > URL: https://issues.apache.org/jira/browse/YARN-6445 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Varun Vasudev >Assignee: Varun Vasudev > Attachments: YARN-6445-YARN-3926.001.patch, > YARN-6445-YARN-3926.002.patch, YARN-6445-YARN-3926.003.patch > > > As part of the SLS runs on YARN-3926, we discovered a bunch of bottle
[jira] [Updated] (YARN-6419) Support to launch native-service deployment from new YARN UI
[ https://issues.apache.org/jira/browse/YARN-6419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akhil PB updated YARN-6419: --- Attachment: YARN-6419.004.patch > Support to launch native-service deployment from new YARN UI > > > Key: YARN-6419 > URL: https://issues.apache.org/jira/browse/YARN-6419 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn-ui-v2 >Reporter: Akhil PB >Assignee: Akhil PB > Attachments: YARN-6419.001.patch, YARN-6419.002.patch, > YARN-6419.003.patch, YARN-6419.004.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org